-
-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make AST node content public outside of the crate #175
Conversation
I'm open to accepting changes like this, but I'd like to understand a little more what the motivation for this is. Normally after a parse all iirc, after |
This is what i'd like to do fn get_document_title(document: &str) -> Result<String, Utf8Error> {
let arena = Arena::new();
let root = parse_document(&arena, document, &ComrakOptions::default());
for node in root.children() {
let value = node.data.clone().into_inner().value;
let header = match value {
Heading(c) => c,
_ => continue,
};
if header.level == 1 {
let data = node.data.clone().into_inner();
// `data.content` is private
let title = match std::str::from_utf8(&data.content) {
Ok(title) => title,
Err(e) => return Err(e)
};
return Ok(title.to_string());
}
}
Ok("Untitled Document".to_string())
} (i know, the code is hacky) I don't entirely understand how to do this differently |
Ah, I see. You'll want to pull the data from the Trying to make that guarantee, which making it The solution is to collect the text from the embedded Lines 370 to 382 in 374c174
I'd accept a PR that makes that public and usable outside the HTML formatter, but you can also get the effect yourself by reproducing it. Here's a full example: extern crate comrak;
use comrak::{Arena, parse_document, ComrakOptions, nodes::{AstNode, NodeValue}};
fn main() {
println!("{:?}", get_document_title("# Hello\n"));
println!("{:?}", get_document_title("## Hello\n"));
println!("{:?}", get_document_title("# `hi` **there**\n"));
}
fn get_document_title(document: &str) -> String {
let arena = Arena::new();
let root = parse_document(&arena, document, &ComrakOptions::default());
for node in root.children() {
let header = match node.data.clone().into_inner().value {
NodeValue::Heading(c) => c,
_ => continue,
};
if header.level != 1 {
continue;
}
let mut text = Vec::new();
collect_text(node, &mut text);
// The input was already known good UTF-8 (document: &str) so comrak
// guarantees the output will be too.
return String::from_utf8(text).unwrap();
}
"Untitled Document".to_string()
}
fn collect_text<'a>(node: &'a AstNode<'a>, output: &mut Vec<u8>) {
match node.data.borrow().value {
NodeValue::Text(ref literal) | NodeValue::Code(ref literal) => {
output.extend_from_slice(literal)
}
NodeValue::LineBreak | NodeValue::SoftBreak => output.push(b' '),
_ => {
for n in node.children() {
collect_text(n, output);
}
}
}
} And the output: "Hello"
"Untitled Document"
"hi there" I hope this helps! |
That helps a lot, thank you very much for the detailed explanation! 😄 From what you explained, it doesn't make sense to make the |
@runebaas Thank you very much for your contribution. :heart. |
I'm currently building a static site generator, for some representational stuff i'm parsing the headers out of the document. Unfortunately the content on the AST node is internal to the crate, which prevents me from getting the data i need.
This change, making it public, would be much appreciated, but regardless if you decide to or not, thank you for maintaining the crate.