-
-
Notifications
You must be signed in to change notification settings - Fork 554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework node struct #326
Rework node struct #326
Conversation
int marker_offset; | ||
int padding; | ||
int start; | ||
cmark_delim_type delimiter; | ||
unsigned char list_type; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just curious (as a C ignoramus) why this is needed. Does the compiler default to using more than one byte for a cmark_delim_type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The C standard mandates a single implementation-defined type for enum
s. Compilers typically use int
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, that's good to know.
This looks good to me! Query: does anything in this affect the public API? |
Yes, the addition of |
OK, can you put something very prominent in the commit message so I'll remember to highlight this API change in the next release? |
Use zero-terminated C strings instead of cmark_chunks without storing the length. The length of code literals will be readded in a later commit. strlen overhead for code info should be negligible. Reduces size of struct cmark_node by 8 bytes.
Use zero-terminated C strings instead of cmark_chunks without storing the length. This introduces a few additional strlen computations, but overhead should be low. Allows to reduce size of struct cmark_node later.
Reduces size of struct cmark_node by 8 bytes.
Use zero-terminated C strings and a separate length field instead of cmark_chunks. Literal inline text will now be copied from the parent block's content buffer, slowing the benchmark down by 10-15%. The node struct never references memory of other nodes now, fixing commonmark#309. Node accessors don't have to check for delayed creation of C strings, so parsing and iterating all literals using the public API should actually be faster than before.
Fix another place where an "allocated" cmark_chunk was used.
Allows to reduce size of struct cmark_node later.
Introduce multi-purpose data/len members in struct cmark_node. This is mainly used to store literal text for inlines, code and HTML blocks. Move the content strbuf for blocks from cmark_node to cmark_parser. When finalizing nodes that allow inlines (paragraphs and headings), detach the strbuf and store the block content in the node's data/len members. Free the block content after processing inlines. Reduces size of struct cmark_node by 8 bytes.
42ff47c
to
30c3095
Compare
I changed the pull request to use |
Excellent, thanks. |
We're getting a new fuzzing error:
Since it just popped up, I suspect it has to do with these changes. |
Should be fixed with #329. If it's OK, could you add me to the OSS-Fuzz auto_ccs? |
I'd be happy to add you, but I can't figure out how! |
You'd have to submit a pull request to https://github.com/google/oss-fuzz, adding me to |
Sure, go ahead! |
Approved by @jgm here: commonmark/cmark#326 (comment)
Approved by @jgm here: commonmark/cmark#326 (comment)
…brackets-overflow Fix bug in fuzz harness
Fixes #309.