-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(parser/renderer): Support inline role assignment #598
Conversation
Codecov Report
@@ Coverage Diff @@
## master #598 +/- ##
==========================================
+ Coverage 86.63% 86.68% +0.05%
==========================================
Files 67 68 +1
Lines 4189 4236 +47
==========================================
+ Hits 3629 3672 +43
- Misses 357 360 +3
- Partials 203 204 +1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Btw, I wrote a bunch of test cases for this in the HTML tree, and added code to make these replacements "safe" (HTML escaped), but I did not add tests to the parser side. Hope that's ok.)
yes, please, let's have tests for the parser as well. You can simply use the same test cases, but instead of verifying the HTML output, you just verify the DraftDocument
, Document
or more simply, the DocumentBlock
(there are custom Gomega matchers for each type).
I found that half of the time, the failures in the renderer tests were due to the grammar, so I would have to add tests at the parser level.
@@ -313,6 +313,65 @@ InlineAttributes <- "[" attrs:(GenericAttribute)* "]" { | |||
return types.NewAttributes(attrs.([]interface{})) | |||
} | |||
|
|||
// Quoted Text attributes (attributes in front of marked up text). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest to keep the explanation of the supported syntax short/concise. The investigation you did on how Asciidoctor works here (and its limitations) should rather be reported on the Asciidoctor ML or the WG ML when it exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I've cleaned this up so that its limited to just understanding what it is here.
pkg/parser/parser.peg
Outdated
QuotedTextRoleWord <- [^\]]* { | ||
return strings.SplitN(string(c.text), ",", 2)[0], nil | ||
} | ||
QuotedTextRole <- "[" ![#.] role:QuotedTextRoleWord "]" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can even write it as [^#.]
, where the ^
character to invert the match in the class matcher (see https://godoc.org/github.com/mna/pigeon#hdr-Character_class_matcher). I believe it's slightly faster as the parser does not need to use a "NotRuleExpr" to "unmatch" the [#.]
rule
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the issue is that I'd need to that to the QuotedTextRoleWord ... which I could do. I may test it. At one point the grammar returned the object, and it was awkward to have that in place because of conversion to []interface{} when presented with multiple items.
Having said that, the grammar now probably will support this suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I don't think it will work. because it would then fail to match an empty string. I.e. I wouldn't be able to handle []. (right now we do, which is probably not terrifically useful, but I prefer to have a grammar that handles the edge cases.)
Note that the grammar permits a "#" or a "." in the middle, only not in the beginning. Arguably this is another area where asciidoctor is super inconsistent - as it permits short-hand rules like this for block attribute lists, but not quoted text inline attributes. Go figure.
Anyway, if you'd rather lose that possibility, I can do the move. I doubt that it makes a vast amount of difference in performance, but I've not tested. We're still orders of magnitude faster than the alternatives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried to address this in gdamore#2
Also, as you will see in my PR, the AttrRole
attribute value can be a string
or a []string
, depending on how many occurrences were found in the document. This means that the role renderer is also able to deal with these 2 types of values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, I realized that paragraphs and other delimited blocks can also have multiple roles. That's something we can address in #602. For now, I like the idea of having a string
value when there's a single role on an element, and switching to []string
when there is more than one. WDYT?
If we want to always handle roles in []string
, then I would suggest that we rename the AttrRole="role"
attribute to AttrRoles="roles"
to reflect the cardinality. We could do that in #602.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be ok with renaming it to Roles. I'd kind of like this to always be a []string for consistency. The renderer code does turn it into a single string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that will be better for the sake of consistency. I'll do that in #602 then ;)
pkg/parser/parser.peg
Outdated
return strings.SplitN(string(c.text), ",", 2)[0], nil | ||
} | ||
QuotedTextRole <- "[" ![#.] role:QuotedTextRoleWord "]" { | ||
return []interface{}{types.Attributes{ "role": role }}, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that here you could use the types.NewElementRole
func:
return []interface{}{types.Attributes{ "role": role }}, nil | |
return NewElementRole(role) |
pkg/parser/parser.peg
Outdated
// Corollary 3: There can be only one role if it contains either a hash or a dot. | ||
// Corollary 4: Neither the ID nor the role may contain a closing bracket. | ||
// | ||
// Conclusion: Limit oneself to [0-9A-Za-z_-] in roles and IDs for reliable results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of limiting ourselves to [0-9A-Za-z_-]+
in roles and IDs, but I can't see it in the grammar rules below :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not in the rules. Rather this is a recommendation for document authors -- the results will vary across implementations when using other characters. I've tried to be as generous as we can reasonably allow.
See Postel's Law.
Done. |
This adds support for both [role] and [.role#id] style attributes on quoted text blocks. Fixes bytesparadise#588
thanks @gdamore 🙌 |
This adds support for both
[role]
and[.role#id]
style attributes on quoted text blocks.
Fixes #588
(Btw, I wrote a bunch of test cases for this in the HTML tree, and added code to make these replacements "safe" (HTML escaped), but I did not add tests to the parser side. Hope that's ok.)