feat(parser/renderer): Support inline role assignment #598

gdamore · 2020-06-13T06:32:57Z

This adds support for both [role] and [.role#id]
style attributes on quoted text blocks.

Fixes #588

(Btw, I wrote a bunch of test cases for this in the HTML tree, and added code to make these replacements "safe" (HTML escaped), but I did not add tests to the parser side. Hope that's ok.)

codecov · 2020-06-13T07:48:12Z

Codecov Report

Merging #598 into master will increase coverage by 0.05%.
The diff coverage is 90.56%.

@@            Coverage Diff             @@
##           master     #598      +/-   ##
==========================================
+ Coverage   86.63%   86.68%   +0.05%     
==========================================
  Files          67       68       +1     
  Lines        4189     4236      +47     
==========================================
+ Hits         3629     3672      +43     
- Misses        357      360       +3     
- Partials      203      204       +1

xcoulon

(Btw, I wrote a bunch of test cases for this in the HTML tree, and added code to make these replacements "safe" (HTML escaped), but I did not add tests to the parser side. Hope that's ok.)

yes, please, let's have tests for the parser as well. You can simply use the same test cases, but instead of verifying the HTML output, you just verify the DraftDocument, Document or more simply, the DocumentBlock (there are custom Gomega matchers for each type).
I found that half of the time, the failures in the renderer tests were due to the grammar, so I would have to add tests at the parser level.

xcoulon · 2020-06-13T07:47:39Z

pkg/parser/parser.peg

@@ -313,6 +313,65 @@ InlineAttributes <- "[" attrs:(GenericAttribute)* "]" {
    return types.NewAttributes(attrs.([]interface{}))								
 } 

+// Quoted Text attributes (attributes in front of marked up text).


I would suggest to keep the explanation of the supported syntax short/concise. The investigation you did on how Asciidoctor works here (and its limitations) should rather be reported on the Asciidoctor ML or the WG ML when it exist.

Agreed. I've cleaned this up so that its limited to just understanding what it is here.

xcoulon · 2020-06-13T07:50:12Z

pkg/parser/parser.peg

+QuotedTextRoleWord <- [^\]]* {
+    return strings.SplitN(string(c.text), ",", 2)[0], nil
+}
+QuotedTextRole <- "[" ![#.] role:QuotedTextRoleWord "]" {


You can even write it as [^#.], where the ^ character to invert the match in the class matcher (see https://godoc.org/github.com/mna/pigeon#hdr-Character_class_matcher). I believe it's slightly faster as the parser does not need to use a "NotRuleExpr" to "unmatch" the [#.] rule

So the issue is that I'd need to that to the QuotedTextRoleWord ... which I could do. I may test it. At one point the grammar returned the object, and it was awkward to have that in place because of conversion to []interface{} when presented with multiple items.

Having said that, the grammar now probably will support this suggestion.

Ah, I don't think it will work. because it would then fail to match an empty string. I.e. I wouldn't be able to handle []. (right now we do, which is probably not terrifically useful, but I prefer to have a grammar that handles the edge cases.)

Note that the grammar permits a "#" or a "." in the middle, only not in the beginning. Arguably this is another area where asciidoctor is super inconsistent - as it permits short-hand rules like this for block attribute lists, but not quoted text inline attributes. Go figure.

Anyway, if you'd rather lose that possibility, I can do the move. I doubt that it makes a vast amount of difference in performance, but I've not tested. We're still orders of magnitude faster than the alternatives.

I've tried to address this in gdamore#2
Also, as you will see in my PR, the AttrRole attribute value can be a string or a []string, depending on how many occurrences were found in the document. This means that the role renderer is also able to deal with these 2 types of values.

By the way, I realized that paragraphs and other delimited blocks can also have multiple roles. That's something we can address in #602. For now, I like the idea of having a string value when there's a single role on an element, and switching to []string when there is more than one. WDYT?

If we want to always handle roles in []string, then I would suggest that we rename the AttrRole="role" attribute to AttrRoles="roles" to reflect the cardinality. We could do that in #602.

I'd be ok with renaming it to Roles. I'd kind of like this to always be a []string for consistency. The renderer code does turn it into a single string.

yes, that will be better for the sake of consistency. I'll do that in #602 then ;)

xcoulon · 2020-06-13T10:16:02Z

pkg/parser/parser.peg

+    return strings.SplitN(string(c.text), ",", 2)[0], nil
+}
+QuotedTextRole <- "[" ![#.] role:QuotedTextRoleWord "]" {
+    return []interface{}{types.Attributes{ "role": role }}, nil


I believe that here you could use the types.NewElementRole func:

Suggested change

return []interface{}{types.Attributes{ "role": role }}, nil

return NewElementRole(role)

xcoulon · 2020-06-13T12:05:47Z

pkg/parser/parser.peg

+// Corollary 3: There can be only one role if it contains either a hash or a dot.
+// Corollary 4: Neither the ID nor the role may contain a closing bracket.
+//
+// Conclusion: Limit oneself to [0-9A-Za-z_-] in roles and IDs for reliable results.


I like the idea of limiting ourselves to [0-9A-Za-z_-]+ in roles and IDs, but I can't see it in the grammar rules below :(

It's not in the rules. Rather this is a recommendation for document authors -- the results will vary across implementations when using other characters. I've tried to be as generous as we can reasonably allow.

See Postel's Law.

gdamore · 2020-06-13T17:59:58Z

(Btw, I wrote a bunch of test cases for this in the HTML tree, and added code to make these replacements "safe" (HTML escaped), but I did not add tests to the parser side. Hope that's ok.)

yes, please, let's have tests for the parser as well. You can simply use the same test cases, but instead of verifying the HTML output, you just verify the DraftDocument, Document or more simply, the DocumentBlock (there are custom Gomega matchers for each type).
I found that half of the time, the failures in the renderer tests were due to the grammar, so I would have to add tests at the parser level.

Done.

This adds support for both [role] and [.role#id] style attributes on quoted text blocks. Fixes bytesparadise#588

xcoulon · 2020-06-14T15:02:53Z

thanks @gdamore 🙌

xcoulon reviewed Jun 13, 2020

View reviewed changes

feat(parser/renderer): Support inline role assignment

094bef3

This adds support for both [role] and [.role#id] style attributes on quoted text blocks. Fixes bytesparadise#588

gdamore force-pushed the qtinlinerb branch from 17c2ec1 to 094bef3 Compare June 14, 2020 14:56

xcoulon merged commit 4ab8453 into bytesparadise:master Jun 14, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(parser/renderer): Support inline role assignment #598

feat(parser/renderer): Support inline role assignment #598

gdamore commented Jun 13, 2020 •

edited by xcoulon

Loading

codecov bot commented Jun 13, 2020 •

edited

Loading

xcoulon left a comment

xcoulon Jun 13, 2020

gdamore Jun 13, 2020

xcoulon Jun 13, 2020

gdamore Jun 13, 2020

gdamore Jun 13, 2020

xcoulon Jun 14, 2020

xcoulon Jun 14, 2020

gdamore Jun 14, 2020

xcoulon Jun 14, 2020

xcoulon Jun 13, 2020

xcoulon Jun 13, 2020

gdamore Jun 14, 2020

gdamore commented Jun 13, 2020

xcoulon commented Jun 14, 2020

	return []interface{}{types.Attributes{ "role": role }}, nil
	return NewElementRole(role)

feat(parser/renderer): Support inline role assignment #598

feat(parser/renderer): Support inline role assignment #598

Conversation

gdamore commented Jun 13, 2020 • edited by xcoulon Loading

codecov bot commented Jun 13, 2020 • edited Loading

Codecov Report

xcoulon left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gdamore commented Jun 13, 2020

xcoulon commented Jun 14, 2020

gdamore commented Jun 13, 2020 •

edited by xcoulon

Loading

codecov bot commented Jun 13, 2020 •

edited

Loading