Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(parser/renderer): Support inline role assignment #598

Merged
merged 1 commit into from
Jun 14, 2020

Conversation

gdamore
Copy link
Collaborator

@gdamore gdamore commented Jun 13, 2020

This adds support for both [role] and [.role#id]
style attributes on quoted text blocks.

Fixes #588

(Btw, I wrote a bunch of test cases for this in the HTML tree, and added code to make these replacements "safe" (HTML escaped), but I did not add tests to the parser side. Hope that's ok.)

@codecov
Copy link

codecov bot commented Jun 13, 2020

Codecov Report

Merging #598 into master will increase coverage by 0.05%.
The diff coverage is 90.56%.

@@            Coverage Diff             @@
##           master     #598      +/-   ##
==========================================
+ Coverage   86.63%   86.68%   +0.05%     
==========================================
  Files          67       68       +1     
  Lines        4189     4236      +47     
==========================================
+ Hits         3629     3672      +43     
- Misses        357      360       +3     
- Partials      203      204       +1     

Copy link
Member

@xcoulon xcoulon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Btw, I wrote a bunch of test cases for this in the HTML tree, and added code to make these replacements "safe" (HTML escaped), but I did not add tests to the parser side. Hope that's ok.)

yes, please, let's have tests for the parser as well. You can simply use the same test cases, but instead of verifying the HTML output, you just verify the DraftDocument, Document or more simply, the DocumentBlock (there are custom Gomega matchers for each type).
I found that half of the time, the failures in the renderer tests were due to the grammar, so I would have to add tests at the parser level.

@@ -313,6 +313,65 @@ InlineAttributes <- "[" attrs:(GenericAttribute)* "]" {
return types.NewAttributes(attrs.([]interface{}))
}

// Quoted Text attributes (attributes in front of marked up text).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to keep the explanation of the supported syntax short/concise. The investigation you did on how Asciidoctor works here (and its limitations) should rather be reported on the Asciidoctor ML or the WG ML when it exist.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I've cleaned this up so that its limited to just understanding what it is here.

QuotedTextRoleWord <- [^\]]* {
return strings.SplitN(string(c.text), ",", 2)[0], nil
}
QuotedTextRole <- "[" ![#.] role:QuotedTextRoleWord "]" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can even write it as [^#.], where the ^ character to invert the match in the class matcher (see https://godoc.org/github.com/mna/pigeon#hdr-Character_class_matcher). I believe it's slightly faster as the parser does not need to use a "NotRuleExpr" to "unmatch" the [#.] rule

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the issue is that I'd need to that to the QuotedTextRoleWord ... which I could do. I may test it. At one point the grammar returned the object, and it was awkward to have that in place because of conversion to []interface{} when presented with multiple items.

Having said that, the grammar now probably will support this suggestion.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I don't think it will work. because it would then fail to match an empty string. I.e. I wouldn't be able to handle []. (right now we do, which is probably not terrifically useful, but I prefer to have a grammar that handles the edge cases.)

Note that the grammar permits a "#" or a "." in the middle, only not in the beginning. Arguably this is another area where asciidoctor is super inconsistent - as it permits short-hand rules like this for block attribute lists, but not quoted text inline attributes. Go figure.

Anyway, if you'd rather lose that possibility, I can do the move. I doubt that it makes a vast amount of difference in performance, but I've not tested. We're still orders of magnitude faster than the alternatives.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried to address this in gdamore#2
Also, as you will see in my PR, the AttrRole attribute value can be a string or a []string, depending on how many occurrences were found in the document. This means that the role renderer is also able to deal with these 2 types of values.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, I realized that paragraphs and other delimited blocks can also have multiple roles. That's something we can address in #602. For now, I like the idea of having a string value when there's a single role on an element, and switching to []string when there is more than one. WDYT?

If we want to always handle roles in []string, then I would suggest that we rename the AttrRole="role" attribute to AttrRoles="roles" to reflect the cardinality. We could do that in #602.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be ok with renaming it to Roles. I'd kind of like this to always be a []string for consistency. The renderer code does turn it into a single string.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that will be better for the sake of consistency. I'll do that in #602 then ;)

return strings.SplitN(string(c.text), ",", 2)[0], nil
}
QuotedTextRole <- "[" ![#.] role:QuotedTextRoleWord "]" {
return []interface{}{types.Attributes{ "role": role }}, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that here you could use the types.NewElementRole func:

Suggested change
return []interface{}{types.Attributes{ "role": role }}, nil
return NewElementRole(role)

// Corollary 3: There can be only one role if it contains either a hash or a dot.
// Corollary 4: Neither the ID nor the role may contain a closing bracket.
//
// Conclusion: Limit oneself to [0-9A-Za-z_-] in roles and IDs for reliable results.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of limiting ourselves to [0-9A-Za-z_-]+ in roles and IDs, but I can't see it in the grammar rules below :(

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not in the rules. Rather this is a recommendation for document authors -- the results will vary across implementations when using other characters. I've tried to be as generous as we can reasonably allow.

See Postel's Law.

@gdamore
Copy link
Collaborator Author

gdamore commented Jun 13, 2020

(Btw, I wrote a bunch of test cases for this in the HTML tree, and added code to make these replacements "safe" (HTML escaped), but I did not add tests to the parser side. Hope that's ok.)

yes, please, let's have tests for the parser as well. You can simply use the same test cases, but instead of verifying the HTML output, you just verify the DraftDocument, Document or more simply, the DocumentBlock (there are custom Gomega matchers for each type).
I found that half of the time, the failures in the renderer tests were due to the grammar, so I would have to add tests at the parser level.

Done.

This adds support for both [role] and [.role#id]
style attributes on quoted text blocks.

Fixes bytesparadise#588
@xcoulon xcoulon merged commit 4ab8453 into bytesparadise:master Jun 14, 2020
@xcoulon
Copy link
Member

xcoulon commented Jun 14, 2020

thanks @gdamore 🙌

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support inline role assignment
2 participants