Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Parse legislative list as part of the main conversion
Kramdown handles abbreviations and footnotes natively. Call to action and legislative list are custom Govspeak components built as extensions to Kramdown. These sections are parsed separately in preprocessing, they are ignored in any subsequent parsing because Kramdown ignores HTML. Because of this, we have added custom handling of abbreviations and footnotes specifically for these components [1]. This custom implementation means that abbreviations defined anywhere in the document will be applied to call to action and legislative list components. Otherwise, only abbreviations defined within these components would be applied. This custom implementation has been shown to contain bugs through several Zendesk tickets. The main issues reported are: 1. Acronyms are inserted into the produced HTML, even if undesirable. For example, a link `<[email protected]>` with the acronym `*[email]: Electronic mail is a method of transmitting and receiving messages` would produce `<p>href="mailto:<abbr title="Electronic mail is a method of transmitting and receiving messages">email</abbr>@example.com"<abbr title="Electronic mail is a method of transmitting and receiving messages">email</abbr>@example.com</p>` rather than the expected link. 2. Because the `add_acronym_alt_text` method runs through the text for each acronym, acronyms can be inserted into other acronyms creating invalid HTML. If we change tack, and instead allow Kramdown to parse these components as part of the main conversion to HTML (instead of in preprocessing), Kramdown will handle abbreviations and footnotes correctly, fixing both of these issues (as well as some other undocumented issues). The legislative list component currently works by overriding Kramdowns `list` extension [2] and disabling ordered lists [3]. So that we can parse this component as part of the main conversion, this applies two options: `parse_block_html` [4], so that markdown inside the `legislative-list-wrapper` `div` is parsed by Kramdown, and a new custom option of `ordered_lists_disabled`, so that we can control flow inside our `Parser::Govuk` class (subclass of the Kramdown parser). It is possible to override `parse_list` instead of `parse_block_html`, but this means that the outermost list will not be parsed (by the time `parse_list` is called the outermost list is already being parsed) The only difference between this new iteration of the legislative list and the previous iteration, is that we now wrap the component in a `legislative-list-wrapper` `div`. We could remove this in postprocessing, but this requires additional modification of the Kramdown produced HTML which seems unnecessary. This also allows us to remove all remaining code which replicates the footnote and acronym functions of Kramdown. Some of the tests can likely be removed as we are now testing the functionality of the library, however, for now these represent that there are no regressions. [1]: #285 [2]: https://github.com/gettalong/kramdown/blob/bd678ecb59f70778fdb3b08bdcd39e2ab7379b45/lib/kramdown/parser/kramdown/list.rb#L54 [3]: #25 [4]: https://kramdown.gettalong.org/quickref.html#html-elements
- Loading branch information