Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#38 list continuations #126

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

xyz65535
Copy link
Contributor

@xyz65535 xyz65535 commented Sep 5, 2024

#38 list continuations

  • ListItem: added attaching block paragraph/block/admonition
  • List: added multiple tests in various combinations of the above e.g. nested lists, attached paragraph, attached admonition, two attached paragraphs, nested list with attached paragraph, nested list with attached multiline paragraph, attached paragraph and nested list

Important improvement: code in Coradoc::Parser::Asciidoc::Base that converts grammar rules written as ruby methods into proper parslet rules, which as it turned out was not the case previously. On top of that, it goes beyond what parslet is capable by also supporting arguments. Due to this improvement time for running all the tests came down (on tested hardware, approximately) from 50 seconds to 10 seconds (5x speedup) and time for running utils/round_trip.rb came down from 2.5 minutes to 3 seconds (50x speedup), which is also realistic input. This improvement makes it possible to iterate on grammar rules much faster, develop parsing inlines and potentially parsing input from various sources, for example test cases from asciidoc, in much more reasonable time.

  • Coradoc::Parser::Asciidoc::Base was changed from a module to a class, rules previously contained there were moved to module Coradoc::Parser::Asciidoc::Text

Other improvements:

  • Blocks: added missing block types: listing, open. added missing properties in literal block. (based on things noticed with utils/round_trip.rb)
  • Grammar rules to be used inside of other rules e.g. line_start?, line_not_text?. Purpose of those rules is making sure rules using them are applied by parser only in places they should be applied.
  • Tests for blocks, paragraph, table, two parsing bugs to be fixed
  • Minor fixes in tests

Metanorma PR checklist

Copy link

codecov bot commented Sep 5, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.02%. Comparing base (5904502) to head (ec037ce).
Report is 2 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #126   +/-   ##
=======================================
  Coverage   97.02%   97.02%           
=======================================
  Files           3        3           
  Lines         168      168           
=======================================
  Hits          163      163           
  Misses          5        5           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@ronaldtse ronaldtse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be merged or is it still WIP?

- ListItem: added attaching block paragraph/block/admonition
- List: added multiple tests in various combinations of the above e.g. nested lists, attached paragraph, attached admonition, two attached paragraphs, nested list with attached paragraph, nested list with attached multiline paragraph, attached paragraph and nested list

Important improvement: code in Coradoc::Parser::Asciidoc::Base that converts grammar rules written as ruby methods into proper parslet rules, which as it turned out was not the case previously. On top of that, it goes beyond what parslet is capable by also supporting arguments. Due to this improvement time for running all the tests came down (on tested hardware, approximately) from 50 seconds to 10 seconds (5x speedup) and time for running utils/round_trip.rb came down from 2.5 minutes to 3 seconds (50x speedup), which is also realistic input. This improvement makes it possible to iterate on grammar rules much faster, develop parsing inlines and potentially parsing input from various sources, for example test cases from asciidoc, in much more reasonable time.
- Coradoc::Parser::Asciidoc::Base was changed from a module to a class, rules previously contained there were moved to module Coradoc::Parser::Asciidoc::Text

Other improvements:
- Blocks: added missing block types: listing, open. added missing properties in literal block. (based on things noticed with utils/round_trip.rb)
- Grammar rules to be used inside of other rules e.g. line_start?, line_not_text?. Purpose of those rules is making sure rules using them are applied by parser only in places they should be applied.
- Tests for blocks, paragraph, table, two parsing bugs to be fixed
- Minor fixes in tests
@xyz65535 xyz65535 changed the title WIP #38 list continuations #38 list continuations Nov 5, 2024
@hmdne hmdne linked an issue Nov 5, 2024 that may be closed by this pull request
table = ast.first[:table]


obj = {:table=>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not a Ruby object tree? There should NOT be any hashes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coradoc does:

Coradoc text -> AST -> Ruby object tree

In particular, those tests are for the first step. And AST is represented with Ruby hashes.

@@ -30,7 +30,7 @@
generated_adoc = Coradoc::Generator.gen_adoc(doc)
cleaned_adoc = Coradoc::Input::HTML.cleaner.tidy(generated_adoc)
File.open("#{file_path}.roundtrip","w"){|f| f.write(cleaned_adoc)}
`diff -B #{file_path} #{file_path}.roundtrip > #{file_path}.roundtrip.diff`
`diff -BNaur #{file_path} #{file_path}.roundtrip > #{file_path}.roundtrip.diff`
Copy link
Contributor

@ronaldtse ronaldtse Nov 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better write a separate RSpec matcher that compares 2 Coradoc Document trees in addition to two string compares?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is just an utility at this point. It would make sense to make some roundtripping tests, but at the time we are not at 100% coverage.

@ReesePlews
Copy link

is this revision related to this issue #139 ?

@webdev778
Copy link
Contributor

is this revision related to this issue #139 ?

#38

@ReesePlews
Copy link

thanks @webdev778 if you have time, #139 also describes lists that are being truncated, it could be similar. please discuss with @ronaldtse

@hmdne
Copy link
Contributor

hmdne commented Nov 6, 2024

@ReesePlews This PR is for AsciiDoc parsing. You probably mean HTML conversion to AsciiDoc. So this is unrelated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ListItem formatting needs to deal with HardBreaks
5 participants