Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor directive and role parsing #181

Open
fwkoch opened this issue Feb 3, 2023 · 5 comments
Open

Refactor directive and role parsing #181

fwkoch opened this issue Feb 3, 2023 · 5 comments
Assignees
Labels
enhancement New feature or request

Comments

@fwkoch
Copy link
Collaborator

fwkoch commented Feb 3, 2023

Context:
Over the past year we have been coming up against some of the limitations of early design around the markdown-it tokenizer. There were many things that are in that library that are now either duplicated or deprecated (e.g. references and state management), there are also challenges in having errors being reported to the CLI, jupyter or web context in a consistent way. Some errors are not possible to report until much later (i.e. after transformations), and some errors need to be much more lenient (e.g. parser errors). We also need to start thinking about adding myst language extensions (e.g. tabs, diagrams) in a way that work across all serializers and are agnostic to the tokenizer being used (e.g. unified or markdown-it).

Current state:

  • markdown-it-docutils - markdown-it plugin to handle tokenizing roles and directives. In addition to generic roles/directives, it introduces special token stream behaviour for admonitions, code, math, etc etc.
  • markdown-it-myst-extras - markdown-it parsing support for additional markdown syntax features including colon fence, block breaks, targets, comments
  • mystjs - creates a parser with markdown-it-docutils plugin and extensibility to add more markdown-it tokenizers for new directives/roles, defines tokens-to-myst, including the special roles/directives from markdown-it-docutils, defines basic transforms, includes myst-to-hast
  • myst-cli - consumes and extends mystjs, makes it usable from CLI, defines a bunch of new directives/roles (in a markdown-it, token-y way, i.e. has to define parsing to token stream then transforming to mdast)
  • myst-transforms - a bunch of additional mdast transforms consumed by myst-cli
  • myst-ext-card/grid/tabs - directives previously defined in myst-cli, pulled into separate subpackages

Desired state:

  • markdown-it-docutils - continues to exist as-is to support existing vscode integration, unused in mystjs
  • markdown-it-myst-extras - unchanged, continues to be used by mystjs as-is
  • markdown-it-myst - pulls out basic role/directive tokenizing from markdown-it-docutils. No custom tokens for any specific roles/directives, instead, they are all just mystRole/Directive with args, options, value and parsed_args/options/value
  • mystjs - no longer exists in it's previous capacity. Lets us potentially rename myst-cli to mystjs?
  • myst-parser - this does what mystjs used to do for converting markdown-it token stream to mdast. However, it moves the plugin functionality for new directives to come after that conversion. I.e. all directives/roles become some sort of rawMystDirective node with mystDirectiveArgs, mystDirectiveOptions, etc. children... then new directives/roles are just transforms of these nodes. We do not want to have to make any decisions about parsing, nor ever have new directives/roles touch the token stream.
  • myst-directives / myst-roles - home for the core directives and roles currently defined in markdown-it-docutils. These will look very different since they are dealing in mdast transforms, not markdown-it tokenizers. These will come into myst-parser as defaults.
  • myst-to-html - stashes mdast-to-hast stuff from mystjs
  • myst-ext-* - structured definitions of new roles/directives, info about arguments, options defined as data, and functions for "validate" and "transform." Eventually this will be a place for additional, directive/role-specific myst-to-* functionality for all the new node types that are created by "transform."
@rowanc1
Copy link
Member

rowanc1 commented Feb 16, 2023

This has largely been completed in #184, I think to close this issue we should:

  • add/update the readmes of myst-parser with the above information
  • update anything from bringing this into other contexts like JupyterLab and the theme demo.

@rowanc1
Copy link
Member

rowanc1 commented Feb 18, 2023

This has fully landed with myst v0.1.15. 🚀

@tavin
Copy link
Contributor

tavin commented May 9, 2023

If you have to work with markdown-it and markdown-it-myst how do you cause directives (e.g. admonitions) to actually be rendered?

@rowanc1
Copy link
Member

rowanc1 commented May 13, 2023

I think the best path for now is either (1) sticking with markdown-it-docutils for now; (2) introduce a tokenizer transformer on top of the markdown-it-myst layer that modifies the token stream back to an HTML-focused export for use inside of markdown-it; or (3) if you are in control of the render process (you might not be depending on the use case), you can use something like myst-to-html after you get an AST out.

I think the best path is probably (2), but it is also probably a decent amount of work. Sticking with (1) should be mostly fine, there haven't been substantial changes at that level, mostly just allowing errors to propagate to the CLI and changing/simplifying the extension mechanism.

@tavin
Copy link
Contributor

tavin commented May 26, 2023

Sticking with (1) is already infeasible due to obsolescence :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants