Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for serializing liquid templates along with their VM code #77

Open
dylanahsmith opened this issue Oct 8, 2020 · 3 comments

Comments

@dylanahsmith
Copy link
Contributor

dylanahsmith commented Oct 8, 2020

Once #59 and #60 are merged, the liquid benchmarks show that 60% of the time to parse and render a template happens during parsing. We could save a lot of that time by persisting (e.g. to a cache initially) the parsed template so that it can be re-used for further renders.

I think we should have the persisted format be fairly stable, such that it can be used across deploys of the application. As such, I don't think we should just use Marshal on the liquid gem's parse tree, since that can easily lead to breaking changes from code written without awareness of liquid-c. Instead, we should re-parse tag nodes on deserialization if that don't support liquid VM compilation. For code that is compiled into liquid VM code, we should include two versions in the serialized VM code, one for the liquid-c and one for the application, so either can bump these versions to avoid rendering incompatible VM code.

So the serialized block bodies will consist of the following sections:

  1. liquid VM instructions that don't require deserialization for execution
  2. An array of marshal dumped ruby constants and will require marshal loading on deserialization
  3. Uncompiled tags to parse on deserialization, including the type of tag (e.g. regular, block or raw) and references to any block bodies and "unknown tags" (i.e. handled from Liquid::Block#unknown_tag, such as elsif for Liquid::If)

This will require combining the VM assembler buffers for these block bodies together and removing the assumption that each block body will own its own assembler buffers. Instead, the block bodies should instead reference offsets into the shared buffer for execution. This could be done before adding serialization support (#78).

To keep the ruby constant array simple, we will want to compile the non-ruby constant arguments into the VM instructions as immediate arguments. Currently this isn't the case for OP_RENDER_VARIABLE_RESCUE and OP_WRITE_RAW. Note that doing this for OP_WRITE_RAW will add copying overhead that won't be beneficial without using it for serialization.

Another possible application version compatibility concern is the Liquid::ParseContext, which is an argument for parsing, which can end up being saved and used for rendering (e.g. for parsing included templates). To check for compatibility, we may want to require the application to pass in the same parse options for both parsing and deserializing to consider the serialized template to be compatible with the current application version. We may also want to check the parse context after parsing to make sure the application doesn't depend on mutations of the parse context.

@tobi
Copy link
Member

tobi commented Oct 21, 2020

Is ParseContext context used in Shopify? I think its important that we treat this project as the future of Liquid. It's perfectly OK for us to deprecate a bunch of liquid features that make the work of Liquid-c harder and issue a major release for liquid to go with that.

@dylanahsmith
Copy link
Contributor Author

It looks like we needlessly wrap the parse options in a Liquid::ParseContext, instead of passing a parse options hash directly to Liquid::Template.parse. So we could only support passing in a parse options hash.

We may also want to check the parse context after parsing to make sure the application doesn't depend on mutations of the parse context.

Liquid::ParseContext has a [] method but not a []= method, so I think the intention is already clear that the parse options should be immutable. We even dup the options hash, so mutating the parse options hash passed in won't effect the parse options. We could deeply dup and freeze the options hash if we want to prevent mutable option values.

To check for compatibility, we may want to require the application to pass in the same parse options for both parsing and deserializing to consider the serialized template to be compatible with the current application version.

Actually, I think we could avoid serializing the parse context and just require the application to pass the parse context on deserialization without a compatibility check. Although the parse context can get stored and re-used for rendering (e.g. https://github.com/Shopify/liquid/blob/001fde7694b52cb8a577c3294d0d6152522477fe/lib/liquid/tag/disableable.rb#L16), that is happening on liquid tags objects. So parsing uncompiled tags on deserialization should take care of this concern.

@radixdev
Copy link

Any update here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants