-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor template parser #3
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Previous implemtation was re-implementing the full XML parser. This version uses Stax as XML parser and adds parsing of the characters events to extract JTwig code islands.
This is a big job: * added more processor utilities * converted optional behaviour configured in config to processors * refactored compiler to separate getting the vars and rendering * create the var and adjuster image processor and removed the corresponding code from the compiler * added factories Still a few TODOs in the code before it can run. Still more tests to write.
emeka
force-pushed
the
refactor-template-parser
branch
from
March 2, 2020 09:23
285fe7d
to
77046ae
Compare
Added unit tests to cover Config.java Also removed a couple of unused methods & set private attributes as they should be
Added unit test for FileResult. Testing the other cases would require a refactoring which I'll leave for now
Gradle wrapper has been updated from 2.13 to 4.8.1 as intellij don't support old versions
Cleaned up a bit Code class; fixed bugs, removed dead code, set private attributes
…Test TemplateCompilerIntegrationTest was in fact testing LibreOffice pdf conversion which is beyond its scope.
…ment-service into refactor-template-parser
ianaz
suggested changes
Mar 16, 2020
src/main/java/com/proxeus/document/odt/ODTManifestProcessor.java
Outdated
Show resolved
Hide resolved
src/main/java/com/proxeus/document/odt/ODTManifestProcessor.java
Outdated
Show resolved
Hide resolved
ianaz
reviewed
Mar 16, 2020
ianaz
approved these changes
Mar 19, 2020
src/main/java/com/proxeus/xml/template/DefaultTemplateHandler.java
Outdated
Show resolved
Hide resolved
alexblockfactory
approved these changes
Mar 19, 2020
10 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Finally the PR for the document service. It is a big PR as the XML/Template parsing was fully replaced and the renderer/compiler code has been refactored.
The best way to review is to start from the API. The two entry points that have been changed are
/compile
and/vars
.The old XML parsing that was reimplementing a XML parser and generating a customer DOM tree where nodes were XML and JTwig templates nodes. It has been replaced by a standard java StaX XML parser and a template extractor which is responsible of finding and interpreting template code elements. The two parts are clearly separated: a TemplateParser (in this case a JTwigParser) handles the template and a TemplateExtractor class handle the generic extraction of the template code that is inside the visible document text. The result must be a well formed template made of XML and template instructions.
The TemplateParser responsibility is to recognize the template instructions in the characters elements of the XML document, to maintain a state machine whose states indicated if we are in the XML, a code element, a content element or a comment, and tracks the current code block. In addition, the TemplateParser will segment the XML characters element to isolate the template instructions from the other document texts and simplify downstream event processing.
The TemplateExtractor will read XML events, send characters to the TemplateParser, interpret the new parser state, move XML elements out of the template instructions, ensure that matching XML start and end elements are located in the same template code block.
The whole XML handling has been migrated to an event pipeline composed of XMLEventProcessor's. Therefore the template extractor, image adjuster, empty element remover, ODT manifest processor, template and image variable extractors implement the XMLEventProcessor interface. It is now possible to easily add more XML processing in the chain and still understand the full system. Also, each processor can easily be testes separately.
The XML processor is only part of the full document service. It is responsible of actually rendering an ODT document used as a document template, to an other valid ODT document after executing the template with given data context. The ODT document is then formatted using a DocumentCompiler. At the moment, only the LibreOffice ODTDocumentCompiler is implemented. the compiler is responsible to read the input files, render the template, gather the additional assets including additional fonts, and then format the rendered document using the a TemplateFormatter, currently the LibreOfficeAssistant which was not changed in this PR.
The ODTCompiler and the associated ODTRenderer (renamed from the old ODTContext) have been refactored to use the new XML event processor architecture and the functionality that was inside the old ODTContext has been extracted in different processors.
Testing has been improved. The XML processing, including the template extraction, is now well tested and covered. The ODTCompiler area testing has been improved slightly. The new document service has been tested with the Proxeus test-api suite with success.
There are still improvement to make. For example, the formatter using LibreOffice inside the document service, which increase the complexity of the service, can be entirely externalized using projects like https://github.com/thecodingmachine/gotenberg which use the unoconv project that itself uses LibreOffice to handle its tasks. Unoconv is a ten year old project specializing in document conversion.
The handling of table row for-loop available in the old version was not migrated yet due to time limit. It is a small task as it just need to move template for loop instruction written inside table cells (as it is the only place possible if you edit the document using your document editor) outside the row. This will need to be part of an new PR.
In addition, the client has been removed and replace by curl. The API has been augmented to use addition content type that are easier to use with standard HTTP command line tools. Please refer to the updated README.
The opaque run and ui command have been removed and placed with plain java -jar command and the Dockerfile has been updated to use a multi-stage build.