-
Notifications
You must be signed in to change notification settings - Fork 526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #3201: Setup Kotlitex library to render raw latex #3194
Conversation
… introduce-kotlitex
@@ -141,6 +150,9 @@ class MathTagHandlerTest { | |||
} | |||
|
|||
@Test | |||
@Ignore | |||
// There are some cases in alpha content where we don't have raw latex but we do have | |||
// svg image, so can we use that image without raw latex? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
earlier we show only when both raw latex and SVG image is available in data, but as per our last discussion and from the alpha data, there are cases where we might have raw latex or svg or both of them, so I had conditioned them here.
git_repository( | ||
name = "kotlitex", | ||
commit = "3e8a0804041121ece4b57099ac3de386d78e0996", | ||
remote = "https://github.com/anandwana001/kotlitex", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
once confirmed, we will move this repo to org and update in this PR only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The need of creating our own version is because we need to use this in Bazel and also the repo's current min SDK is 21 and our min SDK is 19, so need to update as well.
Also, I had removed the sample apps from the kotlitex repo, as it contains various resources like drawable, mipmap, and other colors, styles, etc which cause errors for our Bazel and Gradle for duplicate of items.
@@ -64,6 +64,7 @@ dependencies { | |||
'androidx.appcompat:appcompat:1.0.2', | |||
'androidx.lifecycle:lifecycle-livedata-ktx:2.2.0-alpha03', | |||
'androidx.work:work-runtime-ktx:2.4.0', | |||
'com.github.anandwana001:kotlitex:3e8a0804041121ece4b57099ac3de386d78e0996', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the latest commit, we might use branch = master
in the Bazel file and -master:SNAPSHOT
in gradle to always use the latest commit from the master branch.
This will reduce updating the commit anytime when we change anything in the kotlitex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we actually will want to snapshot to a specific version (like we do elsewhere). Even though we control the repository, it's helpful to keep builds as deterministic as possible so there are no unexpected surprises.
) | ||
} else if (content.rawLatex != null) { | ||
output.setSpan( | ||
MathExpressionSpan(content.rawLatex, 72f, assetManager, true), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to come up with the base text height or might need a way to calculate and pass it here as an argument.
Do we have any current implementation with the same use case, where we calculate the text height based on the screen for the content?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use 16 sp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anandwana001 For my codeowner files it looks goods.
I also checked the parser implementation which looks good from code but I am unable to check this locally.
I tried replacing content in explorations to
"html": "<p>This represents <oppia-noninteractive-math math_content-with-value=\"{&quot;raw_latex&quot;: &quot;\\\\frac{2}{5}&quot;, &quot;svg_filename&quot;: &quot;mathImg_20200901_220408_fllx4jv5pn_height_3d329_width_1d666_vertical_1d072.svg&quot;}\"></oppia-noninteractive-math>.</p><p>Then, see if you can figure out what to do next, so that you get a fraction that looks like <oppia-noninteractive-math math_content-with-value=\"{&quot;raw_latex&quot;: &quot;\\\\frac{?}{10}&quot;, &quot;svg_filename&quot;: &quot;mathImg_20200901_220408_wpjlg4ybsl_height_3d329_width_2d495_vertical_1d072.svg&quot;}\"></oppia-noninteractive-math>.</p><p>Test exploration with interactions. <oppia-noninteractive-link text-with-value=\""Oppia"\" url-with-value=\""https://oppia.github.io"\"></oppia-noninteractive-link></p>"
But its not working.
Can you mention the steps of how to test in description?
@rt4914 we can meet and discuss on how to test this. |
@rt4914 Have you added the svg file in local data? Or you can simply test with the complete alpha data |
I think this is primarily meant to be a proof-of-concept for the upcoming expressions & equations interactions work, so closing this for now since we haven't started that project quite yet. |
This implementation is heavily based on #3194, including Akshay's custom fork (which was re-forked and slightly patched to work in the latest Oppia Android version).
…4068) ## Explanation Fix part of #4044 This PR introduces support for rendering raw LaTeX in the app using the KotliTeX (https://github.com/karino2/kotlitex) library which is necessary since users may input arbitrary math expressions that need to be pretty-rendered (using the math expression -> LaTeX generation support added in #4054). This implementation is heavily based on #3194 and leverages a new fork (https://github.com/oppia/kotlitex) of KotliTeX based on @anandwana001's fork. In order to keep the implementation simpler, this hangs the functionality onto the existing custom Oppia noninteractive math tag except it only renders LaTeX if there's no SVG filename to load (which will be the case for dynamically generated LaTeX). This approach has the advantage of the app always being able to fall back to LaTeX rendering in the unlikely scenario where content has a math tag without an SVG URL being present (or where the SVG fails to load which could allow potential fault tolerance when downloads support is fully implemented). These changes required several changes to KotliTeX itself which are outlined in more detail below. Furthermore, it was observed that the rendering was noticeably slow on a Nexus 5X for repeated renders (think navigation back and forth between views with rendered LaTeX, or expanding/collapsing multiple answers). This didn't seem acceptable, so this PR also introduces a platform parameter for utilizing Glide to pre-render the LaTeX into a PNG bitmap so that it can cache it. This is not only more performant for repeated exposure and faster to render, but it also offloads the expensive rendering step to a background thread rather than blocking the main thread (see the videos in the UI section below for a before-and-after comparison). ### Details on changes to KotliTeX A custom fork of KotliTeX (https://github.com/oppia/kotlitex) was needed to include the following changes: - Support to be built on Bazel - Lower the min sdk required by the library to match the app's - A slight hack to ensure KotliTeX doesn't auto-fail for larger square roots (the library actually doesn't support them, so this workaround just stretches the smaller square root rather than leading to error text) - A hack to expose the outer bounds of drawing requires by KotliTeX for its span (see caching details below for why) Note that KotliTeX only supports a subset of LaTeX, but it seems to support everything the app needs for now (and it should be extensible by porting other portions of KaTeX as needed). ### Details on the rendering caching & positioning challenges At a high-level, the approach to caching is to render the KotliTeX drawable to a bitmap that can be saved on-disk in a format that Glide can easily load on its own (PNG). While it's not ideal to perform the extra PNG conversion step, it's much more disk-friendly (and in the future, we could look into offloading just this part to another thread if needed). Each individual load is keyed on the exact raw LaTeX, the line height (up to 2 digits), and whether it's inline or block rendered (see later in this section for the distinction). If all three of these properties match, there should be a Glide cache hit and no need to re-render the LaTeX. A couple of issues arose when trying to implement this: - Despite us being able to expose the internal KotliTeX drawable (which wasn't by default), I was having a lot of difficulty getting the text alignment correct. Instead, it was much easier to rely on Android's internal text rendering by just rendering the span in a similar way to what TextView does. - While the above worked, it led to another problem: figuring out how large the bitmap should be. The actual drawable bounds that KotliTeX computes is an underestimate (its ascent and descent go outside of these bounds), so an additional estimation is needed. - Because we're now drawing the LaTeX as a bitmap in an ImageSpan, the text aligns the image either with its bottom line or its baseline. This introduces a vertical center alignment issue as shown below (the image with the red text shows one attempt at fixing this by eliminating unnecessary upper space by using a combination of vertical translation and negative bounds). Due to this quirk, we can't actually have whitespace below the text without modifying the text's font metrics (specifically the bottom/top/ascent/descent properties). #4170 is tracking this work. Note also that this is really only an issue for inline rendering which we're unlikely to use in the medium-term since all user-submitted answers will be rendered in block style which seems to work well. Caching is demonstrably much more performant for large amounts of repetition, and it doesn't result in UI lag like direct rendering does since all rendering is offloaded to a background thread (see the videos below for the comparison). ### Details on testing The tests are mainly focused around changes to the tag handler since the Glide portions can't be tested, and neither can the rendering routines (hence the exemptions). The fake image loader was updated to include support for the math-based rendering load requests. An aside change was made to make the Robolectric library test-only (it's something that was noticed in passing during development, and isn't directly needed by this PR; it's just a nice-to-have). The new interfaces are exempted from tests since they have no logic to verify (their implementations are generated by Dagger and verified at compile time). Furthermore, I didn't add any tests for the changes to the compute affected tests script since it's a bit difficult to test, and already quite an edge case. Please let me know if you have any concerns with this. ### Details on the race condition fix KotliTeX seems to initialized shared state when its parsing the LaTeX, and this with multiple Glide threads can cause a race condition where the LaTeX fails to render. This has been fixed by leveraging coroutines (where we still try to maximize efficiency and parallelization), but it required exposing new injector pathways for the coroutine dispatchers and ``ConsoleLogger``. ### Details on the ComputeAffectedTests fix This PR exposes an issue with the ComputeAffectedTests script wherein it can sometimes exceed the system's maximum argument length. This is being worked around by partitioning values for longer Bazel queries into multiple calls and concatenating the results. I'm fairly certain this is functionally equivalent to the current implementation, and for the vast majority of future PRs only one partition will ever be used. There doesn't seem to be an alternative possible since this is a platform-specific Kernel restriction. ## Essential Checklist - [x] The PR title and explanation each start with "Fix #bugnum: " (If this PR fixes part of an issue, prefix the title with "Fix part of #bugnum: ...".) - [x] Any changes to [scripts/assets](https://github.com/oppia/oppia-android/tree/develop/scripts/assets) files have their rationale included in the PR explanation. - [x] The PR follows the [style guide](https://github.com/oppia/oppia-android/wiki/Coding-style-guide). - [x] The PR does not contain any unnecessary code changes from Android Studio ([reference](https://github.com/oppia/oppia-android/wiki/Guidance-on-submitting-a-PR#undo-unnecessary-changes)). - [x] The PR is made from a branch that's **not** called "develop" and is up-to-date with "develop". - [x] The PR is **assigned** to the appropriate reviewers ([reference](https://github.com/oppia/oppia-android/wiki/Guidance-on-submitting-a-PR#clarification-regarding-assignees-and-reviewers-section)). ## For UI-specific PRs only For the most part, this PR doesn't actually affect users yet since this functionality isn't exposed (all LaTeX currently rendered in the app will utilize the existing SVG pipeline). The user-facing UI changes will be thoroughly documented in the PR that's planned for after this one since it'll be integrating all prior PRs along with the changes introduced here. That being said, the "What is a Fraction?" test revision card has been updated to include rendered LaTeX text in order to demonstrate the functionality more closely. The following is how rendering looks (with bitmap caching): ![cached_latex_rendering](https://user-images.githubusercontent.com/12983742/153363770-ad1ad20a-c06e-4dae-a9ce-4253285f1adb.png) Note that the above, when compared with non-cached rendering, demonstrates where some of the spacing issues mentioned above come in: ![direct_render_latex](https://user-images.githubusercontent.com/12983742/153363841-8318f520-0788-449f-9ea8-276c8898981e.png) Here's the image showing the bounds when trying to force the drawable downward: ![difficult_alignment_inline_math](https://user-images.githubusercontent.com/12983742/153363908-568dca88-6c67-4b11-98f6-fed0e17d56b8.png) This video shows how long a bunch of LaTeX expressions takes to render with direct rendering: https://user-images.githubusercontent.com/12983742/153363983-4ff99db5-52da-4360-9630-cc98397dc07b.mp4 Versus the performance of caching for the same experience (note this occurred with a completely clean local app cache): https://user-images.githubusercontent.com/12983742/153364178-3066d6ce-2a85-45c8-83fe-c0378a786023.mp4 ### Notes on rendering approach Note that rendering during cache time involves a three step process: 1. Approximate the outer bounding volume by tracking all draw calls needed for a given LaTeX expression (these bounds are used to create the bitmap that will store the cacheable render) 2. Render the LaTeX to a canvas with dimensions larger than the approximated bounding (since it's possible for the rendering to exceed these estimates--see notes below) 3. Crop the bitmap to the smallest bounds needed to encapsulate all of the pixels (1) is needed because KotliTeX actually incorrect computes bounds (it sometimes over and underestimates). This isn't as apparent when directly rendering because Android will handle text expanding past the space KotliTeX expects since it's rendered in-line. I spent some time trying to debug this within KotliTeX but I couldn't make much headway. It may be worth looking into this in the future to see if we can reliably calculate the bounds within KotliTeX (since it will likely be much more performant than the solution used here). Seeing KotliTeX's debug lines makes this clearer (where it thinks the bounds are relative to glyphs; note that this screenshot is using rendering without caching): ![kotlitex_wrong_bounds](https://user-images.githubusercontent.com/12983742/155447759-5e21ba68-8cec-496a-843b-db0109278625.png) (2) is needed because the methodology used for (1) is inexact. It's actually really difficult to estimate exactly where the final glyph pixels will be rendered for a given glyph since KotliTeX is directly handling positioning (rather than allowing Android to compute x/y coordinates based on nearby characters to account for tracking, kerning, etc.). The dimension approximation for (1) is much closer to correct than KotliTeX's internal bounds calculations, however it tends to slightly underapproximate the bounds. To counteract this, the app uses a bitmap that's 2x what's actually estimated to be needed and offsets rendering into that bitmap such that 50% of the estimated size is a margin around the drawing area. This provides a large buffer to account for incorrect bounds calculations. The app then crops this to the edges of filled pixels so that the smallest bitmap is used to represent the final rendered equation. While this uses a lot more memory than should actually be required for rendering, it guarantees perfect results. See: Before (notice the over-sized LaTeX, and that some expressions are partially cut-off from the right and above, or have extra space below): | ![latex_rendering_cutoff_problem1](https://user-images.githubusercontent.com/12983742/155447426-0662f705-7ec3-4422-a45f-f36e42948968.png) | ![latex_rendering_cutoff_problem2](https://user-images.githubusercontent.com/12983742/155447432-35ba68d6-b05b-4a91-82ce-7af03a47851b.png) | |------|------| After (with size estimation & croppingl this also includes a fix in UrlImageParser to ensure that LaTeX images are never auto-resized when rendering in block mode since it distorts them, and that they are properly centered): | ![latex_rendering_cutoff_fix1](https://user-images.githubusercontent.com/12983742/155447617-86eb68bc-9c8b-4483-b630-bba14b2c0cb7.png) | ![latex_rendering_cutoff_fix2](https://user-images.githubusercontent.com/12983742/155447622-fea0e52f-0dc1-4cac-b673-6a102fe5f430.png) | |------|------| Commit history: * Add KDocs & test exemptions. * Lint fixes. * Remove temporary TODOs. * Add tests. * Split StringToFractionParser. This is a temporary change that will be finished upstream (since there's an earlier PR that's a better fit for this change). * Address reviewer comments + other stuff. This also fixes a typo and incorrectly ordered exemptions list I noticed during development of downstream PRs. * Move StringExtensions & fraction parsing. This splits fraction parsing between UI & utility components. * Address reviewer comments. * Alphabetize test exemptions. * Fix typo & add regex check. The new regex check makes it so that all parameterized testing can be more easily tracked by the Android TL. * Add missing KDocs. * Post-merge cleanups. Also, fix text file exemption ordering. * Add new test for negation with math symbol. * Post-merge fixes. * Add KDocs. Also, add new regex exemption for new parameterized tests in this branch. * Refactor & simplify real ext impl. Also, fix/clarify some KDocs. * Lint fixes. * Simplify operation list converter a lot. This inlines three recursive operations to be done during the actual computation to simplify the overall converter complexity (and to make determining the test matrix easier). * Prepare for new tests. * Remove the ComparableOperationList wrapper. * Change parameterized method delimiter. * Use utility directly in test. * Post-merge fixes. This adjusts for the removal of ComparableOperationList (i.e. no wrapper proto). * Add first round of tests. This includes fixes to the converter itself as it wasn't distributing both product inversions and negation correctly in several cases. Tests should now be covering these cases. * Finish initial test suite. Still needs to be cleaned up, but after converter refactoring attempts. * Simplify operation sorting comparators. * Remove old tests. * Add remaining missing tests. * KDocs & test exemption. * Renames & lint fixes. * Post-merge fixes. * Add tests. * KDocs + exemptions. Also, clean up polynomial sorting. * Lint fixes. * Post-merge fixes. Also, mark methods/classes that need tests. * Add extension tests. * Add classifier tests. * Use more intentional epsilons for float comparing. * Treat en-dash as a subtraction symbol. * Add explicit platform selection for paramerized. This adds explicit platform selection support rather than it being automatic based on deps. While less flexible for shared tests, this offers better control for tests that don't want to to use Robolectric for local tests. This also adds a JUnit-only test runner, and updates MathTokenizerTest to use it (which led to an almost 40x decrease in runtime). * Exemption fixes. Also, fix name for the AndroidJUnit4 runner. * Remove failing test. * Fix unary expression precedence. Also, use ParameterizedJunitTestRunner for MathExpressionParserTest. * Fixes & add more test cases. * Post-merge fixes & test changes. Also, update RealExtensionsTest to use the faster JUnit runner. * Use utility directly in LaTeX tests. * Post-merge fixes. Also, update ExpressionToComparableOperationConverterTest to use the fast JUnit-only runner. * Post-merge fixes. Also, update PolynomialExtensionsTest to use fast JUnit-only runner. * Post-merge fixes. Also, update float interval per new tests. * Lint & other check fixes. * Replace deprecated term. * Post-merge fixes. * Add full test suites for alg exp classifiers. * Lint & static check fixes. * Fix test on Gradle. * Fix test for Gradle. * Add tests for math equations. And, post-merge fixes. * Static check & lint fixes. * Post-merge fixes. Verified CI checks & all unit tests are passing. * Split up tests. Also, adds dedicated BUILD.bazel file for new test. * Add missing test in Bazel, and fix it. * Correct order for genrule. * Add full test suite. * Clean up + KDocs + exemption. * Lint fixes. * Post-merge fix. * Cache KotliTeX renders. Directly rendering LaTeX through KotliTeX is way too slow, so this introduces a custom flow through Glide that computes a PNG for the LaTeX on a background thread and then caches it as part of Glide's cache to speed up re-renders of the LaTeX. We may need to manage/prune the cache over time, but for now we'll just rely on Glide's internal behaviors. This also documents some new tests that should be added, but it's not comprehensive. * Add tests, docs, and exemptions. * Update to fixed version of KotliTeX. The newer version correctly computes the bounds for rendered LaTeX. * Lint fixes. * Add new dependency licenses. This isn't done yet (some of the licenses still need to be fixed). * Fix license links. Note that the kxml one was tricky since its Maven entry says it's licensed under BSD and CC0 1.0, and its SourceForge link says the same plus LGPL 2.0. However, the source code and GitHub version of the project license it under MIT, and this seems to predate the others so it seems like the most correct license to use in this case and the one that we're using to represent the dependency. * Fix Gradle build. This uses a version of KotliTeX that builds correctly on Jitpack for Gradle, and fixes the StaticLayout creation to use an alignment constant that builds on Gradle (I'm not sure why there's a difference here between Gradle & Bazel, but the previous constant isn't part of the enum per official Android docs). * Create the math drawable synchronously. This requires exposing new injectors broadly in the app since the math model loader doesn't have access to the dependency injection graph directly. * Remove new deps from Maven list. They were incorrectly pulled in by KotliTeX. * Add argument partitioning. This fixes cases where argument calls may be very large and fail to execute due to exceeding system limitations. * Make allowance for empty cases to fix tests. These tests correspond to real scenarios. * Lint fixes. * Address reviewer comment. Clarifies the documentation in the test runner around parameter injection. * Fix broken build. * Fix broken build post-merge. * Fix broken post-merge classifier. * Address reviewer comment. * Post-merge build fixes. * Post-merge build fixes for new classifiers. * Post-merge build fixes. * Correct reference document link. * Ensure LaTeX isn't stretched or cut-off. The comments in-code have much more specifics on the approach. * Add and fix missing test (was broken on Gradle). * Post-merge fix. * More post-merge fixes. * Fix TODO comment. * Post-merge fix. * Update KotliTeX version. This version doesn't have debug drawing enabled.
Explanation
Fix #3201
Need data with the math tag to review this PR.
Documentation
Sample Result
Alpha Content
TODO/Q&A
MathTagHandlerTest
.Checklist