Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

740: Improve complex forms performance - follow up - IANA test/update, move comment #742

Merged
merged 3 commits into from
Dec 9, 2024

Conversation

lindsay-stevens
Copy link
Contributor

@lindsay-stevens lindsay-stevens commented Dec 9, 2024

Follow up on #740 feedback.

Addresses this comment:

Would be good to add a quick test with a longer subtag to test_language_warnings.

And this comment:

This comment looks like it's in the wrong place

Also updates results in performance test cases.

Why is this the best possible solution? Were any other approaches considered?

Detailed notes in commit message and docstrings / inline comments.

For the updater script, considered having it also pull from the URL rather than save manually, but 1) I don't expect that this tag list will require regular updates (last updated 6 years ago) for it to be worth streamlining it further, and 2) by the time another update is done or needed the relevant URL might be different. The main value of the script is having a documented and reproducible way to generate the code lists.

What are the regression risks?

None

Does this change require updates to documentation? If so, please file an issue here and include the link below.

No

Before submitting this PR, please make sure you have:

  • included test cases for core behavior and edge cases in tests
  • run python -m unittest and verified all tests pass
  • run ruff format pyxform tests and ruff check pyxform tests to lint code
  • verified that any code or assets from external sources are properly credited in comments

- test_language_warnings.py
  - add case for a valid >2 char language code
  - add case for invalid 3 char code - example of behaviour where if
    someone refers to ISO 639 and gets a valid tag but that's not
    compliant with the IANA RFC which uses the shortest available tag.
- subtags_updater.py
  - add script to simplify/formalise how the tag lists are prepared.
    I don't know how/where the original list was obtained but this
    produces a purely additive changeset consistent with a published
    changelist https://www.iana.org/assignments/lang-subtags-templates/lang-subtags-templates.xhtml
    so it should be about right.
  - add IANA language codes added to registry since list was originally
    committed 6+ years ago (2018-04-01)
- generally quite a bit faster now but there seems to be something still
  potentially wrong when translations and/or lots of choices/itemsets
  are involved, e.g. 10K text = ~5 seconds,
  vs 10K select + 20K choices = ~80 seconds.
- switched to using `convert()` directly since this now accepts
  markdown input, and it avoids potentially confounding effects of
  extra things done in `assertPyxformXform` (such as re-parsing XML).
  - this had a minor impact (approx 10%) on test execution time but
    considerable impact (approx 30%) on memory usage.
@lindsay-stevens lindsay-stevens marked this pull request as ready for review December 9, 2024 11:44
Copy link
Contributor

@lognaturel lognaturel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to see the perf improvements and good call on adding a reproducible way to build the IANA tag lists.

@lognaturel lognaturel merged commit 95034d9 into XLSForm:master Dec 9, 2024
10 checks passed
@lindsay-stevens lindsay-stevens deleted the iana-subtags branch December 9, 2024 22:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants