-
-
Notifications
You must be signed in to change notification settings - Fork 760
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ICU-22707 Unicode 16 alpha #2930
Conversation
@hsivonen FYI -- I am working on Unicode 16 alpha in ICU. The properties are in. |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
@hsivonen I think I am done with ICU4C changes for normalization to support the characters with the new combinations of properties. All of the C/C++ normalization tests pass. 🎉 Hopefully this can get you started. There are still ICU4C test failures, but they are currently expected. They are due to missing Unicode 16 data of several types, and some outdated test expectations. |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as resolved.
This comment was marked as resolved.
- .nrm formatVersion 5 - updated data format doc & design doc
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
@eggrobin @cjchapman FYI It looks like I got this snapshot to work well enough in ICU4C & ICU4J. There are two problems with generating ICU4X data; for now I disabled those generators and filed separate ICU tickets. I think it's most productive if I can merge this PR and @hsivonen can then look into fixing them on the main branch. @aheninger FYI I got a UBSan failure in rbbitst.cpp. It was unhappy about accessing the 8-bit version of RBBIStateTableRow at an odd address. I changed the test code to cast directly to the 8-bit or 16-bit version of the row struct. |
I intend to look into this is the later part of this week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RSLGTM
NF*C_QC=Maybe
andNF*D_QC=No
, that is, they have two-way mappings but also combine-back themselves, or their decompositions combine-back; also new in Unicode 16Known issues, to be fixed separately:
Will do later:
For comparison:
Checklist
ALLOW_MANY_COMMITS=true