Regenerate UnicodeData in MakeUnicodeFiles; fix a bug in Unicode_1_Name #449

eggrobin · 2023-04-20T11:06:46Z

This turned out much easier than expected, because Mark had done half the work twenty years ago. This should make the processing of new proposals easier: no need to manually merge-sort the lines from the proposal, and with #444, conflict markers will be automatically eliminated when regenerating the UCD.

This also fixes a (fairly uninteresting) bug: the Unicode_1_Name property (obsolete, but neither deprecated nor stabilized, and importantly not blanked like ISO_Comment was) was incorrectly handled by the tools, which relied on an ICU API for it (ICU deprecated that API in ICU-9013, unicode-org/icu@c39e5af).

The new options for BagFormatter should allow us to generate files without spaces either side of the semicolon, like the 15.0 and earlier LB and EAW, should that be needed.

The function dumbFraction is extended to guess the correct denominator based on the name; if we want the fraction fields in DerivedNumericValues to match UnicodeData for the Meroitic cursive twelfths, we could pass the name there too; for now I am leaving the data files unchanged.

macchiati

Looks great, thanks!

eggrobin added 9 commits April 19, 2023 20:33

Fix the Vertical_Orientation from ToolUnicodePropertySource

f5e4ab7

Also generate EAW and VO

742afdf

Update legacy comment

77b4aa9

Regenerate UCD

a2b01f8

Merge remote-tracking branch 'la-vache/main' into regenerate-unicodedata

d9155f2

Surprisingly straightforward after all, and found a bug.

9389498

weird corners

9d2edc2

spotless and all

7d545ae

fields & close

3c4a8fc

eggrobin requested review from macchiati and markusicu April 20, 2023 11:07

macchiati approved these changes Apr 20, 2023

View reviewed changes

eggrobin merged commit ea4227f into unicode-org:main Apr 20, 2023

eggrobin mentioned this pull request Oct 2, 2023

Make MakeUnicodeFiles regenerate IndicMeowCategory.txt #547

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regenerate UnicodeData in MakeUnicodeFiles; fix a bug in Unicode_1_Name #449

Regenerate UnicodeData in MakeUnicodeFiles; fix a bug in Unicode_1_Name #449

eggrobin commented Apr 20, 2023 •

edited

Loading

macchiati left a comment

Regenerate UnicodeData in MakeUnicodeFiles; fix a bug in Unicode_1_Name #449

Regenerate UnicodeData in MakeUnicodeFiles; fix a bug in Unicode_1_Name #449

Conversation

eggrobin commented Apr 20, 2023 • edited Loading

macchiati left a comment

Choose a reason for hiding this comment

eggrobin commented Apr 20, 2023 •

edited

Loading