Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Space as thousands separator in numbers #724

Open
limefrogyank opened this issue Jul 26, 2023 · 1 comment
Open

Space as thousands separator in numbers #724

limefrogyank opened this issue Jul 26, 2023 · 1 comment

Comments

@limefrogyank
Copy link

limefrogyank commented Jul 26, 2023

I've done some testing and thought I would leave this issue here.

First, this is from the International System of Units spec concerning separating digits with spaces (emphasis mine):

"The practice of grouping digits in this way is a matter of choice; it is not always followed in certain specialized applications such as engineering drawings, financial statements and scripts to be read by a computer."

The International System of Units (PDF) (9th ed.). International Bureau of Weights and Measures. 2019. p. 150. ISBN 978-92-822-2272-0.

I take this to mean that the spacing has no meaning and is only to make it easier to read at a glance. Spoken numbers have natural separators like "thousand" and "million".

However, adding spaces to the number using Unicode character x2009 (slimspace) causes speech-rule-engine to add spaces to the number causing a number like 12345 (which looks like 12 345) to be read as "twelve three hundred and forty-five" instead of "twelve thousand three hundred and forty-five". Adding literal commas in place of the spaces will generate the correct reading, but using commas is not correct according to the SI rules.

This happens when:

  • MathJax parses LaTeX: 12\,345
  • MathML is generated manually (\u2009 is unicode slimspace):
    <mn>12\u2009345</mn>
  • Alternative MathML:
    <mrow>
        <mn>12</mn>
        <mo separator='true'>\u2009</mo>
        <mn>345</mn>
    </mrow>
  • Alternative MathML v2:
    <mrow>
        <mn>12</mn>
        <mspace width="thinspace" />
        <mn>345</mn>
    </mrow>

Pure MathML using <mn>12\u2009345</mn> is the best because this does not generate any <mo> multiplication in MathSpeak. However, it is still not read properly in ClearSpeak.

I'm not sure what else I can try, but it seems that ideally we would have spaces between numbers in plain <mn> tags be treated the same way commas are treated.

As for a solution, I am generating the MathML directly and can explicitly add an attribute (data-number-separator = "\u2009" ??) easily. If there's a better way that's less of a bandaid, I'm happy to implement it on my side.

@limefrogyank
Copy link
Author

Relevant: mathjax/MathJax#2772

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant