Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possibility of support for CF/UDUNITS-style powers of units #851

Closed
jthielen opened this issue Aug 26, 2019 · 2 comments · Fixed by #911
Closed

Possibility of support for CF/UDUNITS-style powers of units #851

jthielen opened this issue Aug 26, 2019 · 2 comments · Fixed by #911

Comments

@jthielen
Copy link
Contributor

jthielen commented Aug 26, 2019

The CF Conventions are a common standard used in the atmospheric and oceanic sciences, and many NetCDF files follow these conventions. The unit conventions rely upon the UDUNITS package. While most CF/UDUNITS-compliant unit strings can be parsed properly by pint, I've run into some issues recently with powers of units in the default symbol notation, i.e., strings such as "m2 s-2" for meters squared per second squared.

Would support for powers of units in this style be welcome in pint?

If so, this should be fairly simple to implement by adding the following regex substitution pair

(r'(?<=[A-Za-z])(?![A-Za-z])(?<![0-9\-])(?=[0-9\-])', '**')

to

pint/pint/util.py

Lines 565 to 575 in 2afdc4b

#: List of regex substitution pairs.
_subs_re = [('\N{DEGREE SIGN}', " degree"),
(r"([\w\.\-\+\*\\\^])\s+", r"\1 "), # merge multiple spaces
(r"({}) squared", r"\1**2"), # Handle square and cube
(r"({}) cubed", r"\1**3"),
(r"cubic ({})", r"\1**3"),
(r"square ({})", r"\1**2"),
(r"sq ({})", r"\1**2"),
(r"\b([0-9]+\.?[0-9]*)(?=[e|E][a-zA-Z]|[a-df-zA-DF-Z])", r"\1*"), # Handle numberLetter for multiplication
(r"([\w\.\-])\s+(?=\w)", r"\1*"), # Handle space for multiplication
]

If not, would it be possible to add something to the registry API in order to add regex substitution pairs for the string preprocessor? Even if pint itself doesn't wish to support this style of powers of units (which would be reasonable...this isn't the clearest syntax), having some officially supported way to add new regex substitutions would make it much easier for downstream libraries like MetPy to implement it if desired for CF/UDUNITS-compatibility.

Either way, I'd be glad to put in a PR to implement the desired functionality.

xref Unidata/MetPy#1134

@hgrecco
Copy link
Owner

hgrecco commented Aug 26, 2019

I would answer in different converging directions:

  1. There has been an idea to create an API to edit replacements. See a very early discussion parse_expression fails on units with spaces in the name #799
  2. I would agree to add any default replacement that does not create ambiguity. For example s-2 could mean s minus 2.
  3. (item 2) could be relaxed if we clearly separate in the docs parse_units from parse_expression

@jthielen
Copy link
Contributor Author

jthielen commented Aug 29, 2019

Thank you for bringing up the point about ambiguity, and after thinking about it more, I'm not sure if there is a way to avoid it given pint's existing behavior and UDUNITS behavior.

While this is a contrived example, I think it demonstrates the ambiguity well:

In pint:

import pint
ureg = pint.UnitRegistry()
test_str = '1e6 Hz s-2'
print(ureg(test_str))
999998.0 dimensionless

In cf_units (a python wrapper for UDUNITS):

from cf_units import Unit
test_str = '1e6 Hz s-2'
print(Unit(test_str).definition)
1000000 s-3

Also, because of the expressions that CF/UDUNITS allows, it would likely require using parse_expression, so I'm not sure if relaxing the unambiguity requirement by clearly separating parse_units and parse_expression would be sufficient (unless I'm misinterpreting something).

With all that in mind, I think the best way forward will be building on the early discussions for an API to extend replacements and/or the pre-processor itself (#429 (comment))? I will try putting together an initial PR sometime soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants