Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decorate model functions to indicate their units #361

Closed
denised opened this issue Jul 25, 2021 · 21 comments · Fixed by #525
Closed

Decorate model functions to indicate their units #361

denised opened this issue Jul 25, 2021 · 21 comments · Fixed by #525
Assignees
Labels
enhancement New feature or request

Comments

@denised
Copy link
Member

denised commented Jul 25, 2021

It would be nice if there was a convenient way to know what the units of a model function are. There are two common ways this information is provided:

  • Part of the function name, something like co2_mmt_....
  • Part of the docstring

In both techniques it is either difficult to standardize this process, or results in a kind of templatization that makes the code (or comments) hard to read.
The model functions already have an attribute that gives their name. We could add an additional attribute of units. This could be automatically used by any UI layer on top, lowering the chance of mismatch.

@denised denised added the enhancement New feature or request label Jul 25, 2021
@denised
Copy link
Member Author

denised commented Jul 25, 2021

Another alternative would be to create types for units, and add typing information. This would have an even additional benefit of being applicable to arguments as well as results.
Not sure if this would provide the same level of support for a UI however.

@gerald-scharitzer
Copy link
Contributor

Do you have a list of non-SI units that are frequently used in Drawdown? So far I have seen the following, but I am not sure that I interpreted them correctly.

  • mass
    • mmt: metric megaton = 1e9 kg
    • Gt: gigatonne = 1e12 kg
    • Mt: megatonne = 1e9 kg
    • kt: kilotonne = 1e6 kg
  • scalar
    • ppm: parts per million = 1e-6
    • ppb: parts per billion = 1e-9

Do you also have samples of functions where they are used?

@denised
Copy link
Member Author

denised commented Nov 21, 2021 via email

@gerald-scharitzer
Copy link
Contributor

gerald-scharitzer commented Nov 21, 2021

I understand that solution modules specify their units in the dictionary named units as in solution/airplanes/__init__.py.

Also, several model functions utilize a naming convention, where the second qualifier (first_second_*) specifies the unit as in co2eq_mmt_reduced, where mmt specifies million metric tonnes.

I recommend to stick to your original idea of decorating functions or data frames with their units, because this can be done independent of naming conventions. This will help us to keep stable interfaces once we reach version 1.0.0.

Function names and docstrings are unstructured and would have to be parsed based on some to-be-defined grammar.

Attributes on functions or data frames can be more structured and machine-readable to also serve UIs for labeling table columns and chart axes.

I will look at unitadoption.py from that perspective.

@denised
Copy link
Member Author

denised commented Nov 22, 2021 via email

@gerald-scharitzer
Copy link
Contributor

Pandas does not provide direct support for units. There is an open issue for this since 2015 and to me it does not look like it is going to be resolved anytime soon.

Since not only data frames can have units (more specifically the data series therein), but also functions, attributes, and variables, I do not want to inject a units attribute into all of those or subclass them.

I have something less invasive in mind and will try some things. This may take a while.

@denised
Copy link
Member Author

denised commented Nov 24, 2021

@gerald-scharitzer I look forward to seeing what you come up with.

@gerald-scharitzer
Copy link
Contributor

So far I am mapping data frame columns (series) to units. Next, I will use those to decorate table columns and chart axes when rendering them for display, to get some practical use cases for user interfaces.

@gerald-scharitzer
Copy link
Contributor

gerald-scharitzer commented Nov 28, 2021

Labeling chart axes with units works, but labeling table headers does not work yet. I will improve the units API and then come back to the table visualization.

@havarsak
Copy link

havarsak commented Jan 6, 2022

Just to throw in my 2c: this would be really useful!
Especially if the more advanced thoughts are accounted for (keep track of units into functions and graphs, or in multiplication or division). But just being able to specify the unit of a dataframe column is useful, to remember what it was.
Just being able to add tags or information to columns would also be highly useful, to keep track of what's stored in them.

@gerald-scharitzer
Copy link
Contributor

I am back from the end-of-year madness and working on it right now. More specific, I am working on the prefix API, such that for example "Mt" is megatonnes and thus 10e6 tonnes and 10e9 kilograms.

After that I will have an API for dimensions, units, and prefixes. That should be enough for a first release to integrate into Drawdown.

@gerald-scharitzer
Copy link
Contributor

I have a reasonably stable API now. Next, I will test this in the context of the solutions.

@gerald-scharitzer
Copy link
Contributor

It also works in this context. Next, I will test it with some interesting use cases.

@gerald-scharitzer
Copy link
Contributor

gerald-scharitzer commented Feb 5, 2022

I found a bug in my handling of weak references and will fix that now.

@gerald-scharitzer
Copy link
Contributor

Hi @denised, my proposal to map functions and user-defined objects to units looks like this. Please let me know whether this is going into the right direction for you, or you want something else.

@denised
Copy link
Member Author

denised commented Feb 9, 2022

@gerald-scharitzer I like this very much. Please proceed!

@gerald-scharitzer
Copy link
Contributor

So far I only decorated one function, but did not use the unit anywhere, except the Jupyter notebook that demonstrates potential use cases. Do you have any actual use cases in this repository on your mind, where you think this would be useful?

@denised
Copy link
Member Author

denised commented Feb 15, 2022

(I thought I had answered earlier -- must not have hit send?)

The Jupyter notebook scenario is the biggest one right now. And I anticipate (but have not designed) some tools that generate interesting charts and use this functionality to auto-populate legends.

...But there are a lot of functions that could be annotated, and it would make sense to focus on an initial segment. Two that pop to mind are CO2e and Hectares (either one would be a good starting place)

@gerald-scharitzer
Copy link
Contributor

No problem, meanwhile I improved my test coverage.
In this context I would document in the units notebook, that if a function is not mapped to a unit, then this can be done easily with the @map_to_unit decorator.
This way we enable the developers to use the feature when the demand or use case arises.
At the same time we always have the option to launch a campaign to decorate the functions of specific modules or packages.
Or do you want to keep this issue open until all model functions are decorated?

@denised
Copy link
Member Author

denised commented Feb 17, 2022

@gerald-scharitzer I think your suggestion is good. We'll put the stake in the ground, and then add more built in functionality as we have time to invest in it.

@gerald-scharitzer
Copy link
Contributor

Then I will release version 1.0 of units-of-measure and create the pull request for the branch units of the solutions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants