Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unit conventions #12

Closed
markbandstra opened this issue Jan 31, 2017 · 12 comments
Closed

Unit conventions #12

markbandstra opened this issue Jan 31, 2017 · 12 comments

Comments

@markbandstra
Copy link
Member

Following up on our discussion about using pint, I looked a bit more into its usability in our ecosystem. It seems very tightly coupled with numpy and uncertainties to the point where using it is nearly transparent, but its integration with pandas is poor. Quantities with both uncertainties and units can be stored in pandas data structures, but they need to be handled one-by-one instead of as an entire DataFrame or Series. (This is a known issue with pandas not supporting different methods of incorporating units.)

This could be a deal-breaker if we decide to rely on pandas heavily in this project.

I, for one, am not a big pandas user so the cost-benefit of using pint is weighed heavily toward benefit. For example, is this spectrum in units of counts, or counts per second, or counts per second per keV? Is this branching ratio a percentage or a dimensionless number? I have an activity in Becquerels; how do I do the conversion to mCi again?

@markbandstra
Copy link
Member Author

markbandstra commented Jan 31, 2017

Here is a script you all can try that uses pint for various types of quantities, please give it a try.

@bplimley
Copy link
Contributor

bplimley commented Feb 2, 2017

I'm leaning toward not wanting to use pint, instead relying on clear conventions, clear variable names, and good testing.

A similar question, though, is whether we want to use uncertainties for uncertainties? It could save a lot of manual error propagation which has potential for bugs. The API doesn't have to rely on it, I think, but we could use uncertainties under the hood.

@markbandstra
Copy link
Member Author

I am a huge fan of uncertainties. If you're just using gaussian error propagation it takes care of all that for you. If you're doing something fancier (e.g., asymmetric error bars) you would want to write your own stuff anyway.

@bplimley
Copy link
Contributor

I did implement pint in an electron range module that had some conversions between energies, lengths, densities, and mass thicknesses, in order to try pint for myself. It is elegant in some ways, but other things I don't like (see below). So I still vote that we avoid using pint.

(Specifically:

  1. if you divide a keV quantity by a MeV quantity, the dimensions don't automatically cancel, you need to apply a method like to_base_units().
  2. you're not supposed to use quantities from different unit registries (1 bottom of page), which means that it's hard for the user to give an input arg with units, because the user will be working from a different unit registry)

markbandstra added a commit that referenced this issue Mar 6, 2017
@markbandstra
Copy link
Member Author

I have become convinced that pint is too much of a hassle for our purposes. Perhaps in the future it might have some use, but I agree that we should just use conventions and testing. I have removed the pint dependency from the xcom branch.

My proposal is:

  • All energies will be in keV
  • All other units are in CGS
  • Any special unit cases to be handled on a case-by-case basis

@bplimley
Copy link
Contributor

bplimley commented Mar 6, 2017

(CGS? You're such an astrophysicist...)
I'm fine with CGS. Most of the lengths we work with are small (order of the size of detectors), and if densities are involved then g/cm3 make sense.

Until you suggest we use ergs for something. There I'll draw the line. ;-)

@bplimley
Copy link
Contributor

bplimley commented Mar 6, 2017

What about variable naming convention, do we still suffix energy variables with _kev, lengths with _cm, etc.?

@markbandstra
Copy link
Member Author

Good question, I am worried that appending units to variable names could end up making things too verbose, but I do see the utility of doing that. What do others think?

@markbandstra markbandstra reopened this Mar 6, 2017
@bplimley
Copy link
Contributor

bplimley commented Mar 6, 2017 via email

@bplimley bplimley changed the title Decide whether to use pint for units Unit conventions Mar 7, 2017
markbandstra added a commit that referenced this issue Apr 18, 2017
Units are noted in the DataFrame column names for now, e.g., "Energy
Level (MeV)".
markbandstra added a commit that referenced this issue Apr 21, 2017
Units are noted in the DataFrame column names for now, e.g., "Energy
Level (MeV)".
@markbandstra
Copy link
Member Author

Can re-close this issue? I think we have decided to prioritize simplicity over using pint through a combination of clear conventions, good documentation, and subscript hints.

@bplimley
Copy link
Contributor

I only hesitated because I wasn't very clear on when to name a variable with e.g. _kev and when not to. But I'm okay closing it and revisiting the variable naming once we have more code to have a feel for it.

@markbandstra
Copy link
Member Author

Sounds good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants