Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support boost_histogram/hist's categorical axes #672

Closed
jpivarski opened this issue Aug 11, 2022 · 0 comments · Fixed by #764
Closed

Support boost_histogram/hist's categorical axes #672

jpivarski opened this issue Aug 11, 2022 · 0 comments · Fixed by #764
Assignees
Labels
feature New feature or request good first issue Good for newcomers

Comments

@jpivarski
Copy link
Member

Inspired by #659 (@jlabounty). Currently, the boost-histogram/hist → ROOT TH* conversion takes a categorical axis with n categories as a continuous regular axis of n bins from 0 to n-1 (inclusive). ROOT TH* objects have no notion of categorical axes, but they do have fLabels, which can be set to a list of strings. When this value is set to n strings, ROOT draws these strings under each tick, effectively simulating a categorical axis.

This feature request would be (1) to have boost-histogram/hist → ROOT TH* set fLabels, somewhere here (may have to be expanded to a regular for loop):

# convert all axes in one list comprehension
axes = [
to_TAxis(
fName=getattr(axis, "name", default_name),
fTitle=getattr(axis, "label", getattr(obj, "name", "")),
fNbins=len(axis),
fXmin=axis.edges[0],
fXmax=axis.edges[-1],
fXbins=_fXbins_maybe_regular(axis, boost_histogram),
)
for axis, default_name in zip(obj.axes, ["xaxis", "yaxis", "zaxis"])
]

And (2) to interpret a TAxis with fLabels is not None and len(fLabels) != 0 as a boost-histogram/hist categorical axis when transforming the other way:

def _boost_axis(axis, metadata):
boost_histogram = uproot.extras.boost_histogram()
fNbins = axis.member("fNbins")
fXbins = axis.member("fXbins", none_if_missing=True)
if axis.member("fLabels") is not None:
out = boost_histogram.axis.StrCategory([str(x) for x in axis.member("fLabels")])
elif fXbins is None or len(fXbins) != fNbins + 1:
out = boost_histogram.axis.Regular(
fNbins,
axis.member("fXmin"),
axis.member("fXmax"),
underflow=True,
overflow=True,
)
else:
out = boost_histogram.axis.Variable(fXbins, underflow=True, overflow=True)
for k, v in metadata.items():
setattr(out, k, axis.member(v))
return out

We probably won't be able to preserve the int-ness of IntCategorical (fLabels are strictly strings), but maybe if all strings don't raise ValueError when cast as a Python int, that would be a good indicator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants