Skip to content
This repository has been archived by the owner on Mar 31, 2019. It is now read-only.

Implementation of export to TH2 #47

Closed
clelange opened this issue Aug 24, 2018 · 4 comments
Closed

Implementation of export to TH2 #47

clelange opened this issue Aug 24, 2018 · 4 comments

Comments

@clelange
Copy link
Contributor

Hi, I've been playing around with uproot and histbook, and I find it nice for data exploration.
However, I then would like to profit from RooFit in my analysis workflow, and I usually export the histogram to ROOT for this purpose (and then import it as RooDataHist)
When trying to export a 2-dimensional Hist, defined e.g. as

hp_dwcReferenceType_ntracks = Hist(bin("dwcReferenceType", 16, -.5, 15.5), profile("ntracks"))

or

h2_dwcReferenceType_ntracks = Hist(bin("dwcReferenceType", 16, -.5, 15.5), bin("ntracks", 2, -.5, 1.5))

I get a NotImplementedError: TH2 (from https://github.com/scikit-hep/histbook/blob/master/histbook/export.py#L290-L292).

Will exporting to TH2 be available soon and/or do you have a timescale for that?

@jpivarski
Copy link
Member

Without this request, the timescale would be a few months because of backlog.

Would you be interested in implementing it yourself and submitting a pull request? It may be easier than you think. Do you see the NotImplementedError in export.py, as well as the code for handling one-dimensional histograms and profiles above it? The only tricky thing is underflow/overflow handling: ROOT treats the 0th index as underflow, and histbook allows the underflow/overflow/nanflow to not exist, which means that the meaning of the bins shifts by one depending on whether there's an underflow or not. (ROOT has no concept of nanflow— ignore it.) However, the one-dimensional implementation illustrates this pattern.

The one thing that the one-dimensional version doesn't handle is multiple dimensions. The content that all dimensionalities use as input is from Hist.table, a high-level, user-facing function that makes Numpy record arrays with a shape that matches the dimensionality of the histogram. Just as with the one-dimensional case, you'd be able to pick out "count()" and "err(count())" (as a record array) but also use nested indexes to pick out bins in a simple double-for loop (accounting for the possible missing underflow at each level). The hard part might be the ROOT multidimensional bin indexing. (I don't know if ROOT sets bin contents with a serialized bin index, a multidimensional one, or both.)

@clelange
Copy link
Contributor Author

Hi Jim,

OK, that sounds doable, but I won't be able to look into this before September and then the timescale would be weeks. I'll write again once I've started looking into it. With TH2 implemented, TH3 should be straight-forward.

@jpivarski
Copy link
Member

Thanks, and I understand about the timescale!

I'll leave this issue open, which you can use to ask me any questions about it and other users can follow the development (or take it over if they need 2-d histograms on a shorter timescale).

@jpivarski
Copy link
Member

Thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants