Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DeserializationError too frequent when HistoPath is specified in HiFa XML #1690

Closed
1 task done
kratsg opened this issue Nov 10, 2021 · 0 comments · Fixed by #1691
Closed
1 task done

DeserializationError too frequent when HistoPath is specified in HiFa XML #1690

kratsg opened this issue Nov 10, 2021 · 0 comments · Fixed by #1691
Labels
bug Something isn't working perf A code change that improves performance

Comments

@kratsg
Copy link
Contributor

kratsg commented Nov 10, 2021

Summary

When a <Sample> specifies the HistoPath and it's not empty, we will run into very frequent deserialization errors from uproot which is very slow to raise an exception (see scikit-hep/uproot5#504).

pyhf needs to fix this to be a bit smarter in how to check valid keys, and in particular, fix up its logic to not hit a/rely on DeserializationError.

OS / Environment

$ system_profiler -detailLevel mini SPSoftwareDataType | head -n 6
Software:

    System Software Overview:

      System Version: macOS 10.14.6 (18G9323)
      Kernel Version: Darwin 18.7.0

Steps to Reproduce

See #1687 for the fundamental issue. It's reproducible using private workspaces for now, but can be confirmed reproducible.

$ time pyhf xml2json monotop_twmetComb0L1LBoosted_allCRs_normDStoDR_unblind_sigTheo_envelope__pmoder_sig_a250_DM10_H900_tb1_st0p7/NormalMeasurement.xml --basedir monotop_twmetComb0L1LBoosted_allCRs_normDStoDR_unblind_sigTheo_envelope__pmoder_sig_a250_DM10_H900_tb1_st0p7
Processing ./NormalMeasurement_CR_0LBoosted_ttbar_cuts.xml:   0%|                           | 0/1 [00:00<?, ?channel/smonotop_twmetComb0L1LBoosted_allCRs_normDStoDR_unblind_sigTheo_envelope__pmoder_sig_a250_DM10_H900_tb1_st0p7/results/monotop_twmetComb0L1LBoosted_allCRs_normDStoDR_unblind_sigTheo_envelope__pmoder_sig_a250_DM10_H900_tb1_st0p7/Exclusion_combined_NormalMeasurement_model.root not in filecache
path=CR_0LBoosted_ttbar_cuts_hists/data, name=hData_CR_0LBoosted_ttbar_obs_cuts
deserialization error, trying fullname=CR_0LBoosted_ttbar_cuts_hists/data/hData_CR_0LBoosted_ttbar_obs_cuts instead
                                                                                                                      monotop_twmetComb0L1LBoosted_allCRs_normDStoDR_unblind_sigTheo_envelope__pmoder_sig_a250_DM10_H900_tb1_st0p7/results/monotop_twmetComb0L1LBoosted_allCRs_normDStoDR_unblind_sigTheo_envelope__pmoder_sig_a250_DM10_H900_tb1_st0p7/Exclusion_combined_NormalMeasurement_model.root in filecache
path=CR_0LBoosted_ttbar_cuts_hists/Top0LBoosted, name=hTop0LBoostedNom_CR_0LBoosted_ttbar_obs_cuts
deserialization error, trying fullname=CR_0LBoosted_ttbar_cuts_hists/Top0LBoosted/hTop0LBoostedNom_CR_0LBoosted_ttbar_obs_cuts instead
                                                                                                                      monotop_twmetComb0L1LBoosted_allCRs_normDStoDR_unblind_sigTheo_envelope__pmoder_sig_a250_DM10_H900_tb1_st0p7/results/monotop_twmetComb0L1LBoosted_allCRs_normDStoDR_unblind_sigTheo_envelope__pmoder_sig_a250_DM10_H900_tb1_st0p7/Exclusion_combined_NormalMeasurement_model.root in filecache
path=CR_0LBoosted_ttbar_cuts_hists/Top0LBoosted, name=hTop0LBoostedEG_EffLow_CR_0LBoosted_ttbar_obs_cutsNorm
deserialization error, trying fullname=CR_0LBoosted_ttbar_cuts_hists/Top0LBoosted/hTop0LBoostedEG_EffLow_CR_0LBoosted_ttbar_obs_cutsNorm instead

File Upload (optional)

No response

Expected Results

pyhf xml2json should be fast.

Actual Results

`pyhf xml2json` is slow.

pyhf Version

This impacts all pyhf versions up to 0.6.4.

Code of Conduct

  • I agree to follow the Code of Conduct
@kratsg kratsg added bug Something isn't working needs-triage Needs a maintainer to categorize and assign labels Nov 10, 2021
@matthewfeickert matthewfeickert added perf A code change that improves performance and removed needs-triage Needs a maintainer to categorize and assign labels Nov 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working perf A code change that improves performance
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants