-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-implementation of some of the ATLAS Collider DY Datasets #1866
base: master
Are you sure you want to change the base?
Conversation
@scarlehoff, is something maybe wrong with the loading when the treatment of the systematics is set to Comparing the old:
with the new implementation:
while you can see that dumped values are exactly the same (modulo the first column
|
It does look wrong, specially since there's nothing that would justify a 10^-7, isn't there? The data is all > 1 so it cannot be a add vs mult problem, let me have a look. |
By just converting the percentage (
So I think it is really a difference between how the systematics are represented (unless I am doing something stupid here). |
The I think the difference might be that you are implementing the multiplicative uncertainties as % or relative, while in the new commondata format they should be implemented always as absolute. I thought that we had added this to the documentations but it seems we didn't. Let me update it! |
This definitely explains why using absolute
Is this actually correct (I don't think so!)? If the values are quoted as absolute then their treatment have to be |
Regardless on how they are given in hepdata (they could tell you it's a relative value but give you a table with the absolute values) you can convert them to absolute. I honestly don't remember why we went for everything absolute, I guess it is more consistent this way. |
Right. I just want to emphasize that if everything now is given as absolute, then only the treatment Re everything absolute, we might want to keep in mind the following sentence from the docs:
|
Why? The first thing the parser does is to make it relative to the central values. The way it is written in the actual file doesn't really matter that much. (that said... it makes it unreliable in closure tests? we need @enocera here!) |
But such extra-operation is not needed at all if everything is defined as Absolute |
Not for the t0 covmat. |
@scarlehoff, @enocera, this is also ready for review. Here is the report: https://vp.nnpdf.science/kuWT56KBSlai3_-XqLZxeA==/ For some datasets, I couldn't find the commondata and/or FK tables to compare to. |
The ones you didn't find the commondata for is because they have no corresponding old dataset, right? |
I understand that these are the 3D ATLAS distributions, of which we implemented only the 2D version. |
So let's forget about the 3D distributions, for the moment. |
Ok! Thanks. First comments, then I'll start going through all the old-new datasets one by one: What about these ones? Did you forget about them or are they part of another set (or maybe they have a different name in your list?)
And these four I think already asked you about, so I know you were not taking care of them but just for completeness:
|
So the status is then the following:
All in all, still a few to be done before this PR is complete 😅 |
kinematic_coverage: [_zero, mu2, sqrt_s] | ||
kinematics: | ||
variables: | ||
_zero: {description: "", label: "", units: ""} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_zero: {description: "", label: "", units: ""} | |
_Zero: {description: "", label: "", units: ""} |
(maybe I should automatically fill a column with zero
when one is missing... the constrain of having k1, k2, k3 is silly in both directions...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be best indeed! In this case it we don't overcrowd the implementation with spurious variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you are at it, how to plot the results as a function of the Gauge bosons? Right now, it seems that plot_x
can only accept one of the kinematic variables. Here is an example using the corrected ATLAS_WZ_TOT_13TEV
: https://vp.nnpdf.science/B4_E9eBKRjadW8gBoZZZRw==/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not doing this yet! (but I will)
For the plot_x
, just do whatever was done in the previous plotting file. If things are not working it just means I had not encountered that situation before and I need to fix it.
(which might mean you have to use some specific kinematic override
or transformation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In some of the previous files, plot_x
are not defined, as is the case for the particular example above for instance https://github.com/NNPDF/nnpdf/blob/master/nnpdfcpp/data/commondata/PLOTTING_ATLAS_WZ_TOT_13TEV.yaml (not sure how exactly it works in such a case).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, for now it will complain!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so for the benefit of producing report comparisons I leave its value to some reasonable kinematic while waiting for the parser to accommodate this. I added a TODO in the description to not forget about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, but don't worry for the comparisons for now, I'll try to fix by today! (I hope it's nothing supercomplicated)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(let me know when I should look at this again btw)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can look into this now, this is the only ATLAS dataset that not only has one kinematic variable that is zero but also should not have plot_x
(in the same way as before) as it should plotted as a function of the Gauge bosons.
In 1976617, I have both removed the _zero
from the kinematics and the plot_x
in the plotting.
PS: As for the remaining 4 datasets (inc the JETS), I will finish them by early next week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the parser so that automatically repeats a column if one is missing.
Here's the report for the one with the weird plot_x
option: https://vp.nnpdf.science/fkjMDKmrSC6XpBXv7-mhHA==
I've repeated the tests and now I have:
old: ATLASWZRAP36PB vs new: ATLAS_DY_7TEV_EMUON_Y
# Differences in the computation of chi2 32.119931215298024 vs 32.21855155561812
The covmats are different
even the diagonal
old: ATLAS_DY_2D_8TEV_LOWMASS vs new: ATLAS_Z0_8TEV_LOWMASS_2D
> Everything ok
> old: ATLAS_WZ_TOT_13TEV vs new: ATLAS_DY_13TEV_FID
The t0 chi2 is different: 10934.218358793676 vs 81138.4387008005
> old: ATLASDY2D8TEV vs new: ATLAS_Z0_8TEV_HIGHMASS_2D
% difference in the data
Differences in the computation of chi2 80.2445870631963 vs 76.30021056788038
The covmats are different
even the diagonal
In the last one I've noticed the data itself is different at the level of few-per-mille which could be driving the difference (since a difference in the data will modify also the covmat through the multiplicative uncertainties).
For the one that has a very different t0 (but nothing else was different) I guess the MULT and ADD uncertainties are very wrong? Or I've done something else wrong...
The one that is only t0, might be a problem with MULT and ADD?
As usual, thanks a lot for the detailed checks! For the one with different As for the rests, the differences are exactly understood. Before implementing the legacy versions, maybe I am just missing something from the new hepdata (?), which we'd need @enocera. PS: I will also check the boson plotting now. |
Thanks @Radonirinaunimi, your last commit fixes the t0 issue. |
This is also now ready for review. For all of the datasets (except one), they have been implemented in the same way as in the old commondata (for legacy purposes), and comments are left in the table above to describe what I've found to be different wrt the hepdata. Nevertheless, sometimes, the numerical values of the correlated systematics (and even the central values) are not exactly equal because it might happen that the numerical values quoted in the hepdata tables are slightly different from the rawdata used in the old commondata. PS: there are only the |
When the results are different you can implement the hepdata one and then a Btw, did you check that when loading the entire set of datasets the associated covariance matrix is the same as the old (same for the datasets in the other PRs)? |
The issue that I am struggling at the moment is that I am not sure if it makes sense to have legacy versions for some particular datasets are not. And this is really one of the things we should discuss (cc @enocera). Let me provide two explicit examples:
Yes, for the datasets listed here and have a checkmark in the column comparison vs old. For some of the CMS datasets in #1869, it is a bit tricky because of the numerical differences mentioned in the first point as I tried to use as much as possible the hepdata files instead. |
e64d092
to
929b692
Compare
I'm going to rebase these datasets on top of the ones currently in master. @Radonirinaunimi I'll leave this as PR and not merge immediately in case you want to rollback the changes that you did for legacy purposes. Now we have the legacy version for reproduction as the copy from the old one but I think it is better in general to have the proper hepdata one as well |
276c240
to
baaeadb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I need some feedback here, some of the keys in the plotting dictionary refer to the wrong datasets. I guess you copied part of it since it is shared, but could you have a second look to make sure the rest is ok?
(if it is only the labels I can change those)
npoints: [48] | ||
plotting: | ||
kinematics_override: ewk_rap_sqrt_scale | ||
dataset_label: "ATLAS DY 2D 8 TeV low mass" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the high mass dataset, is the rest of the plotting data equal between the two?
npoints: [8, 11, 11] | ||
plotting: | ||
kinematics_override: ewk_rap_sqrt_scale | ||
dataset_label: "LHCb $W,Z \\to \\mu$ 8 TeV" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one also contains data from another ds.
npoints: [9, 6] | ||
plotting: | ||
kinematics_override: ewk_rap_sqrt_scale | ||
dataset_label: "LHCb $W,Z \\to \\mu$ 7 TeV" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and this one
That sounds good! I will revert back to before it produced the legacy versions. I guess in doing so, I will need to call the uncertainty files to something else? Thanks for the comments plotting metadata, I will have a second look at them and make sure they are fully correct. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there are quite some changes to be made to the metadata of these files, I will move it to the right folder, with the right names, etc, and I'll leave you to review the labels / plotting options / etc @Radonirinaunimi
data/kinematics/uncertainties should be ok, I've used your data when it was compatible up to 10^-3 with the legacy data and the legacy data when it was between 10^-2 and 10^-3 (every difference was sub-% anyway)
ndata: 64 | ||
npoints: [8, 8, 8, 20, 20] | ||
plotting: | ||
kinematics_override: ewk_ptrap_sqrt_scale |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has also change with respect to what's in the old commondata files (which uses jet_sqrt_scale
)
The following PR implements some of the ATLAS collider DY in the new commondata format. The status is summarized in the table below.
☑️ in Comparison vs. Old means that the results are fully identical while 🔴 means that comparisons are available but noticeable differences are perceived.
Remains TODO: