Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize the APPLgrid exporter #285

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Conversation

janw20
Copy link
Collaborator

@janw20 janw20 commented Apr 30, 2024

Hi,
I implemented the cases not yet implemented in the APPLgrid exporter. This includes:

  1. Adding support for exporting PineAPPL grids with multi-dimensional and non-consecutive bins by exporting them as APPLgrids with integer bin limits 0..N. The ordering is the same as the PineAPPL bins, e.g. given by pineappl read --bins. An info message about the bin limits is printed to the user when a PineAPPL grid with multi-dimensional and non-consecutive bins is exported.
  2. Adding support for arbitrary Q2, x1, x2 grids
  3. Adding the option epsilon = 1e-12 to approx_eq! in lines 240 & 257. Otherwise, this failed for the grid I tested this with, for the two values 0.04979197630496172 and 0.04979197630496281 (the difference is only in the last 3 digits). Or would you handle this differently?

* Add support for multi-dimensional and non-consecutive bins by assigning them integer bin limits 0..N after informing the user
* Add support for arbitrary Q2, x1, x2 grids
* Use the `epsilon` option to of `approx_eq!` to not report a false error
@cschwan
Copy link
Contributor

cschwan commented May 2, 2024

Concerning 1): this is a good idea, it's surely better to do something than to error out. There's probably no meaningful alternative to the standard bin limits (0, 1), (1, 2), ... in the general case.

2): The problem with general bin limits is that APPLgrid requires a fixed functional form, which means that the bin limits x_i must be the results of a function x_i = f(i), and f is hard-coded in APPLgrid. In LagrangeSubgridV{1,2} we use the same function and therefore it happens to work in most cases. However, imagine changing the spacing between some of the points between x_min and x_max; in that case the APPLgrid won't give the right results. In other words: you only ever set x_min and x_max, but never the points in between, and they can be random.

3): I'd try changing the ulps parameter, there shouldn't be such a large difference (try doubling it).

For points 2) and 3) it would be good to have grids to test that with.

@janw20
Copy link
Collaborator Author

janw20 commented May 2, 2024

  1. Right, we could throw an error for now if a subgrid is not a LagrangeSubgridVx? Also, I did not know how to extract the interpolation order from the grid values, so I am just leaving it at the default value. Is there a way how to reconstruct the interpolation order from just the grid points? Or should the interpolation order maybe be stored in the grid files?
  2. At ulps ≈ 470 this works without the epsilon, below that it still fails for x = 0.000014007074397454824 and 0.00001400707439745562 in my case. Should I just set it to 512 or is there a more systematic way to determine it? The float_cmp author recommends to set both ulps and epsilon, where epsilon is a small multiple of f64::EPSILON. For me this starts working with about 5.0 * f64::EPSILON with ulps = 128.

@cschwan
Copy link
Contributor

cschwan commented May 2, 2024

  1. Right, we could throw an error for now if a subgrid is not a LagrangeSubgridVx? Also, I did not know how to extract the interpolation order from the grid values, so I am just leaving it at the default value. Is there a way how to reconstruct the interpolation order from just the grid points? Or should the interpolation order maybe be stored in the grid files?

We can't throw an error, because then optimized grids can't be exported. That would be a big loss.

There's no way to reconstruct the interpolation order from the grid points alone, you can use the 50 that we always use for different interpolation orders. However I don't think the interpolation order in the exported APPLgrid is important, usually you don't want to fill them after having them exported.

  1. At ulps ≈ 470 this works without the epsilon, below that it still fails for x = 0.000014007074397454824 and 0.00001400707439745562 in my case. Should I just set it to 512 or is there a more systematic way to determine it? The float_cmp author recommends to set both ulps and epsilon, where epsilon is a small multiple of f64::EPSILON. For me this starts working with about 5.0 * f64::EPSILON with ulps = 128.

Set it to 512. I don't think we need an epsilon comparison; I believe this is only needed when we compare against a zero result (0.0 != x for every non-zero x no matter the ulps parameter).

@janw20
Copy link
Collaborator Author

janw20 commented May 2, 2024

I could also add a --ulps argument to pineappl export, which lets the user set a different value than the value of 512, with 512 then being a hard-coded default value?

@cschwan
Copy link
Contributor

cschwan commented May 3, 2024

I could also add a --ulps argument to pineappl export, which lets the user set a different value than the value of 512, with 512 then being a hard-coded default value?

I don't think this is needed.

@janw20
Copy link
Collaborator Author

janw20 commented May 3, 2024

Okay, I set ulps = 512 now in the checks of the x grid and removed the epsilon argument. From my side this can be merged

@cschwan
Copy link
Contributor

cschwan commented May 4, 2024

What's missing are tests that check the features that you implemented. We need a PineAPPL grid

  1. with multidimensional bin limits that have bin widths not equal to one, and one
  2. with custom x-grid values.

Those two cases are possibly broken, so we better make sure to test it.


let x_grid = grid
.subgrids()
.slice(s![order, bin, ..])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably doesn't work quite as expected. APPLgrid stores different channels in a single appl_igrid, which has common parameter values NQ2, NQ2min, NQ2max, NQorder, Nx, xmin, xmax, xorder. This means that:

  • if several channels have different SubgridParam/ExtraSubgridParam values we should error out, unless this is a DIS grid in which we ignore the second dimension. In PineAPPL the second dimension can be either x1 or x2
  • the flat_map should therefore probably be a simple map and we must make sure that for all channels the entries are the same
  • the same applies also for mu_grid

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OPTIMIZE_SUBGRID_TYPE optimization can change the ends of the x and mu2 grid points, right? So I would check if the grid points are equal modulo some endpoints, and then use the union as the x points that will be passed to APPLgrid?

Copy link
Contributor

@cschwan cschwan May 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would address one point of it, yes. It'll get interesting if you've got a single channel with different minimums/maximums.

.chain(limits.last().map(|vec| vec[0].1))
.collect();
let limits = if integer_bin_limits {
(0..=limits.len()).map(|x| x as f64).collect::<Vec<_>>()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here there's very likely a normalization factor missing. If you change the bin sizes you must correct the cross sections.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, thanks. Does APPLgrid also divide by the bin widths in the end?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICR yes!

pineappl_cli/src/export/applgrid.rs Outdated Show resolved Hide resolved
@janw20
Copy link
Collaborator Author

janw20 commented May 10, 2024

While working on this, I realized: Why are the SubgridParams reconstructed? Why aren't the SubgridParams from the Grid used, e.g. by exposing them in a pub fn subgrid_params()?

@cschwan
Copy link
Contributor

cschwan commented May 10, 2024

Why are the SubgridParams reconstructed? Why aren't the SubgridParams from the Grid used, e.g. by exposing them in a pub fn subgrid_params()?

Not every Subgrid necessarily has SubgridParams in the way understood by LagrangeSubgridV{1,2}. ImportOnlySubgridV{1,2} may have arbitrary x-node values, for instance linearly-spaced ones like [0.1, 0.2, 0.3, 0.4, 0.5]. These points are impossible to generate with the LagrangeSubgridV{1,2}.

@janw20
Copy link
Collaborator Author

janw20 commented May 10, 2024

linearly-spaced ones like [0.1, 0.2, 0.3, 0.4, 0.5]

But the exporter can't handle these anyway, since APPLgrid can't handle them, right? (At least that's how I understood your first comment in this pull request.) So it would only matter (for the exporter) how the LagrangeSubgridV{1,2} handles the SubgridParams.

The ExtraSubgridParams are also not much of an issue, I think, since APPLgrid doesn't support different x1 and x2 values.

Also, is the Grid::subgrid_params field updated when optimizing the grid? Because if not, using this field would even solve the issue of stripped-away grid points at the start and the end of the grid axes.

@janw20
Copy link
Collaborator Author

janw20 commented May 10, 2024

One could also properly document a potential Grid::subgrid_params() function, i.e. say that these are the parameters the subgrids were initialized with, not necessarily the current parameters.

@cschwan
Copy link
Contributor

cschwan commented May 10, 2024

linearly-spaced ones like [0.1, 0.2, 0.3, 0.4, 0.5]

But the exporter can't handle these anyway, since APPLgrid can't handle them, right? (At least that's how I understood your first comment in this pull request.) So it would only matter (for the exporter) how the LagrangeSubgridV{1,2} handles the SubgridParams.

Yes, APPLgrid only understands the grid-node values spaced as LagrangeSubgridV{1,2} uses them.

The ExtraSubgridParams are also not much of an issue, I think, since APPLgrid doesn't support different x1 and x2 values.

It does support DIS grids, however, where one of the is trivial (zero and usually one x-grid value in the unused x-dimension)

Also, is the Grid::subgrid_params field updated when optimizing the grid? Because if not, using this field would even solve the issue of stripped-away grid points at the start and the end of the grid axes.

No it's not updated, but you can't rely on this field, because you can merge any kind of subgrid into any Grid. Grid::subgrid_params is solely for the purpose of filling temporarily empty subgrids (and this is an optimization, see the implementation of Grid::fill).

@janw20
Copy link
Collaborator Author

janw20 commented May 10, 2024

No it's not updated, but you can't rely on this field, because you can merge any kind of subgrid into any Grid.

I see, this would be a showstopper for using Grid::subgrid_params then. Thanks for the clarification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants