-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement per-axis periodicity and box size #276
Conversation
Hello @lgarrison! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2022-07-19 15:10:00 UTC |
Fantastic! I'll check this PR against halotools in the next few days and report back. |
I currently don't find agreement with halotools. Take the following code that counts pairs in spheres with radius of 10 in a uniformly sampled, non-cubic box.
I get agreement with halotools and the expectation value (within error) for cubic boundary conditions. For the above code with a non-cubic box, Corrfunc currently does not agree with halotools. However, if I only consider points with Z<40, there is agreement with halotools. So it seems that the periodicity along the z-axis might not be taken into account by Corrfunc in the above code. |
Awesome! Just tested the updated version and everything looks good. |
The current PR doesn't implement non-cubic boundary conditions for Corrfunc.theory functions other than DD, right? Is it possible to also add it for the other pair counters, especially DDsmu? |
Great! And yes, I haven't ported this feature to the other modules yet. It's not too bad, just involves echoing the @manodeep, can you review the change to DD before we proceed to the other modules? |
I took one quick look and noted some comments, but I need to look more closely. Different boxsizes per dimension might impact the min-separation + periodic features, particularly in the calculation of the cell-pairs. |
I agree, if you could check those areas of the code, that would be great. Fortunately, there were few places in the code that used |
…=-1 in that dimension
I've implemented the semantics from #276 (comment). Specifically, a boxsize of -1 along any dimension now means that dimension is not periodic. The surface area of the change was pretty minimal; it just affected the generation of the cell pairs. I've tested that turning off periodicity per-dimension works by comparing to the periodic answer where the boxsize along that dimension has been extended beyond the particle distribution, such that it is effectively non-periodic. The answers agree exactly. @manodeep, can you take a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright I have gone through and really have a couple of major comments
-
the
x/y/zdiff
calculation and in get_binsize might need to be switched back. This is in part due to not-so-great naming choices -x/y/zdiff
in the*impl.c.src
refers to periodic wrapping lengths, whilex/y/zdiff
ingridlink*
refers to an actualdiff = (xmax-xmin)
. -
would highly recommend against changing parameter semantics while keeping the function signature the same
-
think calls to generate_cell_pairs should invoke separate periodic_x/y/z rather than the replicated options->periodic
theory/DD/countpairs_impl.c.src
Outdated
const DOUBLE xdiff = (options->periodic && options->boxsize_x > 0) ? options->boxsize_x : (xmax-xmin); | ||
const DOUBLE ydiff = (options->periodic && boxsize_y > 0) ? boxsize_y : (ymax-ymin); | ||
const DOUBLE zdiff = (options->periodic && boxsize_z > 0) ? boxsize_z : (zmax-zmin); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I follow the logic here. It also seems that we might want to initialise to 0.0 for the cases where -1 has been passed as the box size.
Separately, this might be a good opportunity to clean up the naming for x/y/zdiff -> x/y/z_wrap or something like that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've renamed these to xwrap
now. These values are never used if a dimension is not periodic. We could set them to 0 in that case, but I'm not sure how to do that without adding another conditional, which I'm afraid would make the logic even harder to follow.
Thanks for the review; I think we've uncovered an unintended behavior in the binsize calculation predating this PR. Can you take a look at #276 (comment) and see if you agree? |
Hey @manodeep, can you take a look at this when you get the chance? |
I had looked at the comment and it seems fine. (Realised just now that I never "submitted" the comment - so it was pending ...) |
Great! The code now uses the spatial extent instead of the boxsize, and all the tests are passing, so I think we're ready to port to the other modules. Will work on that next week. |
@manodeep: xi and wp take a |
I did some more in-depth tests and now I get some errors but only sometimes. Let me see whether I can find out what exactly triggers it. |
@lgarrison I get errors when the number of tracers is very small. For example, if I take my code above and set
|
@johannesulf Corrfunc requires a minimum of 3 cells per dimension and so will crash based on the calculated number of cells -- check is here. It looks like the actual particle extent is significantly smaller than 100 (or 50) - e.g., the two tracers are separated by ~13 Mpc in Y (-> only 2 cells) and ~1e-3 in Z (-> only 1 cell). You can avoid this crash by explicitly passing the boxsize. |
I just checked, the crash still happens if the boxsize is passed (as Johannes' above example does). But this makes sense, because we changed the gridding to only span the particle extent, whereas before it was using the boxsize. So now, regardless of the boxsize, we may have few cells if the particle extent is narrow. I think the fix is actually pretty simple: only apply the 3 cell minimum if any pairs can be counted across the wrap; that is, if |
…ap makes wrap-crossing pairs possible
Works much better. But I'm getting an out-of-bounds error when calculating an auto-correlation function with e.g. |
Wonderful! Everything looks good now. |
Great! @manodeep do you want to check my last two commits? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that we have to make some changes to the grid-cell assignment (with setting inv_x/y/zbinsize to 0
, perhaps it would be better to use the entire boxsize for cases where x/y/zdiff is 0 (and print a warning). I know I am going back on what I said before, but I hadn't realised this edge-case of particles all lying on a plane/line etc
Thanks @manodeep, I've added the warning and the parentheses as you suggested. I'm not sure I follow the suggestion about the inverse xdiff. Wouldn't using the boxsize when xdiff is zero just create a lot of empty cells to process? And if the user requested automatic detection of the particle extent, then we don't even have a sensible non-zero value of the boxsize to use. I think the code handles the planar case fine: once all the particles are gridded, it no longer matters that they were in a plane, or even what the size of their containing cell is. The binsize calculation will still raise an error if the user tries to apply periodicity in that dimension that could cause repeat counts. |
Agree that we need a single cell (i.e. cell index == 0) and the easiest way to do that is by setting inv_xyzdiff to 0. However, that might be confusing to read - may be add some lines of comments to show why inv_xyzdiff is being set to 0. That way, other users and later developers will not be confused. Yup - agree that when the particles are co-planar (or worse), we have a meaningful boxsize only in the periodic cases and not in the non-periodic ones.
If there is only a single cell, then I am not sure what conditions needs to be checked - (xdiff + rmax) >= xwrap or (xdiff + rmax) >= 0.5*xwrap. Did you have a simple unit-test added that can check? |
I'm thinking of the condition as: "is this problem periodic?" (this previous condition was literally There's a unit test added here to make sure the code gives the right answer in the single-cell case (I've updated it slightly in the last commit): https://github.com/manodeep/Corrfunc/pull/276/files#diff-c7d4836bf94aa48cd9d4e4410ddab869be6ec08d83dfc0999ee5eac26cd8d30cR100 If we revisit #210, we might be able to relax this condition to allow a greater fraction of the boxsize. But I think this version is the most literal translation of the existing code to this PR. |
* Attempt to enable Rmax comparable to half boxsize by removing (unnecessary?) duplicate cell checks * Comment out unused var * Add test implementing Manodeep's example * Restore duplicate cell pair check, but only count a pair as a duplicate if the wrap value is identical. * Add pragmas for CI (will revisit) * Update GNU assembler bug detection (#278) * Update GNU assembler bug detection * Cosmetic enhancement to suppress spurious warnings during GAS bug test * Fix test error code * Add another test of large Rmax, comparing against brute-force * Add const qualifiers * Apply @manodeep's fix for large Rmax test against brute-force, and require nmesh>=2 to avoid duplicate cell pairs * Make boxsize non-trivial in test. Remove extra print statement. * Add comments on array broadcasting to test * Changed variable name for clarity Co-authored-by: Manodeep Sinha <[email protected]>
…ze arg types, and tests for those arg types. Remove old large Rmax test.
I merged the Corrfunc/utils/gridlink_utils.c.src Line 37 in d30e59a
The only other thing was to add tests that simultaneously checked the large Rmax and non-cubic boxsize features, which is in the last commit. Everything looks fine. @manodeep Ready to merge? The only check failure is the unrelated CircleCI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to add the python tests for the DDrppi and DDsmu as well. Otherwise, seems to be ready to be merged in
const DOUBLE xwrap = (options->periodic && options->boxsize_x > 0) ? options->boxsize_x : (xmax-xmin); | ||
const DOUBLE ywrap = (options->periodic && boxsize_y > 0) ? boxsize_y : (ymax-ymin); | ||
const DOUBLE zwrap = (options->periodic && boxsize_z > 0) ? boxsize_z : (zmax-zmin); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think I asked this before as well - shouldn't x/y/zwrap be set to 0 when the relevant boxsize_x/y/z is < 0?
Also, if we are only testing for a negative number for setting non-periodic, then the python docstrings need to be updated. Currently, the docstrings say to set to -1.0 for non-periodic along any axis
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I thought it didn't matter, but indeed we do end up passing xwrap
to get_binsize_DOUBLE
, so this had the side effect of disallowing Rmax > L/2 for non-periodic boxes, when really Rmax should be unrestricted. The expanded brute-force test caught this. So now we do set xwrap = 0
for non-periodic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We actually use BOXSIZE_NOTGIVEN = -2.
as a special value in the C API, so the docstrings are correct that -1 is the right value to pass.
boxsize > 0
is really testing boxsize != 0 && boxsize != -1
. I will add a comment to that effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know if you want to add the comment - otherwise, I will merge this in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with the new comment; ready for merge, I think!
" the maximum difference within each dimension of the X/Y/Z arrays.\n\n" | ||
" If the boxsize in a dimension is 0., then\n" | ||
" then that dimension's wrap is done based on the extent of the particle\n" | ||
" distribution. If the boxsize in a dimension is -1., then periodicity\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this might need to be "If the boxsize in a dimension is negative, then ..."
…passing boxsize=None to the Python API.
I've significantly expanded the brute-force test; it now tests almost all combinations of options for DD, DDrppi, and DDsmu. I actually think that if we wanted to reduce the test runtime, we could just use this instead of the The expanded tests did catch one or two regressions in supported configurations, namely |
@lgarrison Is this ready to merge? And then (after the readme), release v2.5 with this per-axis periodicity and R ~ Lbox/2 features? |
This PR allows the user to specify a periodic box length in each dimension, rather than assuming a cube. Internally, Corrfunc already supports this, so the only necessary change was to add support for passing a Python tuple as the box size instead of a scalar.
This change is API and ABI backward-compatible (EDIT: no longer ABI backwards compatible, see #276 (comment)). Currently, I've only implemented it for
theory.DD
, but all the other modules continue to work without modification.@johannesulf, would you be willing to test this PR? I've tested that
boxsize=L
andboxsize=(L,L,L)
get the same answer, but I haven't actually verified the code gets the right answer forLx!=Ly!=Lz
!Will close #247.