Add check for too many grids in NSGrid #3441

scal444 · 2021-10-16T18:38:28Z

The NSGrid cython code uses integer indexing, which is typically
32 bit. 3D grids with small cutoffs or giant box sizes can have
grids > 1000 per dimension, which leads to integer overflow when
trying to index the grids.

Fixes #3183

Changes made in this Pull Request:

PR Checklist

Tests?
Docs?
CHANGELOG updated?
Issue raised/referenced?

codecov · 2021-10-16T18:56:49Z

Codecov Report

Merging #3441 (1be6f64) into develop (a33e303) will not change coverage.
The diff coverage is 100.00%.

❗ Current head 1be6f64 differs from pull request most recent head 6afc2dc. Consider uploading reports for the commit 6afc2dc to get more accurate results

@@           Coverage Diff            @@
##           develop    #3441   +/-   ##
========================================
  Coverage    93.75%   93.75%           
========================================
  Files          176      176           
  Lines        23163    23163           
  Branches      3297     3297           
========================================
  Hits         21717    21717           
  Misses        1395     1395           
  Partials        51       51

Impacted Files	Coverage Δ
package/MDAnalysis/lib/nsgrid.pyx	`97.65% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a33e303...6afc2dc. Read the comment docs.

scal444 · 2021-10-16T19:37:07Z

I'm a bit confused by the coverage output - it's not really possible that this change could have made such a delta. I'd suspect that some test might be failing now (and therefore not covering other things it would have), but the mdanalysis-CI seems to pass.

IAlibay · 2021-10-16T20:04:34Z

I'm a bit confused by the coverage output - it's not really possible that this change could have made such a delta. I'd suspect that some test might be failing now (and therefore not covering other things it would have), but the mdanalysis-CI seems to pass.

It's just complaining because the majority of CI hasn't run yet, so the diff is off.

Actions has this whole limit first time contributors thing and it's only recently that they introduced relaxed rules so we've not changed it back

hmacdope

Thanks for this! One small nitpick.

testsuite/MDAnalysisTests/lib/test_nsgrid.py

IAlibay

thanks for working on this @scal444 - just of comments from a quick review.

package/CHANGELOG

testsuite/MDAnalysisTests/lib/test_nsgrid.py

package/MDAnalysis/lib/nsgrid.pyx

tylerjereddy

Probably out of scope for PR, but we may also want to use Py_ssize_t for indexing:
https://stackoverflow.com/q/20987390/2942522

I probably need to follow that advice more often myself (using int is pretty common indeed though).

hmacdope · 2021-10-17T00:36:15Z

We noticed some improved -O3 autovectorisation with unsigned size_t over int in distopia so worth doing IMO. Py_ssize_t is signed but possible worth adhering too anyway esp if reccomended by Cython devs.

richardjgowers

This doesn't have to raise an error. The algorithm can still work by capping the number of boxes to this limit.

scal444 · 2021-10-17T16:16:54Z

This doesn't have to raise an error. The algorithm can still work by capping the number of boxes to this limit.

Good point. I wonder if we should still fail, since it's over a billion cells and very unlikely to be performant. Not sure where this team's preference lies between automatically adjusting parameters vs failing and having the user adjust themselves. I guess it makes sense for us to do it, given that it won't affect the algorithm accuracy to adjust the grid size upwards, but now there's the question of what to cap it at? My first thought is to increase the grid size so that there's at least an average of x particles per grid, but we might need to do some research on what a good limit is.

richardjgowers

If we're up at ~1,000 cells a side, and even if we're searching for bonds (so r=1A), you've got a box with 100nm sides. This probably has about 100 million atoms in it? I'm not sure brute force methods are going to be more performant here.

I think at best this is a warning that the number of cells got truncated and that the devs should finally write some templates/fused types to handle the ultra large case.

package/MDAnalysis/lib/nsgrid.pyx

richardjgowers · 2021-10-18T08:59:14Z

That said, that all assumed heterogenous positions which might not be true. If a ratio of particles per cell fell below some heuristic threshold, then switching to a different algorithm (sparse cell grid thing) might well be a good idea.

scal444 · 2021-10-19T13:52:57Z

This probably has about 100 million atoms in it

The example in the original issue is a counterexample here, where a box with a smaller number of atoms is inflated. The grid is set based off of the box size and cutoff, so you end up with more grids than particles to pack onto the grid. I was suggesting something like increasing the grid size to at least achieve some reasonable particle density.

scal444 · 2021-10-20T13:53:26Z

In the interest of moving this along, how about I implement the suggestion to cap the number of boxes, and we can revisit for optimizations. I'll see how this affects the comments regarding unit tests and apply whatever suggestions are still relevant.

richardjgowers · 2021-10-20T14:20:35Z

Yep sounds good 👍

…

On Wed, Oct 20, 2021 at 15:53, Kevin Boyd ***@***.***> wrote: In the interest of moving this along, how about I implement the suggestion to cap the number of boxes, and we can revisit for optimizations. I'll see how this affects the comments regarding unit tests and apply whatever suggestions are still relevant. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#3441 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACGSGB3NRU74VU2V2A3YGMLUH3CWBANCNFSM5GD5BMJQ> .

pep8speaks · 2021-10-24T23:09:38Z

Hello @scal444! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

In the file testsuite/MDAnalysisTests/lib/test_nsgrid.py:

Line 73:80: E501 line too long (80 > 79 characters)

Comment last updated at 2021-11-09 15:09:32 UTC

scal444 · 2021-10-24T23:22:08Z

RE pep8 violations - if I run autopep8 it flags other problems in the same file. What's the project's stance on refactoring? In general, guess it's better to make a preliminary cleanup change, but here should I just add in the other style changes to the same PR? Or just change the affected lines?

lilyminium · 2021-10-24T23:38:33Z

Thanks for asking — we tend to keep style changes only to the code that is currently being changed, i.e. the only ones that get flagged by the bot. It does mean that we have some code with a lot of violations, as you can see, but it does mean that we keep the git history a bit cleaner (it can be very handy to run git blame on a line of code and look up why it’s like that).

…

On 24 Oct 2021, at 4:22 pm, Kevin Boyd ***@***.***> wrote: RE pep8 violations - if I run autopep8 it flags other problems in the same file. What's the project's stance on refactoring? In general, guess it's better to make a preliminary cleanup change, but here should I just add in the other style changes to the same PR? Or just change the affected lines? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#3441 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AHNMOXPC7NHRIDJHIRIOWMLUISIK5ANCNFSM5GD5BMJQ>.

hmacdope · 2021-10-25T00:04:54Z

Thanks for incorporating feedback @scal444! I will approve workflow run, its blocked because you are a first time contributor.

testsuite/MDAnalysisTests/lib/test_nsgrid.py

IAlibay

Just a couple of extra changes on my end.

package/MDAnalysis/lib/nsgrid.pyx

testsuite/MDAnalysisTests/lib/test_nsgrid.py

The NSGrid cython code uses integer indexing, which is typically 32 bit. 3D grids with small cutoffs or giant box sizes can have grids > 1000 per dimension, which leads to integer overflow when trying to index the grids. Fixes MDAnalysis#3183

Rather than throwing an exception, the number of grids is capped, not affecting the algorithm output

richardjgowers · 2021-10-26T15:29:22Z

The issue is almost literally a memory segfault, so… no? :)

…

On Tue, Oct 26, 2021 at 17:25, Irfan Alibay ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In testsuite/MDAnalysisTests/lib/test_nsgrid.py <#3441 (comment)> : > @@ -159,12 +163,16 @@ def test_nsgrid_distances(universe): @pytest.mark.parametrize('box, results', ((None, [3, 13, 24]), - (np.array([10., 10., 10., 90., 90., 90.]), [3, 13, 24, 39, 67]), - (np.array([10., 10., 10., 60., 75., 90.]), [3, 13, 24, 39, 60, 79]))) + (np.array([10000., 10000., 10000., 90., 90., 90.]), + [3, 13, 24]), (I'm sure there's some env variable set on our CI matrix we can detect) pretty sure actions sets $CI to True whenever it runs (see: https://docs.github.com/en/actions/learn-github-actions/environment-variables ) Is there a way we can reduce the memory cost but still replicate the issue? I would prefer not have re-add psutil as a dependency if we can avoid it. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#3441 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACGSGB3RI5NP44F6D4CUNZ3UI3I6PANCNFSM5GD5BMJQ> .

richardjgowers

Can we skip the test on CI please? Otherwise looks good.

IAlibay · 2021-10-27T10:58:59Z

Can we skip the test on CI please? Otherwise looks good.

We'll want to make this skip on more than just CI - we can't make an 8 GB memory test run on folk's laptops, it's going to be a nightmare for folks on lower end workstations and laptops.

scal444 · 2021-10-27T14:41:56Z

We'll want to make this skip on more than just CI - we can't make an 8 GB memory test run on folk's laptops, it's going to be a nightmare for folks on lower end workstations and laptops.

Alternatively - could we patch the max grid dim variable in the test to bring it down to something smaller? We'd still be testing the same error case, and it won't change the rest of the algorithm. Not sure how to override cython constants in python unit tests but there's probably something we could do.

richardjgowers · 2021-10-28T09:30:23Z

We'll want to make this skip on more than just CI - we can't make an 8 GB memory test run on folk's laptops, it's going to be a nightmare for folks on lower end workstations and laptops.

Alternatively - could we patch the max grid dim variable in the test to bring it down to something smaller? We'd still be testing the same error case, and it won't change the rest of the algorithm. Not sure how to override cython constants in python unit tests but there's probably something we could do.

The constraint is going to be very baked in. Cython goes sideways into C++ then gets compiled, so the constraint is a constant somewhere in the object code. For the same reason, I'd rather not have it as a user variable.

hmacdope · 2021-10-28T12:04:08Z

Does a DEF work here ? I’m not sure …

On Thu, 28 Oct 2021 at 8:30 pm, Richard Gowers ***@***.***> wrote: We'll want to make this skip on more than just CI - we can't make an 8 GB memory test run on folk's laptops, it's going to be a nightmare for folks on lower end workstations and laptops. Alternatively - could we patch the max grid dim variable in the test to bring it down to something smaller? We'd still be testing the same error case, and it won't change the rest of the algorithm. Not sure how to override cython constants in python unit tests but there's probably something we could do. The constraint is going to be very baked in. Cython goes sideways into C++ then gets compiled, so the constraint is a constant somewhere in the object code. For the same reason, I'd rather not have it as a user variable. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#3441 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AF3RHC7AH33EYCE25K3J6U3UJEQ4TANCNFSM5GD5BMJQ> .

-- *Hugo MacDermott-Opeskin* PhD Candidate, RSC ANU Email: ***@***.*** ***@***.***>

richardjgowers · 2021-10-28T12:05:56Z

DEF is #define I think. You’d have to recompile the whole thing again with the new constant. On Thu, Oct 28, 2021 at 14:04, Hugo MacDermott-Opeskin < ***@***.***> wrote:

…

Does a DEF work here ? I’m not sure … On Thu, 28 Oct 2021 at 8:30 pm, Richard Gowers ***@***.***> wrote: > We'll want to make this skip on more than just CI - we can't make an 8 GB > memory test run on folk's laptops, it's going to be a nightmare for folks > on lower end workstations and laptops. > > Alternatively - could we patch the max grid dim variable in the test to > bring it down to something smaller? We'd still be testing the same error > case, and it won't change the rest of the algorithm. Not sure how to > override cython constants in python unit tests but there's probably > something we could do. > > The constraint is going to be very baked in. Cython goes sideways into C++ > then gets compiled, so the constraint is a constant somewhere in the object > code. For the same reason, I'd rather not have it as a user variable. > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > < #3441 (comment) >, > or unsubscribe > < https://github.com/notifications/unsubscribe-auth/AF3RHC7AH33EYCE25K3J6U3UJEQ4TANCNFSM5GD5BMJQ > > . > -- *Hugo MacDermott-Opeskin* PhD Candidate, RSC ANU Email: ***@***.*** ***@***.***> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#3441 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACGSGB3DXPKLSSF5FIEE3XDUJE34JANCNFSM5GD5BMJQ> .

hmacdope · 2021-10-28T12:18:07Z

I was thinking something like:

DEF test_ns_grid_max = 1 # set on compile, only to run  on special occasions, like Christmas

IF test_ns_grid_max
 DEF MAX_DIM=reasonable_value_to_test
ELSE
 DEF MAX_DIM= 1290

but Im probably missing something / its super ugly / disabling on CI is much cleaner.

richardjgowers · 2021-10-29T13:01:36Z

I was thinking something like:
DEF test_ns_grid_max = 1 # set on compile, only to run  on special occasions, like Christmas

IF test_ns_grid_max
 DEF MAX_DIM=reasonable_value_to_test
ELSE
 DEF MAX_DIM= 1290
but Im probably missing something / its super ugly / disabling on CI is much cleaner.

Yeah this would work, but you'd be reinstalling the entire package for a single test. I think it's probably fine not to test for this (unless someone is actively tinkering with the algorithm..).

IAlibay · 2021-10-29T13:14:38Z

Yeah this would work, but you'd be reinstalling the entire package for a single test. I think it's probably fine not to test for this (unless someone is actively tinkering with the algorithm..).

I think we're overthinking this. As @richardjgowers first pointed out, let's just guard the test with an env variable, set it in our macOS runners (which for now have >10 GB ram), and document it somewhere - job done.

hmacdope · 2021-10-31T00:09:37Z

Good point @IAlibay, sorry for muddying the waters, much better to just block test.

Enable test in macos runners which have sufficient memory

scal444 · 2021-10-31T20:16:05Z

Test is now guarded by an env variable

testsuite/MDAnalysisTests/lib/test_nsgrid.py

IAlibay

Just the one thing on my end. Aside from @richardjgowers' comment, everything else looks good. Thanks @scal444 !

testsuite/MDAnalysisTests/lib/test_nsgrid.py

Used parentheses for string concatenation and a bool check converter

hmacdope

Looks great to me! Thanks @scal444

IAlibay

Just a final PEP8 thing please.

testsuite/MDAnalysisTests/lib/test_nsgrid.py

fixed it

richardjgowers · 2021-11-09T15:10:55Z

@scal444 thanks for fixing this and congrats on getting your first PR on the board!

github-actions bot added the Component-lib label Oct 16, 2021

hmacdope requested changes Oct 16, 2021

View reviewed changes

testsuite/MDAnalysisTests/lib/test_nsgrid.py Outdated Show resolved Hide resolved

IAlibay requested changes Oct 16, 2021

View reviewed changes

tylerjereddy reviewed Oct 16, 2021

View reviewed changes

richardjgowers requested changes Oct 17, 2021

View reviewed changes

richardjgowers requested changes Oct 18, 2021

View reviewed changes

package/MDAnalysis/lib/nsgrid.pyx Show resolved Hide resolved

richardjgowers reviewed Oct 25, 2021

View reviewed changes

testsuite/MDAnalysisTests/lib/test_nsgrid.py Show resolved Hide resolved

IAlibay requested changes Oct 26, 2021

View reviewed changes

package/MDAnalysis/lib/nsgrid.pyx Show resolved Hide resolved

testsuite/MDAnalysisTests/lib/test_nsgrid.py Show resolved Hide resolved

IAlibay reviewed Oct 26, 2021

View reviewed changes

testsuite/MDAnalysisTests/lib/test_nsgrid.py Outdated Show resolved Hide resolved

Kevin Boyd added 4 commits October 26, 2021 07:53

Add check for too many grids in NSGrid

9e6daa8

The NSGrid cython code uses integer indexing, which is typically 32 bit. 3D grids with small cutoffs or giant box sizes can have grids > 1000 per dimension, which leads to integer overflow when trying to index the grids. Fixes MDAnalysis#3183

Automatically resize grids to avoid int overflow

dff262b

Rather than throwing an exception, the number of grids is capped, not affecting the algorithm output

Add Kevin Boyd to authors and changelog

736f4cf

Pep8 fixes

6506201

scal444 force-pushed the nsgrid_debug branch from a7cefa9 to 6506201 Compare October 26, 2021 14:54

richardjgowers requested changes Oct 27, 2021

View reviewed changes

richardjgowers mentioned this pull request Oct 29, 2021

Release 2.1.0 #3446

Closed

5 tasks

Extract high-grid test to own fixture and add skip criterion.

685a6a0

Enable test in macos runners which have sufficient memory

Kevin Boyd and others added 4 commits November 2, 2021 16:39

Undo some accidental autoformatting

f67b46c

Add versionchanged

c440f0d

Fix more accidental autoformatting

0c6fd4a

Merge branch 'develop' into nsgrid_debug

b58048b

richardjgowers reviewed Nov 4, 2021

View reviewed changes

testsuite/MDAnalysisTests/lib/test_nsgrid.py Outdated Show resolved Hide resolved

IAlibay requested changes Nov 4, 2021

View reviewed changes

testsuite/MDAnalysisTests/lib/test_nsgrid.py Outdated Show resolved Hide resolved

Address review comments

1be6f64

Used parentheses for string concatenation and a bool check converter

richardjgowers approved these changes Nov 7, 2021

View reviewed changes

hmacdope approved these changes Nov 8, 2021

View reviewed changes

IAlibay previously requested changes Nov 8, 2021

View reviewed changes

testsuite/MDAnalysisTests/lib/test_nsgrid.py Show resolved Hide resolved

testsuite/MDAnalysisTests/lib/test_nsgrid.py Outdated Show resolved Hide resolved

Update test_nsgrid.py

6afc2dc

richardjgowers merged commit 735abb8 into MDAnalysis:develop Nov 9, 2021

IAlibay added the defect label Sep 25, 2023

Add check for too many grids in NSGrid #3441

Add check for too many grids in NSGrid #3441

Conversation

scal444 commented Oct 16, 2021 • edited Loading

Changes made in this Pull Request:

PR Checklist

codecov bot commented Oct 16, 2021 • edited Loading

Codecov Report

scal444 commented Oct 16, 2021

IAlibay commented Oct 16, 2021

hmacdope left a comment

Choose a reason for hiding this comment

IAlibay left a comment

Choose a reason for hiding this comment

tylerjereddy left a comment

Choose a reason for hiding this comment

hmacdope commented Oct 17, 2021 via email • edited Loading

richardjgowers left a comment

Choose a reason for hiding this comment

scal444 commented Oct 17, 2021

richardjgowers left a comment

Choose a reason for hiding this comment

richardjgowers commented Oct 18, 2021

scal444 commented Oct 19, 2021 • edited Loading

scal444 commented Oct 20, 2021

richardjgowers commented Oct 20, 2021 via email

pep8speaks commented Oct 24, 2021 • edited Loading

Comment last updated at 2021-11-09 15:09:32 UTC

scal444 commented Oct 24, 2021

lilyminium commented Oct 24, 2021 via email

hmacdope commented Oct 25, 2021

IAlibay left a comment

Choose a reason for hiding this comment

richardjgowers commented Oct 26, 2021 via email

richardjgowers left a comment

Choose a reason for hiding this comment

IAlibay commented Oct 27, 2021

scal444 commented Oct 27, 2021

richardjgowers commented Oct 28, 2021

hmacdope commented Oct 28, 2021 via email

richardjgowers commented Oct 28, 2021 via email

hmacdope commented Oct 28, 2021

richardjgowers commented Oct 29, 2021

IAlibay commented Oct 29, 2021

hmacdope commented Oct 31, 2021

scal444 commented Oct 31, 2021

IAlibay left a comment

Choose a reason for hiding this comment

hmacdope left a comment

Choose a reason for hiding this comment

IAlibay left a comment

Choose a reason for hiding this comment

richardjgowers commented Nov 9, 2021

scal444 commented Oct 16, 2021 •

edited

Loading

codecov bot commented Oct 16, 2021 •

edited

Loading

hmacdope commented Oct 17, 2021 via email •

edited

Loading

scal444 commented Oct 19, 2021 •

edited

Loading

pep8speaks commented Oct 24, 2021 •

edited

Loading