-
Notifications
You must be signed in to change notification settings - Fork 640
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Out-of-bounds indexing in material_grids_addgradient
#1861
Comments
What's an example of a "safer indexing primitive" we can use to avoid issues like this in the future? |
Having a type for n-dimensional arrays and views into arrays with methods to index into the array would be a starting point. In debug builds, the indexing methods can perform bounds checking. This would reduce the amount of indexing and stride calculation logic that is distributed over the codebase. |
This is great -- thanks, Andreas. I'll also add that not only would these new data structures prevent bugs, but they would significantly simplify the arduous process of adding new features to the C++ backend. For this particular PR that you reference, I spent 80% of my time trying do nail down the complicated array logic (and obviously it still has issues...). Also note that #1855 completely revamps how these fields are accessed, so let's not dwell too much on "fixing" this particular issue (rather let's just get the new PR working correctly from the very beginning). In addition to the above suggestion, I imagine you have a series of techniques you use to identify and diagnose issues like these? It might be nice to someday document these techniques for future use. Im sure these are standard "tricks of the trade" used across software development, but many of us aren't traditional software devs. |
When it comes to techniques, one line of defense is the use of address sanitizers. Those were introduced as a GitHub Action in #1610 but unfortunately disabled, likely due to the large number of identified issues. Here is the output for ASAN at the current state in the repo: https://gist.github.com/ahoenselaar/13d0b764bb38bbab8b579ed414c49592 At the moment, the sanitizer only covers C++ tests and thus cannot identify the OOB indexing in gradient-related methods. IIRC, JAX was causing issues in the ASAN-enabled build but those could probably be resolved. |
Closed by #1886? |
Calculated indices
fwd1_idx
andfwd2_idx
into the DFT field arrays are, at times, negative and cause out-of-bounds data to be read.Repro:
Instrument
material_grids_addgradient
as follows:then run the Python test suite and observe failures in adjoint tests.
Without the introduction of safer indexing primitives and better test coverage, bugs of this type are going to occur over and over.
The text was updated successfully, but these errors were encountered: