Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add scale_factor write support for NetCDF writer #3294

Merged
merged 18 commits into from
Aug 20, 2021

Conversation

IAlibay
Copy link
Member

@IAlibay IAlibay commented May 8, 2021

Fixes #2327 completes #3185

I doubt this will make it into the beta release (unless someone feels like reviewing it tonight), but it should make it into the final release as it was a promised changed in 1.1.1.

Changes made in this Pull Request:

  • Adds scale_factor writing support to NCDFWriter.
  • Changes the default scale_factor value for velocity writing to 20.455 in order to match the AMBER default

From ambertools' AmberNetdf.F90

    double precision, parameter :: velocityScale = 20.455d0
  • Cleans up the NCDFReader code a bit (still could be improved, but at least now it's more readable).

For context, here is the AMBER NetCDF convention: http://ambermd.org/netcdf/nctraj.xhtml

PR Checklist

  • Tests? -- finishing
  • Docs?
  • CHANGELOG updated? -- to do
  • Issue raised/referenced?

@IAlibay IAlibay added this to the 2.0 milestone May 8, 2021
@codecov
Copy link

codecov bot commented May 8, 2021

Codecov Report

Merging #3294 (46f63ef) into develop (00c4c3c) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff            @@
##           develop    #3294   +/-   ##
========================================
  Coverage    93.69%   93.70%           
========================================
  Files          177      177           
  Lines        22959    22990   +31     
  Branches      3234     3247   +13     
========================================
+ Hits         21511    21542   +31     
  Misses        1397     1397           
  Partials        51       51           
Impacted Files Coverage Δ
package/MDAnalysis/coordinates/TRJ.py 97.88% <100.00%> (+0.16%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 00c4c3c...46f63ef. Read the comment docs.

@pep8speaks
Copy link

pep8speaks commented May 8, 2021

Hello @IAlibay! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 1268:80: E501 line too long (120 > 79 characters)

Line 938:80: E501 line too long (81 > 79 characters)
Line 1010:80: E501 line too long (87 > 79 characters)

Comment last updated at 2021-08-20 20:06:10 UTC

@IAlibay IAlibay marked this pull request as ready for review May 8, 2021 16:01
@IAlibay
Copy link
Member Author

IAlibay commented May 8, 2021

Copy link
Member

@lilyminium lilyminium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting on new tests to finish running for diagnostic purposes before addressing those :)

package/MDAnalysis/coordinates/TRJ.py Outdated Show resolved Hide resolved
package/MDAnalysis/coordinates/TRJ.py Outdated Show resolved Hide resolved
package/MDAnalysis/coordinates/TRJ.py Outdated Show resolved Hide resolved
package/MDAnalysis/coordinates/TRJ.py Outdated Show resolved Hide resolved
package/MDAnalysis/coordinates/TRJ.py Outdated Show resolved Hide resolved
package/MDAnalysis/coordinates/TRJ.py Outdated Show resolved Hide resolved
package/MDAnalysis/coordinates/TRJ.py Outdated Show resolved Hide resolved
testsuite/MDAnalysisTests/coordinates/test_netcdf.py Outdated Show resolved Hide resolved
testsuite/MDAnalysisTests/coordinates/test_netcdf.py Outdated Show resolved Hide resolved
@IAlibay
Copy link
Member Author

IAlibay commented May 8, 2021

How odd, I can't replicate this locally, https://github.com/MDAnalysis/mdanalysis/pull/3294/checks?check_run_id=2535134574#step:10:548

Ok can confirm the issue is with netcdf4. Doesn't crash with scipy's netcdf reader, but fails with netcdf4.

@IAlibay
Copy link
Member Author

IAlibay commented May 8, 2021

Ok complication number 1:

maskandscale: "whether or not to do automatic conversion based on the value of scale_factor

In scipy this is False by default, in netcdf4 this is True by default....

So as far as I can tell when we fixed reading (because it's scipy only reading), then we were fine (i.e. auto scaling was off so we just applied scaling ourselves). But with writing, if we write with netcdf4 then it's on, if we write without it then it's off... unless we enforce it to be off....

@IAlibay
Copy link
Member Author

IAlibay commented May 11, 2021

So one way to make the NetCDF4 and scipy default behaviours the same is just to enable maskandscale when creating the scipy netcdf file.

@tylerjereddy [person most likely to know this] do you happen to have any insights here on a) performance overheads of using numpy masked arrays, b) potential issues in using masked arrays (i.e. can it lead to the unintentional hiding of corrupted netcdf files).

@IAlibay IAlibay mentioned this pull request May 13, 2021
@IAlibay IAlibay mentioned this pull request Aug 13, 2021
4 tasks
@IAlibay IAlibay requested a review from lilyminium August 17, 2021 17:53
@IAlibay
Copy link
Member Author

IAlibay commented Aug 17, 2021

@MDAnalysis/coredevs this is the last thing we definitely promised in 2.0, can we try to get this reviewed/merged as a priority so we can finally release 2.0?

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reviewed first half, looking good so far

package/MDAnalysis/coordinates/TRJ.py Outdated Show resolved Hide resolved
Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks sensible to me. Kudos for writing these tests!

I have some minor comments regarding docs — I am missing some knowledge about netcdf details. Assuming that I am somewhat representing an average user here, the question is if the available options should carry warning signs that mere mortals should leave them at defaults. Links to other docs would also be helpful.

As I mentioned in a comment, we should remove a whole bunch of stuff from the API docs for Writers, namely the start, stop, step and dt kwargs/attributes should not be required. That's another issue, though.

Co-authored-by: Oliver Beckstein <[email protected]>
@IAlibay
Copy link
Member Author

IAlibay commented Aug 20, 2021

That should have address the review comments @orbeckst - let me know what you think

@lilyminium do you want to re-review and/or are you ok with your review being dismissed?

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 65 bit -> 64 bit fix still needs to be added. @IAlibay you can use my suggestion or do it yourself. I won't hold up proceedings. Thanks for addressing my comments.

package/MDAnalysis/coordinates/TRJ.py Outdated Show resolved Hide resolved
@lilyminium
Copy link
Member

Sorry -- LGTM

@lilyminium lilyminium merged commit bd0ed5d into MDAnalysis:develop Aug 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improving the NCDFWriter behaviour (units + scale_factors)
4 participants