Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Created hole2 module #2523

Merged
merged 7 commits into from
Mar 4, 2020
Merged

Created hole2 module #2523

merged 7 commits into from
Mar 4, 2020

Conversation

lilyminium
Copy link
Member

@lilyminium lilyminium commented Feb 10, 2020

Fixes #2522

Changes made in this Pull Request:

  • Wrote a new hole2 module

Working example in this notebook.

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

@codecov
Copy link

codecov bot commented Feb 10, 2020

Codecov Report

Merging #2523 into develop will decrease coverage by 0.23%.
The diff coverage is 79.35%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #2523      +/-   ##
===========================================
- Coverage    90.68%   90.45%   -0.24%     
===========================================
  Files          169      173       +4     
  Lines        22828    23381     +553     
  Branches      2939     3038      +99     
===========================================
+ Hits         20702    21149     +447     
- Misses        1540     1605      +65     
- Partials       586      627      +41
Impacted Files Coverage Δ
package/MDAnalysis/analysis/hole2/templates.py 100% <100%> (ø)
package/MDAnalysis/analysis/hole2/__init__.py 100% <100%> (ø)
package/MDAnalysis/analysis/hole.py 71.54% <54.54%> (ø) ⬆️
package/MDAnalysis/analysis/hole2/utils.py 75.77% <75.77%> (ø)
package/MDAnalysis/analysis/hole2/hole.py 82.18% <82.18%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update eb18a33...fc938b6. Read the comment docs.

@orbeckst
Copy link
Member

The original idea had been that HOLE is like running hole on a single structure from the command line. But I guess it makes sense to just make it work in such a way that it doesn't matter if you feed a single frame or a trajectory. For backwards compatibility I would just create a new HoleAnalysis and deprecate the other two.

(There is also the stalled PR #1814 which I am just mentioning as another HOLE thing...)

@orbeckst
Copy link
Member

Should do this

For backwards compatibility I would just create a new HoleAnalysis and deprecate the other two.

and deprecate in 1.0 for removal in 2.0.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea to rewrite from scratch. I like it.

Tests, including in docs, and deprecations are obviously still missing.

package/MDAnalysis/analysis/hole.py Outdated Show resolved Hide resolved
package/MDAnalysis/analysis/hole2/hole.py Show resolved Hide resolved
package/MDAnalysis/analysis/hole.py Show resolved Hide resolved
@lilyminium
Copy link
Member Author

lilyminium commented Feb 16, 2020

hole2 (now) has:

  • hole: a function that replaces hole.HOLE to work on PDB files.
    • it deletes HOLE files if keep_files=False (default: True)
  • HoleAnalysis: a class that replaces hole.HOLEtraj
    • It no longer takes order parameters in __init__, but returns profiles ordered over given order parameters with over_order_parameters()
    • It deletes HOLE files if used as a context manager
    • Unlike any other AnalysisBase subclass, takes more arguments in run() than start, stop, step, verbose. It takes what I think of as runtime arguments:
      • cpoint: coordinate inside the pore
      • cvect: search direction
      • sample: distance of sample points
      • end_radius: end radius of pore
      • output_level: output level of HOLE
      • random_seed: random seed
      • prefix: prefix for HOLE files
      • write_input_files: whether to write input files
        Given that no other classes do this, it can be moved to __init__. Opinions @richardjgowers @orbeckst?

Serendipitously, hole2 is also the name of HOLE as installed from GitHub, but open to suggestions for nonnumeric names

@lilyminium lilyminium changed the title [WIP] Rework hole.HOLEtraj Created hole2 module Feb 16, 2020
@richardjgowers
Copy link
Member

disclaimer, I have no idea what HOLE does yet..

@lilyminium I'd move those things to __init__ if they're parameters of how the analysis is done, whereas start/stop/step are more like sampling variables (so random seed should stay in run). Maybe philosophically an analysis class with identical __init__ args is measuring the same thing, the run kwargs just define how well it is sampled?

I'm confused about the name, is hole2.py using HOLE 2.0, or rather it's our second version of HOLE bindings? (Or Both). Either way it's a confusing name, I'd rather just it have a different name (eg hbonds vs hbond_analysis) and nag users to use the newer one.

@orbeckst
Copy link
Member

Hole2 calculates radius of molecular pores in proteins. It is the de facto standard in ion channel structural biology.

The program is hole2 so it would be appropriate to call the module hole2.

Note: I’d love to have Python bindings, just for the package name pyhole ;-)

@orbeckst
Copy link
Member

@lilyminium I would put the parameters into init , following @richardjgowers rationale (and the original implementation).

@orbeckst
Copy link
Member

raseed should be in run(), though.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just needs more testing.

Optional: I think that the average pore profile would be a good thing to have because many users might use it directly. For the power users it would be even more useful to just have an example for how to get the data so that they can plot themselves or a method that returns the aggregated data.

@orbeckst orbeckst self-assigned this Feb 18, 2020
@lilyminium
Copy link
Member Author

Added new functions: gather(), bin_radii(), histogram_radii(), plot_mean_profile(). Are these similar to what you mean, @orbeckst? (See example usage in this notebook)

Tests and documentation coming soon.

@orbeckst
Copy link
Member

Yes, these functions are very useful and the plot_mean_profile will allow users to quickly have a look at the data. Nice.

@richardjgowers
Copy link
Member

version 1.0 is 99% complete, is this close enough to completion to delay by a few days? If so I can dive into it and review

@lilyminium
Copy link
Member Author

@richardjgowers I was hoping it would be nearly done; the tests aren't failing on my local Python 2.7 or 3.7 environments so not sure what's going on there. When did you want to release?

Copy link
Member

@richardjgowers richardjgowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe hole2.helpers name can stay to avoid conflict with lib.util.

There’s a lot of bending over backwards to get data in to this program, which obviously isn’t going to be fast (ie writing and reading files is slow). How important is this analysis program? If it’s open source then I can look at building real bindings to it, maybe this is a better long term solution.

package/MDAnalysis/analysis/hole2/__init__.py Show resolved Hide resolved
package/MDAnalysis/analysis/hole2/helper.py Outdated Show resolved Hide resolved
package/MDAnalysis/analysis/hole2/helper.py Outdated Show resolved Hide resolved
package/MDAnalysis/analysis/hole2/hole.py Outdated Show resolved Hide resolved
@orbeckst
Copy link
Member

orbeckst commented Mar 1, 2020

There’s a lot of bending over backwards to get data in to this program, which obviously isn’t going to be fast (ie writing and reading files is slow). How important is this analysis program? If it’s open source then I can look at building real bindings to it, maybe this is a better long term solution.

HOLE2 is FORTRAN with various limitations on file name lengths etc. Python bindings would be great (it is open source nowadays, link to repo in docs) (pyhole!) but it's not worthwhile holding up this PR, which is almost done and is a definitive improvement over what we currently have.

Performance in file I/O is pretty much a non-issue IIRC because HOLE is slow per frame. It would be a prime candidate for PMDA or similar approaches.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some gaps in the tests. At minimum, plot_mean_profile() and plot_oreder_parameters() should be covered. For all other comments use your discretion. (The filename shortening shenanigans in helpers.py have patchy coverage but that's probably on me because my original functions were not tested well – feel free to ignore and just raise an issue that this needs testing... maybe a gsoc issue?)

package/MDAnalysis/analysis/hole2/helper.py Outdated Show resolved Hide resolved
package/MDAnalysis/analysis/hole2/helper.py Outdated Show resolved Hide resolved
package/MDAnalysis/analysis/hole2/helper.py Outdated Show resolved Hide resolved
package/MDAnalysis/analysis/hole2/helper.py Outdated Show resolved Hide resolved
package/MDAnalysis/analysis/hole2/hole.py Show resolved Hide resolved
package/MDAnalysis/analysis/hole2/hole.py Show resolved Hide resolved
package/MDAnalysis/analysis/hole2/hole.py Show resolved Hide resolved
package/MDAnalysis/analysis/hole2/hole.py Show resolved Hide resolved
package/MDAnalysis/analysis/hole2/hole.py Show resolved Hide resolved
package/MDAnalysis/analysis/hole2/hole.py Outdated Show resolved Hide resolved
@lilyminium
Copy link
Member Author

lilyminium commented Mar 3, 2020

Added tests for plots and checking long filenames.

@richardjgowers hole gets a few questions on the mailing list. I started this PR because combining the pore surfaces for each frame into a single VMD script from the original surface files isn't immediately easy (and had to rewrite the class because otherwise it would keep overwriting intermediate files). So it's worth keeping that part, at least, the pore videos are neat.

Thanks for the reviews!

@orbeckst
Copy link
Member

orbeckst commented Mar 3, 2020

This is a veritable amount of tests. Kudos! (Some are falling for some versions, though, e.g. "AttributeError: 'Line3D' object has no attribute 'get_data_3d'" — for any tests that only fail on 2.7 I'd be ok with a skipif on the test, provided it's only the testing code that fails.)

@orbeckst
Copy link
Member

orbeckst commented Mar 3, 2020

@lilyminium do you want to rewrite your history into a smaller number of commits? I don't think that squashing all of them is very clear (there's stuff in old and new hole, tests, etc) but I also don't want to merge 31 commits where some a little corrections.

Awesome work!!

@lilyminium
Copy link
Member Author

Thanks @orbeckst , I've squashed the history from 31 commits -> 7.

@orbeckst orbeckst merged commit 9d602ec into MDAnalysis:develop Mar 4, 2020
@orbeckst
Copy link
Member

orbeckst commented Mar 4, 2020

Nice work!

@Y0geshh
Copy link

Y0geshh commented Mar 8, 2020

Hi oliver is this module ready to use now?

@lilyminium lilyminium deleted the hole branch March 8, 2020 13:53
@lilyminium
Copy link
Member Author

@Y0geshh Yes -- as it is now merged into the develop version of MDAnalysis, you can just clone the develop branch and build that.

@fiona-naughton fiona-naughton added enhancement deprecation Deprecated functionality to give advance warning for API changes. labels Sep 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component-Analysis deprecation Deprecated functionality to give advance warning for API changes. enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

hole.HOLEtraj makes pore VMD files for every frame
6 participants