Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Converters API #2882

Merged
merged 41 commits into from
May 10, 2021
Merged

Converters API #2882

merged 41 commits into from
May 10, 2021

Conversation

cbouy
Copy link
Member

@cbouy cbouy commented Jul 30, 2020

Fixes #2790

Changes made in this Pull Request:

  • Makes u.atoms.convert_to("PACKAGE") case insensitive
  • Pass kwargs to converters: u.atoms.convert_to("rdkit", NoImplicit=False)
  • Adds u.atoms.convert_to.package() methods automatically from the metaclass magic, with TAB completion support, and docstring from the converter

The same logic can be applied to writers if that's needed

The tests will fail for now as I have written them with #2775 in mind.

Also, I'm not sure if I've put the code in the right place and if the names I've come up with are relevant.

I'm not sure how this will work out with Sphinx though, since all the convert_to.package() methods are made automatically through setattr. Similar concern for convert_to in AtomGroup, is there a way to tell Sphinx to show a particular docstring instead of considering it like an attribute, and if yes, which docstring should it show ?

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

@pep8speaks
Copy link

pep8speaks commented Jul 30, 2020

Hello @cbouy! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 211:1: E402 module level import not at top of file

Line 31:80: E501 line too long (146 > 79 characters)

Line 27:80: E501 line too long (113 > 79 characters)
Line 61:80: E501 line too long (88 > 79 characters)
Line 76:80: E501 line too long (94 > 79 characters)
Line 145:80: E501 line too long (95 > 79 characters)
Line 166:80: E501 line too long (95 > 79 characters)
Line 225:80: E501 line too long (86 > 79 characters)
Line 245:80: E501 line too long (83 > 79 characters)
Line 259:80: E501 line too long (85 > 79 characters)
Line 274:13: E731 do not assign a lambda expression, use a def
Line 334:80: E501 line too long (109 > 79 characters)
Line 335:80: E501 line too long (84 > 79 characters)
Line 336:80: E501 line too long (99 > 79 characters)

Line 44:80: E501 line too long (85 > 79 characters)
Line 169:9: E266 too many leading '#' for block comment
Line 276:9: E266 too many leading '#' for block comment

Line 27:80: E501 line too long (100 > 79 characters)
Line 28:80: E501 line too long (80 > 79 characters)
Line 139:80: E501 line too long (91 > 79 characters)

Line 311:28: E231 missing whitespace after ','

Comment last updated at 2021-05-10 18:20:47 UTC

@cbouy
Copy link
Member Author

cbouy commented Jul 31, 2020

Forgot to tag @MDAnalysis/coredevs !

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your pandas plot-style interface is a good middle-ground. I don't know how to solve the problem with the docs. Perhaps look into how pandas documents DataFrame.plot.

I suggest you start building converter things in MDAnalysis.converters although others might have different opinions.

package/MDAnalysis/lib/accessors.py Outdated Show resolved Hide resolved
package/MDAnalysis/core/groups.py Show resolved Hide resolved
package/MDAnalysis/core/groups.py Outdated Show resolved Hide resolved
@orbeckst orbeckst added the GSoC GSoC project label Aug 7, 2020
@IAlibay IAlibay added this to the 2.0 milestone Apr 22, 2021
@codecov
Copy link

codecov bot commented Apr 25, 2021

Codecov Report

Merging #2882 (26ebd6e) into develop (d734a89) will decrease coverage by 0.00%.
The diff coverage is 96.14%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #2882      +/-   ##
===========================================
- Coverage    93.56%   93.56%   -0.01%     
===========================================
  Files          172      176       +4     
  Lines        22785    22823      +38     
  Branches      3191     3193       +2     
===========================================
+ Hits         21319    21354      +35     
- Misses        1416     1419       +3     
  Partials        50       50              
Impacted Files Coverage Δ
package/MDAnalysis/coordinates/__init__.py 100.00% <ø> (ø)
package/MDAnalysis/topology/__init__.py 100.00% <ø> (ø)
package/MDAnalysis/converters/ParmEd.py 93.18% <93.18%> (ø)
package/MDAnalysis/converters/ParmEdParser.py 98.44% <98.44%> (ø)
package/MDAnalysis/__init__.py 92.10% <100.00%> (+0.21%) ⬆️
package/MDAnalysis/converters/OpenMM.py 97.05% <100.00%> (ø)
package/MDAnalysis/converters/OpenMMParser.py 100.00% <100.00%> (ø)
package/MDAnalysis/converters/RDKit.py 97.31% <100.00%> (ø)
package/MDAnalysis/converters/RDKitParser.py 96.72% <100.00%> (ø)
package/MDAnalysis/converters/__init__.py 100.00% <100.00%> (ø)
... and 9 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d734a89...26ebd6e. Read the comment docs.

@lilyminium
Copy link
Member

@cbouy please just let us know when this is ready for review :-)

@cbouy
Copy link
Member Author

cbouy commented Apr 29, 2021

I still need to write the tests but most of the code and docs should be ready for review

@richardjgowers
Copy link
Member

I'm not really convinced that we need convert_to.rdkit() rather than convert_to('rdkit'). The syntax of convert_to.X doesn't feel hugely intuitive to me, I can't think of a similar case? The selling point seems to be tab completion, could we not instead do something smart with the docstrings of convert_to? being magically patched? If we split out the case insensitivity we can merge that as is.

@orbeckst
Copy link
Member

orbeckst commented May 4, 2021

The syntax of convert_to.X doesn't feel hugely intuitive to me, I can't think of a similar case?

There was a fairly long discussion on this and we found that pandas DataFrames do something similar with their different plots (see #2790 (comment) and thereabouts). The majority seemed to favor TAB-discoverability.

@richardjgowers richardjgowers marked this pull request as ready for review May 5, 2021 09:02
Copy link
Member

@richardjgowers richardjgowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok lgtm

@cbouy
Copy link
Member Author

cbouy commented May 7, 2021

I've included the OpenMM converter as well (just noticed it was there), and hopefully fixed #3262 in the process

@jbarnoud
Copy link
Contributor

jbarnoud commented May 7, 2021

I've included the OpenMM converter as well (just noticed it was there), and hopefully fixed #3262 in the process

Make sure you also move the documentation stubs.

@lilyminium
Copy link
Member

lilyminium commented May 8, 2021

@cbouy could you change the np.int in your tests to int?

Edit: Ah, you already test that. Could you explain why you're also testing np.int?

@lilyminium
Copy link
Member

@cbouy could you please rebase on develop and see if that fixes the tests? :)

@IAlibay
Copy link
Member

IAlibay commented May 9, 2021

Sorry @cbouy in the interest of getting this merged faster & not forcing you to work on a Sunday I went ahead and did the update against develop.

Copy link
Member

@IAlibay IAlibay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went ahead and fixed the doc issues.


.. versionadded:: 2.0.0


Converts an
`OpenMM <http://docs.openmm.org/latest/api-python/generated/simtk.openmm.app.topology.Topology.html#simtk.openmm.app.topology.Topology>`_
`OpenMM topology <http://docs.openmm.org/latest/api-python/generated/simtk.openmm.app.topology.Topology.html#simtk.openmm.app.topology.Topology>`_
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sphinx was complaining about doubling of OpenMM link definition, so I renamed the topology as such.

@@ -22,15 +22,15 @@
#

"""
RDKit topology parser
=====================
RDKit topology parser --- :mod:`MDAnalysis.converters.RDKitParser`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added these :mod: entries since they were present in the non Parser modules.


.. rubric:: Available converters

.. toctree::
:maxdepth: 1

converters/ParmEdParser
converters/RDKitParser
converters/init
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sphinx was complaining about the presence of init.rst without a toctree entry for it. I added it, although it isn't super informative as an entry, should we just remove it?

Copy link
Member

@lilyminium lilyminium May 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could move the converters.rst contents into converters.__init__ ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went with the path of least resistance here and just moved a tiny bit of init's doc to the converters.rst. We can always clean up in 2.1.0 (or post beta) if it's really worth it.

@IAlibay
Copy link
Member

IAlibay commented May 9, 2021

Sorry again for hijaking the PR @cbouy, just added a couple of small commits (docs + tests) to see if we can get it merged today.

@IAlibay
Copy link
Member

IAlibay commented May 9, 2021

This is concerning, the 3.6 / numpy 1.16 run failed twice with the same error... (just restarted things to see if it'll do it a third time in a row)

[gw1] linux -- Python 3.6.13 /usr/share/miniconda/envs/test/bin/python

self = <test_rdkit.TestRDKitConverter object at 0x7f3a1297ceb8>, smi = '[He]'

    @pytest.mark.parametrize("smi", ["[H]", "C", "O", "[He]"])
    def test_single_atom_mol(self, smi):
        u = mda.Universe.from_smiles(smi, addHs=False,
                                     generate_coordinates=False)
        mol = u.atoms.convert_to("RDKIT")
        assert mol.GetNumAtoms() == 1
>       assert mol.GetAtomWithIdx(0).GetSymbol() == smi.strip("[]")
E       AssertionError: assert 'C' == 'He'
E         - He
E         + C

Ok, it's not done it again, it's probably the whole cache thing -- let's get this merged so we can get that one merged too.

edit: the failure rate for py3.6 numpy 1.16 is very high, I wonder if it's actually linked to rdkit 2020 vs 2021.

@IAlibay
Copy link
Member

IAlibay commented May 10, 2021

@lilyminium @orbeckst can we get a quick re-review with aim to merge please?

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a quick read through, this looks good. I would just add an explicit statement to CHANGELOG about the new converter module. I added a suggestion — @IAlibay @cbouy you can either apply the suggestion or proceed without it if you think that's more appropriate.

package/CHANGELOG Show resolved Hide resolved
Copy link
Member

@lilyminium lilyminium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @cbouy and @IAlibay for pushing it forward

@IAlibay
Copy link
Member

IAlibay commented May 10, 2021

Only failing check here is codecov (just an error cover, can deal with it some other time).

Thanks for your hard work here @cbouy 🎉 -- squash merging

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

API for converters
8 participants