Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

increased the maximum number of matches from an rdkit smarts query #3470

Merged
merged 50 commits into from
Jun 1, 2022
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
7a57af1
Fixed issue #2915 and added a pytest to demonstrate the fix. Sphzone …
orionarcher Apr 2, 2021
7a2a155
Merge branch 'issue_#2915' into develop. Updates to selection and sel…
orionarcher Apr 3, 2021
89a38d3
removed a unncesessary TODO statement
orionarcher Apr 3, 2021
a9194a3
added issue #2915 fix to CHANGELOG
orionarcher Apr 3, 2021
dcb23b5
added self to AUTHORS and CHANGELOG, moved lines in selection
orionarcher Apr 3, 2021
889b28d
fixed issue #2915 for cylayer, cyzone, and sphlayer. added testing to…
orionarcher Apr 3, 2021
96b63e3
created decorator to test for empty selections. moved around testing …
orionarcher Apr 3, 2021
d5fde7d
added another blank line to be consistent with PEP-8
orionarcher Apr 3, 2021
8a81f54
seperated testing for empty atom selection for sph* and cy* selection…
orionarcher Apr 4, 2021
a4c7e41
added stacked decorators instead of repeated functionality in one dec…
orionarcher Apr 4, 2021
b6dfd9a
removed the decorator implementation and added code directly into the…
orionarcher Apr 4, 2021
0cc38ad
updated CHANGELOG with more detail
orionarcher Apr 5, 2021
d03d329
Merge branch 'develop' of https://github.com/MDAnalysis/mdanalysis in…
orionarcher May 22, 2021
d0198e1
increased the maximum number of matches from an rdkit smarts query
orionarcher Nov 29, 2021
3ac808c
update changelog
orionarcher Nov 29, 2021
ad874f0
set default max_matches to largest unsigned int and add a configurabl…
orionarcher Nov 30, 2021
b2a9446
add better documentation to the smarts selection query
orionarcher Nov 30, 2021
da317ed
added a test for max_matches behavior
orionarcher Nov 30, 2021
bef38a9
change max_matches -> maxMatches and set default max to group.n_atoms
orionarcher Nov 30, 2021
547d8f6
update CHANGELOG to reflect recent changes
orionarcher Nov 30, 2021
937e580
clean up syntax and expose useChirality kwarg
orionarcher Nov 30, 2021
4e126d1
add better documentation for rdkit_kwargs
orionarcher Nov 30, 2021
086e050
use two kwargs variables to pass kwargs to smarts queries
orionarcher Dec 2, 2021
14d163a
update testing to use smarts_kwargs
orionarcher Dec 2, 2021
d89b131
small bug fix
orionarcher Dec 3, 2021
1d7d898
updated testing to increase code coverage
orionarcher Dec 3, 2021
207e858
cleaned up testing for smarts kwargs
orionarcher Dec 3, 2021
a210d43
updated and clarified documentation
orionarcher Dec 10, 2021
8a67c76
cleaned up code style and used setdefaults rather than if statement
orionarcher Dec 10, 2021
dc5a704
added versionchaged info at 2.0.1
orionarcher Dec 10, 2021
e984be0
returned to default to 1000
orionarcher Jan 14, 2022
cbcba0f
update changelog to reflect no change to defaults
orionarcher Jan 14, 2022
8c252d2
Merge branch 'develop' into increase_smarts_matches
IAlibay Jan 16, 2022
04b415b
Updating to new develop
orionarcher Jan 28, 2022
6317edb
Merge branch 'develop' of https://github.com/MDAnalysis/mdanalysis in…
orionarcher May 16, 2022
8fff72a
Merge branch 'develop' into increase_smarts_matches
orionarcher May 16, 2022
fc129f6
Merge branch 'develop' into increase_smarts_matches
IAlibay May 16, 2022
e57695a
added docstring parameter for smarts_kwargs
orionarcher May 18, 2022
9eabb4c
updated documentation description for smarts query and duplicated in …
orionarcher May 18, 2022
ace0c57
Merge branch 'increase_smarts_matches' of https://github.com/orioncoh…
orionarcher May 18, 2022
ae5c5e6
change default maxMatches to 10 * n_atoms and add test
orionarcher May 19, 2022
1e6eba0
update documentation for smarts kwargs
orionarcher May 19, 2022
e0e28a3
add smarts kwargs warning to docs
orionarcher May 19, 2022
e07f40b
merge develop
orionarcher May 27, 2022
92828c8
Merge branch 'develop' into increase_smarts_matches
orionarcher May 27, 2022
9574661
Merge branch 'develop' into increase_smarts_matches
IAlibay May 31, 2022
e38e219
various doc improvements and maxMatches fix
IAlibay Jun 1, 2022
e7d6a68
update selections.rst
IAlibay Jun 1, 2022
30c3f35
Merge branch 'develop' into increase_smarts_matches
IAlibay Jun 1, 2022
c739404
fix tests
IAlibay Jun 1, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion package/CHANGELOG
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,8 @@ The rules for this file:
* release numbers follow "Semantic Versioning" http://semver.org

------------------------------------------------------------------------------
??/??/?? IAlibay, melomcr, mdd31, ianmkenney, richardjgowers, hmacdope, orbeckst, scal444
??/??/?? IAlibay, melomcr, mdd31, ianmkenney, richardjgowers, hmacdope,
orbeckst, scal444, orioncohen
* 2.1.0

Fixes
Expand All @@ -29,6 +30,8 @@ Fixes
Enhancements
* Add option for custom compiler flags for C/C++ on build and remove march
option from setup.cfg (Issue #3428, PR #3429).
* increase smarts atom selection limit from 1000 -> n_atoms. Add
orionarcher marked this conversation as resolved.
Show resolved Hide resolved
maxMatches and useChirality to rdkit_kwargs. (Issue #3469, PR #3470)

Changes
* Dropped python 3.6 support and raised minimum numpy version to 1.18.0
Expand Down
15 changes: 12 additions & 3 deletions package/MDAnalysis/core/groups.py
Original file line number Diff line number Diff line change
Expand Up @@ -2826,7 +2826,7 @@ def ts(self):

def select_atoms(self, sel, *othersel, periodic=True, rtol=1e-05,
atol=1e-08, updating=False, sorted=True,
rdkit_kwargs=None, **selgroups):
rdkit_kwargs=None, smarts_kwargs=None, **selgroups):
"""Select atoms from within this Group using a selection string.

Returns an :class:`AtomGroup` sorted according to their index in the
Expand Down Expand Up @@ -2977,7 +2977,15 @@ def select_atoms(self, sel, *othersel, periodic=True, rtol=1e-05,
smarts *SMARTS-query*
select atoms using Daylight's SMARTS queries, e.g. ``smarts
[#7;R]`` to find nitrogen atoms in rings. Requires RDKit.
All matches (max 1000) are combined as a unique match
All matches are combined as a unique match. Uses two sets of
kwargs: rdkit_kwargs is passed to `RDKitConverter.convert()`
and smarts_kwargs is passed to RDKit's [GetSubstructMatches](
orionarcher marked this conversation as resolved.
Show resolved Hide resolved
https://www.rdkit.org/docs/source/
rdkit.Chem.rdchem.html#rdkit.Chem.rdchem.Mol.GetSubstructMatches).
The useChirality kwarg is True by default.
orionarcher marked this conversation as resolved.
Show resolved Hide resolved

>>> universe.select_atoms("C", smarts_kwargs={"maxMatches": 100})
<AtomGroup with 100 atoms>

**Boolean**

Expand Down Expand Up @@ -3147,7 +3155,8 @@ def select_atoms(self, sel, *othersel, periodic=True, rtol=1e-05,
periodic=periodic,
atol=atol, rtol=rtol,
sorted=sorted,
rdkit_kwargs=rdkit_kwargs)
rdkit_kwargs=rdkit_kwargs,
orionarcher marked this conversation as resolved.
Show resolved Hide resolved
smarts_kwargs=smarts_kwargs)
for s in sel_strs))
if updating:
atomgrp = UpdatingAtomGroup(self, selections, sel_strs)
Expand Down
11 changes: 9 additions & 2 deletions package/MDAnalysis/core/selection.py
Original file line number Diff line number Diff line change
Expand Up @@ -643,6 +643,7 @@ def __init__(self, parser, tokens):
pattern.append(val)
self.pattern = "".join(pattern)
self.rdkit_kwargs = parser.rdkit_kwargs
self.smarts_kwargs = parser.smarts_kwargs

def _apply(self, group):
try:
Expand All @@ -656,7 +657,12 @@ def _apply(self, group):
if not pattern:
raise ValueError(f"{self.pattern!r} is not a valid SMARTS query")
mol = group.convert_to("RDKIT", **self.rdkit_kwargs)
matches = mol.GetSubstructMatches(pattern, useChirality=True)
# override GetSubstructMatches default values
if "useChirality" not in self.smarts_kwargs:
orionarcher marked this conversation as resolved.
Show resolved Hide resolved
self.smarts_kwargs["useChirality"] = True
if "maxMatches" not in self.smarts_kwargs:
self.smarts_kwargs["useChirality"] = group.n_atoms
matches = mol.GetSubstructMatches(pattern, **self.smarts_kwargs)
# convert rdkit indices to mdanalysis'
indices = [
mol.GetAtomWithIdx(idx).GetIntProp("_MDAnalysis_index")
Expand Down Expand Up @@ -1388,7 +1394,7 @@ def expect(self, token):
"".format(self.tokens[0], token))

def parse(self, selectstr, selgroups, periodic=None, atol=1e-08,
rtol=1e-05, sorted=True, rdkit_kwargs=None):
rtol=1e-05, sorted=True, rdkit_kwargs=None, smarts_kwargs=None):
orionarcher marked this conversation as resolved.
Show resolved Hide resolved
"""Create a Selection object from a string.

Parameters
Expand Down Expand Up @@ -1433,6 +1439,7 @@ def parse(self, selectstr, selgroups, periodic=None, atol=1e-08,
self.rtol = rtol
self.sorted = sorted
self.rdkit_kwargs = rdkit_kwargs or {}
self.smarts_kwargs = smarts_kwargs or {}

self.selectstr = selectstr
self.selgroups = selgroups
Expand Down
8 changes: 7 additions & 1 deletion testsuite/MDAnalysisTests/core/test_atomselections.py
Original file line number Diff line number Diff line change
Expand Up @@ -575,11 +575,17 @@ def test_invalid_smarts_sel_raises_error(self, u2):
with pytest.raises(ValueError, match="not a valid SMARTS"):
u2.select_atoms("smarts foo")

def test_passing_args_to_converter(self):
def test_passing_rdkit_kwargs_to_converter(self):
u = mda.Universe.from_smiles("O=C=O")
sel = u.select_atoms("smarts [$(O=C)]", rdkit_kwargs=dict(force=True))
assert sel.n_atoms == 2

def test_passing_smarts_kwargs_to_converter(self, u2):
sel = u2.select_atoms("smarts C", smarts_kwargs=dict(maxMatches=2))
assert sel.n_atoms == 2
sel2 = u2.select_atoms("smarts c")
assert sel2.n_atoms == 4


class TestSelectionsNucleicAcids(object):
@pytest.fixture(scope='class')
Expand Down