Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add writing u.trajectory.ts.data['molecule_tag'] as molecule_tag atom attribute to LAMMPS datafile #4114

Merged
merged 6 commits into from
May 21, 2023

Conversation

mglagolev
Copy link
Contributor

@mglagolev mglagolev commented Apr 4, 2023

Fixes #3548

Being able to write the molecule id in a LAMMPS data file is crucial for using the MDAnalysis as a generator of LAMMPS initial configurations, for example, in machine learning workflows.
Currently, the molecule_tag field reads as atoms.resids, but when writing a data file, zero values are hardcoded.
Following the discussion in #3548, I've implemented writing ts.data['molecule_tag'] into this field.

Changes made in this Pull Request:

Added data as input parameter for DATAWriter._write_atoms
If data['molecule_tag'] is present, _write_atoms will write its values as the molecule_tag attribute of the Atoms section of the data file. Otherwise, it will fill molecule_tag with zeroes, consistent with the previous behavior.

Added special read-write-read function to the tests, which sets data['molecule_tag'] values from atoms.resids after reading the data file. Added the class to check the resids after using the aforementioned function.

PR Checklist

  • [V] Tests?
  • Docs?
  • [V] CHANGELOG updated?
  • [V] Issue raised/referenced?

📚 Documentation preview 📚: https://readthedocs-preview--4114.org.readthedocs.build/en/4114/

@github-actions
Copy link

github-actions bot commented Apr 4, 2023

Linter Bot Results:

Hi @mglagolev! Thanks for making this PR. We linted your code and found the following:

Some issues were found with the formatting of your code.

Code Location Outcome
main package ⚠️ Possible failure
testsuite ⚠️ Possible failure

Please have a look at the darker-main-code and darker-test-code steps here for more details: https://github.com/MDAnalysis/mdanalysis/actions/runs/4870972704/jobs/8687365788


Please note: The black linter is purely informational, you can safely ignore these outcomes if there are no flake8 failures!

@codecov
Copy link

codecov bot commented Apr 4, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.02 🎉

Comparison is base (fd978d2) 93.59% compared to head (eb7c177) 93.61%.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #4114      +/-   ##
===========================================
+ Coverage    93.59%   93.61%   +0.02%     
===========================================
  Files          192      192              
  Lines        25134    25140       +6     
  Branches      4056     4056              
===========================================
+ Hits         23524    23536      +12     
+ Misses        1092     1088       -4     
+ Partials       518      516       -2     
Impacted Files Coverage Δ
package/MDAnalysis/coordinates/LAMMPS.py 95.03% <100.00%> (+1.61%) ⬆️

... and 7 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@orbeckst
Copy link
Member

orbeckst commented Apr 7, 2023

@hmacdope do you have expertise with LAMMPS to review this PR, or can you suggest someone else?

Copy link
Member

@RMeli RMeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few small comments. I'll leave it to LAMMPS experts for a more in-depth review.

package/MDAnalysis/coordinates/LAMMPS.py Outdated Show resolved Hide resolved
package/MDAnalysis/coordinates/LAMMPS.py Outdated Show resolved Hide resolved
package/MDAnalysis/coordinates/LAMMPS.py Outdated Show resolved Hide resolved
Comment on lines 119 to 122
LAMMPSdata,
LAMMPSdata_mini,
LAMMPScnt,
LAMMPShyd,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the code coverage comment above, it seems that none of those covers the case where there is no charge. Can we add one such case?

@mglagolev
Copy link
Contributor Author

@hmacdope Shall I improve something in this PR?

Copy link
Member

@hmacdope hmacdope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mglagolev sorry for the delay, looks good, if you could address @RMeli's last comments about f-strings and a test case coverage for a case with no carge we should be good to go

@mglagolev
Copy link
Contributor Author

Thank you @hmacdope and, once again, @RMeli!
I've added the @RMeli's improvements, except running a test on an uncharged system.
There is an uncharged datafile, which I added with my previous commit, but it is compressed.
When adding it to the test fixture, I run into an error, apparently because the filename upon writing is changed from .data.bz2 to .data.data

Here's a simple script to test this behavior:

import MDAnalysis as mda
import numpy as np
u = mda.Universe.empty(1, trajectory = True)
u.add_TopologyAttr('type')
u.add_TopologyAttr('mass')
atom = mda.core.groups.Atom(u = u, ix = 0)
atom.position = np.array([0., 0., 0.])
u.atoms = mda.AtomGroup([atom,])
u.add_bonds([])
u.add_angles([])
u.add_dihedrals([])
u.add_impropers([])
u.atoms.types = ['1',]
u.atoms.masses = [1.,]
u.dimensions = [1, 1, 1, 90, 90, 90]
with mda.Writer("test.data.bz2", n_atoms = 1) as w:
	w.write(u.atoms)

All the attributes are set just to get the simplest datafile written without an error.
In my environments, using both the development branch and the version from pip, the datafiles are written under the name "test.data.data"
Am I missing something? Is it a bug or a feature?

@mglagolev
Copy link
Contributor Author

Maybe @fenilsuchak can help?

@RMeli
Copy link
Member

RMeli commented Apr 28, 2023

Sounds odd, I'll have to double check this behaviour. Thanks for checking and reporting.

A possible workaround/alternative to get this PR done is to use del_TopologyAttr on the universe obtained from one of the current files to remove the charge attribute.

In [1]: from MDAnalysisTests.datafiles import LAMMPSdata

In [2]: import MDAnalysis as mda

In [3]: u = mda.Universe(LAMMPSdata)

In [4]: u.atoms.charges
Out[4]: array([0., 0., 0., ..., 0., 0., 0.])

In [5]: u.del_TopologyAttr('charges')

In [6]: u.atoms.charges
---------------------------------------------------------------------------
NoDataError                               Traceback (most recent call last)
Cell In[6], line 1
----> 1 u.atoms.charges

File ~/Documents/git/mdanalysis/package/MDAnalysis/core/groups.py:2539, in AtomGroup.__getattr__(self, attr)
   2537 elif attr == 'positions':
   2538     raise NoDataError('This Universe has no coordinates')
-> 2539 return super(AtomGroup, self).__getattr__(attr)

File ~/Documents/git/mdanalysis/package/MDAnalysis/core/groups.py:614, in GroupBase.__getattr__(self, attr)
    612     else:
    613         err = 'This Universe does not contain {singular} information'
--> 614         raise NoDataError(err.format(singular=cls.singular))
    615 else:
    616     return super(GroupBase, self).__getattr__(attr)

NoDataError: This Universe does not contain charge information

@mglagolev
Copy link
Contributor Author

@RMeli, thanks for the idea!
The test is passed without an error.

Copy link
Member

@RMeli RMeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mglagolev. Small nitpick (you can safely ignore it), otherwise LGTM.

testsuite/MDAnalysisTests/coordinates/test_lammps.py Outdated Show resolved Hide resolved
@mglagolev mglagolev requested a review from IAlibay May 4, 2023 13:18
@RMeli RMeli requested a review from hmacdope May 8, 2023 17:02
@mglagolev
Copy link
Contributor Author

@IAlibay, sorry for bothering, is anything else required to merge this?

@hmacdope hmacdope merged commit 374f0e9 into MDAnalysis:develop May 21, 2023
@RMeli
Copy link
Member

RMeli commented May 21, 2023

Thanks for the contribution @mglagolev, and sorry if it took quite long to get it merged.

@mglagolev
Copy link
Contributor Author

Great! Thanks to helpful people here for making this small contribution possible:-)

By the way, @RMeli did you reproduce the issue with compressed filenames?

@mglagolev mglagolev deleted the lammps-write-molecule-tag branch May 23, 2023 16:39
@RMeli
Copy link
Member

RMeli commented May 28, 2023

@mglagolev unfortunately no, I forgot. Thanks for reminding me! If you don't mind, could you please open an issue with your observation? Otherwise I'll open one.

@mglagolev
Copy link
Contributor Author

@RMeli Sure!
#4159

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LAMMPS.DATAWriter should allow to write residue ID
5 participants