Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mmtf parser concatenates models #1496

Closed
kain88-de opened this issue Jul 17, 2017 · 11 comments
Closed

mmtf parser concatenates models #1496

kain88-de opened this issue Jul 17, 2017 · 11 comments

Comments

@kain88-de
Copy link
Member

kain88-de commented Jul 17, 2017

EDIT by @jbarnoud: The issue actually comes from the handling of models by the mmtf paser. See discussion.

Expected behaviour

both give the same number of segments for 1ubq

Actual behaviour

the mmtf parser gives two segments and the pdb parser gives one

Code to reproduce the behaviour

import MDAnalysis as mda

u = mda.fetch_mmtf('1UBQ')
print(u.atoms.segments)
u = mda.Universe('1ubq.pdb')
print(u.atoms.segments)

....

Currently version of MDAnalysis:

0.17.0dev (couple days old)

@kain88-de
Copy link
Member Author

Behavior is even worse for *1Q0W` where the mmtf parser finds 40 segments while the PDB parser finds only 2.

@jbarnoud
Copy link
Contributor

It seems that the problem is not so much the segments, but the whole content of the files:

import MDAnalysis as mda

u_mmtf = mda.fetch_mmtf('1Q0W')
print(len(u_mmtf.atoms))  # 32420
u_pdb = mda.Universe('/home/jon/Downloads/1q0w.pdb')
print(len(u_pdb.atoms))  # 1621

The MMTF file contains 20 times the number of atoms the PDB file has. Looking at the PDB file, there are 20 models. The issue seems to be that we do not account for models in the same way in the PDB and the MMTF parsers.

@richardjgowers
Copy link
Member

Yeah so PDB uses models to differentiate between frames in a trajectory. In MMTF we don't do this, but maybe we should? Ie we could use u.trajectory[i] to select the ith model. I think the problem with this was that models didn't necessarily have the same number of atoms, which breaks our trajectory iteration model.

@kain88-de
Copy link
Member Author

We could default to the first model though.

@jbarnoud
Copy link
Contributor

That is what we do with the PDB parser. Actually, we do not even read after the first model when reading the topology: https://github.com/MDAnalysis/mdanalysis/blob/develop/package/MDAnalysis/topology/PDBParser.py#L158.

@jbarnoud jbarnoud changed the title mmtf parser gives different number of segments then pdb parser mmtf parser concatenates models Jul 25, 2017
@richardjgowers
Copy link
Member

@jbarnoud for topology purposes, we read the first model and assume that all other models have the same topology, the coordinate reader will iterate (the coordinates) over models

@orbeckst
Copy link
Member

One issue is the handling of models, the other the handling of segments or did I misunderstand?

For PDB, segments are parts of a model. Is the same true for mmtf? Are there even segments defined in MMTF?

Assuming that a model broadly means the same thing in both formats then I would also go with the current approach to choose the first model for topology building. I think MMTFParser effectively switches the topology when another model is selected, doesn't it?

@richardjgowers
Copy link
Member

@orbeckst the problem is that different models can have a different number of atoms. We currently can't change the topology with frame (model) changes.

The PDB format using models as frames is a special case, as the number of atoms remains constant, which is why we can treat it as a traj.

@kain88-de
Copy link
Member Author

The PDB format using models as frames is a special case, as the number of atoms remains constant, which is why we can treat it as a traj.

That is what most PDBs do. Actually I do not know of a rule in PDB that forbids that models have a different number of atoms. It's just what most people implicitly do, especially since it's abused as a trajectory format.

@orbeckst
Copy link
Member

orbeckst commented Aug 22, 2017 via email

@richardjgowers
Copy link
Member

probably not relevant now mmtf is going extinct

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants