-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add missing crosslink attributes #118
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #118 +/- ##
==========================================
- Coverage 99.76% 97.01% -2.76%
==========================================
Files 29 28 -1
Lines 7296 6728 -568
Branches 1749 1121 -628
==========================================
- Hits 7279 6527 -752
- Misses 11 180 +169
- Partials 6 21 +15
... and 5 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a general rule, python-ihm does not store IDs (instead, we use pointers to other Python objects) and does not read redundant information (which this is). All of this information is already in the cross link object.
From a user perspective, it's a somewhat unintuitive behavior. mmCIF is a table-based format, so i see a table in the mmcif file, i see the field names and expect to be able to access this information. For instance, now you can get asym and atom from xl, but not the residue (neither residue number nor residue name). You have to guess, that residue is linked inside the experimental_crosslink, but the I don't see a reason why one would want to impose a non-redundant scheme (doing essentially manual filtering/separation of the information ) if redundancy is a part of the format. Linking to instances is completely ok unless it obscures the information. Again, as in the example with why not be consistent? |
That is not how python-ihm is designed. The internal representation is a hierarchy of Python objects, not a bunch of tables.
No guessing is required. We can certainly add convenience properties where necessary to reduce the number of hops though.
Nothing has a |
I've added convenience accessors so this information should be available for a given
This should work for the majority of depositions. If memory serves there are one or two where they elected to enforce cross-links on different residues from those identified experimentally (these are easy to see because the comp_ids in the mmCIF file are not all LYS, for example). python-ihm doesn't currently handle that; see #119. (If your intention is to preserve data exactly as read from the mmCIF file, python-ihm probably isn't the best tool for the job because it is not designed to do that. Although you can use its low-level classes if you want to read the file as a bunch of tables, there are other tools such as Biopython which can do that too.) |
Some mandatory attributes are missing from the interface in the current version: