-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Making CIF metrics usable #52
Comments
@aozalevsky @brindakv recommendations are very welcome |
afaik, direct support of
where All scripts for pLDDT coloring for PyMOL rely on the As for PyMOL, seems like only the minimal useful information from |
As for PAE, it's already supported in the ModelCIF as The low-level support for ModelCIF (including reading/writing metrics) is already implemented in the https://github.com/ihmwg/python-modelcif |
Showing _ma_qa_metric_local_pairwise in ChimeraX is not related to the capability added by @e-pettersen in April 2023 to show per-residue attributes from _ma_qa_metric since a residue-pair score would be handled entirely differently than a score on individual residues. But I have used the AlphaFold PAE plot to show residue-residue distances as shown here
That code just rewrote the distances in AlphaFold PAE JSON file format and read the file. I could possibly try some Python code in ChimeraX that reads the _ma_qa_metric_local_pairwise table from a ModelCIF entry and show it with the AlphaFold PAE plot in ChimeraX. Could you point me to a ModelCIF file that has that table? The dictionary description of the table suggests it has all the information needed to display like PAE data. |
@arogozhnikov @tomgoddard i took one of the examples from the Chai-1 webserver and generated a CIF file with embedded PAEs. There are a few comments, though:
|
The ChimeraX PAE plotting for AlphaFold 2 and AlphaFold 3 assumes that the matrix rows and columns include all residues in order and for AlphaFold 3 all atoms for all non-standard residues. I believe in AlphaFold 3 the PAE files contain residues numbers, chain ids, not sure about atom names, for each PAE row, but ChimeraX is not using those because the ChimeraX code was written for early AlphaFold 2 PAE files that had none of that information labeling the matrix rows. Also the PAE matrix is not symmetric in AlphaFold since the PAE value is defined by aligning the row residue/atom and then giving an error for the placement of the column residue/atom, so it is by definition not symmetric. So ChimeraX has no provision to handle just one triangle of a symmetric matrix. All these ChimeraX PAE limitations could be fixed for ModelCIF. My purpose in this comment is just to explain how ChimeraX PAE plotting is currently based entirely on AlphaFold (and ESMFold) PAE conventions. |
First of all let me state that it is great to see the interest in properly storing the quality metrics using ModelCIF. Given my involvement in ModelCIF and ModelArchive (MA), I can hopefully provide some useful input here.
|
ChimeraX plots the AlphaFold 3 PAE scores and handles its combination of residue and atom level values (one PAE value per residue for standard amino and nucleic acids, and one PAE value per atom for all other atoms). Specifically it knows that the per-residue score is a residue score, not just a score for the C-alpha atom. If you treat it as a C-alpha atom score you lose some knowledge about the meaning of the score. I agree it is messy to handle though. |
I made an example ChimeraX command to plot pairwise residue scores included directly in ModelCIF structure files described here
|
Yeah the "meaning of AlphaFold 3 PAE scores" is a tricky one here. In terms of what is being predicted, it relates only to the CA/C1' atoms for standard amino and nucleic acids. But: I could make the same argument about per-residue pLDDT and PAE in AF2. In terms of interpretation of the score (i.e. making it useful), I of course agree that PAE values for standard amino and nucleic acids relate to the whole residue and not just to CA/C1' atoms. Anyway I updated my suggested update of ModelCIF (ihmwg/ModelCIF#21) with an idea how one could represent QE metrics which are listed in separate ModelCIF tables. I.e. the same PAE metric would then have values in |
A simple idea for making the mmCIF tables handle pairwise scores between atoms and residues is to add two atom name fields to the _ma_qa_metric_local_pairwise table
If the atom name was given as "." then that would mean the score applies to the whole residue, while if an atom is named e.g. "CA" then the score is associated with that atom. This allows a pair score to be between an atom and a residue as it is done in AlphaFold 3. I think it will be important to handle AlphaFold 3 PAE scores. They will probably be the most heavily used pairwise scores in the next few years. |
we have a somewhat similar concept in ihm for crosslinks (which are also, typically, are pairwise restraints) |
Agree with the above. Supporting AF3 PAE will be important and we need to make sure that whatever we implement can handle pairs between single atoms and full residues. I will update the suggestion in ihmwg/ModelCIF#21 accordingly. One worry of adding more columns to |
We recently switched from PDB > CIF with two main motivations:
We followed recommendation to use
modelcif
, and specifically_ma_qa_metric
, but I don't see a way to color by this field in pymol; and don't see pairwise metrics in pymol either.I need help figuring out what's the right pipeline for users.
The text was updated successfully, but these errors were encountered: