Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metals.json updated #57

Merged
merged 1 commit into from
Sep 17, 2024
Merged

metals.json updated #57

merged 1 commit into from
Sep 17, 2024

Conversation

GaribMurshudov
Copy link
Contributor

@GaribMurshudov GaribMurshudov commented Sep 4, 2024

New metal-ligand distance file. New file contains distance statistics for all metal-non metal "bonds" depending on coordination number. If the distance distribution have multiple modes then all they are listed (small modes may be absent)

@GaribMurshudov
Copy link
Contributor Author

Do I need to do anything else on this branch?

@keitaroyam
Copy link
Contributor

Giving some description about how the file was updated may help.

I need to do some tests.

@GaribMurshudov
Copy link
Contributor Author

I added some description.

@keitaroyam
Copy link
Contributor

I just wondered why some MAD values were so different from the old metals.json. For example, K-O distance:

# Old
   coord    median       mad      mean       std  count
0      3  2.753206  0.155887  2.808922  0.191985     36
1      4  2.602137  0.046944  2.618844  0.059042     34
2      6  2.761015  0.103126  2.754573  0.139016    722
3      8  2.818235  0.040012  2.818390  0.060503   4680
4     10  2.881260  0.057027  2.879357  0.073333     76

# New
   coord    median       mad      mean       std  count
0      2  3.251116  0.136209  2.905904  0.574963      8
1      3  2.753206  0.096156  2.808922  0.189300     36
2      4  2.602137  0.030799  2.618844  0.058167     34
3      6  2.761015  0.077418  2.754573  0.138920    722
4      8  2.818235  0.027879  2.818390  0.060496   4680
5      9  2.852053  0.031486  2.843358  0.081653     18
6     10  2.879385  0.052707  2.883545  0.075945     87
7     11  2.675899  0.008922  2.698282  0.046524      4
8     12  2.799075  0.074610  2.822165  0.110755     16
9     16  2.782353  0.024644  2.772753  0.042812     16

For coord = 8, it was 0.04 and now is 0.028, despite the same counts. Cs-Cl is much more different:

# Old
   coord   median       mad      mean       std  count
0      3  3.48288  0.000000  3.482880  0.000000      1
1      6  2.91600  0.240004  3.095503  0.270005      9

# New
   coord   median    mad      mean       std  count
0      3  3.48288  0.000  3.482880  0.000000      1
1      6  2.91600  0.001  3.095503  0.254563      9

Has the calculation method been changed?

@Lekaveh
Copy link

Lekaveh commented Sep 17, 2024

I checked and recognized that in the previous version deprecated "mad" function from pandas was used which is actually Mean Absolute Error. In the last calculation mediad absolute deviation (np.median(np.abs(distances - np.median(distances)))) was used.

@keitaroyam
Copy link
Contributor

Thank you! Could you provide a link to the calculation code, if available on Github?

@Lekaveh
Copy link

Lekaveh commented Sep 17, 2024

It is not available on Github. I can post it here

@keitaroyam keitaroyam merged commit 80ddcba into master Sep 17, 2024
1 check passed
@keitaroyam keitaroyam deleted the update-metals-json branch September 17, 2024 13:43
@keitaroyam
Copy link
Contributor

This metals.json has some inappropriate values. For example, the longest Mg-O distance has too small std/mad, which causes very strong force in (servalcat) refinement if Mg-O distance is long in the model.

 {'coord': 12,
  'median': 2.081025467416805,
  'mad': 0.00028628237234551435,
  'mean': 2.0810127465104333,
  'std': 0.00033989375892045,
  'count': 12,
  'modes': [{'mode': 2.0810127465104333,
    'std': 0.00033989375892045,
    'weight': 1.0}]}

Servalcat was fixed to cap the sigma in 0.4.94, but not released. Probably it is better to fix metals.json as well?

@GaribMurshudov
Copy link
Contributor Author

It looks like that there is one example in the COD: Mg with coordination 12. And all these values came from that example. Capping is a good practice.
How should we change metals.json? Should it be done by @Lekaveh during generation or should there be a postprocessing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants