-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
So what if i dont know the charges #43
Comments
The first screen photo looks like atoms of oxygen is assigned a negative charge each,
If this snippet is about the complete molecule, the total charge would be equal to
or xyz2mol.py molecule.xyz --charge -4 respectively because the earlier line is about "create bonds for all molecule's atoms and an overall charge of ...". However, for the size of the molecule, Share the xyz file in question, for instance within a zip then attached to the mask here. |
@Ltwicke The default bond perception method uses interatomic distances together with cutoffs. The interatomic distances can be slightly different for different conformers, which can give different results for different conformers. You can try different value of covFactor or useHueckel=True to see if that helps. It's also important that you use the correct charge, i.e. the same charge you used for your QM calculations |
@nbehrnd Your not entirely correct, it also assigns the charge 1 to C twice (always below the two Os), so the total charge would therefore be -2. The xyz "file" is actually just a string block created shortly to make the mol object. @jhjensen2 The questions still remains, why the bond perception method needs the charge to work. I assume that when for some conformers, the distance between two atoms becomes so large, the algorithm tries to justify it by assigning a charge to the molecule and then the charge parameter is more like an assertion that this assumption is right? Then i still wonder, why there is no option to just infer the charge in this sense... |
@Ltwicke Better to share these data as plain text, than by a screen photo. Request the script to write a .xyz file .or. to copy-paste it -- it is less prone to errors in transmission, and faster to then replicate your observations. Post it as a fenced code block (parlance of [GitHub flavored] markdown, cf. for instance with learnxinyminutes) after a revision (vide infra). I run an online OCR (here) with the second screen photo of yours. The initial result ( An initial inspection with Jmol of the structure -- without assignment of bond orders different to one. It rises an eyebrow because the motif of two oxygen atoms next to each other in an organic molecule (i.e., an organic peroxide) usually yields an unstable molecule with tendency to decompose. Even more so if this motif occurs twice per molecule. And check the peroxide oxygen atoms depicted as trivalent, too. Playground: I once wrote a moderator xyz2mol_b to easier test an overall charge different from zero. Against the odds, the submission of python xyz2mol_b.py intermediate.xyz -c -4 > molecule_minus4.mol were in line with a sum of 4 negative charges. The attempt with |
Hey, sorry, your right, i was already kind of giving up on the probem. Heres an actual example from my dataset: This is the xyz string: '26\n\nC 3.048553 -0.031229 0.757984\nC 2.511941 -0.828616 -0.409149\nO 3.206721 -1.692353 -0.970628\nN 1.181584 -0.712026 -0.833386\nC 0.171254 0.129608 -0.366781\nC -1.139305 -0.411149 -0.153312\nC -2.131229 0.413783 0.404978\nC -1.884858 1.743819 0.752641\nC -0.644931 2.305070 0.473622\nC 0.373153 1.521099 -0.110496\nN 1.543497 2.238024 -0.538759\nO 1.750514 3.353296 -0.110506\nO 2.333591 1.685481 -1.409296\nC -1.487353 -1.827544 -0.517160\nO -0.708038 -2.533700 -1.166728\nC -2.827449 -2.391606 -0.091887\nH 2.260289 0.388497 1.376909\nH 3.664566 -0.684247 1.370458\nH 3.683724 0.777107 0.402794\nH 0.829627 -1.550721 -1.327940\nH -3.122948 0.007975 0.574823\nH -2.668454 2.343480 1.197834\nH -0.448350 3.354552 0.666391\nH -2.883419 -3.440655 -0.365380\nH -3.639944 -1.863558 -0.587756\nH -2.972832 -2.294489 0.980732' I then proceed to use
However, it tells me: ValueError: Final molecular charge (0) does not match input (-1); could not find valid bond ordering If i were to change the charge to -1, then i get the following response: ValueError: Final molecular charge (-1) does not match input (-4); could not find valid bond ordering When using another conformer of the same molecule however, with this following xyz string: '26\n\nC -1.935455 -1.367446 1.551091\nC -1.145513 -1.991177 0.421168\nO -1.291752 -3.188290 0.120949\nN -0.146416 -1.284075 -0.257162\nC 0.223788 0.055350 -0.112289\nC 1.608755 0.418802 -0.060006\nC 1.990547 1.758481 0.182665\nC 1.029890 2.742352 0.360649\nC -0.324359 2.412579 0.236727\nC -0.747634 1.099760 -0.019541\nC -2.198708 0.866646 -0.379629\nO -2.522873 -0.075443 -1.096743\nC -3.240425 1.855507 0.114015\nH -3.075026 2.119007 1.156450\nH -3.191555 2.773606 -0.470310\nH -4.234264 1.435943 -0.000793\nH -1.063149 3.203195 0.305317\nH 1.325072 3.765648 0.554457\nH 3.048659 1.995623 0.221612\nN 2.680117 -0.514933 -0.229924\nO 3.813063 -0.192512 0.033971\nO 2.409476 -1.707898 -0.691785\nH 0.538618 -1.888031 -0.737694\nH -1.486071 -0.452125 1.925956\nH -2.002204 -2.084458 2.365521\nH -2.951581 -1.154254 1.228369' The code works with charge = 0. Thank you for taking your time to look into it. |
@Ltwicke Thank you for sharing the update, I can replicate your findings, i.e.
The issues here are
over which your influence varies.
reading: "between atom
3),4) By 2) one could assume "bond order isn't 1, nor 2, so let's try 1.5". However as far as I know, fractional bond orders are not in the scope of a .mol file. The total charge of the molecule (which in case of the moderator script were the optional flag
Your revised data: I reset the line feeds by sed 's/\\n/\n/g' record_01.txt > record_01.xyz
sed 's/\\n/\n/g' record_02.txt > record_02.xyz then submitted them to the conversion moderated Not knowing your work flow, I suggest
|
Hello,
im a physicist and i learned about molecules in terms of quantum mechanical many body systems rather than in the good old chemical fashion that speaks bond orders and stereochemistry, so please be easy on me, if i got something fundamentally wrong.
I have a dataset of xyz file from qm calculations and i need to make mol objects out of them for further processing. Im using this:
as im expecting the molecules just to be neutrally charged, i have no information sadly. Now this procedure fails for some of my conformers:
with
ValueError: Final molecular charge (-2) does not match input (0); could not find valid bond ordering
Strangely enough, it fails for a certain conformer of a molecule, while it is able to process the others before. These are the atomic coordinates, in case someone can see something from them:
My question is: Is there a physical reason, why this algorithm cant find the bonds if the charge is unknown? I suppose it tries to distribute the given charge over the atoms and is unhappy if it cant match the charge with some pairs of coordinates based on some rules. Then i wonder; why is there no option to infer the most likely charge and just use this?
If it helps anything: im using version '2024.03.5' of rdkit.
Hope to hear from you,
Thanks!
Edit: For my total of 59k, it fails only 184 times, however, as stated before, it randomly fails for different conformers for the same molecule...
The text was updated successfully, but these errors were encountered: