-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Empty dataframe #15
Comments
Hi @TheKipiDragon , I'm assuming that in your script Don't hesitate if there's another problem |
Sorry, I didnt specified the path. |
Ok, then the problem probably comes from using Anyway, I would suggest you modify your script like this: import pathlib
import prolif as plf
import MDAnalysis as mda
# path to your rootdir, containing all the folders with PDB and SDF files
rootdir = pathlib.Path("/home/cedric/projects/ProLIF/prolif/data")
for subdir in rootdir.glob("*"):
# skip if subdir is not a directory
if not subdir.is_dir():
continue
try:
# search for pdb and sdf files in subdir
protfile = next(subdir.glob("*.pdb"))
ligfile = next(subdir.glob("*.sdf"))
except StopIteration:
# skip if subdir is missing either pdb or sdf file
continue
print(subdir.name, ligfile.name, protfile.name)
# load protein
prot = mda.Universe(protfile)
prot = plf.Molecule.from_mda(prot)
# load ligands
lig_suppl = list(plf.sdf_supplier(str(ligfile)))
# generate fingerprint
fp = plf.Fingerprint()
fp.run_from_iterable(lig_suppl, prot)
df = fp.to_dataframe()
print(df) Tell me if that works PS: if you want to share some code, put it between triple backticks like so:
|
Alright, thank you! rootdir = pathlib.Path("/home/milax/Escriptori/test") The terminal shows me the following:
Thank you, again, for all the help providen! |
Ok, can you try to run this as well: import MDAnalysis as mda
import prolif as plf
protfile = "path/to/Mpro-x10387_protein.pdb"
ligfile = "/path/to/Mpro-x10387_ligand.sdf"
prot = mda.Universe(protfile)
prot = plf.Molecule.from_mda(prot)
lig_suppl = plf.sdf_supplier(ligfile)
lig = next(lig_suppl)
pocket_residues = plf.get_residues_near_ligand(lig, prot, cutoff=6.0)
print(pocket_residues) It should print a list of residues that are close to your ligand. |
Alright, that works, but what I want to obtain are the descriptors for the ligand-protein interaction. Thank you for your assistance! |
Okay that's reassuring! Do both PDB and SDF file contain explicit hydrogens for all atoms ? |
Yeah, they both contain the hydrogens. Thanks for the help! |
Mmh that's strange then... Have you tried running the quickstart tutorial ? Just to make sure that there's no problem with your installation. If you get empty dataframes even with the tutorial, can you tell me which python version are you using, and the output of the following commands: import prolif, MDAnalysis, rdkit
print("prolif", prolif.__version__,
"MDAnalysis", MDAnalysis.__version__,
"rdkit", rdkit.__version__) If the tutorial worked, can you try running the code snippet from 2 messages ago (the one with the pocket_residues), then run the following: fp = plf.Fingerprint()
for resid in pocket_residues:
bv = fp.bitvector(lig, prot[resid])
print(resid, bv) If it prints some |
I dont know if this prolif version is supposed to show like that, since i followed the tutorial to install it, if I installed it incorrectly, please do tell me.
As for the second one, it didn't print any trues at all. Thanks for all the help! |
And if you run the quickstart tutorial (with the topology and trajectory available in plf.datafiles) do you get empty dataframes or not? If yes, can you also check if |
I was trying it, but when doing
I get the following error
I copy pasted the code from the quickstart. Thanks for the assistance! |
It's actually the line just below that is failing: df = fp.to_dataframe()
print(df) If it's empty, also do |
Alright! |
Ok so the installation is working correctly which is reassuring. I have a few more ideas as to where the problem could come from, could you run the following code snippet please: import MDAnalysis as mda
from rdkit import Chem
from rdkit.Chem import Draw
import prolif as plf
print(plf.__file__)
protfile = "path/to/Mpro-x10387_protein.pdb"
ligfile = "/path/to/Mpro-x10387_ligand.sdf"
# draw protein residues
prot = mda.Universe(protfile)
prot = plf.Molecule.from_mda(prot)
frags = []
for res in prot:
mol = Chem.RemoveHs(res)
mol.RemoveAllConformers()
frags.append(mol)
img = Draw.MolsToGridImage(frags, legends=[str(res.resid) for res in prot], subImgSize=(200, 140), molsPerRow=5, maxMols=prot.n_residues)
img.save("prot.png")
# draw ligand
lig_suppl = plf.sdf_supplier(ligfile)
lig = next(lig_suppl)
lig = Chem.RemoveHs(lig)
lig.RemoveAllConformers()
img = Draw.MolToImage(lig, size=(500, 500))
img.save("lig.png") The first print should show where prolif is installed, and include which version of prolif is installed (that's the only way to obtain it for now, until I find a way to fix it). Thanks, |
Sorry for taking so long to answer!
As for the images, I cant visualize them as I'm executing it in linux cmd. Also, it gave me a lot of lines as the following:
And this error:
Thanks for the help! |
No worries take your time! pip list | grep prolif in a bash shell and it should give you the full version string (something like For the error with the images code, can you add the following import at the beginning of the script: And I think the warning might be a clue as to why the fingerprint is empty but we'll see |
Sorry for the wait again!
As for the image itself, i've gotten this error
Thanks for the help! I'll try not to take as long this time, sorry for any inconveniences again! |
Okay so you're on the latest version. For the error, try adding |
I tried it, but still gives me the same error, 'Image' object has no attribute 'save' |
Mmh ok let's try it slightly differently then: import MDAnalysis as mda
from rdkit import Chem
from rdkit.Chem import Draw
import prolif as plf
protfile = "path/to/Mpro-x10387_protein.pdb"
ligfile = "/path/to/Mpro-x10387_ligand.sdf"
# draw protein residues
prot = mda.Universe(protfile)
prot = plf.Molecule.from_mda(prot)
for i in range(3):
res = prot[i]
mol = Chem.RemoveHs(res)
mol.RemoveAllConformers()
Draw.MolToImageFile(mol, f"{res.resid}.png", size=(500, 400))
# draw ligand
lig_suppl = plf.sdf_supplier(ligfile)
lig = next(lig_suppl)
lig = Chem.RemoveHs(lig)
lig.RemoveAllConformers()
Draw.MolToImageFile(lig, "lig.png", size=(500, 400)) This should save PNG images for the 3 first residues in your protein and 1 file for your ligand. Could you have a look at them and make sure that there are no unexpected formal charges (typically (-) charges on carbons) ? Thanks! |
Alright! It's working now, although it only prints one image file for the residues, not three (I've checked the protein, it's supposed to have 306 residues). The ligand structure is printed properly. |
Okay I'm even more confused now 😆 import MDAnalysis as mda
import prolif as plf
protfile = "path/to/Mpro-x10387_protein.pdb"
prot = mda.Universe(protfile)
prot = plf.Molecule.from_mda(prot)
for i in range(10):
res = prot[i]
print(i, res.resid) Actually is it possible for you to send me the protein file (my email is on my profile) ? Or is it sensitive data ? It will make it easier for me to debug as I'm starting to run out of ideas |
This is the output of the code
I dont think that the pdb is sensitive data, but I'll check with the director about it nonetheless. |
Hi @TheKipiDragon, Just checking back on this issue, is your director ok with sharing the pdb file ? You could also just share a part of it, something like the first 5 residues should be enough for me to pinpoint the source of the problem. Just make sure the "reduced" file also reproduces the same behavior as the previous code snippet, i.e: import MDAnalysis as mda
import prolif as plf
protfile = "path/to/first_5_residues.pdb"
prot = mda.Universe(protfile)
prot = plf.Molecule.from_mda(prot)
for i in range(5):
res = prot[i]
print(i, res.resid) should output
|
Hi, I am recently trying to use prolif to get protein-protein interactions from a single pdb file. I also faced very similar problem, the dataframe is empty. All the test files are running properly, results as expected. I am going to share the 10 lines script. If you want, I can also give you the pdb file. I am not sure if the topology file is necessary for the analysis. `import MDAnalysis as mda u1 = mda.Universe("trial_part.pdb") fp = plf.Fingerprint() |
Hi @soumyosen, I don't see any problem in your script, could you send me the pdb file ( Thanks in advance! |
Thank you @cbouy |
I managed to reproduce the issue, thanks! Can you try this and tell me if it works ?:
|
@cbouy Thanks it is working now. Just one more question. By default, I am supposed to get only the "True" results. Right? |
Yes, by default |
OK, Thanks a lot for your time. |
Hello @cbouy . Thanks for this. |
### Added - Improved the documentation on how to properly restrict interactions to ignore the protein backbone (Issue #22), how to fix the empty dataframe issue when no bond information is present in the PDB file (Issue #15), how to save the LigNetwork diagram (Issue #21), and some clarifications on using `fp.generate` ### Fixed - Mixing residue type with interaction type in the interactive legend of the LigNetwork would incorrectly display/hide some residues on the canvas - MOL2 files starting with a comment (`#`) would lead to an error
Greetings.
I made a program following the instructions in the "How to" page, to obtain the descriptors of the interaction between ligand and protein, but all I get as a result are empty dataframes.
I'll attach the code that I'm using.
Thank you for your attention.
Alberto Blanco, University Rovira i Virgili, Tarragona, Spain.
The text was updated successfully, but these errors were encountered: