-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue on using IDSL_MINT with cuda device #1
Comments
Hi @NTuan-Nguyen, can you please help Sara to fix this issue ? Thanks! Dinesh |
Hello Sara, I think this issue might sometime occur on a system with PyTorch and Triton backend on multiple GPU systems. I was able to replicate the issue on Colab using the default Colab PyTorch 2.3 with cuda 12. A workaround for this issue is to revert back to an earlier PyTorch build using cuda 11.8. This can be done by adding the following code to the notebook during installation step: !pip install torch==2.0.1+cu118 torchvision torchaudio torchinfo --extra-index-url https://download.pytorch.org/whl/cu118 I have tested this on Colab environment, but please let me know if the issue persist or if you're unable to apply the fix in your workspace. |
Thanks, this worked in resolving the issue with the Triton library. |
Hi NTuan,
Thank you for resolving the last issue. I was wondering if you can also assist me with the prediction part of IDSL_MINT, for MS2FP. I am providing the test samples in .msp format, containing 250 in one case and 500 in another case to the prediction method. However, it is only detecting 40-55 samples for different datasets. Could you kindly advise as to why not all samples were provided with a prediction?
I have attached a screenshot of the prediction output in Jupyter. As can be seen, the blocks are read perfectly well, but when model prediction is initiated it only provides outputs to a portion of the samples.
I look forward to hearing from you.
Regards,
Sara
One sample

Another sample:

… On Jul 11, 2024, at 12:58 PM, NTuan-Nguyen ***@***.***> wrote:
Closed #1 <#1> as completed.
—
Reply to this email directly, view it on GitHub <#1 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AKSAYJCD6JNBN6BF4KJYV3DZL22UZAVCNFSM6AAAAABJ2M2FQCVHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJTGQ3TINZVGAZTEMI>.
You are receiving this because you authored the thread.
|
The YAML file has a section for MSP processing criteria used to filter out MSP blocks that fall outside the model training space. If an MSP block does not meet these criteria, it will not be streamlined in the prediction step. You can find a log file in the output folder, which records any issues with MSP block processing. An example of an MSP block for Aspirin is provided on the main GitHub page. The necessary row entries for an MSP block are Name, PrecursorMZ, and Num Peaks. |
Understood. Am I correct in assuming that in cases that the Names are not unique or we only have access to compound ID or InChIKey (such as the Casmi 2022 dataset), the algorithm would not be able to provide predictions?
… On Jul 11, 2024, at 1:54 PM, Sadjad Fakouri Baygi ***@***.***> wrote:
You should find a log file in the output folder. If the MSP blocks were not processed correctly, it will be recorded there. This is the primary reason why MSP blocks are not streamlined in the prediction step. I've put an example of MSP block for Aspirin in the main Github page. Necessary row entries for a MSP block are Name, PrecursorMZ and Num Peaks.
—
Reply to this email directly, view it on GitHub <#1 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AKSAYJCIBXCFLAU6YLRODVDZL3BEZAVCNFSM6AAAAABJ2M2FQCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRTGU2DCMJXG4>.
You are receiving this because you authored the thread.
|
Names do not have to be unique values, but Name row entries must be there. You should standardize your msp blocks before feeding them into MINT. |
You mentioned the uniqueness of the name isn’t important. I checked the log and msp file.
This is just one of the warnings in the log file:
WARNING!!! Removed MSP block ID `1` related to `A_M8_negPFP_03`!
We have provided the three fields you mentioned in all samples and this is the msp block related to the mentioned removed block:
Name: A_M8_negPFP_03
PrecursorMZ: 959.4857
accession:
formula: C46H74O18
inchi:
inchikey: ZKCHQVRAXCCTLE-YXHZOQBQSA-N
instrument:
instrument_type:
ion_mode: Negative
mspfilename: compound22_neg.msp
origin:
precursor_type:
smiles: CC1(C2CCC3(C(C2(CCC1OC4C(C(C(CO4)OC5C(C(C(C(O5)CO)O)O)O)O)OC6C(C(C(C(O6)CO)O)O)O)C)CC=C7C3(CCC8(C7CC(CC8)(C)O)C(=O)O)C)C)C
Num Peaks: 20
589.371337890625 1000.0
913.4783935546876 586.0267162402822
71.01252746582031 515.8543538237518
113.02287292480467 468.4553416691734
101.02297973632812 428.1174459461769
457.33154296875 397.6219270407272
85.02803039550781 337.9321735558277
89.02291107177734 274.9899145437956
275.0784912109375 250.24985495915308
161.04464721679688 217.49185721113724
304.286376953125 201.07134509328185
733.4205322265625 180.99497331719678
119.03327178955078 130.82086499261035
73.02812957763672 68.71968997340647
485.32757568359375 68.51101645221802
571.3665161132812 27.78861894280747
377.1240234375 22.856618471034004
199.70016479492188 19.588991572034402
221.0661163330078 15.165348488986465
365.5905151367187 15.03111104927375
As can be seen, the “Name”, “ PrecursorMZ” and “Num Peaks” are all provided. Based on your method I also normalized the peaks so their intensity would be between [10,1000].
Is there anything that we missed leading to the block being omitted from the prediction process?
… On Jul 11, 2024, at 2:59 PM, Sadjad Fakouri Baygi ***@***.***> wrote:
Names do not have to be unique values, but Name row entries must be there. You should standardize your msp blocks before feeding them into MINT.
—
Reply to this email directly, view it on GitHub <#1 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AKSAYJGS3RCUJI2VO4PJQKLZL3IXVAVCNFSM6AAAAABJ2M2FQCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRTGY3TMOJTGM>.
You are receiving this because you authored the thread.
|
WARNING!!! Removed MSP block ID The m/z thresholds refer to the mass values, not their intensities. Could you please also share your YAML file? |
Sure, please find it as below:
MINT_MS2FP_predictor:
## You should try to use identical parameters used in the training step to maximize the performance of the model.
MSP:
Directory to MSP files: IDSL_MINT_files/msp_files/
MSP files: compound22_neg_sorted.msp # A string OR a list of msp files in [brackets]
Minimum m/z: 100
Maximum m/z: 900
Interval m/z: 0.1 # This parameters is also used as a maximum mass deviation parameter
Minimum number of peaks: 5
Maximum number of peaks: 512
Noise removal threshold: 0.01
Allowed spectral entropy: True
Number of CPU processing threads: 4
Model Parameters:
## Model parameters must be identical to the used parameters in the training step; otherwise, PyTorch cannot load weight parameters.
Number of m/z tokens: 8003 # This parameter calculated using: 3 + (Maximum m/z - Minimum m/z)/Interval m/z
Dimension of model: 512 # general dimension of the model
Embedding norm of m/z tokens: 2
Dropout probability of embedded m/z: 0.1
Number of total fingerprint bits: 2051 # This number should also include three special tokens dedicated to this workflow. (e.g. 2048 + 3)
Maximum number of available fingerprint bits: 200
Number of attention heads: 2
Number of encoder layers: 3
Number of decoder layers: 3
Dropout probability of transformer: 0.1
Activation function: relu # relu OR glue
Model address to load weights: /home/ec2-user/SageMaker/IDSL_MINT_files/ms2fp_cmp_neg/MINT_MS2FP_model.pth
Prediction Parameters:
Directory to store predictions: /home/ec2-user/SageMaker/IDSL_MINT_files/ms2fp_neg22_prediction
Device: cuda # cuda OR cpu. When None, it automatically finds the processing device.
Beam size: 3
Number of CPU processing threads: 4
… On Jul 11, 2024, at 4:34 PM, Sadjad Fakouri Baygi ***@***.***> wrote:
Those m/z thresholds pertain to the mass themselves not their intensities. Can you also post your YAML file?
—
Reply to this email directly, view it on GitHub <#1 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AKSAYJFTIN47HZLOPGGUDVTZL3T45AVCNFSM6AAAAABJ2M2FQCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRTHA3TMNZTGY>.
You are receiving this because you authored the thread.
|
By this are you referring to the fact that the minimum and maximum m/z should be investigated over the mass of all samples (train/validation and test) then the yaml file can include those numbers? Could this be why the algorithm bypasses some samples? I will also look into the names for a more accurate representation.
… On Jul 11, 2024, at 4:38 PM, Sadjad Fakouri Baygi ***@***.***> wrote:
WARNING!!! Removed MSP block ID 1 related to A_M8_negPFP_03! Names don't need to be unique, but in this case the MSP block ID 1 should be specifically investigated.
The m/z thresholds refer to the mass values, not their intensities. Could you please also share your YAML file?
—
Reply to this email directly, view it on GitHub <#1 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AKSAYJHQJCHGS2YYPNIE2STZL3UMFAVCNFSM6AAAAABJ2M2FQCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRTHA4TSOBTHE>.
You are receiving this because you authored the thread.
|
Your precursor mass is out of the mass range specified in the YAML file.
There is a 10% tolerance in number of peaks for fragmentation mass to be outside of the training space, but the precursor mass must be within this range. Additionally, keep in mind:
|
Yes, each m/z value is represented by a specific embedded token. If that token is not in the training space, the model cannot represent your chemical space. |
Thanks Sadjad for your help and advice.
I believe I have three items on my plate for further investigation. I appreciate the explanation.
Kind regards,
Sara
… On Jul 11, 2024, at 8:46 PM, Sadjad Fakouri Baygi ***@***.***> wrote:
Your precursor mass is out of the mass range specified in the YAML file.
Minimum m/z: 100
Maximum m/z: 900
There is a 10% tolerance in number of peaks for fragmentation mass to be outside of the training space, but the precursor mass must be within this range.
Additionally, keep in mind:
Minimum number of peaks: 5 counted after noise removal threshold: 0.01
—
Reply to this email directly, view it on GitHub <#1 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AKSAYJA4ZQ4WGGAFSBT66Y3ZL4ROPAVCNFSM6AAAAABJ2M2FQCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRUGIZDIMZSGY>.
You are receiving this because you authored the thread.
|
you're welcome! |
I have downloaded and ran your code from GitHub in various environments (Colab, GC, AWS), however, when we change the yaml file to use the GPU server it throws an error specifying problems with the triton library. This is the error:
ValueError: Pointer argument (at 2) cannot be accessed from Triton (cpu tensor?)
Please note, that I have made sure that we are using a GPU server (already checked and confirmed with using nvidia-smi command) and have modified the related YAML file that the device is “cuda”. I have also checked the training file to ensure everything is passed on to the specified device.
Do you have any insights on why this issue comes up? I have tested the code on the CPU and modified the YAML file to work with a CPU and that works fine. The issue only arises when using a GPU server and specifying the device as cuda.
The text was updated successfully, but these errors were encountered: