Skip to content

Commit

Permalink
Update examples and README
Browse files Browse the repository at this point in the history
  • Loading branch information
ben-clancy committed Aug 28, 2024
1 parent aa1c2fa commit c5fd28c
Show file tree
Hide file tree
Showing 5 changed files with 43 additions and 40 deletions.
11 changes: 7 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,23 @@
# MONSTROUS

MOlecular traNSporT inhibitoR and substrate predictOr Utility Server (MONSTROUS) is a computational transporter profiler that predicts the potential of a chemical to interact with transporters recommended for testing in drug development by regulatory agencies. Currently, these transporters are considered to be a major player in determining the safety and efficacy of drugs. MONSTROUS utilizes either graph convolutional neural networks or similarity-based cheminformatics approaches to screen query chemicals against 12 transporters widely expressed in various tissues, including liver, brain, and kidney, and makes predictions as to their potential to be inhibitors as well as substrates.

##### Supporting information for paper:
Title:
Authors:
Journal:
### Intro
This repository contains the data and models used to make MONSTROUS's predictions and has sorted this data into 3 sections:
This repository contains the data and models used to make MONSTROUS's predictions and has sorted this data into 4 sections:
- A GCNN folder containing the data for our GCNN transporters. This data includes csv files containing lists of compounds for each GCNN transporter as well as their GCNN models.
- A Similarity Approach folder containing the data for our non-GCNN transporters. This data includes csv files containing lists of compounds for each transporter.
- A python folder containing the code that will run the MONSTROUS command line tool.
- An examples folder containing an example input as well as output files for each output format the tool supports.

### GCNN
The GCNN folder contains two subfolders: compounds and models. In the compounds folder are CSV files for each transporter, containing a list of compounds that are _____ for the given transporter. These compounds are used in generating the applicability domain for that transporter. In the models folder we hold the models for each GCNN transporter. These models are used to generate the values for GCNN transporters
The GCNN folder contains two subfolders: compounds and models. In the compounds folder are CSV files for each transport protein, containing a list of reference compounds that are known inhibitors or substrates for the given transporter. These compounds are used in generating the applicability domain for that transporter. In the models folder we hold the models for each GCNN transporter. These models are used to generate the values for GCNN transporters

### Similarity Approach
The similarity approach folder contains CSV files for each transporter, containing a list of compounds that are _____ for the given transporter. These compounds are used in generating the applicability domain for that transporter, as well as generating the values for similarity approach transporters
The similarity approach folder contains CSV files for each transport protein, containing a list of reference compounds that are known inhibitors or substrates for the given transporter. These compounds are used in generating the applicability domain for that transporter, as well as generating the values for similarity approach transporters

## MONSTROUS Command Line Tool

Expand All @@ -31,7 +34,7 @@ Next, navigate to this repository's folder and enter the following command to in

### Running the MONSTROUS command line tool

Once everything is installed, you can then run the script by running `python python/monstrous_clt.py` followed by any of the following tags (and must include the `-i` , input file, tag):
Once everything is installed, you can then run the script by running `python python/monstrous_clt.py` followed by any of the following tags (and must include the `-i` , input file and tag):
- `-h` or `--help`: Shows a help message explaining these tags.
- `-i [INPUT]` or `--input [INPUT]`: The file location of a .CSV file whose first column is 'Name' and whose second is 'SMILES' and contains the list of SMILES to be submitted.
- `-o [OUTPUT]` or `--output [OUTPUT]`: The output file path
Expand Down
8 changes: 4 additions & 4 deletions examples/example_csv_output_inhibitor.csv
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
Name,SMILES (original),SMILES (standardized),BCRP_inhibitor,BSEP_inhibitor,MRP1_inhibitor,OATP1B1_inhibitor,OATP1B3_inhibitor,PGP_inhibitor,MATE1_inhibitor,MATE2_inhibitor,OAT1_inhibitor,OAT3_inhibitor,OCT2_inhibitor,MRP2_inhibitor
Verapamil,COC1=C(OC)C=C(CCN(C)CCCC(C#N)(C(C)C)C2=CC(OC)=C(OC)C=C2)C=C1,COc1ccc(CCN(C)CCCC(C#N)(c2ccc(OC)c(OC)c2)C(C)C)cc1OC,True,False,True,False,False,True,False,False,False,False,False,False
Lopinavir,CC1=C(C(=CC=C1)C)OCC(=O)NC(CC2=CC=CC=C2)C(CC(CC3=CC=CC=C3)NC(=O)C(C(C)C)N4CCCNC4=O)O,Cc1cccc(C)c1OCC(=O)NC(Cc1ccccc1)C(O)CC(Cc1ccccc1)NC(=O)C(C(C)C)N1CCCNC1=O,False,False,True,True,True,True,False,False,False,False,False,False
Lopinavir,CC1=C(C(=CC=C1)C)OCC(=O)NC(CC2=CC=CC=C2)C(CC(CC3=CC=CC=C3)NC(=O)C(C(C)C)N4CCCNC4=O)O,Cc1cccc(C)c1OCC(=O)NC(Cc1ccccc1)C(O)CC(Cc1ccccc1)NC(=O)C(C(C)C)N1CCCNC1=O,False,False,True,True,True,True,True,True,False,False,True,False
Probenecid,CCCN(CCC)S(=O)(=O)C1=CC=C(C=C1)C(=O)O,CCCN(CCC)S(=O)(=O)c1ccc(C(=O)O)cc1,False,False,True,False,False,False,False,False,True,True,False,False
Saquinavir,CC(C)(C)NC(=O)C1CC2CCCCC2CN1CC(C(CC3=CC=CC=C3)NC(=O)C(CC(=O)N)NC(=O)C4=NC5=CC=CC=C5C=C4)O,CC(C)(C)NC(=O)C1CC2CCCCC2CN1CC(O)C(Cc1ccccc1)NC(=O)C(CC(N)=O)NC(=O)c1ccc2ccccc2n1,False,False,True,False,False,True,False,False,False,False,False,False
Saquinavir,CC(C)(C)NC(=O)C1CC2CCCCC2CN1CC(C(CC3=CC=CC=C3)NC(=O)C(CC(=O)N)NC(=O)C4=NC5=CC=CC=C5C=C4)O,CC(C)(C)NC(=O)C1CC2CCCCC2CN1CC(O)C(Cc1ccccc1)NC(=O)C(CC(N)=O)NC(=O)c1ccc2ccccc2n1,False,False,True,False,False,True,True,True,False,False,False,False
Bromosulphophthalein,[O-]S(=O)(=O)c1c(O)ccc(c1)C3(OC(=O)c2c(Br)c(Br)c(Br)c(Br)c23)c4ccc(O)c(c4)S([O-])(=O)=O,O=C1OC(c2ccc(O)c(S(=O)(=O)O)c2)(c2ccc(O)c(S(=O)(=O)O)c2)c2c(Br)c(Br)c(Br)c(Br)c21,True,False,True,True,True,False,False,False,False,False,False,False
Methotrexate,CN(CC1=CN=C2C(=N1)C(=NC(=N2)N)N)C3=CC=C(C=C3)C(=O)NC(CCC(=O)O)C(=O)O,CN(Cc1cnc2nc(N)nc(N)c2n1)c1ccc(C(=O)NC(CCC(=O)O)C(=O)O)cc1,False,False,True,False,False,False,False,False,False,False,False,False
Furosemide,C1=COC(=C1)CNC2=CC(=C(C=C2C(=O)O)S(=O)(=O)N)Cl,NS(=O)(=O)c1cc(C(=O)O)c(NCc2ccco2)cc1Cl,False,False,True,False,False,False,False,False,False,False,False,False
Metformin,CN(C)C(=N)N=C(N)N,CN(C)C(=N)N=C(N)N,False,False,True,False,False,False,False,False,False,False,False,False
Cimetidine,CC1=C(N=CN1)CSCCNC(=NC)NC#N,CN=C(NC#N)NCCSCc1nc[nH]c1C,False,False,True,False,False,False,True,False,False,False,True,False
Procainamide,CCN(CC)CCNC(=O)C1=CC=C(C=C1)N,CCN(CC)CCNC(=O)c1ccc(N)cc1,False,False,False,False,False,False,False,False,False,False,False,False
Procainamide,CCN(CC)CCNC(=O)C1=CC=C(C=C1)N,CCN(CC)CCNC(=O)c1ccc(N)cc1,False,False,False,False,False,False,False,False,True,True,False,False
Oestrone,CC12CCC3C(C1CCC2=O)CCC4=C3C=CC(=C4)O,CC12CCC3c4ccc(O)cc4CCC3C1CCC2=O,False,False,True,False,False,False,True,True,False,False,True,False
Pravastatin,CCC(C)C(=O)OC1CC(C=C2C1C(C(C=C2)C)CCC(CC(CC(=O)O)O)O)O,CCC(C)C(=O)OC1CC(O)C=C2C=CC(C)C(CCC(O)CC(O)CC(=O)O)C21,False,False,True,True,True,False,False,False,False,True,False,False
Pravastatin,CCC(C)C(=O)OC1CC(C=C2C1C(C(C=C2)C)CCC(CC(CC(=O)O)O)O)O,CCC(C)C(=O)OC1CC(O)C=C2C=CC(C)C(CCC(O)CC(O)CC(=O)O)C21,False,False,True,True,True,False,False,False,False,True,False,True
Delaviridine,CC(C)NC1=C(N=CC=C1)N1CCN(CC1)C(=O)C1=CC2=C(N1)C=CC(NS(C)(=O)=O)=C2,CC(C)Nc1cccnc1N1CCN(C(=O)c2cc3cc(NS(C)(=O)=O)ccc3[nH]2)CC1,True,False,True,False,False,True,False,False,False,False,False,False
Loperamide,CN(C)C(=O)C(CCN1CCC(O)(CC1)C1=CC=C(Cl)C=C1)(C1=CC=CC=C1)C1=CC=CC=C1,CN(C)C(=O)C(CCN1CCC(O)(c2ccc(Cl)cc2)CC1)(c1ccccc1)c1ccccc1,False,False,True,False,False,True,False,False,False,False,False,False
Rifampicin,CC1C=CC=C(C(=O)NC2=C(C(=C3C(=C2O)C(=C(C4=C3C(=O)C(O4)(OC=CC(C(C(C(C(C(C1O)C)O)C)OC(=O)C)C)OC)C)C)O)O)C=NN5CCN(CC5)C)C,COC1C=COC2(C)Oc3c(C)c(O)c4c(O)c(c(C=NN5CCN(C)CC5)c(O)c4c3C2=O)NC(=O)C(C)=CC=CC(C)C(O)C(C)C(O)C(C)C(OC(C)=O)C1C,False,False,True,False,True,False,False,False,False,False,False,True
16 changes: 8 additions & 8 deletions examples/example_csv_output_substrate.csv
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
Name,SMILES (original),SMILES (standardized),BCRP_substrate,MRP1_substrate,PGP_substrate,BSEP_substrate,MATE1_substrate,MATE2_substrate,OAT1_substrate,OAT3_substrate,OATP1B1_substrate,OATP1B3_substrate,OCT2_substrate,MRP2_substrate
Verapamil,COC1=C(OC)C=C(CCN(C)CCCC(C#N)(C(C)C)C2=CC(OC)=C(OC)C=C2)C=C1,COc1ccc(CCN(C)CCCC(C#N)(c2ccc(OC)c(OC)c2)C(C)C)cc1OC,False,True,True,False,False,False,False,False,False,False,False,False
Lopinavir,CC1=C(C(=CC=C1)C)OCC(=O)NC(CC2=CC=CC=C2)C(CC(CC3=CC=CC=C3)NC(=O)C(C(C)C)N4CCCNC4=O)O,Cc1cccc(C)c1OCC(=O)NC(Cc1ccccc1)C(O)CC(Cc1ccccc1)NC(=O)C(C(C)C)N1CCCNC1=O,False,False,True,False,False,False,False,False,False,False,False,True
Probenecid,CCCN(CCC)S(=O)(=O)C1=CC=C(C=C1)C(=O)O,CCCN(CCC)S(=O)(=O)c1ccc(C(=O)O)cc1,True,False,False,False,False,False,False,False,False,False,False,True
Lopinavir,CC1=C(C(=CC=C1)C)OCC(=O)NC(CC2=CC=CC=C2)C(CC(CC3=CC=CC=C3)NC(=O)C(C(C)C)N4CCCNC4=O)O,Cc1cccc(C)c1OCC(=O)NC(Cc1ccccc1)C(O)CC(Cc1ccccc1)NC(=O)C(C(C)C)N1CCCNC1=O,False,False,True,True,False,False,False,False,False,False,False,True
Probenecid,CCCN(CCC)S(=O)(=O)C1=CC=C(C=C1)C(=O)O,CCCN(CCC)S(=O)(=O)c1ccc(C(=O)O)cc1,True,False,False,False,True,True,False,True,False,False,False,True
Saquinavir,CC(C)(C)NC(=O)C1CC2CCCCC2CN1CC(C(CC3=CC=CC=C3)NC(=O)C(CC(=O)N)NC(=O)C4=NC5=CC=CC=C5C=C4)O,CC(C)(C)NC(=O)C1CC2CCCCC2CN1CC(O)C(Cc1ccccc1)NC(=O)C(CC(N)=O)NC(=O)c1ccc2ccccc2n1,False,False,True,True,False,False,False,False,False,True,False,True
Bromosulphophthalein,[O-]S(=O)(=O)c1c(O)ccc(c1)C3(OC(=O)c2c(Br)c(Br)c(Br)c(Br)c23)c4ccc(O)c(c4)S([O-])(=O)=O,O=C1OC(c2ccc(O)c(S(=O)(=O)O)c2)(c2ccc(O)c(S(=O)(=O)O)c2)c2c(Br)c(Br)c(Br)c(Br)c21,True,True,False,False,False,False,False,False,True,True,False,True
Methotrexate,CN(CC1=CN=C2C(=N1)C(=NC(=N2)N)N)C3=CC=C(C=C3)C(=O)NC(CCC(=O)O)C(=O)O,CN(Cc1cnc2nc(N)nc(N)c2n1)c1ccc(C(=O)NC(CCC(=O)O)C(=O)O)cc1,True,False,False,False,False,False,False,True,False,False,False,True
Furosemide,C1=COC(=C1)CNC2=CC(=C(C=C2C(=O)O)S(=O)(=O)N)Cl,NS(=O)(=O)c1cc(C(=O)O)c(NCc2ccco2)cc1Cl,True,False,False,False,False,False,False,False,False,False,False,False
Metformin,CN(C)C(=N)N=C(N)N,CN(C)C(=N)N=C(N)N,True,True,False,False,False,False,False,False,False,False,False,False
Furosemide,C1=COC(=C1)CNC2=CC(=C(C=C2C(=O)O)S(=O)(=O)N)Cl,NS(=O)(=O)c1cc(C(=O)O)c(NCc2ccco2)cc1Cl,True,False,False,False,False,False,False,True,False,False,False,False
Metformin,CN(C)C(=N)N=C(N)N,CN(C)C(=N)N=C(N)N,True,True,False,False,True,True,False,False,False,False,True,False
Cimetidine,CC1=C(N=CN1)CSCCNC(=NC)NC#N,CN=C(NC#N)NCCSCc1nc[nH]c1C,True,False,False,False,True,True,True,True,False,False,True,False
Procainamide,CCN(CC)CCNC(=O)C1=CC=C(C=C1)N,CCN(CC)CCNC(=O)c1ccc(N)cc1,False,False,True,False,True,True,False,False,False,False,False,False
Oestrone,CC12CCC3C(C1CCC2=O)CCC4=C3C=CC(=C4)O,CC12CCC3c4ccc(O)cc4CCC3C1CCC2=O,True,False,False,False,True,True,False,True,True,False,False,True
Pravastatin,CCC(C)C(=O)OC1CC(C=C2C1C(C(C=C2)C)CCC(CC(CC(=O)O)O)O)O,CCC(C)C(=O)OC1CC(O)C=C2C=CC(C)C(CCC(O)CC(O)CC(=O)O)C21,True,True,False,True,False,False,False,True,False,True,False,True
Procainamide,CCN(CC)CCNC(=O)C1=CC=C(C=C1)N,CCN(CC)CCNC(=O)c1ccc(N)cc1,False,False,True,False,True,True,True,True,False,False,False,True
Oestrone,CC12CCC3C(C1CCC2=O)CCC4=C3C=CC(=C4)O,CC12CCC3c4ccc(O)cc4CCC3C1CCC2=O,True,False,False,True,True,True,False,True,True,False,False,True
Pravastatin,CCC(C)C(=O)OC1CC(C=C2C1C(C(C=C2)C)CCC(CC(CC(=O)O)O)O)O,CCC(C)C(=O)OC1CC(O)C=C2C=CC(C)C(CCC(O)CC(O)CC(=O)O)C21,True,True,False,True,False,False,False,True,True,True,False,True
Delaviridine,CC(C)NC1=C(N=CC=C1)N1CCN(CC1)C(=O)C1=CC2=C(N1)C=CC(NS(C)(=O)=O)=C2,CC(C)Nc1cccnc1N1CCN(C(=O)c2cc3cc(NS(C)(=O)=O)ccc3[nH]2)CC1,True,False,True,False,False,False,False,False,False,False,False,False
Loperamide,CN(C)C(=O)C(CCN1CCC(O)(CC1)C1=CC=C(Cl)C=C1)(C1=CC=CC=C1)C1=CC=CC=C1,CN(C)C(=O)C(CCN1CCC(O)(c2ccc(Cl)cc2)CC1)(c1ccccc1)c1ccccc1,False,False,True,False,False,False,False,False,False,False,False,False
Loperamide,CN(C)C(=O)C(CCN1CCC(O)(CC1)C1=CC=C(Cl)C=C1)(C1=CC=CC=C1)C1=CC=CC=C1,CN(C)C(=O)C(CCN1CCC(O)(c2ccc(Cl)cc2)CC1)(c1ccccc1)c1ccccc1,False,False,True,False,False,False,False,True,True,False,False,False
Rifampicin,CC1C=CC=C(C(=O)NC2=C(C(=C3C(=C2O)C(=C(C4=C3C(=O)C(O4)(OC=CC(C(C(C(C(C(C1O)C)O)C)OC(=O)C)C)OC)C)C)O)O)C=NN5CCN(CC5)C)C,COC1C=COC2(C)Oc3c(C)c(O)c4c(O)c(c(C=NN5CCN(C)CC5)c(O)c4c3C2=O)NC(=O)C(C)=CC=CC(C)C(O)C(C)C(O)C(C)C(OC(C)=O)C1C,True,True,True,False,False,False,False,False,True,True,False,True
Loading

0 comments on commit c5fd28c

Please sign in to comment.