[Official Python implementation]
A repository with code concerning the automation of the modelling efforts for the PINK project.
Project Leader: Haralambos Sarimveis (hsarimv@central.ntua.gr)
Contributors: Giannis Pitoskas (jpitoskas@gmail.com),
Project_dir/
├── data/
├── experiments/
├── models/
└── src/
The src/
directory contains the source code for training Graph Neural Networks (GNNs) using SMILES representations of molecules. For more detailed information about the source code and its usage, please refer to the internal README file located inside the src/
directory.
The models/
directory contains class implementations for different types of graph neural networks (GNNs), designed to be easily configurable.
These implementations provide a flexible framework for constructing and training GNNs, allowing users to experiment with different architectures and hyperparameters to suit their specific needs.
A data/
directory is expected to be included in the project's root directory. This directory is intended to store datasets for different molecular properties (endpoints).
Each property is organized into its own subdirectory, and dataset files follow a consistent naming convention:
- Subdirectory Format: Dataset subdirectories should follow the format
data/{property}/
- Naming Convention: Dataset files should follow the format {property}_dataset.csv
An example is given below:
data/
├── propertyA/
│ └── propertyA_dataset.csv
├── propertyB/
│ └── propertyB_dataset.csv
│
└── ...
The experiments/
directory is where training logs and metadata are stored. For more detailed information about the contents of this directory, please refer to the internal README file located inside the src/
directory.
The project requires the following Python version:
- Python: Version 3.10.9 or higher
The project requires the following Python packages:
- NumPy: Version 1.24.1
- Pandas: Version 1.5.3
- Torch: Version 2.0.0+cu117
- Torch Geometric: Version 2.4.0
- Tqdm: Version 4.64.1
- Scikit-learn: Version 1.4.1
- RDKit: Version 2022.9.5