feat: Integration of torch models in main #34

jpitoskas · 2024-06-14T10:34:02Z

Incorporate Torch Model training into jaqpotpy

The jaqpotpy_torch supports the following types of modelling tasks:

Binary Classification of SMILES using Graph NNs
Multiclass Classification of SMILES using Graph NNs
Regression of SMILES using Graph NNs
Binary Classification of SMILES combined with external features using using Graph NNs along with a Fully Connected NN
Multiclass Classification of SMILES combined with external features using using Graph NNs along with a Fully Connected NN
Regression from SMILES combined with external features using using Graph NNs along with a Fully Connected NN
Binary Classification of Tabular Data using a Fully Connected NN
Multiclass Classification of Tabular Data using a Fully Connected NN
Regression of Tabular Data using a Fully Connected NN

Notes for future implementation

Featurizers

Featurizers that inherit from our base featurizer class Featurizer must implement the featurize method.

Datasets

Datasets must inherit from Torch Dataset and implement/override the __len__ and __getitem__ methods.

Trainers

Trainers that inherit from our base class TorchModelTrainer must implement get_model_type, train, evaluate, prepare_for_deployment.

Models

Models must inherit form nn.Module and implement __init__ and forward.

This commit adds the following implementations to the models_torch subpackage: - Added __init__.py for the subpackage - Implemented GraphAttentionNetwork in graph_attention_network.py - Implemented GraphConvolutionalNetwork in graph_convolutional_network.py - Implemented GraphSAGENetwork in graph_sage_network.py - Implemented GraphTransformerNetwork in graph_transformer_network.py These implementations provide configurable nn architectural support for training graph-based models using PyTorch.

Create a new directory named jaqpotpy_torch/ to organize all torch-related code. We'll decide later whether to keep this code here or move it to an entirely new standalone torch-specific package.

This commit initializes the featurizers_torch subpackage, adding the following implementations: - Added __init__.py for the subpackage - Implemented SmilesGraphFeaturizer in smiles_graph_featurizer.py. The SmilesGraphFeaturizer class is designed to create custom graph featurizations from SMILES strings. It offers highly configurable options, allowing users to choose from a wide range of both atom and bond characteristics to be included.

This commit initializes the datasets_torch subpackage, adding the following implementations: - Added __init__.py for the subpackage - Implemented SmilesGraphDataset in smiles_graph_dataset.py. The SmilesGraphDataset class is designed to create a custom torch Dataset for graph-featurized SMILES. Its __getitem__ method is overridden to return a torch_geometric Data object enacpsulating the following information: - Node attributes (x) - Edge indices (edge_index) - Edge attributes (edge_attr) - Target labels (y) - The original SMILES representation (smiles) SmilesGraphDataset enables straightforward integration into torch-based ML pipelines, facilitating the development of graph-based predictive models.

This commit removes the _torch suffix directory names within the jaqpotpy_torch module: - Renamed datasets_torch directory to datasets - Renamed featurizers_torch directory to featurizers - Renamed models_torch directory to models

This commit initializes the trainers subpackage, adding the following implementations: - Added __init__.py for the subpackage - Implemented an initial version of TorchModelTrainer abstract class in torch_model_trainer.py.

This commit initializes the trainers subpackage, adding the following implementations: - Added BinaryGraphModelTrainer, RegressionGraphModelTrainer in __init__.py - Extended TorchModelTrainer class with additional attributes.

This commit adds the following implementations to the trainers subpackage: - BinaryGraphModelTrainer subclass - RegressionGraphModelTrainer subclass

This commit adds the SmilesGraphDatasetWithExternal class implementation to the datasets subpackage. This class inherits from SmilesGraphDataset, and adds the functionality of providing an external feature vector along with the smiles representation.

This commit adds the implementation of the Featurizer abstract class. Also the abstract method featurize() is defined.

This commit adds the FullyConnectedNetwork class implementation to the models.

This commit adds the following implementations to the models subpackage: - GraphAttentionNetworkWithExternal - GraphConvolutionalNetworkWithExternal - GraphSAGENetworkWithExternal - GraphTransformerNetworkWithExternal In these models the corresponding graph neural network is employed to produce global level representations from smiles. Then these are concatenated with the external feature vectors and the concatenated vector is passed through a fully connected network to produce the final output.

- Fixed a circular import error of the FullyConnectedNetwork class - Added super().__init__() to all the models supporting external features

This commit adds the implementation of the deploy_model for both RegressionGraphModelTrainer and BinaryGraphModelTrainer. deploy_model() is an abstract method of the TorchModelTrainer base class, and must be implemented in every class that inherits from TorchModelTrainer, to support model deployment on Jaqpot.

In this commit we: - Implement the deployment logic for models and trainers that use external features - Set deploy_model function to be on the TorchModelTrainer class - Define the abstract method prepare_for_deployment() with a dynamic set of arguments per trainer subclass which transforms the data into the appropriate JSON - Add 'SMILES' in a protected namespace so that external features can't be named like this - Fix bugs regarding model input arguments

…graph-training

…com/ntua-unit-of-control-and-informatics/jaqpotpy into feat/JAQPOT-62/torch-graph-training

This commit provides: - Ready for deployment torch models are implemented - Everything up to date with the current structure of the API

This commit implements: - TabularDataset class inheriting from torch.utils.data.Dataset - BinaryFCModelTrainer & RegressionFCModelTrainer - The required changed in the BinaryModelTrainer and RegressionModelTrainer abstract classes to support data from torch.utils.data.DataLoader as well

This commit: - Adds example code blocks in SmilesGraphFeaturizer - Sets '.' instead of '_' for the separator when showing atom/bond characteristics vector labels - Add __call__ to to call abstract featurize() method in abstract class Featurizer

This commit: - Adds/refactors docs for models - Modifies author section

…com/ntua-unit-of-control-and-informatics/jaqpotpy into feat/JAQPOT-62/torch-graph-training

alarv · 2024-09-11T12:58:31Z

Closing this as it has been integrated into the main branch as part of JAQPOT-254 and is partly done on #47

jpitoskas and others added 30 commits May 15, 2024 15:30

feat: Move torch-related code in jaqpotpy_torch/

69d0c8f

Create a new directory named jaqpotpy_torch/ to organize all torch-related code. We'll decide later whether to keep this code here or move it to an entirely new standalone torch-specific package.

refactor: Rename subdirectories in jaqpotpy_torch

35ed377

This commit removes the _torch suffix directory names within the jaqpotpy_torch module: - Renamed datasets_torch directory to datasets - Renamed featurizers_torch directory to featurizers - Renamed models_torch directory to models

feat: Add trainers package for torch models

05a8791

This commit initializes the trainers subpackage, adding the following implementations: - Added __init__.py for the subpackage - Implemented an initial version of TorchModelTrainer abstract class in torch_model_trainer.py.

feat: Extended trainers package for torch models

6fd3be3

This commit initializes the trainers subpackage, adding the following implementations: - Added BinaryGraphModelTrainer, RegressionGraphModelTrainer in __init__.py - Extended TorchModelTrainer class with additional attributes.

feat: Implement Binary & Regression Graph Trainers

6aa65ba

This commit adds the following implementations to the trainers subpackage: - BinaryGraphModelTrainer subclass - RegressionGraphModelTrainer subclass

refactor: Add abstract class Featurizer

9bbd19e

This commit adds the implementation of the Featurizer abstract class. Also the abstract method featurize() is defined.

feat: Implement Fully Connected Network model

b26555c

This commit adds the FullyConnectedNetwork class implementation to the models.

fix: Circular import error and add super()

59895d8

- Fixed a circular import error of the FullyConnectedNetwork class - Added super().__init__() to all the models supporting external features

fix: Change "mu" to "mean" for consistency

611430c

fix: Fix log_filepath var name & rm unused libs

686106e

Merge remote-tracking branch 'origin/main' into feat/JAQPOT-62/torch-…

6b683aa

…graph-training

Merge branch 'main' into feat/JAQPOT-62/torch-graph-training

c3227d2

Merge branch 'feat/JAQPOT-62/torch-graph-training' of https://github.…

9d5a50b

…com/ntua-unit-of-control-and-informatics/jaqpotpy into feat/JAQPOT-62/torch-graph-training

refactor: Change dir structure of trainers

dedceae

feat: Fully functional torch model upload

e863a12

This commit provides: - Ready for deployment torch models are implemented - Everything up to date with the current structure of the API

feat: Method to get all installed packages in env

26fb527

feat: Add bond featurs as edge_attr

927d84d

feat: Make categorical values as str

c61006c

feat: Implement Multiclass Trainer

7eee23f

fix: Model type of multiclass fc model trainer

5dbe355

fix: multiclass_fc_model_trainer.py rename typo

0e68b16

fix: Add zero_division in both precision & recall

f026247

jpitoskas added 14 commits June 28, 2024 12:28

refactor: Add documentation to models

2aabc6d

refactor: Fix in docs of models

f344713

refactor: Add documentation to trainers

4ee8c11

feat: Add scheduler to trainer

9fd1b39

refactor: Add docstring to smiles graph featurizer

16b9046

fix: Forgot scheduler to regression_model_trainer

78b6d3e

fix: Edge dim to transformer graph network

9828db8

refactor: Add docstrings to Datasets

d7c30f1

refactor: Refactor docs for featurizer and dataset

a1c4922

refactor: Add docs for models etc

1f89d87

This commit: - Adds/refactors docs for models - Modifies author section

fix: Move model to device in Trainer constructor

b7a4462

refactor: More fixes in docs

56a6c11

Merge branch 'feat/JAQPOT-62/torch-graph-training' of https://github.…

1fe1096

…com/ntua-unit-of-control-and-informatics/jaqpotpy into feat/JAQPOT-62/torch-graph-training

jpitoskas requested review from periklis91 and vassilismin June 28, 2024 09:58

jpitoskas added 8 commits June 28, 2024 13:13

refactor: Change all 'Argument:' to 'Args' in docs

3bb3fcb

refactor: Fix some docs and add model_type names

d5ce368

refactor: Change 'NUMERICAL' to 'FLOAT'

0f3cebf

chore: Add port 8002, to match jaqpotpy-inference

38d4226

fix: __len__() in SmilesGraphDatasetWithExternal

a67ec47

fix: Add model back to cpu before torch.jit.script

ee19ef3

chore: Change format of creator in schemas

951d0ff

fix: Forgot to change a 'NUMERICAL' to 'FLOAT'

dd6a050

jpitoskas changed the title ~~feat: Getting up to date with main~~ feat: Integration of torch models in main Jun 30, 2024

jpitoskas and others added 2 commits June 30, 2024 04:11

Merge branch 'main' into feat/JAQPOT-62/torch-graph-training

0db9db0

chore: Temporarily add docs_jaqpotpy_torch

658536d

alarv closed this Sep 11, 2024

alarv deleted the feat/JAQPOT-62/torch-graph-training branch September 11, 2024 12:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Integration of torch models in main #34

feat: Integration of torch models in main #34

jpitoskas commented Jun 14, 2024 •

edited

Loading

alarv commented Sep 11, 2024

feat: Integration of torch models in main #34

feat: Integration of torch models in main #34

Conversation

jpitoskas commented Jun 14, 2024 • edited Loading

Incorporate Torch Model training into jaqpotpy

Notes for future implementation

Featurizers

Datasets

Trainers

Models

alarv commented Sep 11, 2024

jpitoskas commented Jun 14, 2024 •

edited

Loading