-
Notifications
You must be signed in to change notification settings - Fork 4
How To: Adding Machine Learning Models
This framework currently supports the integration of neural network models, saved as Keras outputs. Such models are dynamically loaded into runtime using tensorflow-cpu
package, and can be ran together with the analysis to evaluate scores on inputs. This section has the details on how to integrate a Keras-output model to bucoffea
and run it on the intended analysis input.
Note: Work is ongoing to support PyTorch models as well, under the branch 2022-06-12_pytorch
.
The paths to the saved neural network models are stored in models
directory. During analysis execution (e.g. VBF H(inv) processor), a model with a specified path can be loaded in runtime, and score predictions from this model can be obtained and saved as a histogram. In the VBF H(inv) processor, this is implemented here.
In order to specify a certain model, two things need to be done:
- The model directory (output of Keras'
save_model
) should be placed under thebucoffea/models
directory. - This model path must then be specified in the
vbfhinv.yaml
configuration file, here.
Below, we have a working example of a Convolutional Neural Network (CNN) model, used in the VBF H(inv) processor. The path of the model to load is read, and model is loaded within the VBF H(inv) processor as follows:
# Load the model, notice how the directory path is read dynamically
# from the configuration file
model_dir = bucoffea_path(cfg.NN_MODELS.CONVNET.PATH)
model = load_model(model_dir)
# Normalize the images + put them in a stack with the proper shape
jetimages_norm = prepare_data_for_cnn(jet_images)
# Obtain predictions from the model
df['cnn_score'] = model.predict(jetimages_norm)
The score distribution is then saved as a histogram, which is done here.
Note: Please note that work for this feature is ongoing, and this is not yet merged to master_UL
branch.
PyTorch DNN models can be specified to bucoffea
framework in a similar fashion to CNN ones. The vbfhinv.yaml
configuration file can be updated as follows:
deepnet:
path: models/pytorch/2022-06-12_dense
features:
- mjj
- detajj
- dphijj
arch_parameters:
n_features: 3
n_classes: 2
n_nodes: [18,10,5,2]
dropout: 0.2
Namely, three set parameters need to be specified:
Path to the PyTorch model: This will be a directory under models/pytorch
, which must contain a model_state_dict.pt
file. Due to PyTorch related limitations, the model_state_dict.pt
file only contains a mapping of weights and biases of the trained model, but not the whole model object. A model will be instantiated in runtime, using arch_parameters
and the model class definition within bucoffea
(see below), and this set of weights and biases will be attached to the model instance.
List of features: This is the list of features that this model is trained on. These features will be selected from analysis input, scaled and pre-processed (scaled to zero mean and unit variance), and will be fed to the deep neural network model.
Model architecture: These are the parameters specifying the architecture of the model (annoying, but necessary to initiate the PyTorch model instance in bucoffea
). So please make sure that the arch parameters are the same with the model that is being used.
Additionally, for the instantiation of the PyTorch model to work, the DNN class definition is included in helpers/pytorch.py
file here. This class is a simplification of the original model, which has functions to build the model architecture, and perform a feed-forward. Before evaluation, please make sure to check that the model architecture defined in this class is inline with the evaluated model.
Given the model class definition and the parameters listed above, the workflow to get predictions looks like this:
# Create an instance of the PyTorch model with the arch parameters specified in .yaml file
dnn_model = FullyConnectedNN(
**dict(cfg.NN_MODELS.DEEPNET.ARCH_PARAMETERS)
)
# Load the state dictionary (set of weights + biases) of a previously trained model
state_dict = load_pytorch_state_dict(cfg.NN_MODELS.DEEPNET.PATH)
# Set the weights and biases for the current model instance
dnn_model.load_state_dict(state_dict)
# Get the predictions from this model
df['dnn_score'] = dnn_model.predict(dnn_features.to_numpy())