These scripts can be used to reproduce the study described in the article "Inverse modelling pyrolization kinetics with ensemble learning methods" [1]. The scripts here are intended to be used with default configuration, if not noted otherwise. To apply this model to your own data, you can adopt the scripts shown in ./test_model/.
The scripts generate_1c.py
, generate_2c.py
and generate_3c.py
are used to generate the training data for the model. There is an individual script for 1, 2 and 3 components. The reaction kinetic parameters and component fractions are sampled randomly and then the mass loss rate for TGA experiments with four different constant heating rates are calculated. The scripts are intended to be used on multiple CPUs. Example data sets are available for download [2].
In the following table, there are paramaters listed that can easily set by the user. Further parameters, as heating rates and sampling rates can also be modified in the scripts but may need more caution.
Parameter | Description | Default value |
---|---|---|
i | number of elements that will be generated | 6400000 |
cores | number of CPU cores used for generation | 128 |
rrlimlow | lower boundary of peak reaction rate sampling (in /s) | 0.001 |
rrlimup | upper boundary of peak reaction rate sampling (in /s) | 0.01 |
rtlimlow | lower boundary of peak reaction rate sampling (in °C) | 100 |
rtlimup | upper boundary of peak reaction rate sampling (in °C) | 500 |
Tstart | Start temperature of experiment (in °C) | 20 |
Tend | End temperature of experiment (in °C) | 550 |
Standard output will be the mass loss rate for four TGA experiments with following configuration:
Heating rate | Heating rate value | Time step |
Temperature step |
---|---|---|---|
5 K/min | 24 s | 2 K | |
10 K/min | 12 s | 2 K | |
30 K/min | 4 s | 2 K | |
40 K/min | 3 s | 2 K |
The scripts will generate numbered files named features*
and labels*
individually for each used CPU core. In addition there is a labels*
file that holds all generated labels. The naming convention can be seen in the table below and a description of the data structure in the files can be found under Data structure below.
Filename | Description |
---|---|
labels{$N_u$/1000}k_{$\beta_1$}{$\beta_2$}{$\beta_3$}{$\beta_4$}{$n$}_{$\Delta T$/s}.csv | generated labels |
features{$N_u$/1000}k_{$\beta_1$}{$\beta_2$}{$\beta_3$}{$\beta_4$}{$n$}{$\Delta T$/s}{$i_{core}$}.csv | generated labels, output per core |
labels{$N_u$/1000}k_{$\beta_1$}{$\beta_2$}{$\beta_3$}{$\beta_4$}{$n$}{$\Delta T$/s}{$i_{core}$}.csv | generated labels, output per core |
These scripts generate the individual sub models as described in the following table.
Filename | Output | Description |
---|---|---|
sm1.py |
sm1.pickle |
Sub model 1 (Classifier for estimation of number of components) |
sm3_1c.py |
sm3_1c.pickle |
Sub model 3 for materials with 1 component (Regressor for estimation of reaction kinetic parameters) |
sm3_2c.py |
sm3_2c.pickle |
Sub model 3 for materials with 2 components (Regressor for estimation of reaction kinetic parameters) |
sm3_3c.py |
sm3_3c.pickle |
Sub model 3 for materials with 3 components (Regressor for estimation of reaction kinetic parameters) |
Default parameters are the ones used in the corresponding publication. A contemporary high performance system (128 CPU cores, 1 TB RAM) will be needed for this. However, the following hyper parameter values shall be taken as lower limits to produce models with slightly less accurate predictions but much less computational demand. Pre built models are available for download [3].
Parameter | Sub model 1 (sm1.py ) |
Sub model 3, 1 Component (sm3_1c.py ) |
Sub model 3, 2 components (sm3_2c.py ) |
Sub model 3, 3 components (sm3_3c.py ) |
---|---|---|---|---|
Number of estimators | 500 | 1 | 50 | 50 |
Maximum tree depth | 100 | 50 | 50 | 50 |
Scripts to test the complete model
calculate.py
in default configuration will calculate predictions from data not used during the training process for 150000 elements and output the prediction of the reaction kinetic parameters to prediction.csv
evaluate.py
then evaluates the true (prescribed) and predicted reaction kinetic parameters and initial fractions by calculating R^2 scores. It also calculates mass loss rates from the predictions and compares them to the initial mass loss rates used as input features. For comparison, the normalised RMSE is calculated and the distribution is plotted.
The mass loss rates are calculated from the reaction kinetic parameters can be found at the same row number of the corresponding features*
file. The files do not have any header.
Columns description:
Column | Symbol | Description | Unit |
---|---|---|---|
1 | Peak reaction rate for component 1 | s^-1 | |
2 | Peak reaction rate for component 2 | s^-1 | |
3 | Peak reaction rate for component 3 | s^-1 | |
4 | Peak reaction temperature for component 1 | °C | |
4 | Peak reaction temperature for component 2 | °C | |
6 | Peak reaction temperature for component 3 | °C | |
7 | Fraction of component 1 | 1 | |
8 | Fraction of component 2 | 1 | |
9 | Fraction of component 3 | 1 | |
10 | Pre-exponential factor for component 1 | s^-1 | |
11 | Pre-exponential factor for component 2 | s^-1 | |
12 | Pre-exponential factor for component 3 | s^-1 | |
13 | Activation energy for component 1 | J/mol | |
14 | Activation energy for component 2 | J/mol | |
15 | Activation energy for component 3 | J/mol | |
16 | Fraction of component 1 | 1 | |
17 | Fraction of component 2 | 1 | |
18 | Fraction of component 3 | 1 |
The files do not have any header. There is a set of mass loss records for four TGA experiments with identical reaction kinetic parameters and four different heating rates per row. One row can be separated into the four experiments as shown in the following tabel. The mass loss rates are calculated from the reaction kinetic parameters in the same row number of the corresponding labels*
file.
Column | Description | Unit |
---|---|---|
1...266 | Mass loss rate at |
s^-1 |
267...532 | Mass loss rate at |
s^-1 |
533...798 | Mass loss rate at |
s^-1 |
799...1064 | Mass loss rate at |
s^-1 |
Corresponding
The file holds a single python element of class sklearn.ensemble.ExtraTreesClassifier
[4]. It is pre trained to estimate the number of components with single reactions represented by input mass loss rate data. It can be loaded via pickle[5] into Python.
Using sm1.predict(X)
expects 1064 features as input with following properties:
Feature | Description | Unit |
---|---|---|
1...266 | Mass loss rate at |
s^-1 |
267...532 | Mass loss rate at |
s^-1 |
533...798 | Mass loss rate at |
s^-1 |
799...1064 | Mass loss rate at |
s^-1 |
799...1064 | Mass loss rate at |
s^-1 |
Corresponding
Output is a single integer of 1,2 or 3 as the number of components in the material represented by the TGA mass loss rate profile.
The file holds a single python element of class sklearn.ensemble.ExtraTreesRegressor
[6]. It is pre trained to estimate the reaction kinetic parameters of materials consisting of one component represented by input mass loss rate data. It can be loaded via pickle[5] into Python.
Using sm3_1r.predict(X)
expects 1067 features as input with following properties:
Feature | Description | Unit |
---|---|---|
1...266 | Mass loss rate at |
s^-1 |
267...532 | Mass loss rate at |
s^-1 |
533...798 | Mass loss rate at |
s^-1 |
799...1064 | Mass loss rate at |
s^-1 |
1065 | Initial fraction of component 1 ( |
|
1066 | Initial fraction of component 2 ( |
|
1067 | Initial fraction of component 3 ( |
Corresponding
Output is
The file holds a single python element of class sklearn.ensemble.ExtraTreesRegressor
[6]. It is pre trained to estimate the reaction kinetic parameters of materials consisting of two components represented by input mass loss rate data. It can be loaded via pickle[5] into Python.
Using sm3_2r.predict(X)
expects 1067 features as input with following properties:
Feature | Description | Unit |
---|---|---|
1...266 | Mass loss rate at |
s^-1 |
267...532 | Mass loss rate at |
s^-1 |
533...798 | Mass loss rate at |
s^-1 |
799...1064 | Mass loss rate at |
s^-1 |
1065 | Initial fraction of component 1 ( |
|
1066 | Initial fraction of component 2 ( |
|
1067 | Initial fraction of component 3 ( |
Corresponding
Output is
The file holds a single python element of class sklearn.ensemble.ExtraTreesRegressor
[6]. It is pre trained to estimate the reaction kinetic parameters of materials consisting of three components represented by input mass loss rate data. It can be loaded via pickle[5] into Python.
Using sm3_3r.predict(X)
expects 1067 features as input with following properties:
Feature | Description | Unit |
---|---|---|
1...266 | Mass loss rate at |
s^-1 |
267...532 | Mass loss rate at |
s^-1 |
533...798 | Mass loss rate at |
s^-1 |
799...1064 | Mass loss rate at |
s^-1 |
1065 | Initial fraction of component 1 ( |
|
1066 | Initial fraction of component 2 ( |
|
1067 | Initial fraction of component 3 ( |
Corresponding
Output is
This is the columns description of the output from calculate.py
. The file has no header.
Column | Symbol | Description | Unit |
---|---|---|---|
1 | Pre-exponential factor for component 1 | s^-1 | |
2 | Pre-exponential factor for component 2 | s^-1 | |
3 | Pre-exponential factor for component 3 | s^-1 | |
4 | Activation energy for component 1 | J/mol | |
5 | Activation energy for component 2 | J/mol | |
6 | Activation energy for component 3 | J/mol | |
7 | Fraction of component 1 | 1 | |
8 | Fraction of component 2 | 1 | |
9 | Fraction of component 3 | 1 |