Merge pull request #214 from Evovest/docs

Docs
Evovest · Mar 17, 2023 · 4da60ae · 4da60ae
2 parents bd46e50 + dc0ab24
commit 4da60ae
Show file tree

Hide file tree

Showing 2 changed files with 25 additions and 12 deletions.
diff --git a/src/MLJ.jl b/src/MLJ.jl
@@ -174,14 +174,15 @@ A model type for constructing a EvoTreeRegressor, based on [EvoTrees.jl](https:/
 # Hyper-parameters
 
 - `loss=:linear`:         Loss to be be minimized during training. One of:
-
   - `:linear`
   - `:logistic`
   - `:gamma`
   - `:tweedie`
   - `:quantile`
   - `:L1`
 - `nrounds=10`:           Number of rounds. It corresponds to the number of trees that will be sequentially stacked.
+- `eta=0.1`:              Learning rate. Each tree raw predictions are scaled by `eta` prior to be added to the stack of predictions. 
+  A lower `eta` results in slower learning, requiring a higher `nrounds` but typically improves model performance.   
 - `lambda::T=0.0`:        L2 regularization term on weights. Must be >= 0. Higher lambda can result in a more robust model.
 - `gamma::T=0.0`:         Minimum gain improvement needed to perform a node split. Higher gamma can result in a more robust model.
 - `alpha::T=0.5`:         Loss specific parameter in the [0, 1] range:
@@ -198,6 +199,7 @@ A model type for constructing a EvoTreeRegressor, based on [EvoTrees.jl](https:/
 - `nbins=32`:             Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins.
 - `monotone_constraints=Dict{Int, Int}()`: Specify monotonic constraints using a dict where the key is the feature index and the value the applicable constraint (-1=decreasing, 0=none, 1=increasing). 
   Only `:linear`, `:logistic`, `:gamma` and `tweedie` losses are supported at the moment.
+- `T=Float32`:            The float precision on which the model will be trained. One of `Float32` or `Float64`.
 - `rng=123`:              Either an integer used as a seed to the random number generator or an actual random number generator (`::Random.AbstractRNG`).
 - `device="cpu"`:         Hardware device to use for computations. Can be either `"cpu"` or `"gpu"`. Only `:linear`, `:logistic`, `:gamma` and `tweedie` losses are supported on GPU.
 
@@ -302,6 +304,8 @@ EvoTreeClassifier is used to perform multi-class classification, using cross-ent
 # Hyper-parameters
 
 - `nrounds=10`:                 Number of rounds. It corresponds to the number of trees that will be sequentially stacked.
+- `eta=0.1`:              Learning rate. Each tree raw predictions are scaled by `eta` prior to be added to the stack of predictions. 
+  A lower `eta` results in slower learning, requiring a higher `nrounds` but typically improves model performance.  
 - `lambda::T=0.0`:              L2 regularization term on weights. Must be >= 0. Higher lambda can result in a more robust model.
 - `gamma::T=0.0`:               Minimum gain improvement needed to perform a node split. Higher gamma can result in a more robust model.
 - `max_depth=5`:                Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf.
@@ -311,6 +315,7 @@ EvoTreeClassifier is used to perform multi-class classification, using cross-ent
 - `rowsample=1.0`:              Proportion of rows that are sampled at each iteration to build the tree. Should be in `]0, 1]`.
 - `colsample=1.0`:              Proportion of columns / features that are sampled at each iteration to build the tree. Should be in `]0, 1]`.
 - `nbins=32`:                   Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins.
+- `T=Float32`:                  The float precision on which the model will be trained. One of `Float32` or `Float64`.
 - `rng=123`:                    Either an integer used as a seed to the random number generator or an actual random number generator (`::Random.AbstractRNG`).
 - `device="cpu"`:               Hardware device to use for computations. Can be either `"cpu"` or `"gpu"`.
 
@@ -422,6 +427,8 @@ EvoTreeCount is used to perform Poisson probabilistic regression on count target
 # Hyper-parameters
 
 - `nrounds=10`:                 Number of rounds. It corresponds to the number of trees that will be sequentially stacked.
+- `eta=0.1`:              Learning rate. Each tree raw predictions are scaled by `eta` prior to be added to the stack of predictions. 
+  A lower `eta` results in slower learning, requiring a higher `nrounds` but typically improves model performance.  
 - `lambda::T=0.0`:              L2 regularization term on weights. Must be >= 0. Higher lambda can result in a more robust model.
 - `gamma::T=0.0`:               Minimum gain imprvement needed to perform a node split. Higher gamma can result in a more robust model.
 - `max_depth=5`:                Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf.
@@ -432,6 +439,7 @@ EvoTreeCount is used to perform Poisson probabilistic regression on count target
 - `colsample=1.0`:              Proportion of columns / features that are sampled at each iteration to build the tree. Should be `]0, 1]`.
 - `nbins=32`:                   Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins.
 - `monotone_constraints=Dict{Int, Int}()`: Specify monotonic constraints using a dict where the key is the feature index and the value the applicable constraint (-1=decreasing, 0=none, 1=increasing).
+- `T=Float32`:                  The float precision on which the model will be trained. One of `Float32` or `Float64`.
 - `rng=123`:                    Either an integer used as a seed to the random number generator or an actual random number generator (`::Random.AbstractRNG`).
 - `device="cpu"`:               Hardware device to use for computations. Can be either `"cpu"` or `"gpu"`.
 
@@ -547,6 +555,8 @@ EvoTreeGaussian is used to perform Gaussian probabilistic regression, fitting μ
 # Hyper-parameters
 
 - `nrounds=10`:                 Number of rounds. It corresponds to the number of trees that will be sequentially stacked.
+- `eta=0.1`:              Learning rate. Each tree raw predictions are scaled by `eta` prior to be added to the stack of predictions. 
+  A lower `eta` results in slower learning, requiring a higher `nrounds` but typically improves model performance.  
 - `lambda::T=0.0`:              L2 regularization term on weights. Must be >= 0. Higher lambda can result in a more robust model.
 - `gamma::T=0.0`:               Minimum gain imprvement needed to perform a node split. Higher gamma can result in a more robust model.
 - `max_depth=5`:                Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf.
@@ -558,6 +568,7 @@ EvoTreeGaussian is used to perform Gaussian probabilistic regression, fitting μ
 - `nbins=32`:                   Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins.
 - `monotone_constraints=Dict{Int, Int}()`: Specify monotonic constraints using a dict where the key is the feature index and the value the applicable constraint (-1=decreasing, 0=none, 1=increasing). 
   !Experimental feature: note that for Gaussian regression, constraints may not be enforce systematically.
+- `T=Float64`:                  The float precision on which the model will be trained. One of `Float32` or `Float64`.
 - `rng=123`:                    Either an integer used as a seed to the random number generator or an actual random number generator (`::Random.AbstractRNG`).
 - `device="cpu"`:               Hardware device to use for computations. Can be either `"cpu"` or `"gpu"`.
 
@@ -680,6 +691,8 @@ EvoTreeMLE performs maximum likelihood estimation. Assumed distribution is speci
   - `:gaussian` / `:gaussian_mle`
   - `:logistic` / `:logistic_mle`
 - `nrounds=10`:                 Number of rounds. It corresponds to the number of trees that will be sequentially stacked.
+- `eta=0.1`:              Learning rate. Each tree raw predictions are scaled by `eta` prior to be added to the stack of predictions. 
+  A lower `eta` results in slower learning, requiring a higher `nrounds` but typically improves model performance.  
 - `lambda::T=0.0`:              L2 regularization term on weights. Must be >= 0. Higher lambda can result in a more robust model.
 - `gamma::T=0.0`:               Minimum gain imprvement needed to perform a node split. Higher gamma can result in a more robust model.
 - `max_depth=5`:                Maximum depth of a tree. Must be >= 1. A tree of depth 1 is made of a single prediction leaf.
@@ -691,6 +704,7 @@ EvoTreeMLE performs maximum likelihood estimation. Assumed distribution is speci
 - `nbins=32`:                   Number of bins into which each feature is quantized. Buckets are defined based on quantiles, hence resulting in equal weight bins.
 - `monotone_constraints=Dict{Int, Int}()`: Specify monotonic constraints using a dict where the key is the feature index and the value the applicable constraint (-1=decreasing, 0=none, 1=increasing). 
   !Experimental feature: note that for MLE regression, constraints may not be enforced systematically.
+- `T=Float64`:                  The float precision on which the model will be trained. One of `Float32` or `Float64`.
 - `rng=123`:                    Either an integer used as a seed to the random number generator or an actual random number generator (`::Random.AbstractRNG`).
 - `device="cpu"`:               Hardware device to use for computations. Can be either `"cpu"` or `"gpu"`.
 

diff --git a/src/fit.jl b/src/fit.jl
@@ -288,17 +288,16 @@ Main training function. Performs model fitting given configuration `params`, `x_
 - `offset_eval::VecOrMat`: evaluation data offset. Should match the size of the predictions.
 - `metric`: The evaluation metric that wil be tracked on `x_eval`, `y_eval` and optionally `w_eval` / `offset_eval` data. 
     Supported metrics are: 
-    
-        - `:mse`: mean-squared error. Adapted for general regression models.
-        - `:rmse`: root-mean-squared error (CPU only). Adapted for general regression models.
-        - `:mae`: mean absolute error. Adapted for general regression models.
-        - `:logloss`: Adapted for `:logistic` regression models.
-        - `:mlogloss`: Multi-class cross entropy. Adapted to `EvoTreeClassifier` classification models. 
-        - `:poisson`: Poisson deviance. Adapted to `EvoTreeCount` count models.
-        - `:gamma`: Gamma deviance. Adapted to regression problem on Gamma like, positively distributed targets.
-        - `:tweedie`: Tweedie deviance. Adapted to regression problem on Tweedie like, positively distributed targets with probability mass at `y == 0`.
-        - `:gaussian_mle`: Gaussian log-likelihood. Adapted to MLE when using `EvoTreeMLE` with `loss = :gaussian_mle`. 
-        - `:logistic_mle`: Logistic log-likelihood. Adapted to MLE when using `EvoTreeMLE` with `loss = :logistic_mle`. 
+    - `:mse`: mean-squared error. Adapted for general regression models.
+    - `:rmse`: root-mean-squared error (CPU only). Adapted for general regression models.
+    - `:mae`: mean absolute error. Adapted for general regression models.
+    - `:logloss`: Adapted for `:logistic` regression models.
+    - `:mlogloss`: Multi-class cross entropy. Adapted to `EvoTreeClassifier` classification models. 
+    - `:poisson`: Poisson deviance. Adapted to `EvoTreeCount` count models.
+    - `:gamma`: Gamma deviance. Adapted to regression problem on Gamma like, positively distributed targets.
+    - `:tweedie`: Tweedie deviance. Adapted to regression problem on Tweedie like, positively distributed targets with probability mass at `y == 0`.
+    - `:gaussian_mle`: Gaussian log-likelihood. Adapted to MLE when using `EvoTreeMLE` with `loss = :gaussian_mle`. 
+    - `:logistic_mle`: Logistic log-likelihood. Adapted to MLE when using `EvoTreeMLE` with `loss = :logistic_mle`. 
 - `early_stopping_rounds::Integer`: number of consecutive rounds without metric improvement after which fitting in stopped. 
 - `print_every_n`: sets at which frequency logging info should be printed. 
 - `verbosity`: set to 1 to print logging info during training.