Add Brier Score metrics #10

Vincent-Maladiere · 2023-06-28T16:21:08Z

What does this PR implement?

This is the implementation of the Brier score metrics module:

BrierScoreComputer
~~BrierScoreSampler~~ to be moved to the GBIncidence source code
brier_score
brier_score_incidence
integrated_brier_score
integrated_brier_score_incidence

This also implements the IPCWEstimator used by the BrierScoreComputer. We have a dependency on lifelines but not on scikit-survival.

Remarks

~~- This fixes a subtle bug in y_binary, making our previous implementation of brier_score() incorrect. It now matches scikit-survival results.~~
Edit: y_true_binary is 1 when we observed an event, and 0 otherwise, which is correct for our Gradient Boosting classification or regression target.
However, when computing the Brier Score metric, we actually need np.logical_not(y_true_binary). Indeed, when we observed an event, the survival probability drops to 0. Otherwise, it stays at 1. The Brier Score uses the survival probability, not the incidence probability.

brier_score computes is the regular Brier Score like scikit-survival does, where the y_pred input is the survival probability. On the contrary, brier_score_incidence follows the Kretowska formula for competing events and expects y_pred to be the incidence probability for the kth cause of failure instead. Under the hood though, brier_score relies on brier_score_incidence, by simply inputting 1 - y_pred.

This is needed to disambiguate our Brier Score implementation. We could also only expose brier_score_incidence without the regular brier_score, but I thought it was handy to have it, instead of telling the user to perform 1 - y_pred for binary events, all the time.
When running
```
brier_score(y_train, y_test, y_pred, times)
```
times must be the times used for computing y_pred —of shape (n_samples, n_times)— otherwise, the Brier Score will be incorrect. This is tricky to check when y_pred is not a dataframe, so we only check for shapes.

ogrisel

First quick pass of feedback.

hazardous/metrics/brier_score.py

hazardous/tests/test_metrics.py

hazardous/tests/test_ipcw.py

hazardous/tests/test_metrics.py

ogrisel

Here is the suggestion related to https://github.com/soda-inria/hazardous/pull/10/files#r1257862724.

hazardous/_ipcw.py

ogrisel · 2023-07-10T15:22:35Z

Let's add some tests to check some mathematical properties:

IPCWEstimator should constantly predict 1.0 when the training data has no censored values,
check that assert_allclose(ipcw.predict([0]), [1.0]) when ipcw is fit on random data, even when the shorter time in the training set is censored.
check that ipcw.predict([t + eps]) is strictly larger than 1.0 when ipcw is fit on random
censored data and t is the smallest censored duration in the training set.
check that IPCWEstimator predicts monotonically increasing values on a np.linspace(0, t_max, 100) when fit on random censored event data.
check that when a training set has deterministic censoring beyond a given t value (100% of the largest training events are censored), predicting with large t values is clipped to 1 / ipcw.min_censoring_survival_prob for different values of min_censoring_survival_prob.

ogrisel

More feedback.

hazardous/tests/test_metrics.py

hazardous/metrics/brier_score.py

pyproject.toml

hazardous/utils.py

…ouble backslash

…ith sksurv

…rob parameter

ogrisel · 2023-07-13T18:04:33Z

Ok @Vincent-Maladiere I think this is ready for final pass of review on your end.

ogrisel · 2023-07-13T18:10:42Z

I rendered the HTML of the doc locally but I realize that we do not have a CI job to do it for the PR. Only for publishing the result in the merge of the PR.

I think the easiest way to achieve this would be to configure a circle CI job for this project. Let's do that later in a separate PR.

Vincent-Maladiere added 6 commits June 9, 2023 18:19

add gb cif

4996d8c

Merge branch 'main' into add_gb_cif

5de7ba3

Apply pre-commit

d85109e

fix ruff tests

c085401

clean Brier score metrics and add doc + test

027e7de

add an explainer for using setuptools

1bce0e6

This was referenced Jun 29, 2023

Add Gradient Boosting Incidence #11

Merged

[Feat] Add Gradient Boosted CIF #9

Closed

add brier_score_incidence to fix ambiguity

0a0c836

ogrisel reviewed Jul 7, 2023

View reviewed changes