Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Brier Score metrics #10

Merged
merged 31 commits into from
Jul 17, 2023
Merged

Add Brier Score metrics #10

merged 31 commits into from
Jul 17, 2023

Conversation

Vincent-Maladiere
Copy link
Collaborator

@Vincent-Maladiere Vincent-Maladiere commented Jun 28, 2023

What does this PR implement?

This is the implementation of the Brier score metrics module:

  • BrierScoreComputer
  • BrierScoreSampler to be moved to the GBIncidence source code
  • brier_score
  • brier_score_incidence
  • integrated_brier_score
  • integrated_brier_score_incidence

This also implements the IPCWEstimator used by the BrierScoreComputer. We have a dependency on lifelines but not on scikit-survival.

Remarks

- This fixes a subtle bug in y_binary, making our previous implementation of brier_score() incorrect. It now matches scikit-survival results.
Edit: y_true_binary is 1 when we observed an event, and 0 otherwise, which is correct for our Gradient Boosting classification or regression target.
However, when computing the Brier Score metric, we actually need np.logical_not(y_true_binary). Indeed, when we observed an event, the survival probability drops to 0. Otherwise, it stays at 1. The Brier Score uses the survival probability, not the incidence probability.

  • brier_score computes is the regular Brier Score like scikit-survival does, where the y_pred input is the survival probability. On the contrary, brier_score_incidence follows the Kretowska formula for competing events and expects y_pred to be the incidence probability for the kth cause of failure instead. Under the hood though, brier_score relies on brier_score_incidence, by simply inputting 1 - y_pred.

    This is needed to disambiguate our Brier Score implementation. We could also only expose brier_score_incidence without the regular brier_score, but I thought it was handy to have it, instead of telling the user to perform 1 - y_pred for binary events, all the time.

  • When running

    brier_score(y_train, y_test, y_pred, times)

    times must be the times used for computing y_pred —of shape (n_samples, n_times)— otherwise, the Brier Score will be incorrect. This is tricky to check when y_pred is not a dataframe, so we only check for shapes.

Copy link
Contributor

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First quick pass of feedback.

hazardous/metrics/brier_score.py Outdated Show resolved Hide resolved
hazardous/metrics/brier_score.py Outdated Show resolved Hide resolved
hazardous/tests/test_metrics.py Outdated Show resolved Hide resolved
hazardous/tests/test_ipcw.py Outdated Show resolved Hide resolved
hazardous/tests/test_ipcw.py Outdated Show resolved Hide resolved
hazardous/tests/test_ipcw.py Outdated Show resolved Hide resolved
hazardous/tests/test_metrics.py Outdated Show resolved Hide resolved
Copy link
Contributor

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hazardous/_ipcw.py Outdated Show resolved Hide resolved
hazardous/_ipcw.py Outdated Show resolved Hide resolved
hazardous/_ipcw.py Outdated Show resolved Hide resolved
hazardous/_ipcw.py Outdated Show resolved Hide resolved
hazardous/_ipcw.py Outdated Show resolved Hide resolved
hazardous/_ipcw.py Outdated Show resolved Hide resolved
hazardous/_ipcw.py Outdated Show resolved Hide resolved
@ogrisel
Copy link
Contributor

ogrisel commented Jul 10, 2023

Let's add some tests to check some mathematical properties:

  • IPCWEstimator should constantly predict 1.0 when the training data has no censored values,
  • check that assert_allclose(ipcw.predict([0]), [1.0]) when ipcw is fit on random data, even when the shorter time in the training set is censored.
  • check that ipcw.predict([t + eps]) is strictly larger than 1.0 when ipcw is fit on random
    censored data and t is the smallest censored duration in the training set.
  • check that IPCWEstimator predicts monotonically increasing values on a np.linspace(0, t_max, 100) when fit on random censored event data.
  • check that when a training set has deterministic censoring beyond a given t value (100% of the largest training events are censored), predicting with large t values is clipped to 1 / ipcw.min_censoring_survival_prob for different values of min_censoring_survival_prob.

Copy link
Contributor

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More feedback.

hazardous/tests/test_metrics.py Outdated Show resolved Hide resolved
hazardous/tests/test_metrics.py Outdated Show resolved Hide resolved
hazardous/tests/test_metrics.py Outdated Show resolved Hide resolved
hazardous/metrics/brier_score.py Outdated Show resolved Hide resolved
hazardous/metrics/brier_score.py Outdated Show resolved Hide resolved
hazardous/metrics/brier_score.py Outdated Show resolved Hide resolved
hazardous/metrics/brier_score.py Outdated Show resolved Hide resolved
hazardous/metrics/brier_score.py Outdated Show resolved Hide resolved
hazardous/metrics/brier_score.py Outdated Show resolved Hide resolved
hazardous/metrics/brier_score.py Outdated Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
hazardous/utils.py Outdated Show resolved Hide resolved
hazardous/utils.py Outdated Show resolved Hide resolved
@ogrisel
Copy link
Contributor

ogrisel commented Jul 13, 2023

Ok @Vincent-Maladiere I think this is ready for final pass of review on your end.

@ogrisel
Copy link
Contributor

ogrisel commented Jul 13, 2023

I rendered the HTML of the doc locally but I realize that we do not have a CI job to do it for the PR. Only for publishing the result in the merge of the PR.

I think the easiest way to achieve this would be to configure a circle CI job for this project. Let's do that later in a separate PR.

@ogrisel ogrisel merged commit 17e3950 into main Jul 17, 2023
6 checks passed
@ogrisel ogrisel deleted the add_brier_score branch July 17, 2023 07:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants