-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Brier Score metrics #10
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First quick pass of feedback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is the suggestion related to https://github.com/soda-inria/hazardous/pull/10/files#r1257862724.
Let's add some tests to check some mathematical properties:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More feedback.
Ok @Vincent-Maladiere I think this is ready for final pass of review on your end. |
I rendered the HTML of the doc locally but I realize that we do not have a CI job to do it for the PR. Only for publishing the result in the merge of the PR. I think the easiest way to achieve this would be to configure a circle CI job for this project. Let's do that later in a separate PR. |
What does this PR implement?
This is the implementation of the Brier score metrics module:
BrierScoreComputer
to be moved to the GBIncidence source codeBrierScoreSampler
brier_score
brier_score_incidence
integrated_brier_score
integrated_brier_score_incidence
This also implements the
IPCWEstimator
used by the BrierScoreComputer. We have a dependency on lifelines but not on scikit-survival.Remarks
- This fixes a subtle bug iny_binary
, making our previous implementation ofbrier_score()
incorrect. It now matches scikit-survival results.Edit:y_true_binary
is 1 when we observed an event, and 0 otherwise, which is correct for our Gradient Boosting classification or regression target.However, when computing the Brier Score metric, we actually need
np.logical_not(y_true_binary)
. Indeed, when we observed an event, the survival probability drops to 0. Otherwise, it stays at 1. The Brier Score uses the survival probability, not the incidence probability.brier_score
computes is the regular Brier Score like scikit-survival does, where they_pred
input is the survival probability. On the contrary,brier_score_incidence
follows the Kretowska formula for competing events and expectsy_pred
to be the incidence probability for the kth cause of failure instead. Under the hood though,brier_score
relies onbrier_score_incidence
, by simply inputting1 - y_pred
.This is needed to disambiguate our Brier Score implementation. We could also only expose
brier_score_incidence
without the regularbrier_score
, but I thought it was handy to have it, instead of telling the user to perform1 - y_pred
for binary events, all the time.When running
times
must be the times used for computingy_pred
—of shape(n_samples, n_times)
— otherwise, the Brier Score will be incorrect. This is tricky to check wheny_pred
is not a dataframe, so we only check for shapes.