-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potential problem of the CSI score #1
Comments
The same problem also appears in the implementation of POD, SURR and BIAS. We found that the CSI score is the most vulnerable one, which can be quite misleading, especially when batch_size is small. |
Thanks for the feedback -- I remember struggling with this choice. In general these metrics are unreliable when using small batches, so I recommended that they only be used for evaluating over your entire testing set. If running in batches is necessary, one can also tally hits, misses and false alarm in batches, and combine them in the end. That being said, I agree there is probably a better way to handle cases where
which would be more of a compromise between your suggestion, and the way it is now. Here |
Thanks for your reply. It's quite helpful. |
Dear authors of the SEVIR benchmark, we are recently running the SEVIR benchmark but noticed that there's a potential problem of the current implementation of the CSI score:
sevir_challenges/src/metrics/metrics.py
Line 138 in 6e18184
When the threshold is large, it is possible that the
hits
,misses
andfas
of the model are all zero. The current formulation will producecsi=1.0
in such case (while it should give csi=0.0). To avoid this problem, a better formula might beCC @gaozhihan also
The text was updated successfully, but these errors were encountered: