Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

c2st fails when one feature is constant #1204

Closed
Baschdl opened this issue Jul 24, 2024 · 1 comment · Fixed by #1205
Closed

c2st fails when one feature is constant #1204

Baschdl opened this issue Jul 24, 2024 · 1 comment · Fixed by #1205
Labels
bug Something isn't working

Comments

@Baschdl
Copy link
Contributor

Baschdl commented Jul 24, 2024

Describe the bug

Running c2st/c2st_scores with the default z_scores=True when at least one feature is constant (all data points have the same value for this feature) fails with ValueError: Input X contains NaN. RandomForestClassifier does not accept missing values encoded as NaN natively....
This is caused by dividing the data by the standard deviation of this feature (which is zero):

sbi/sbi/utils/metrics.py

Lines 161 to 165 in 83e122a

if z_score:
X_mean = torch.mean(X, dim=0)
X_std = torch.std(X, dim=0)
X = (X - X_mean) / X_std
Y = (Y - X_mean) / X_std

To Reproduce

from sbi.utils.metrics import c2st
import torch

X, Y = torch.ones(5,2), torch.zeros(5,2)
c2st(X, Y)
@Baschdl Baschdl added the bug Something isn't working label Jul 24, 2024
@janfb
Copy link
Contributor

janfb commented Jul 26, 2024

Thanks for reporting this @Baschdl

I can reproduce it only when all features are constant. But still, this should not happen. I suggest setting std=1 when the feature is constant so that we are only shifting it to zero in that case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants