-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add class to handle value filling #46
base: main
Are you sure you want to change the base?
Conversation
Modify Standardizer to ensure compatibility with the new parent class and method
…tested with the original standardizer class
…caler (issues with computing the median)
…on was moved to the data_transformer class
@dnerini Should we already aim for a ValueFiller that can differentiate between different variables ? |
mlpp_lib/value_filler.py
Outdated
from typing_extensions import Self | ||
|
||
|
||
class ValueFiller: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd use "Imputer" since imputation is a widely used concept in ML, see e.g. https://scikit-learn.org/stable/modules/impute.html
def fit(self, | ||
dataset: xr.Dataset, | ||
dims: Optional[list] = None, | ||
): | ||
self.fillvalue = self.fillvalue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we currently don't ask for all variables to be filled. This is not an issue as we remove NaNs afterwards anyway, but this behaviour may not be expected by users.
Should we force for all vars to be present ? Or have a "default" value assigned to the rest of variables ?
No description provided.