Add class to handle value filling #46

louisPoulain · 2024-06-26T09:29:13Z

No description provided.

Modify Standardizer to ensure compatibility with the new parent class and method

…tested with the original standardizer class

…sses

…he future

…caler (issues with computing the median)

…to_dict

…be removed)

See #43 (comment)

…on was moved to the data_transformer class

louisPoulain · 2024-06-26T09:30:01Z

@dnerini Should we already aim for a ValueFiller that can differentiate between different variables ?

dnerini · 2024-06-27T15:49:07Z

mlpp_lib/value_filler.py

+from typing_extensions import Self
+
+
+class ValueFiller:


I'd use "Imputer" since imputation is a widely used concept in ML, see e.g. https://scikit-learn.org/stable/modules/impute.html

…DOs to address

louisPoulain · 2024-10-23T09:47:52Z

mlpp_lib/imputer.py

+    def fit(self,
+        dataset: xr.Dataset,
+        dims: Optional[list] = None,
+    ):
+        self.fillvalue = self.fillvalue


Here we currently don't ask for all variables to be filled. This is not an issue as we remove NaNs afterwards anyway, but this behaviour may not be expected by users.
Should we force for all vars to be present ? Or have a "default" value assigned to the rest of variables ?

louisPoulain added 30 commits June 6, 2024 15:34

Define a first draft for a Normalizer "meta-class"

a766beb

Extend normaliser class with new method.

ddadee9

Modify Standardizer to ensure compatibility with the new parent class and method

Continue working on normalizer class. Experimental, still need to be …

6bf868c

…tested with the original standardizer class

Changed a lot the meta-normalizer architecture.

a0d030c

Early testing with the meta class and the initial standardizer class.

3067f28

Update gitignore to hide test files

c261abc

Add MinMaxScaler

1a7764b

Add and perform tests for MinMaxScaler

d52f86a

Change the testing to a more general framework.

5b71fd7

Correct a few issues

e9fa689

Add and test MinMaxScaler

b4f5333

Add RobustScaling (standard norma with median and iqr)

181f8b8

Make the test file work with an unlimited number of nromalizer subcla…

ca556fe

…sses

Add MaxAbsScaler, BoxCox and YeoJohnson transformations

8c2761f

Changed minor things

5f562c6

Minor changes, this test script is vowed to disappear

7f3bd13

Modification of the naming

efb00d8

Modofication to take into account the new class

e4a91f1

Modif ignore

869472f

Add a try/except to test if normalizer was fitted. To be changed in t…

c6b0245

…he future

Rename some parameters and change a bit the structure. Remove RobustS…

6e4b25a

…caler (issues with computing the median)

Add function to bypass dask.nanmedian (not working currently)

68c9ae2

Few text changes

65c88d2

Add fillvalue in arguments, modify the way an object is created from …

e30ed07

…to_dict

Reduce tolerance to 1e-7 for allclose tests

bcbe64d

Minor change

86f8e64

Add logger message to track the multinormalizer.

f25ea28

Debug a bit the lengths of datasets

61399a2

Add default normalizer

41b7240

Move save and load to the super-class

e79aec4

louisPoulain added 16 commits June 20, 2024 13:33

Changes according to standardizers.py

88b7bf5

Changes according to standardizers.py

4f15068

Rename the test to rpevent it from being launched by pytest (test to …

8750db9

…be removed)

Modify code to work with the new normalizers

2ba7abc

Clean a bit the code

eb3e8be

Update doc, remove print messages

d5629b0

Update test

7866d46

Remove import

67ddcc6

Modify the default normalization handling

745f843

Add fitting of data

b3b2e37

Minor simplification

a3ae2fb

Change names to reflect more accurately the roles of the classes

1cfd1b7

Revert to use "normalizer" as input name

5f76a3e

See #43 (comment)

Use again as pytesting. Removed the serialization test as serializati…

0edf87d

…on was moved to the data_transformer class

Rename the method according to #43 (comment)

1a7129a

Add class to handle value filling

a8ce0db

dnerini reviewed Jun 27, 2024

View reviewed changes

Base automatically changed from feature/normalizer to main August 31, 2024 16:00

Louis Poulain--Auzéau and others added 9 commits October 18, 2024 16:20

Merge branch 'main' into feature/ValueFiller

4774838

Add some initial thoughts and rename value_filler.py to imputer.py

5d53eb9

Prepare a template for imputation methods

00c203d

(WIP) Prepare PersistentImputation

4fff489

(WIP) Finish PersistentImputation

da26df6

Improve classes, finish dataImputation super-class

5d2f1fa

Create DataImputer on the model of DataTransformer, left important TO…

7a4808a

…DOs to address

Small text fixes

53bedd9

Update ConstantImputer

0a17fa5

louisPoulain commented Oct 23, 2024

View reviewed changes

Delete tests/test_data_transformer.py

855d6ca

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add class to handle value filling #46

Add class to handle value filling #46

louisPoulain commented Jun 26, 2024

louisPoulain commented Jun 26, 2024

dnerini Jun 27, 2024

louisPoulain Oct 23, 2024

Add class to handle value filling #46

Are you sure you want to change the base?

Add class to handle value filling #46

Conversation

louisPoulain commented Jun 26, 2024

louisPoulain commented Jun 26, 2024

dnerini Jun 27, 2024

Choose a reason for hiding this comment

louisPoulain Oct 23, 2024

Choose a reason for hiding this comment