patch: scikit-learn 1.6 compatibility #726

FBruzzesi · 2024-12-16T08:59:40Z

Description

Overseeds #720 by using and adapting sklearn-compat

Tested on python 3.10 and 3.12 with scikit-learn 1.5 and 1.6: all tests are passing these local runs

Tip for review

I tried to carefully commit each module at the time, hence the best way forward would be to review one commit at the time.

With a big disclaimer, at the end I realized that we validate_data was not gonna work in all situations, but an adaptation of check_X_y would, therefore I rolled back those entirely in another branch which was merged into this one

Edit: as suggested, using validate_data is the right way to go, and after some deeper dive, I was able to refactor the codebase using such function

sklego/_sklearn_compat.py

adrinjalali · 2024-12-16T10:05:38Z

sklego/decomposition/pca_reconstruction.py

        X = check_array(X, estimator=self, dtype=FLOAT_DTYPES)
+        _check_n_features(self, X, reset=True)


you probably want to call validate_data instead of check_array, check_X_y, and _check_n_features. In sklearn itself, we rarely call these functions. It's almost always validate_data

Thanks @adrinjalali ! I will have a deeper look.

If you look at the first 7 commits of this PR, that was exactly how I approached it - i.e. moving from check_array and check_X_y to validate_data. However, many tests were failing (especially for meta estimators), thus I move forward with the safe path of adjusting what we already have.

I'd be happy to have a look if you point me to some errors.

It seems it was a skill issue of mine! Should be fixed now 😇

koaning

Not the biggest fan of a big vendor, but I guess it is the most pragmatic way to go about it for now. I guess time will tell if this becomes a burden but I guess by that time we can just assume folks use scikit-learn >> 1.6.

@FBruzzesi well done! This was a lot of work!

FBruzzesi added 9 commits December 12, 2024 13:01

WIP: low hanging fix

a52bec1

add sklearn-compat dependency

8cfa6c7

preprocessing module

8052bc4

decomposition module

9693384

mixture and feature_selection modules

bac7e83

meta module

e1b2520

top level modules

d3cb19f

WIP: do not use validate_data

6290c99

check_X_y with changed check_array

54e7664

FBruzzesi commented Dec 16, 2024

View reviewed changes

sklego/_sklearn_compat.py Outdated Show resolved Hide resolved

adrinjalali reviewed Dec 16, 2024

View reviewed changes

FBruzzesi added 2 commits December 16, 2024 12:27

use validate_data

f2f28bb

use validate_data

2c90729

koaning approved these changes Dec 17, 2024

View reviewed changes

koaning merged commit c17cd27 into main Dec 17, 2024
15 checks passed

koaning deleted the patch/scikit-learn-16-compat branch December 17, 2024 12:06

FBruzzesi mentioned this pull request Dec 17, 2024

Transition to scikit-learn 1.6 #719

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

patch: scikit-learn 1.6 compatibility #726

patch: scikit-learn 1.6 compatibility #726

FBruzzesi commented Dec 16, 2024 •

edited

Loading

adrinjalali Dec 16, 2024

FBruzzesi Dec 16, 2024

adrinjalali Dec 16, 2024

FBruzzesi Dec 16, 2024

koaning left a comment

		X = check_array(X, estimator=self, dtype=FLOAT_DTYPES)
		_check_n_features(self, X, reset=True)

patch: scikit-learn 1.6 compatibility #726

patch: scikit-learn 1.6 compatibility #726

Conversation

FBruzzesi commented Dec 16, 2024 • edited Loading

Description

Tip for review

adrinjalali Dec 16, 2024

Choose a reason for hiding this comment

FBruzzesi Dec 16, 2024

Choose a reason for hiding this comment

adrinjalali Dec 16, 2024

Choose a reason for hiding this comment

FBruzzesi Dec 16, 2024

Choose a reason for hiding this comment

koaning left a comment

Choose a reason for hiding this comment

FBruzzesi commented Dec 16, 2024 •

edited

Loading