[Logistic Regression] Support fit on two classes #343

lijinf2 · 2023-07-25T23:24:08Z

Require installing cuml 23.08 nightly to run the tests.

lijinf2 · 2023-07-25T23:30:34Z

build

wbo4958 · 2023-07-26T03:37:48Z

python/src/spark_rapids_ml/classification.py

+    ) -> Callable[[FitInputType, Dict[str, Any]], Dict[str, Any],]:
+        num_workers = self.num_workers
+        array_order = self._fit_array_order()
+        num_classes = dataset.select(alias.label).distinct().count()


seems num_classes's not been used.

wbo4958 · 2023-07-26T03:40:31Z

python/src/spark_rapids_ml/classification.py

+
+            # filter only supported params
+            init_parameters = {
+                k: v for k, v in init_parameters.items() if k in supported_params


Seems supported_params is always an empty list?

Yeah, working on C++/Cython support: rapidsai/cuml#5516
Will address this in the next PR.

Will you be integrating the init params in this PR, now that the cuml init params PR is merged? It would help flesh out the param mapping and the compat tests.

Yeah. not sure actually.

The init PR will update a certain amount of the codes in multiple places of this PR. May make this PR too long and introduce extra reviewing overhead.

Thinking of getting this one merged. Then I will create a new PR for transform and init and update all the tests in test_logistic_regression.py.

eordentlich · 2023-08-01T18:33:42Z

python/tests/test_logistic_regression.py

@@ -0,0 +1,111 @@
+from typing import Any, Dict, Tuple


Would be good to have a 'compat' test similar to other classes with branches for non-supported apis.

'compat' test requires vector input type. Currently spark rapids ml converts vector input type into float64 that has no support in cuml C++/Cython 23.08 yet.

Do we want to work out a workaround?

lijinf2 · 2023-08-05T00:30:34Z

build

leewyang

One typo (I think), otherwise LGTM

leewyang · 2023-08-08T18:13:47Z

python/src/spark_rapids_ml/classification.py

+            feature_type,
+        ) = super()._pre_process_data(dataset)
+
+        # if input format is vectorUDT, convert data type from float32


"to float32"?

Signed-off-by: Jinfeng <[email protected]>

lijinf2 · 2023-08-08T23:11:22Z

build

lijinf2 force-pushed the lr_fit branch from cdf5d5a to d2036f1 Compare July 25, 2023 23:26

wbo4958 reviewed Jul 26, 2023

View reviewed changes

eordentlich reviewed Aug 1, 2023

View reviewed changes

leewyang reviewed Aug 8, 2023

View reviewed changes

lijinf2 added 7 commits August 8, 2023 16:08

added a test on toy example for LR

d6a9276

Signed-off-by: Jinfeng <[email protected]>

add test comparing cuml on large dataset

649ab3a

Signed-off-by: Jinfeng <[email protected]>

clean code

6a3ad99

Signed-off-by: Jinfeng <[email protected]>

revised accordingly to the new cuml fit wrapper

abe3f9e

Signed-off-by: Jinfeng <[email protected]>

get run_tests.sh passed and added docstring

8c77ac5

Signed-off-by: Jinfeng <[email protected]>

revised PR and ignore test_logistic_regression in 23.06 test environment

2b251f1

Signed-off-by: Jinfeng <[email protected]>

rebase latest with ci 23.08 and revise typo

20fa5d3

lijinf2 force-pushed the lr_fit branch from d61191e to 20fa5d3 Compare August 8, 2023 23:09

leewyang approved these changes Aug 9, 2023

View reviewed changes

lijinf2 merged commit e6f95f7 into NVIDIA:branch-23.08 Aug 9, 2023
1 check passed

lijinf2 deleted the lr_fit branch March 6, 2024 05:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Logistic Regression] Support fit on two classes #343

[Logistic Regression] Support fit on two classes #343

lijinf2 commented Jul 25, 2023 •

edited

Loading

lijinf2 commented Jul 25, 2023

wbo4958 Jul 26, 2023

wbo4958 Jul 26, 2023

lijinf2 Jul 26, 2023

leewyang Aug 7, 2023

lijinf2 Aug 8, 2023

eordentlich Aug 1, 2023

lijinf2 Aug 1, 2023

lijinf2 commented Aug 5, 2023

leewyang left a comment

leewyang Aug 8, 2023

lijinf2 commented Aug 8, 2023

[Logistic Regression] Support fit on two classes #343

[Logistic Regression] Support fit on two classes #343

Conversation

lijinf2 commented Jul 25, 2023 • edited Loading

lijinf2 commented Jul 25, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lijinf2 commented Aug 5, 2023

leewyang left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lijinf2 commented Aug 8, 2023

lijinf2 commented Jul 25, 2023 •

edited

Loading