Implement apply() in FIL #5358

hcho3 · 2023-04-12T17:54:39Z

Replaces #5307

Depends on #5365

csadorf

Only reviewed the P/Cython code: added a few suggestions, but overall LGTM.

csadorf · 2023-04-14T21:27:38Z

python/cuml/tests/experimental/test_filex.py

+        fm = ForestInference.load(
+            model_path, output_class=True, model_type="xgboost"
+        )


Does it matter in which using_device_type context this is instantiated? If not, should we maybe test expected behavior if this is done wrongly?

It should not matter, and yes, we should definitely test that.

csadorf · 2023-04-14T21:27:57Z

python/cuml/tests/experimental/test_filex.py

+            model_path, output_class=True, model_type="xgboost"
+        )
+
+    with using_device_type(infer_device):


Is there a way to ensure/test that the inference is actually performed on the correct device?

csadorf · 2023-04-14T21:28:42Z

python/cuml/tests/experimental/test_filex.py

+
+    with using_device_type(infer_device):
+        pred_leaf = fm.apply(X).astype(np.int32)
+        expected_pred_leaf = bst.predict(xgb.DMatrix(X), pred_leaf=True)


Unless this is affected by using_device_type(), I'd suggest to move it outside of the context. That goes for all other code where the same principle applies.

csadorf · 2023-04-14T21:31:14Z

python/cuml/tests/experimental/test_filex.py

+            classification=True,
+        )
+
+        model_path = os.path.join(tmp_path, "xgb_class.model")


The tmp_path fixture should be a pathlib.Path object, so this should be equivalent:

Suggested change

model_path = os.path.join(tmp_path, "xgb_class.model")

model_path = tmp_path / "xgb_class.model"

csadorf · 2023-04-14T21:33:27Z

python/cuml/experimental/fil/fil.pyx

+        preds
+            If non-None, outputs will be written in-place to this array.
+            Therefore, if given, this should be a C-major array of shape
+            n_rows * n_trees.


You use a X symbol in row 1254, maybe use the same here to be consistent?

Suggested change

n_rows * n_trees.

n_rows X n_trees.

wphicks

Looking pretty good. After we changed our approach a bit for predict_per_tree, though, I'm wondering if we can't simplify the logic a bit for apply as well.

cpp/include/cuml/experimental/fil/detail/decision_forest_builder.hpp

cpp/include/cuml/experimental/fil/detail/evaluate_tree.hpp

cpp/include/cuml/experimental/fil/detail/infer_kernel/cpu.hpp

cpp/include/cuml/experimental/fil/detail/infer_kernel/gpu.cuh

cpp/include/cuml/experimental/fil/detail/infer_kernel/cpu.hpp

…ory access

…f_new

This reverts commit 47ec711.

This reverts commit 67ea706.

wphicks

This is a really nice refactor of the previous implementation! I think we can simplify the logic for output indexing just a little further, but otherwise (assuming perf testing shakes out), this looks perfect.

cpp/include/cuml/experimental/fil/detail/decision_forest_builder.hpp

cpp/include/cuml/experimental/fil/detail/evaluate_tree.hpp

cpp/include/cuml/experimental/fil/detail/infer_kernel/cpu.hpp

cpp/include/cuml/experimental/fil/detail/infer_kernel/gpu.cuh

hcho3 · 2023-05-19T02:35:18Z

cpp/include/cuml/experimental/fil/detail/infer/cpu.hpp

-      categorical_data,
-      infer_type);
+    if (infer_type == infer_kind::leaf_id) {
+      infer_kernel_cpu<has_categorical_nodes, true>(


I'm only adding predict_leaf template parameter to the CPU kernel. Adding it to the GPU kernel adds too much boilerplate.

dantegd · 2023-05-26T21:53:15Z

/merge

Implement apply() in FIL

7199b04

hcho3 requested review from a team as code owners April 12, 2023 17:54

github-actions bot added CUDA/C++ Cython / Python Cython or Python issue labels Apr 12, 2023

Merge remote-tracking branch 'origin/branch-23.06' into predict_leaf_new

5588285

hcho3 mentioned this pull request Apr 12, 2023

Implement apply() in FIL #5307

Closed

hcho3 added non-breaking Non-breaking change improvement Improvement / enhancement to an existing function labels Apr 12, 2023

csadorf reviewed Apr 14, 2023

View reviewed changes

wphicks requested changes Apr 17, 2023

View reviewed changes

hcho3 commented Apr 17, 2023

View reviewed changes

cpp/include/cuml/experimental/fil/detail/infer_kernel/cpu.hpp Outdated Show resolved Hide resolved

hcho3 added 18 commits April 17, 2023 15:41

Merge branch 'branch-23.06' into predict_leaf_new

361e1ab

Fix a subtle bug in Treelite model importer, which led to illegal mem…

db1335a

…ory access

Some simple changes

3d897d2

Merge remote-tracking branch 'upstream/branch-23.06' into predict_lea…

dddfe6d

…f_new

Merge branch 'branch-23.06' into predict_leaf_new

7d77d79

Merge remote-tracking branch 'origin/branch-23.06' into predict_leaf_new

5fd9f7f

Fix broken merge

4775918

C++ benchmark for CPU FIL

67ea706

Roll back change to number of repetition

47ec711

Merge remote-tracking branch 'origin/branch-23.06' into predict_leaf_new

5dd74ef

Revert "Roll back change to number of repetition"

7e130e3

This reverts commit 47ec711.

Revert "C++ benchmark for CPU FIL"

29871b8

This reverts commit 67ea706.

Merge remote-tracking branch 'origin/branch-23.06' into predict_leaf_new

78d1e3b

Revamp design

e208c16

Avoid unused variable warning

535763e

Merge remote-tracking branch 'origin/branch-23.06' into predict_leaf_new

ea3a55f

Merge branch 'branch-23.06' into predict_leaf_new

e611588

Merge branch 'branch-23.06' into predict_leaf_new

7ae0118

wphicks requested changes May 17, 2023

View reviewed changes

hcho3 added 2 commits May 18, 2023 02:36

Update HDBSCAN runner to work with latest RAFT

a35877c

Add template parameter predict_leaf

0a572ef

hcho3 commented May 19, 2023

View reviewed changes

wphicks approved these changes May 19, 2023

View reviewed changes

hcho3 and others added 7 commits May 19, 2023 11:19

Merge branch 'branch-23.06' into predict_leaf_new

94ff224

Fix formatting

9880dc0

Merge branch 'branch-23.06' into predict_leaf_new

17d44ed

Address reviewer's feedback

ae545bd

Merge branch 'branch-23.06' into predict_leaf_new

fd33d14

Avoid compiler warning about unused variable

00e5ae4

Merge branch 'branch-23.06' into predict_leaf_new

2c0627c

rapids-bot bot merged commit 36c1ea9 into rapidsai:branch-23.06 May 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement apply() in FIL #5358

Implement apply() in FIL #5358

hcho3 commented Apr 12, 2023 •

edited

Loading

csadorf left a comment

csadorf Apr 14, 2023

wphicks Apr 17, 2023

csadorf Apr 14, 2023

csadorf Apr 14, 2023

csadorf Apr 14, 2023

csadorf Apr 14, 2023

wphicks left a comment

wphicks left a comment

hcho3 May 19, 2023

dantegd commented May 26, 2023

	model_path = os.path.join(tmp_path, "xgb_class.model")
	model_path = tmp_path / "xgb_class.model"

Implement apply() in FIL #5358

Implement apply() in FIL #5358

Conversation

hcho3 commented Apr 12, 2023 • edited Loading

csadorf left a comment

Choose a reason for hiding this comment

csadorf Apr 14, 2023

Choose a reason for hiding this comment

wphicks Apr 17, 2023

Choose a reason for hiding this comment

csadorf Apr 14, 2023

Choose a reason for hiding this comment

csadorf Apr 14, 2023

Choose a reason for hiding this comment

csadorf Apr 14, 2023

Choose a reason for hiding this comment

csadorf Apr 14, 2023

Choose a reason for hiding this comment

wphicks left a comment

Choose a reason for hiding this comment

wphicks left a comment

Choose a reason for hiding this comment

hcho3 May 19, 2023

Choose a reason for hiding this comment

dantegd commented May 26, 2023

hcho3 commented Apr 12, 2023 •

edited

Loading