-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add weighted_n_node_samples field in sklearn importer #330
Conversation
Codecov Report
@@ Coverage Diff @@
## mainline #330 +/- ##
==============================================
- Coverage 85.06% 84.16% -0.90%
Complexity 42 42
==============================================
Files 108 108
Lines 8374 8350 -24
Branches 40 40
==============================================
- Hits 7123 7028 -95
- Misses 1228 1299 +71
Partials 23 23
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
The 2.2.0 version of Treelite incorporates the following major improvements: * dmlc/treelite#314 * dmlc/treelite#322, dmlc/treelite#327 * dmlc/treelite#325 * dmlc/treelite#332 * dmlc/treelite#330 * dmlc/treelite#333 * dmlc/treelite#334 * dmlc/treelite#304 * dmlc/treelite#335 In particular, dmlc/treelite#332, dmlc/treelite#330, dmlc/treelite#333 are required for #4447. Requires rapidsai/integration#412. EDIT. Using 2.2.1 patch release, to incorporate a hotfix (dmlc/treelite#340). Authors: - Philip Hyunsu Cho (https://github.com/hcho3) Approvers: - AJ Schmidt (https://github.com/ajschmidt8) - Dante Gama Dessavre (https://github.com/dantegd) URL: #4484
Add support for: - [x] cuML RF classifiers - [x] scikit-learn RF regressors - [x] scikit-learn RF classifiers TODOs - [x] Add test cases - [x] De-duplicate path extraction logic Requires dmlc/treelite#330 Authors: - Philip Hyunsu Cho (https://github.com/hcho3) Approvers: - William Hicks (https://github.com/wphicks) - Dante Gama Dessavre (https://github.com/dantegd) URL: #4447
The 2.2.0 version of Treelite incorporates the following major improvements: * dmlc/treelite#314 * dmlc/treelite#322, dmlc/treelite#327 * dmlc/treelite#325 * dmlc/treelite#332 * dmlc/treelite#330 * dmlc/treelite#333 * dmlc/treelite#334 * dmlc/treelite#304 * dmlc/treelite#335 In particular, dmlc/treelite#332, dmlc/treelite#330, dmlc/treelite#333 are required for rapidsai#4447. Requires rapidsai/integration#412. EDIT. Using 2.2.1 patch release, to incorporate a hotfix (dmlc/treelite#340). Authors: - Philip Hyunsu Cho (https://github.com/hcho3) Approvers: - AJ Schmidt (https://github.com/ajschmidt8) - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4484
…#4447) Add support for: - [x] cuML RF classifiers - [x] scikit-learn RF regressors - [x] scikit-learn RF classifiers TODOs - [x] Add test cases - [x] De-duplicate path extraction logic Requires dmlc/treelite#330 Authors: - Philip Hyunsu Cho (https://github.com/hcho3) Approvers: - William Hicks (https://github.com/wphicks) - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4447
Scikit-learn tree models has two fields for sample counts in nodes:
n_node_samples
(unweighted count, int64) andweighted_n_node_samples
(weighted count, float64). So far, Treelite kept only the unweighted count and discarded the weighted count.This PR stores
weighted_n_node_samples
in the Treelite object, using thesum_hess
field.Required by rapidsai/cuml#4447