Use DFS to get a valid but smaller new node order #570

tmct · 2024-06-24T11:38:18Z

Please see #571 for details

tmct · 2024-06-24T11:39:36Z

Resolves #571

hcho3 · 2024-06-24T14:49:50Z

Hi, thanks for submitting a pull request. The fix looks good to me.

Can you add a unit test that uses a LightGBM tree with depth >32 ?

tmct · 2024-06-24T20:34:57Z

Good idea. I've synthesised a model that reproduces - will try to get it into a unit test.

'tree\nversion=v4\nnum_class=1\nnum_tree_per_iteration=1\nlabel_index=0\nmax_feature_idx=0\nobjective=regression\nfeature_names=this\nfeature_infos=[0:100]\ntree_sizes=1119\n\nTree=0\nnum_leaves=32\nnum_cat=0\nsplit_feature=0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\nsplit_gain=0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1\nthreshold=1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\ndecision_type=2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2\nleft_child=1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 -31\nright_child=-1 -2 -3 -4 -5 -6 -7 -8 -9 -10 -11 -12 -13 -14 -15 -16 -17 -18 -19 -20 -21 -22 -23 -24 -25 -26 -27 -28 -29 -30 -32\nleaf_value=31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 0 1\nleaf_weight=1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\nleaf_count=1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\ninternal_value=0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\ninternal_weight=1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\ninternal_count=1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\nis_linear=0\nshrinkage=1\n\n\nend of trees\n\nfeature_importances:\nthis=31\n\npandas_categorical:null\n'

codecov · 2024-06-25T01:01:20Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 84.66%. Comparing base (27b81f7) to head (8175573).
Report is 5 commits behind head on mainline.

Additional details and impacted files

@@             Coverage Diff              @@
##           mainline     #570      +/-   ##
============================================
+ Coverage     84.58%   84.66%   +0.08%     
============================================
  Files            75       75              
  Lines          6547     6549       +2     
  Branches        528      528              
============================================
+ Hits           5538     5545       +7     
+ Misses         1009     1004       -5

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

hcho3 · 2024-06-25T01:07:59Z

@tmct Can you post the code that generated the model? It would be nice to not add the model file itself to the git repo.

src/model_loader/lightgbm.cc

tmct · 2024-06-25T07:24:28Z

The model is effectively handmade outside of LightGBM using the LightGBM model format, and I can't share the specific construction code for it, sorry. I think it might be very difficult to produce a small model with this depth property through the natural training dynamics of LightGBM.... You might be able to achieve it through clever use of features such as sample weights and custom objectives...

The model file above is a string less than 2kB - I tried to make it as small as possible but still failing the test.

hcho3 · 2024-06-25T08:00:27Z

I see. Let's include the model file in the repo then. If you need help setting up the test, ping me.

tmct · 2024-06-25T16:53:25Z

Thank you, will do.

hcho3 · 2024-07-10T23:04:03Z

@tmct Do you mind if I take over this pull request? I'd like to include this as part of the next release of Treelite.

tmct · 2024-07-11T07:05:47Z

@hcho3 I would be delighted if you could finish the tests for me, please. Sorry that I have not yet found the time.

hcho3 · 2024-07-13T00:18:49Z

Done!

tmct · 2024-07-13T06:49:05Z

Many thanks!

Treelite 4.3.0 contains the following improvements: * Support XGBoost 2.1.0, including the UBJSON format (dmlc/treelite#572, dmlc/treelite#578) * [GTIL] Allow inferencing with FP32 input + FP64 model (dmlc/treelite#574). Related: triton-inference-server/fil_backend#391 * Prevent integer overflow for deep LightGBM trees by using DFS order (dmlc/treelite#570). * Support building with latest RapidJSON (dmlc/treelite#567) Authors: - Philip Hyunsu Cho (https://github.com/hcho3) Approvers: - James Lamb (https://github.com/jameslamb) - Dante Gama Dessavre (https://github.com/dantegd) URL: #5968

Use DFS to get a valid but smaller new node order

3de4de8

tmct mentioned this pull request Jun 24, 2024

LightGBM to Treelite conversion fails when some leaves are very deep #571

Closed

hcho3 added 2 commits June 24, 2024 16:40

Update lightgbm.cc

e86ce78

Merge branch 'mainline' into patch-1

4504d0f

tmct commented Jun 25, 2024

View reviewed changes

src/model_loader/lightgbm.cc Show resolved Hide resolved

hcho3 added 2 commits July 12, 2024 15:36

Merge branch 'mainline' into patch-1

dc0b017

Add a unit test

8175573

hcho3 merged commit f1b910e into dmlc:mainline Jul 13, 2024
19 checks passed

hcho3 mentioned this pull request Jul 17, 2024

Bump Treelite to 4.3.0 rapidsai/cuml#5968

Merged

tmct deleted the patch-1 branch July 19, 2024 08:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use DFS to get a valid but smaller new node order #570

Use DFS to get a valid but smaller new node order #570

tmct commented Jun 24, 2024 •

edited

Loading

tmct commented Jun 24, 2024

hcho3 commented Jun 24, 2024 •

edited

Loading

tmct commented Jun 24, 2024

codecov bot commented Jun 25, 2024 •

edited

Loading

hcho3 commented Jun 25, 2024

tmct commented Jun 25, 2024

hcho3 commented Jun 25, 2024

tmct commented Jun 25, 2024

hcho3 commented Jul 10, 2024

tmct commented Jul 11, 2024

hcho3 commented Jul 13, 2024

tmct commented Jul 13, 2024

Use DFS to get a valid but smaller new node order #570

Use DFS to get a valid but smaller new node order #570

Conversation

tmct commented Jun 24, 2024 • edited Loading

tmct commented Jun 24, 2024

hcho3 commented Jun 24, 2024 • edited Loading

tmct commented Jun 24, 2024

codecov bot commented Jun 25, 2024 • edited Loading

Codecov Report

hcho3 commented Jun 25, 2024

tmct commented Jun 25, 2024

hcho3 commented Jun 25, 2024

tmct commented Jun 25, 2024

hcho3 commented Jul 10, 2024

tmct commented Jul 11, 2024

hcho3 commented Jul 13, 2024

tmct commented Jul 13, 2024

tmct commented Jun 24, 2024 •

edited

Loading

hcho3 commented Jun 24, 2024 •

edited

Loading

codecov bot commented Jun 25, 2024 •

edited

Loading