-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hotfix for serializer #273
Conversation
@@ -44,7 +44,7 @@ inline void TestRoundTrip(treelite::Model* model) { | |||
std::unique_ptr<treelite::Model> received_model = treelite::Model::DeserializeFromFile(fp); | |||
std::fclose(fp); | |||
|
|||
ASSERT_EQ(TreeliteToBytes(model), TreeliteToBytes(received_model.get())); | |||
ASSERT_TRUE(TreeliteToBytes(model) == TreeliteToBytes(received_model.get())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ASSERT_EQ
will dump all the raw bytes into a string, leading to OOM error for this test case. So use ASSERT_TRUE
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a comment to this effect so that someone doesn't naively reintroduce this problem down the line?
tree->SetLeafNode(3, frontend::Value::Create<LeafOutputType>(tree_id + 3)); | ||
tree->SetLeafNode(4, frontend::Value::Create<LeafOutputType>(tree_id + 1)); | ||
tree->SetLeafNode(5, frontend::Value::Create<LeafOutputType>(tree_id + 4)); | ||
tree->SetLeafNode(6, frontend::Value::Create<LeafOutputType>(tree_id + 2)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I crafted the test case so that all the trees will have different leaf outputs. The bug being addressed causes all trees to be identical.
Codecov Report
@@ Coverage Diff @@
## mainline #273 +/- ##
==============================================
+ Coverage 84.31% 84.68% +0.36%
Complexity 46 46
==============================================
Files 97 97
Lines 7442 7618 +176
Branches 50 50
==============================================
+ Hits 6275 6451 +176
Misses 1142 1142
Partials 25 25
Continue to review full report at Codecov.
|
@@ -925,6 +925,7 @@ ModelImpl<ThresholdType, LeafOutputType>::InitFromPyBuffer( | |||
|
|||
auto tree_hanlder = [&begin](Tree<ThresholdType, LeafOutputType>& tree) { | |||
tree.InitFromPyBuffer(begin, begin + kNumFramePerTree); | |||
begin += kNumFramePerTree; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're capturing begin
by reference, won't this iterate begin forward as a side effect of executing the lambda? It's not immediately clear to me why that is the correct thing to do with this handler. Maybe add a comment or refactor such that the lambda returns the next iterator to be processed rather than directly changing begin
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're capturing begin by reference, won't this iterate begin forward as a side effect of executing the lambda?
Yes, this is intended behavior.
refactor such that the lambda returns the next iterator to be processed
That would not be feasible, since the tree handler for the file stream does not involve iterators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! Can we get a comment just to make it clear to future developers that the side-effect is intentional?
@@ -44,7 +44,7 @@ inline void TestRoundTrip(treelite::Model* model) { | |||
std::unique_ptr<treelite::Model> received_model = treelite::Model::DeserializeFromFile(fp); | |||
std::fclose(fp); | |||
|
|||
ASSERT_EQ(TreeliteToBytes(model), TreeliteToBytes(received_model.get())); | |||
ASSERT_TRUE(TreeliteToBytes(model) == TreeliteToBytes(received_model.get())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a comment to this effect so that someone doesn't naively reintroduce this problem down the line?
Upgrade to Treelite 1.3.0 to take advantage of the following new features: * Faster model import for scikit-learn tree models (dmlc/treelite#264). Fixes #3768 * Binary serializer to a file stream (dmlc/treelite#270, dmlc/treelite#273) * [EXPERIMENTAL] Add GTIL, reference inference backend (dmlc/treelite#274) Make progress towards #3853 Depends on rapidsai/integration#270 Authors: - Philip Hyunsu Cho (https://github.com/hcho3) Approvers: - William Hicks (https://github.com/wphicks) - AJ Schmidt (https://github.com/ajschmidt8) - Dante Gama Dessavre (https://github.com/dantegd) URL: #3855
Upgrade to Treelite 1.3.0 to take advantage of the following new features: * Faster model import for scikit-learn tree models (dmlc/treelite#264). Fixes rapidsai#3768 * Binary serializer to a file stream (dmlc/treelite#270, dmlc/treelite#273) * [EXPERIMENTAL] Add GTIL, reference inference backend (dmlc/treelite#274) Make progress towards rapidsai#3853 Depends on rapidsai/integration#270 Authors: - Philip Hyunsu Cho (https://github.com/hcho3) Approvers: - William Hicks (https://github.com/wphicks) - AJ Schmidt (https://github.com/ajschmidt8) - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#3855
Failing tests in rapidsai/cuml#3829 revealed a subtle bug in #270. It appears that the testing suite in Treelite was not sufficiently robust to let this bug through 😞
The bug causes multiple trees to deserialize from the same set of frames.
I added a few test cases that are affected by the bug. The current Treelite version will fail these tests.
Once this PR is merged, I will publish a patch release (1.2.1).