-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
expensive CHECK_GE #5722
Comments
There was a patch in dmlc-core for reducing the overhead from checks. I believe it should help once we update dmlc-core. @hcho3 Correct me if I'm wrong. |
@trivialfis Yes, that's correct. |
dmlc/dmlc-core#613 ? I think it solves a little bit different issue; In any case, I've updated dmlc-core and ran test again with adding another if to avoid the call.
avg = 7.53s vs 9.37 (25% data loading time win). I'll submit PR with that change to dmlc. |
Is this CHECK_GE useful for catching any potential bug?
xgboost/src/data/data.cc
Lines 653 to 658 in 91c6463
It's fairly expensive for what it does; removing it reduces data loading time from ~9s to ~7s on Higgs dataset (with whole training time being ~1m on that machine, so, ~3% total.
call to LogCheck_GE is not inlined (gcc 7.4, x86):
disasm:
and forcing inlining it will be likely not a good idea, as that function is pretty big.
If the check is needed, one way to improve would be to short-circuit the condition in the macro itself and avoid the call:
That'd still be a 1-2 seconds win (on that, specific configuration). I can make a change, but if it's ok to just remove the check, that would be easier.
The text was updated successfully, but these errors were encountered: