-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nasty bugs in custom update kernels #439
Conversation
Ironically, in parallel, I was adding some new functionality and extending the tests to cover it and these were starting to fail because of the second of these issues so the testing was nearly good enough 😟 |
Codecov Report
@@ Coverage Diff @@
## master #439 +/- ##
==========================================
+ Coverage 87.91% 87.94% +0.03%
==========================================
Files 78 78
Lines 16445 16494 +49
==========================================
+ Hits 14457 14506 +49
Misses 1988 1988
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for delay. I will have to trust you on the details on this one.
Is there a very typical case you can add as a test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I would say, let's go ahead and merge these.
Two really nasty bugs have come to light as I've started building models which do more complex stuff with custom updates:
Firstly, we were building the group start indices (the arrays that get binary searched to find the index to the right data if groups get merged) across all custom update groups. This is clearly wrong as separate kernel(s) are generated for each update group.
Secondly, in the CUDA kernel, the group start indices for custom updates and custom weight updates weren't being generated together so if there were enough of either to require binary searching there would be problems.