-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI] MachineLearningLicensingTests] [testAutoCloseJobWithDatafeed fails with NPE #51285
Comments
Pinging @elastic/ml-core (:ml) |
@henningandersen how long did you sleep for to make it easily reproducible? I tried a variety of values from 20ms to 9 seconds and none of them reproduced the problem. (10 seconds or more causes an unrelated failure because an Having said that I can see a theoretical flaw in the code, and a reason why adding a sleep would make it more likely to happen. So I am happy to fix that today. |
The ID of the datafeed's associated job was being obtained frequently by looking up the datafeed task in a map that was being modified in other threads. This could lead to NPEs if the datafeed stopped running at an unexpected time. This change reduces the number of places where a datafeed's associated job ID is looked up to avoid the possibility of failures when the datafeed's task is removed from the map of running tasks during multi-step operations in other threads. Fixes elastic#51285
I think #51302 will fix this, but it's hard to be certain as I couldn't reproduce it locally. |
The ID of the datafeed's associated job was being obtained frequently by looking up the datafeed task in a map that was being modified in other threads. This could lead to NPEs if the datafeed stopped running at an unexpected time. This change reduces the number of places where a datafeed's associated job ID is looked up to avoid the possibility of failures when the datafeed's task is removed from the map of running tasks during multi-step operations in other threads. Fixes #51285
The ID of the datafeed's associated job was being obtained frequently by looking up the datafeed task in a map that was being modified in other threads. This could lead to NPEs if the datafeed stopped running at an unexpected time. This change reduces the number of places where a datafeed's associated job ID is looked up to avoid the possibility of failures when the datafeed's task is removed from the map of running tasks during multi-step operations in other threads. Fixes #51285
Thanks @droberts195 , you are right, it is only surfaced due to the added assertion in my PR (sorry, should have double checked against master before reporting here). It looks like the resulting NPE is ignored in I used a sleep of 1 second. I checked after merging in your changes locally and it seems to work fine, thanks for fixing this. |
MachineLearningLicensingTests.testAutoCloseJobWithDatafeed
fails rarely with NPE. Can be reproduced by adding a sleep in the onResponse method here and using following repro line:It failed on a couple of PR builds, for instance this one:
https://gradle-enterprise.elastic.co/s/doocabu3vhmdm/tests/7bifs6bt3pims-kf5sbxwq5fdpm
The assertion error in that build is new code, but caused by the NullPointerException:
The text was updated successfully, but these errors were encountered: