-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI Failures: 80_data_frame_jobs_crud #47943
Comments
Pinging @elastic/ml-core (:ml/Transform) |
@benwtrent five more failures with this one in the last 30 minutes. |
Looks like those 5 failures are all on the 7.x branch w/ varying |
@nknize for sure. I will mute the tests |
This shows TWO bugs, the predicate for _stop was not satisfied even though the task was indeed cancelled and removed, we are failing to create a checkpoint in a mixed cluster. The first bug has probably a couple of causes:
The error trace (when running in a mixed 7.3.0, 7.5.0 cluster):
This occurred in the Don't let the trace fool you, this The transform task was then assigned to the new 7.5.0 node and it attempted to create a new checkpoint since it is attempting to |
The task is DEFINITELY cancelled and removed as I see these logs later:
Also, there MAY be another funky issue as I see this in the logs too (from the old nodes):
|
This explains the missing index: The Template upgrader only runs on the master node, if the master node was The issue however exists already between (7.2, 7.3) and 7.4, because in 7.4 we changed the index name, too. But some tests do exist or are blacklisted for older versions, so maybe we were "lucky". The situation is odd, I see 2 solutions: A do not run the task on a node newer than master (a flavor of that could be a minimum_compat_version check, assuming we do not introduce a new internal index with every minor) |
#46553 was supposed to do this. There must be some situation where it doesn’t do it quickly enough or its attempt to create the template fails. |
add alias for backwards compatibility with 7.4 relates #47943
fixed with #48247 |
These tests have failed for quite a while on 7.x:
https://groups.google.com/a/elastic.co/forum/#!searchin/build-elasticsearch/80_data_frame_jobs_crud%7Csort:date
Don't think we can mute yml tests though?
The text was updated successfully, but these errors were encountered: