-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restarted node takes a long time to become useful due to re-processesing already terminal allocations #20354
Comments
Hi @marvinchin, nice to hear from you again!
The catch here is that the client doesn't know the allocations are terminal unless it's already seen them, and it doesn't know whether it's seen the allocation before on startup if it doesn't have an alloc runner. And the server doesn't necessarily know the client doesn't have an alloc runner for them, because the client could have restarted and come back up without missing a heartbeat. If we zoom out a bit from your code snippet to for allocID, modifyIndex := range resp.Allocs {
// Pull the allocation if we don't have an alloc runner for the
// allocation or if the alloc runner requires an updated allocation.
//XXX Part of Client alloc index tracking exp
c.allocLock.RLock()
currentAR, ok := c.allocs[allocID]
c.allocLock.RUnlock()
// Ignore alloc updates for allocs that are invalid because of initialization errors
c.invalidAllocsLock.Lock()
_, isInvalid := c.invalidAllocs[allocID]
c.invalidAllocsLock.Unlock()
if (!ok || modifyIndex > currentAR.Alloc().AllocModifyIndex) && !isInvalid {
// Only pull allocs that are required. Filtered
// allocs might be at a higher index, so ignore
// it.
if modifyIndex > pullIndex {
pullIndex = modifyIndex
}
pull = append(pull, allocID)
} else {
filtered[allocID] = struct{}{}
}
} The It might even be possible to leave those stopped allocs out of the |
Likewise :)
Ah, I didn't notice that! Thanks for pointing that out.
Hm, that's an interesting idea. My immediate thought is that we wouldn't be able to distinguish between when an allocation has been GC-ed from the server, vs. when an allocation has reached a terminal client state. But maybe it doesn't matter because they should be handled the same way anyway? I'm not confident in this at all though! |
Hi @tgross, we might be able to work on a PR for this issue 😄 Have you had the chance to see whether or not leaving out the stopped allocs in the .Allocs field is a workable approach? Otherwise, I'm tempted to go with the more straightforward approach of adding the desired status of the allocation to the response and filtering based on that. Let me know which you'd prefer! |
Hi @marvinchin, that'd be great!
Yeah, we almost certainly don't want to do that. Because if the alloc is simply missing from the list, the client uses that as a signal that it not only should stop the alloc but that it should be client-side garbage collected (because the server doesn't know about it anymore... this is typically going to happen for clients that have been offline for a while and have reconnected). That would cause allocation logs to be immediately cleaned up after an allocation gets rescheduled, and we don't want that. |
Sorry for the delay, just wanted to mention that I haven't managed to find the time to work on this yet, in case anyone else is interested in picking this up. In the meantime we've been working around this by reducing the GC interval to avoid accumulating of too many allocations in the state. |
Nomad version
1.6.x
Operating system and Environment details
Unix
Issue
When a client is restarted, it spends a long time on startup re-processing allocations that are already known to be terminal and had previously been GC-ed from it's state.
The result of this is that the client node takes a long time to start new allocations that are assigned to it, as it does not poll the server for new allocations until it is done processing the initial set of allocations it receives. In our environment, we have observed this delay to be in the order of minutes for busy client nodes.
This is similar to #16381, but focuses on the impact on the client. #17074 has fixed the issue on the server which causes the modify index of the allocations to be bumped on client restart, but the impact on client still remains.
Relevant code
Looking at the client's logic for filtering out allocations to process:
nomad/client/client.go
Lines 2374 to 2384 in 594fedb
It appears that it will pull allocations which are both:
My understanding is that such allocations need not be pulled, since they are already known to be terminal and there is no local state for them. If this is true, then we can simply update the filtering logic to filter out such allocations, and that should fix the issue.
Reproduction steps
Same as in #16381, in particular the part where we observe in the client logs that it pulls an allocation even though both the allocations it receives are already terminal.
Expected Result
Clients should not re-process allocations which are terminal on restart.
Actual Result
Clients re-process terminal allocations on restart, causing delays in starting new allocations.
The text was updated successfully, but these errors were encountered: