-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce assertions that IO does not occur on network threads #54066
Comments
Pinging @elastic/es-core-infra (:Core/Infra/Core) |
This is an interesting idea! One note on implementation: we would need to think about how to compose or still allow extending the filesystem abstraction, since external systems sometimes use this with elasticsearch (eg quote aware filesystem). |
++. Another area where we will need to watch for is logging (including audit logging, see #39658) where filesystem access will occur on a network thread in our current state. |
We discussed this issue in a Core/Infra team meeting today. The team agreed that these assertions would be very good to have, and would serve as guardrails to help us catch tricky performance issues in tests rather than in production. In short, this is something we should plan to do, and we should aim to do it before the final minor release in the 7.x line so that we have these guardrails in the 7.x code as we maintain it. What we don't know is how much of a drag these extra assertions would put on our test times, and we need to find that out in order to discuss the tradeoffs. So the first effort at this issue should be some experiments that show the effect of this change on our tests, followed by another round of team discussion. We also noted that File I/O is not the only slow/blocking thing that can inadvertently clog up our network threads, and we should think about whether there are any other situations for which we could add similar guardrails. |
We have already implemented guardrail assertions against calling elasticsearch/server/src/main/java/org/elasticsearch/common/util/concurrent/BaseFuture.java Lines 91 to 96 in b01322e
|
As a rule we should not be touching the filesystem on a network thread (
transport_worker
orhttp_server_worker
) but today we do not assert this to be the case. It is all too easy to inadvertently move some filesystem access onto a network thread (see e.g. #53985 for a recent example of this). @jpountz suggested that it should be possible to introduce assertions during all IO to ensure that it does not occur on a network thread. For instance, we could implement aFileSystemProvider
to check the current thread's name (cf.Transports#assertNotTransportThread
).The text was updated successfully, but these errors were encountered: