-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce enthusiasm for calling flush APIs #38503
Conversation
Today we only really describe the benefits of the flush and synced-flush APIs: > _... reduces recovery times ..._ > _... NOTE: It is harmless to request a synced flush ..._ This might lead users to believe that they should call these APIs frequently as a matter of course, but this is not what we recommend. This change rewords the docs a bit to talk about trade-offs and to mention more prominently that Elasticsearch does these things automatically.
Pinging @elastic/es-distributed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments. I general I think it's no surprise that the old docs read better to me, because I wrote them :) That said - I think this PRs tries to do too much in one go. If you want to re-write the whole page, that's fine with me but it seems the goal was to only make it clearer that (normal) flushes should be very rarely called.
The flush process of an index makes sure that any data that is currently only | ||
stored in the <<index-modules-translog,transaction log>> is also permanently | ||
stored in Lucene. When restarting, {es} replays any unflushed operations from | ||
the transaction log into Lucene to ensure that searches do not return stale |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I liked the previous explanation better 👼 The real reason is that we want to bring lucene back to where it was before the shard was shutdown and lucene resets itself to the last time it was flushed. Stale searches are only one aspect of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's ok with me - see updated wording in https://github.com/elastic/elasticsearch/pull/46245/files#diff-31f3afe6b50f556a95a396324be535e5R6
flush can be executed if another flush operation is already executing. | ||
The default is `false` and will cause an exception to be thrown on | ||
the shard level if another flush operation is already running. | ||
flush can be executed if another flush operation is already executing. The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the main point is here that the command waits until a flush has happened and all operations indexed before the command was called are guaranteed to be flushed. I'm tempting to also say we need to change the default, but that's a different discussion :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. This wasn't spelled out before as far as I can see, but it's a good point to make. See https://github.com/elastic/elasticsearch/pull/46245/files#diff-31f3afe6b50f556a95a396324be535e5R22-R24.
The default is true
, fixed by Nhat recently.
|
||
NOTE: It is harmless to request a synced flush while there is ongoing indexing. Shards that are idle will succeed and shards | ||
that are not will fail. Any shards that succeeded will have faster recovery times. | ||
The Synced Flush API allows an administrator to initiate a synced flush |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you change the old explanation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes I made were to address a number of points that I originally found confusing:
- It says it's useful specifically for rolling restarts, but it's actually useful for all restarts whether rolling or otherwise.
- The note about future flushes removing the
sync_id
is out of place in this section on the API, since it applies even if you don't use this API. - It introduces the concept of a "low level Lucene commit point" without definition or explanation. This isn't really necessary, we can make that same point more clearly by talking about flushes directly.
- The formatting (i.e. the numbered list and the
NOTE:
callout) were unnecessary and made it harder to read than just straight prose.
an opportunity for Elasticsearch to reduce shard resources and also perform | ||
a special kind of flush, called `synced flush`. A synced flush performs a normal flush, then adds | ||
a generated unique marker (sync_id) to all shards. | ||
{es} keeps track of which shards have received indexing activity recently, and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as in the previous comment - I like the previous explanation better. It explains what the thing does. With the new version, it provides a higher level description that to me is harder to understand what really happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't follow you here. I've perhaps reordered the information to start with the higher-level view and then drop down to the details, since this is easier to follow if you aren't already intimately familiar with it. I don't think I've lost any detail here have I?
what should we do with this PR? It was opened months ago and there was some discussion, hence it was not merged. Should we close it or get it in? |
I am holding off on further changes here right now while the work on history retention (#41536) is still in flight, because that is intertwined with the flushing and recovery mechanisms. This PR is still on my list. |
Today we only really describe the benefits of the flush and synced-flush APIs:
This might lead users to believe that they should call these APIs frequently as
a matter of course, but this is not what we recommend.
This change rewords the docs a bit to talk about trade-offs and to mention more
prominently that Elasticsearch does these things automatically.