Possible Bulk Update Optimizations #26802

imotov · 2017-09-27T14:54:08Z

We had several reports of slow bulk updates in 5.0 and above. Some of them are summarized in #23792. The main issue that led to significant slow down seems to the refresh that is now performed before each update operation. Basically, in order to perform a bulk requests with 1000 updates on the same record we will have to perform refresh 1000 times. This problem can be avoided if we could combine all update operations on the same record within a shard batch into a single update operation, this way we could get away with performing refresh only once.

For example, if we have the following bulk request:

update rec A
update rec A
update rec B
update rec A

it currently translates into

refresh // possibly if A was updated in the previous bulk and wasn't refreshed yet
get A
update A
index A

refresh // always
get A
update A
index A

// no refresh - because B wasn't modified in this bulk yet
get B
update B
index B

refresh // always
get A
update A
index A

If we combine all updates together we could transform it

refresh
get A
update A
update A
update A
index A
get B
update B
index B

We would still have an issue if we have a mix of index or delete operations with update operations on the same record in the same bulk request, but, perhaps we can either fall back to the current slow way of doing things or optimize them only as much as we can, for example, we can ignore all updates before index operation and all updates and index operation before delete, etc.

The text was updated successfully, but these errors were encountered:

jpountz · 2017-09-28T07:58:13Z

@imotov in it currently translates into, I think we would not refresh before getting B since it is necessarily in the index already due to the refresh that we did for A?

imotov · 2017-09-28T13:08:55Z

@jpountz you are right, thanks a lot for the clarification. I have updated the issue.

s1monw · 2017-10-06T13:44:04Z

We spoke about this in fixit-friday. While this optimization is doable on the client side (for now) we would like to first look into a more generic optimization that would hold a secondary (engine-private) index reader that is used internally that would not pay the price of refreshing caches, load global ords, rebuild parent-child relationships etc. It would also mean that we can honor the refresh semantics if a realtime get is used. I will keep this issue open for now and ope a new one to explore the opportunities.

…er to disk" Today, when ES detects it's using too much heap vs the configured indexing buffer (default 10% of JVM heap) it opens a new searcher to force Lucene to move the bytes to disk, clear version map, etc. But this has the unexpected side effect of making newly indexed/deleted documents visible to future searches, which is not nice for users who are trying to prevent that, e.g. elastic#3593. This is also an indirect spinoff from elastic#26802 where we potentially pay a big price on rebuilding caches etc. when updates / realtime-get is used. We are refreshing the internal reader for realtime gets which causes for instance global ords to be rebuild. I think we can gain quite a bit if we'd use a reader that is only used for GETs and not for searches etc. that way we can also solve problems of searchers being refreshed unexpectedly aside of replica recovery / relocation. Closes elastic#15768 Closes elastic#26912

…er to disk (#26972) Today, when ES detects it's using too much heap vs the configured indexing buffer (default 10% of JVM heap) it opens a new searcher to force Lucene to move the bytes to disk, clear version map, etc. But this has the unexpected side effect of making newly indexed/deleted documents visible to future searches, which is not nice for users who are trying to prevent that, e.g. #3593. This is also an indirect spinoff from #26802 where we potentially pay a big price on rebuilding caches etc. when updates / realtime-get is used. We are refreshing the internal reader for realtime gets which causes for instance global ords to be rebuild. I think we can gain quite a bit if we'd use a reader that is only used for GETs and not for searches etc. that way we can also solve problems of searchers being refreshed unexpectedly aside of replica recovery / relocation. Closes #15768 Closes #26912

s1monw · 2017-10-12T15:31:10Z

@imotov I got the index reader change in I was talking about. While batching up those kind of updates might be a two sided sword I am still interested in seeing how it would look like code wise, would it be as simple as sorting the updates by ID and use insertion order as a secondary sort and then run them all consecutively? I think it can be as simple as checking if the previous ID is the same and don't fetch the document again? I really wonder if we can have a prototype that does the sorting in the test as a start?

imotov · 2017-10-12T22:11:02Z

@s1monw yes, I think this rearrangement shouldn't cause any differences in the final result of the bulk operation (at least on the level we can have any guarantee about). We will have to have a special handling for deletes (clean buffer), but otherwise that should optimize one of the issues.

We will still have an issue of indexing the same record over and over again and sending multiple copies of the same record to replicas, but I am fully sold on progress over perfection here :)

jasontedor · 2017-10-12T22:18:10Z

We will still have an issue of indexing the same record over and over again and sending multiple copies of the same record to replicas, but I am fully sold on progress over perfection here :)

Can you clarify what you mean here? We have to send these operations to maintain the semantics of the translog as an operation log and for semantics of sequence IDs.

imotov · 2017-10-12T22:30:13Z

@jasontedor yes if we reindex the same record again and again on the primary shard, we have to send it to replicas as many times. I was just thinking, maybe, if we can index it on the primary only after all sequential operations on the same record are performed we will need to send only one copy to replicas. But as I said, we don't need to bundle that in, it is a somewhat separate optimization.

jasontedor · 2017-10-12T22:34:06Z

@imotov For a shard-level bulk request, we still send a single bulk request to the replicas containing the results of all the indexing operations on the primary. That is, a shard bulk request is a single replication operation.

imotov · 2017-10-12T22:43:35Z

@jasontedor yes, the question here is, if a bulk request contains 100 script updates on the same record do we need to send 100 slightly different copies of this record to replicas or we can just send the final version and ignore all intermediate states on both primary and replicas?

jasontedor · 2017-10-12T22:51:22Z

On the replica we can not do that, if we apply an operation the primary, we have to send it to the replica and we can not coalesce operations from the primary to the replica as doing so would violate operation and sequence ID semantics.

…er to disk (#26972) Today, when ES detects it's using too much heap vs the configured indexing buffer (default 10% of JVM heap) it opens a new searcher to force Lucene to move the bytes to disk, clear version map, etc. But this has the unexpected side effect of making newly indexed/deleted documents visible to future searches, which is not nice for users who are trying to prevent that, e.g. #3593. This is also an indirect spinoff from #26802 where we potentially pay a big price on rebuilding caches etc. when updates / realtime-get is used. We are refreshing the internal reader for realtime gets which causes for instance global ords to be rebuild. I think we can gain quite a bit if we'd use a reader that is only used for GETs and not for searches etc. that way we can also solve problems of searchers being refreshed unexpectedly aside of replica recovery / relocation. Closes #15768 Closes #26912

bleskes · 2017-10-13T13:43:12Z

@jasontedor if I get the suggestion the idea is to do one get from the engine, apply multiple updates and the indexed the versioned results once. This will result in one write operation to the engine with one issues seq#. I think that's fine?

All that said - I'm really not happy with the state of the TransportBulkShardAction and it's complexity. I think we should figure what we want to do there (I don't have a good suggestion yet) before introducing more complexity.

We historically removed reading from the transaction log to get consistent results from _GET calls. There was also the motivation that the read-modify-update principle we apply should not be hidden from the user. We still agree on the fact that we should not hide these aspects but the impact on updates is quite significant especially if the same documents is updated before it's written to disk and made serachable. This change adds back the ability to read from the transaction log but only for update calls. Calls to the _GET API will always do a refresh if necessary to return consistent results ie. if stored fields or DocValues Fields are requested. Closes #26802

imotov added :Bulk discuss labels Sep 27, 2017

s1monw mentioned this issue Oct 6, 2017

Explore using a engine-private IndexReader for realtime-get etc. #26912

Closed

s1monw mentioned this issue Oct 11, 2017

Use separate searchers for "search visibility" vs "move indexing buffer to disk #26972

Merged

lcawl added :Distributed Indexing/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. and removed :Bulk labels Feb 13, 2018

s1monw mentioned this issue Mar 28, 2018

Allow _update and upsert to read from the transaction log #29264

Merged

s1monw closed this as completed in 13e19e7 Mar 28, 2018

danielmitterdorfer mentioned this issue Aug 21, 2018

Allow to use the "update" action in update challenges elastic/rally-tracks#49

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible Bulk Update Optimizations #26802

Possible Bulk Update Optimizations #26802

imotov commented Sep 27, 2017 •

edited

Loading

jpountz commented Sep 28, 2017

imotov commented Sep 28, 2017

s1monw commented Oct 6, 2017

s1monw commented Oct 12, 2017

imotov commented Oct 12, 2017

jasontedor commented Oct 12, 2017

imotov commented Oct 12, 2017

jasontedor commented Oct 12, 2017

imotov commented Oct 12, 2017

jasontedor commented Oct 12, 2017

bleskes commented Oct 13, 2017

Possible Bulk Update Optimizations #26802

Possible Bulk Update Optimizations #26802

Comments

imotov commented Sep 27, 2017 • edited Loading

jpountz commented Sep 28, 2017

imotov commented Sep 28, 2017

s1monw commented Oct 6, 2017

s1monw commented Oct 12, 2017

imotov commented Oct 12, 2017

jasontedor commented Oct 12, 2017

imotov commented Oct 12, 2017

jasontedor commented Oct 12, 2017

imotov commented Oct 12, 2017

jasontedor commented Oct 12, 2017

bleskes commented Oct 13, 2017

imotov commented Sep 27, 2017 •

edited

Loading