[Optimization] Adaptive shard selection for writes for auto generated ids. #4984

itiyama · 2022-10-31T03:38:18Z

When there are multiple parallel bulk requests from a client with no document id, the coordinator does the following:

Generate a new id per document in bulk.
Create a per shard request out of the entire set of documents based on the hash of the document.
Send out per shard bulk requests to workers

If any of the shards is slow, all the bulk requests need to wait on the coordinator. The client is also blocked till the coordinator return the response. When the request is returned to the customer, the customer reads the bulk response and then retries the remaining requests. Imagine a shard being in INITIALIZING state - all the bulk requests will wait on the coordinator and cause the entire system to be slow.

How about the coordinators always send all document in a bulk request to one shard and then round robin the requests across shards? This will reduce the amount of requests waiting in the coordinator queue as a result of a slow shard and also free up the resources on the client side.

What are the downsides of this approach? Can this result in imbalance of documents across shards? If a shard is really slow, it would be imbalanced even with a uniform splitting approach as it will not be able to complete the work on time and hence would be timed out on coordinator.

This optimization will not work for customer generated ids or for custom routing use-cases.

anasalkouz · 2022-11-08T21:16:22Z

@adnapibar Could you please take a look? since this related to streaming index API

itiyama · 2022-12-06T16:24:38Z

With segment replication and remote storage where the failovers are slow, this optimization can be combined with an adaptive shard selection based on latencies or shard availability.

itiyama · 2023-04-27T19:25:24Z

@nknize Your ghost writer solution will address this problem too, right?

itiyama · 2023-08-29T11:49:49Z

Automatic routing is one way to solve this, but automatic routing makes updates/gets inefficient. The issue that our solution should solve is that id is tightly coupled with routing. Here are a few options to decouple the two, that handle updates and custom document ids.

Encode shard in the document id for auto generated ids.
Pre generate document and associate them with shards on the fly. This will waste some memory on the coordinating node.

This solution does not prevent us from using custom doc ids or custom routing - it is just that the optimization does not work when custom doc id or routing is used.

Adaptive shard selection for indexing - To implement this, each shard returns the number of documents indexed within each shard for auto generated id, the size per shard to coordinator and also stats on shard request queue, processing time etc. Based on this, requests using auto generated ids can adaptively select the shard the document lands into. Each coordinator will calculate a balance score based on documents with auto ids within each shard and once the score starts breaching a threshold, the adaptive shard selection will attempt to fix the balance.

We will use an opt-in to enable this feature and will be enabled by default for data streams

itiyama added enhancement Enhancement or improvement to existing feature or request untriaged labels Oct 31, 2022

minalsha added the distributed framework label Nov 3, 2022

anasalkouz removed the untriaged label Nov 8, 2022

adnapibar added the Indexing Indexing, Bulk Indexing and anything related to indexing label Nov 17, 2022

anasalkouz added Migration:Backlog and removed Migration:Backlog labels Mar 17, 2023

Bukhtawar mentioned this issue Aug 10, 2023

[RFC] Automatic routing for bulk #9219

Open

anasalkouz removed the distributed framework label Sep 19, 2023

itiyama changed the title ~~[Optimization] Send all sub-bulk requests to a single shard when there are multiple concurrent bulk requests~~ [Optimization] Adaptive shard selection for writes for auto generated ids. Oct 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Optimization] Adaptive shard selection for writes for auto generated ids. #4984

[Optimization] Adaptive shard selection for writes for auto generated ids. #4984

itiyama commented Oct 31, 2022 •

edited

Loading

anasalkouz commented Nov 8, 2022

itiyama commented Dec 6, 2022

itiyama commented Apr 27, 2023

itiyama commented Aug 29, 2023 •

edited

Loading

[Optimization] Adaptive shard selection for writes for auto generated ids. #4984

[Optimization] Adaptive shard selection for writes for auto generated ids. #4984

Comments

itiyama commented Oct 31, 2022 • edited Loading

anasalkouz commented Nov 8, 2022

itiyama commented Dec 6, 2022

itiyama commented Apr 27, 2023

itiyama commented Aug 29, 2023 • edited Loading

itiyama commented Oct 31, 2022 •

edited

Loading

itiyama commented Aug 29, 2023 •

edited

Loading