-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CCR] Re-evaluate shard follow parameter defaults #31717
Comments
Pinging @elastic/es-distributed |
TL;DRIt's hard to come up with general defaults that are optimal for all cases, esp. in multi node environments with different (primary+replica) shard settings. this comment shows two categories of defaults that work well in different scenarios:
The next comment will include optimal defaults for:
The final defaults will be based on the results from all the shard setting scenarios, however, the detailed results per category are still useful for documentation and tuning guidance. All benchmarks were done with the same commit from master (73417bf) that doesn't include the unmerged, as of now, append-only final optimizations in #34099. 5P 1R, Network SSD Storage, 3 node cluster"max_concurrent_read_batches": 5,
"max_concurrent_write_batches": 5,
"max_batch_operation_count": 32768,
"max_batch_size_in_bytes": 4718592,
"max_write_buffer_size": 164000 NOTE: While these defaults have been benchmarked against a large amount of scenarios, they are still static parameters. Elasticsearch will not auto-adjust based on performance during CCR activity and performance depends also on the amount of shards. Therefore they will likely still need tuning depending on environmental peculiarities. Benchmarks ExecutedAll benchmarks where executed on the Cloud (AWS and GCP). GCP environment1 cluster, 3 leader nodes in 1 cluster, 3 follower nodes in 1 load gen on europe-west ( Each node root disk size: Each instance is: Minimum allowable processor, Skylake (https://cloud.google.com/compute/docs/machine-types). 14GiB RAM (7GiB heap) AWS environment1 cluster, 3 leader nodes in 1 cluster, 3 follower nodes in Load gen node: ( Each node root disk size: Each instance is: 15GiB RAM (7GiB heap). Rally Details(see https://github.com/elastic/rally-tracks, https://github.com/elastic/rally-eventdata-track) Tracks used:
Type of load: indexing only Results with recommended settings:GCP:
AWS:
Other detailsTotal benchmarks executed:GCP: 43 (gradually increasing max_concurrent_batch_read/write values and max_batch_size_in_bytes) 1P 1R, Local SSD Storage, 3 node cluster"max_concurrent_read_batches": 9,
"max_concurrent_write_batches": 9,
"max_batch_operation_count": 163840,
"max_batch_size_in_bytes": 4718592,
"max_write_buffer_size": 2942700 NotesThis setup exerts high pressure on I/O (primarily) and cpu (secondarily) , as essentially only two out of three nodes deal with the load. Trying locally attached SSD disks shifted the bottleneck to the cpu (see details). After finding the right instances in terms of cpu and i/o performance it was possible to tune settings to provide the optimal performance. Benchmarks ExecutedFinal AWS environment1 cluster, 3 leader nodes in `eu-central-1`, each node in different AZ (`us-east-2a`, `us-east-2b`, `us-east-2c`).1 cluster, 3 follower nodes in Load gen node: ( Each node root disk size: Each instance is: 30GiB RAM (15GiB heap) (Also tried i3.xlarge, which didn't have enough CPU capacity to deal with http_logs benchmarks at max indexing performance). GCP environment1 cluster, 3 leader nodes in `europe-west`, each node in different AZ `europe-west4-a`, `europe-west4-b`, `europe-west4-c` (had to switch from europe-west1-* due to lack of resources).1 cluster, 3 follower nodes in us-central, each node in different AZ 1 load gen on europe-west ( Each node root disk size: Each instance is: Minimum allowable processor, Skylake (https://cloud.google.com/compute/docs/machine-types). 14GiB RAM (7GiB heap). Rally Details(see https://github.com/elastic/rally-tracks, https://github.com/elastic/rally-eventdata-track)indexing was throttled to 45000 doc/s Tracks used:
Type of load: indexing only Results with recommended settings:GCP:
AWS:
|
3P 1R, Network SSD Storage, 3 node cluster"max_concurrent_read_batches": 5,
"max_concurrent_write_batches": 5,
"max_batch_operation_count": 32768,
"max_batch_size_in_bytes": 4718592,
"max_write_buffer_size": 164000 NotesAs expected, using 3p 1r shard settings is lighter on cpu and I/O pressure than the 1/1 scenario. All benchmarks were executed using local SSD drives and http_logs in particular throttled at 85% of max indexing performance, as otherwise normalized CPU consumption on AWS ranged between 95-98% (the instance on AWS has less cpus). AWS environment1 cluster, 3 leader nodes in 1 cluster, 3 follower nodes in Load gen node: ( Each node root disk size: Each instance is: 30GiB RAM (15GiB heap) (Also tried i3.xlarge, which didn't have enough CPU capacity to deal with http_logs benchmarks at max indexing performance). GCP environment1 cluster, 3 leader nodes in 1 cluster, 3 follower nodes in us-central, each node in different AZ 1 load gen on europe-west ( Each node root disk size: Each instance is: Minimum allowable processor, Skylake (https://cloud.google.com/compute/docs/machine-types). 14GiB RAM (7GiB heap). Rally Details(see https://github.com/elastic/rally-tracks, https://github.com/elastic/rally-eventdata-track) Tracks used:
Type of load: indexing only Results with recommended settings:GCP:
AWS:
|
Next stepsWith the append-only follower optimizations merged, next step is to rerun the above benchmarks (at least the ones that demonstrated high resource usage, esp CPU) and evaluate if the common defaults used for 5p/1r, 3p/1r scenario demonstrate less resource usage and whether they are good enough even for 1p/1r. For reference here are some visualizations for CPU usage: AWS http_logs 3P/1R unthrottledCPUAWS http_logs 3P/1R throttled to 85% of max indexing throughputCPU:Progress of follower vs leader with sequence numbers:GCP http_logs 3P/1R unthrottledCPU:GCP http_logs 3P/1R throttled to 85% of max indexing throughputCPU:AWS http_logs 1P/1R throttled to 85% of max indexing throughputCPUGCP http_logs 1P/1R throttled to 85% of max indexing throughputCPU |
@dliappis and I did some more research and here is a summary of our discoveries:
A few observations that we have made will doing the experiments:
W.r.t defaults we have discussed multiple options, including being smart and using the number of processors on the machines to figure out our defaults. That said I feel we don't have a good enough grip of all the possible use cases/machines and it will take a considerable effort to get it, with unclear chance of actually getting a clear and simple picture. I'd prefer going with simple numbers that are easy to communicate and change. Once we gather more input we can adapt them. With that in mind, I suggest going with the following defaults:
@dliappis did I forget anything? |
Per elastic#31717 this commits changes the defaults to the following: Batch size of 5120 ops. Maximum of 12 concurrent read requests. Maximum of 9 concurrent write requests.
Per #31717 this commit changes the defaults to the following: Batch size of 5120 ops. Maximum of 12 concurrent read requests. Maximum of 9 concurrent write requests. This is not necessarily our final values but it's good to have these as defaults for the purposes of initial testing.
Per #31717 this commit changes the defaults to the following: Batch size of 5120 ops. Maximum of 12 concurrent read requests. Maximum of 9 concurrent write requests. This is not necessarily our final values but it's good to have these as defaults for the purposes of initial testing.
@bleskes Thank you for the concise summation. Maybe worth mentioning that all experiments were done on instances using a locally attached SSDs (i.e. not AWS's EBS/GCP's persistent disks). What initial defaults should we use for I'd also have added the need to clarify defaults for |
Per #31717 this commit changes the defaults to the following: Batch size of 5120 ops. Maximum of 12 concurrent read requests. Maximum of 9 concurrent write requests. This is not necessarily our final values but it's good to have these as defaults for the purposes of initial testing.
Here's a summary of what has happened in the mean time. We devised a testing plan comprising 3 stages:
Observations/Actions:
@bleskes Did I miss anything? NOTE: Some of the Kibana links in the gist point to a private Cloud cluster. |
I am closing since we have good default parameters now. Thanks @dliappis for the awesome work. |
in
ShardFollowNodeTask
andShardFollowTask
.The text was updated successfully, but these errors were encountered: