Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce DTS costs for cross zone data transfer within Elasticsearch #73501

Open
dakrone opened this issue May 27, 2021 · 6 comments
Open

Reduce DTS costs for cross zone data transfer within Elasticsearch #73501

dakrone opened this issue May 27, 2021 · 6 comments
Labels
:Data Management/ILM+SLM Index and Snapshot lifecycle management >enhancement Meta Team:Data Management Meta label for data/management team

Comments

@dakrone
Copy link
Member

dakrone commented May 27, 2021

DTS, or Data Transfer & Storage, is a high cost for users either running on ESS or their own cloud deployment. Users use forced awareness to maintain copies of indices on multiple availability zones for high availability. Elasticsearch currently does no special handling of data to reduce the amount of data transferred between zones.

This meta issue links to issues with potential ideas for mitigating the cost of DTS.

Sources of inter-AZ data transfer

Some of these sources are not directly transferring data, however, their execution leads to data being transferred between availability zones

  • Regular relocation
  • Shrinking an index
  • Forcemerge
  • ILM migration (migrating between tiers)
  • Rollups
  • ILM allocation (through the "allocate" action)
  • Rollover
  • Changing the number of replicas

Separate issues

@dakrone dakrone added >enhancement Meta :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. :Data Management/ILM+SLM Index and Snapshot lifecycle management labels May 27, 2021
@elasticmachine elasticmachine added Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. Team:Data Management Meta label for data/management team labels May 27, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (Team:Core/Features)

@dakrone
Copy link
Member Author

dakrone commented Jun 16, 2021

#73971 may also be related, dealing with DTS costs from async search.

@seang-es
Copy link

seang-es commented Jul 7, 2021

#62194 related

@DaveCTurner
Copy link
Contributor

I think the ability to relocate shards via snapshots has reduced or eliminated much of the DTS costs mentioned above. I'm therefore removing this from the distrib team area.

@DaveCTurner DaveCTurner removed :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. labels Jul 28, 2022
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/ILM+SLM Index and Snapshot lifecycle management >enhancement Meta Team:Data Management Meta label for data/management team
Projects
None yet
Development

No branches or pull requests

5 participants