-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DOCS] Clarify migration from warm to cold tier for searchable snapshots #77583
[DOCS] Clarify migration from warm to cold tier for searchable snapshots #77583
Conversation
Specify that we can't just reuse the local data as a cold cache and the migration incurs data transfer costs. Closes elastic#74385
Pinging @elastic/es-docs (Team:Docs) |
Pinging @elastic/es-distributed (Team:Distributed) |
[NOTE] | ||
Migration from the warm to the cold tier requires snapshotting the data in the repo | ||
and reading it back to the node which incurs data transfer costs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this update @arteam!
Is this isolated to the warm->cold transition?
It seems like you'd incur this cost whenever you create a searchable snapshot, regardless of the tier or phase transition. For example, if someone goes directly from hot->frozen, I think you'd incur the same data transfer costs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should elaborate a bit here, mentioning that this is also true when hot/warm and cold are co-located. I think this is the part that can confuse some users. Also, we should note that if you already take snapshots of your data, the actual data uploaded to the repo will be minimal. Finally, I am a bit worried about the terminology "data transfer costs", I think we should use clarify that it requires a download of data which may have costs depending on operating environment. Perhaps link to what David wrote in #77607 rather than elaborate too much here on the costs.
I think frozen
is less confusing since:
- We recommend using dedicated frozen nodes.
- Partially mounted indices will only download a minimum set of data initially.
- The bulk of the download happens on search.
Still, adding a note may be worthwhile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just came across this, sorry I didn't know you were working on these docs too @arteam (hence why I opened #77607). Echoing what Henning says, yes, I think we need to write more words here as this is the sort of thing that comes up frequently when speaking to customers. If we just said this we would be giving the impression that searchable snapshots normally cost more in terms of data transfer than regular indices, which isn't the case. |
Totally agree, that's why I left my comment #77583 (comment). The doc change #77607 is definitely much more clear, detailed, and articulate than this change which is too blunt. I totally agree that we shouldn't give a "data transfer costs" warning by default because for the majority of customers these costs would be negligible. I think we can close this issue is favour of #77607 when it's merged. |
I think it is pretty much covered in #77607. We say these things
I've adjusted this last bit to say "... involves copying the shard contents ..." in 93e0141. Do you think that's enough? |
Superseded by #77607 |
Specify that we can't just reuse the local data as a cold cache and
the migration incurs data transfer costs.
Closes #74385