Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support cloning of searchable snapshot indices #56595

Merged

Conversation

DaveCTurner
Copy link
Contributor

Today you can convert a searchable snapshot index back into a regular index by
restoring the underlying snapshot, but this is somewhat wasteful if the shards
are already in cache since it copies the whole index from the repository again.

Instead, we can make use of the locally-cached data by using the clone API to
copy the contents of the cache into the layout expected by a regular shard.
This commit marks the searchable snapshot's private index settings as
NotCopyableOnResize so that they are removed by resize operations such as
cloning.

Cloning a regular index typically hard-links the underlying files rather than
copying them, but this is tricky to support in the case of a searchable
snapshot so this commit takes the simpler approach of always copying the
underlying files.

Today you can convert a searchable snapshot index back into a regular index by
restoring the underlying snapshot, but this is somewhat wasteful if the shards
are already in cache since it copies the whole index from the repository again.

Instead, we can make use of the locally-cached data by using the clone API to
copy the contents of the cache into the layout expected by a regular shard.
This commit marks the searchable snapshot's private index settings as
`NotCopyableOnResize` so that they are removed by resize operations such as
cloning.

Cloning a regular index typically hard-links the underlying files rather than
copying them, but this is tricky to support in the case of a searchable
snapshot so this commit takes the simpler approach of always copying the
underlying files.
@DaveCTurner DaveCTurner requested a review from tlrx May 12, 2020 12:18
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore)

@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label May 12, 2020
Copy link
Member

@tlrx tlrx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

This change is much simpler than I expected. I think that we could improve cloning from a cached searchable snapshot directory, as in the current state it will cache data locally first before copying bytes to the new cloned shard.

@DaveCTurner DaveCTurner merged commit 718c228 into elastic:master May 13, 2020
@DaveCTurner
Copy link
Contributor Author

Thanks @tlrx

@DaveCTurner DaveCTurner deleted the 2020-05-11-clone-searchable-snapshot branch May 13, 2020 09:56
DaveCTurner added a commit that referenced this pull request May 13, 2020
Today you can convert a searchable snapshot index back into a regular index by
restoring the underlying snapshot, but this is somewhat wasteful if the shards
are already in cache since it copies the whole index from the repository again.

Instead, we can make use of the locally-cached data by using the clone API to
copy the contents of the cache into the layout expected by a regular shard.
This commit marks the searchable snapshot's private index settings as
`NotCopyableOnResize` so that they are removed by resize operations such as
cloning.

Cloning a regular index typically hard-links the underlying files rather than
copying them, but this is tricky to support in the case of a searchable
snapshot so this commit takes the simpler approach of always copying the
underlying files.
tlrx added a commit that referenced this pull request Jul 13, 2021
Today if we try to shrink or to split a searchable snapshot 
index using the Resize API a new index will be created 
but can't be assigned, and even if it was assigned it won't 
work as the number of shards can't be changed and must 
always match the number of shards from the snapshot.

This commit adds some verification to prevent a snapshot 
backed indices to be resized and if an attempt is made, 
throw a better error message.

Note that cloning is supported since #56595 and in this 
change we make sure that it is only used to convert the 
searchable snapshot index back to a regular index.

Relates #74977 (comment)
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Jul 13, 2021
Today if we try to shrink or to split a searchable snapshot
index using the Resize API a new index will be created
but can't be assigned, and even if it was assigned it won't
work as the number of shards can't be changed and must
always match the number of shards from the snapshot.

This commit adds some verification to prevent a snapshot
backed indices to be resized and if an attempt is made,
throw a better error message.

Note that cloning is supported since elastic#56595 and in this
change we make sure that it is only used to convert the
searchable snapshot index back to a regular index.

Relates elastic#74977 (comment)
elasticsearchmachine pushed a commit that referenced this pull request Jul 13, 2021
)

Today if we try to shrink or to split a searchable snapshot
index using the Resize API a new index will be created
but can't be assigned, and even if it was assigned it won't
work as the number of shards can't be changed and must
always match the number of shards from the snapshot.

This commit adds some verification to prevent a snapshot
backed indices to be resized and if an attempt is made,
throw a better error message.

Note that cloning is supported since #56595 and in this
change we make sure that it is only used to convert the
searchable snapshot index back to a regular index.

Relates #74977 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >enhancement Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v7.9.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants