Introduce repository UUIDs #67829

DaveCTurner · 2021-01-21T14:52:48Z

Today a snapshot repository does not have a well-defined global identity. The
closest we have is the repository name, but this is under the control of the
user and users are free to choose a different name each time they register the
repository.

This presents problems for cases where we need to refer to a specific snapshot
in a specific repository in a globally-unique fashion. Although snapshots
themselves have a globally-unique identifier, it is not feasible to search all
the available repositories for a specific snapshot. Today we expect the
repository to be registered under the same name on every cluster, but this is
not something on which we can rely.

To solve this, this commit adds a persistent UUID to each repository. The
repository UUID is stored in the top-level index blob, represented by
RepositoryData, and is also usually copied into the RepositoryMetadata that
represents the repository in the cluster state. The repository UUID is exposed
by the get-repositories API; other more meaningful consumers will be added in
due course.

Relates #66431

Today a snapshot repository does not have a well-defined identity. It can be reregistered with a different cluster under a different name, and can even be registered with multiple clusters in readonly mode. This presents problems for cases where we need to refer to a specific snapshot in a globally-unique fashion. Today we rely on the repository being registered under the same name on every cluster, but this is not a safe assumption. This commit adds a UUID that can be used to uniquely identify a repository. The UUID is stored in the top-level index blob, represented by `RepositoryData`, and is also usually copied into the `RepositoryMetadata` that represents the repository in the cluster state. The repository UUID is exposed in the get-repositories API; other more meaningful consumers will be added in due course.

elasticmachine · 2021-01-21T14:52:53Z

Pinging @elastic/es-distributed (Team:Distributed)

original-brownbear

This looks just fine I think, just one stupid edge case left as far as I think?

original-brownbear · 2021-01-21T15:20:05Z

server/src/main/java/org/elasticsearch/repositories/RepositoriesService.java

+            verifyStep.whenComplete(ignored -> threadPool.generic().execute(
+                    ActionRunnable.wrap(getRepositoryDataStep, l -> repository(request.name()).getRepositoryData(l))), listener::onFailure);
+
+            // When the repository metadata is ready, update the repository UUID stored in the cluster state, if available


There is one slightly strange spot here for the case of read-only repositories.
You could have technically have a repo at some UUID and mount it to a cluster read-only. Then some outside force could wipe the repo completely (as in delete all the files) and write new RepositoryData with a different UUID, in which case the uuid in the metadata won't ever get fixed. It's kind of a weird edge-case but we it's the reason why we don't track the repo generation in the cluster state when it comes to read-only repositories as well.

I guess in this case we explicitly want this UUID visible in read-only repos also so maybe we need some check on the UUID when loading RepositoryData on read-only repos?

Ugh I see, ok, should be nbd since getRepositoryData is already async.

See b680eb3.

original-brownbear · 2021-01-21T15:24:30Z

Also there is some unfortunate interaction here with the S3 repo cooldown enforcement tests in https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+pull-request-1/16657/testReport/junit/org.elasticsearch.repositories.s3/S3BlobStoreRepositoryTests/testEnforcedCooldownPeriod/ (also needs some work around for the repo data loading dance in some form since we fake downgrade the repo there).

…tory-uuid

DaveCTurner · 2021-01-21T17:11:48Z

Also there is some unfortunate interaction here with the S3 repo cooldown enforcement tests

NBD I think, it works to set ?verify=false.

DaveCTurner · 2021-01-21T18:07:45Z

Ugh the docs tests are kinda broken, they don't clear the repo and we use the same path everywhere so it ends up mostly having a UUID 😕

Edit: they don't even clear the repo between runs, yech, it's the same UUID every run!

DaveCTurner · 2021-01-22T10:21:26Z

#67841 will address the docs tests confusion.

…ure during verify

original-brownbear

LGTM thanks !:)

tlrx

LGTM

DaveCTurner · 2021-01-25T10:03:17Z

Failure looks unrelated, I opened #67884.

@elasticmachine please run elasticsearch-ci/1

Today a snapshot repository does not have a well-defined identity. It can be reregistered with a different cluster under a different name, and can even be registered with multiple clusters in readonly mode. This presents problems for cases where we need to refer to a specific snapshot in a globally-unique fashion. Today we rely on the repository being registered under the same name on every cluster, but this is not a safe assumption. This commit adds a UUID that can be used to uniquely identify a repository. The UUID is stored in the top-level index blob, represented by `RepositoryData`, and is also usually copied into the `RepositoryMetadata` that represents the repository in the cluster state. The repository UUID is exposed in the get-repositories API; other more meaningful consumers will be added in due course. Backport of elastic#67829

Today a snapshot repository does not have a well-defined identity. It can be reregistered with a different cluster under a different name, and can even be registered with multiple clusters in readonly mode. This presents problems for cases where we need to refer to a specific snapshot in a globally-unique fashion. Today we rely on the repository being registered under the same name on every cluster, but this is not a safe assumption. This commit adds a UUID that can be used to uniquely identify a repository. The UUID is stored in the top-level index blob, represented by `RepositoryData`, and is also usually copied into the `RepositoryMetadata` that represents the repository in the cluster state. The repository UUID is exposed in the get-repositories API; other more meaningful consumers will be added in due course. Backport of #67829

This commit mostly reverts elastic#67934, except for the change to the version constant `REPOSITORY_UUID_IN_REPO_DATA_VERSION`. Completes the backport of elastic#67829 via elastic#67899

This commit mostly reverts #67934, except for the change to the version constant `REPOSITORY_UUID_IN_REPO_DATA_VERSION`. Completes the backport of #67829 via #67899

In elastic#67829 we introduced a `@Before` method to set up the repository for the docs tests. In fact this only needs to run once for the whole suite, not once per test. Relates elastic#67853

In #67829 we introduced a `@Before` method to set up the repository for the docs tests. In fact this only needs to run once for the whole suite, not once per test. Relates #67853

In elastic#67829 we introduced a `@Before` method to set up the repository for the docs tests. In fact this only needs to run once for the whole suite, not once per test. Relates elastic#67853

In #67829 we introduced a `@Before` method to set up the repository for the docs tests. In fact this only needs to run once for the whole suite, not once per test. Relates #67853 Co-authored-by: David Turner <[email protected]>

DaveCTurner added >enhancement :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.0.0 v7.12.0 labels Jan 21, 2021

DaveCTurner requested review from tlrx and original-brownbear January 21, 2021 14:52

elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Jan 21, 2021

original-brownbear reviewed Jan 21, 2021

View reviewed changes

DaveCTurner added 5 commits January 21, 2021 16:40

Merge remote-tracking branch 'upstream/master' into 2021-01-21-reposi…

d3efa89

…tory-uuid

Fix docs examples

9de1352

Avoid verify when creating broken repo

9692b40

Remove UUID from BWC repository data in SLMSnapshotBlockingIntegTests

0b8b62d

Ensure UUID is fresh when loading a new RepositoryData

b680eb3

DaveCTurner requested a review from original-brownbear January 21, 2021 17:11

DaveCTurner added 3 commits January 22, 2021 11:01

Fork off the cluster applier thread

be71e72

Set up repo before running docs tests

65814c0

Corrupt encryption metadata before re-registering it, and expect fail…

ecd6ba4

…ure during verify

original-brownbear approved these changes Jan 25, 2021

View reviewed changes

tlrx approved these changes Jan 25, 2021

View reviewed changes

Merge branch 'master' into 2021-01-21-repository-uuid

093a80a

DaveCTurner merged commit e5a15d4 into elastic:master Jan 25, 2021

DaveCTurner deleted the 2021-01-21-repository-uuid branch January 25, 2021 12:17

DaveCTurner added the backport pending label Jan 25, 2021

DaveCTurner mentioned this pull request Jan 25, 2021

Introduce repository UUIDs #67899

Merged

DaveCTurner mentioned this pull request Jan 25, 2021

Reinstate BWC snapshot tests #67938

Merged

DaveCTurner added a commit that referenced this pull request Jan 25, 2021

Reinstate BWC snapshot tests (#67938)

06e1418

This commit mostly reverts #67934, except for the change to the version constant `REPOSITORY_UUID_IN_REPO_DATA_VERSION`. Completes the backport of #67829 via #67899

DaveCTurner removed the backport pending label Jan 25, 2021

mark-vieira mentioned this pull request Jan 26, 2021

[CI] DocsClientYamlTestSuiteIT execution performance degraded causing timeouts #67853

Closed

DaveCTurner mentioned this pull request Jan 26, 2021

Only populate snapshot repo once in docs tests #68013

Merged

original-brownbear mentioned this pull request Jan 26, 2021

Only populate snapshot repo once in docs tests (#68013) #68016

Merged

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce repository UUIDs #67829

Introduce repository UUIDs #67829

DaveCTurner commented Jan 21, 2021 •

edited

Loading

elasticmachine commented Jan 21, 2021

original-brownbear left a comment

original-brownbear Jan 21, 2021

DaveCTurner Jan 21, 2021

DaveCTurner Jan 21, 2021

original-brownbear commented Jan 21, 2021

DaveCTurner commented Jan 21, 2021

DaveCTurner commented Jan 21, 2021 •

edited

Loading

DaveCTurner commented Jan 22, 2021

original-brownbear left a comment

tlrx left a comment

DaveCTurner commented Jan 25, 2021

Introduce repository UUIDs #67829

Introduce repository UUIDs #67829

Conversation

DaveCTurner commented Jan 21, 2021 • edited Loading

elasticmachine commented Jan 21, 2021

original-brownbear left a comment

Choose a reason for hiding this comment

original-brownbear Jan 21, 2021

Choose a reason for hiding this comment

DaveCTurner Jan 21, 2021

Choose a reason for hiding this comment

DaveCTurner Jan 21, 2021

Choose a reason for hiding this comment

original-brownbear commented Jan 21, 2021

DaveCTurner commented Jan 21, 2021

DaveCTurner commented Jan 21, 2021 • edited Loading

DaveCTurner commented Jan 22, 2021

original-brownbear left a comment

Choose a reason for hiding this comment

tlrx left a comment

Choose a reason for hiding this comment

DaveCTurner commented Jan 25, 2021

DaveCTurner commented Jan 21, 2021 •

edited

Loading

DaveCTurner commented Jan 21, 2021 •

edited

Loading