-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Creating a snapshot fails at the very last stage with NullPointerException #29649
Comments
Thanks @redlus. It looks similar to other issue we saw. I'm going to look closer at this. |
Thanks. Is there an open issue for what you mentioned? |
Pinging @elastic/es-distributed |
@tlrx a quick update: This seems to indicate that repositories and snapshots created with elasticsearch 5.2.2 have some collision with create / restore / delete snapshot commands run on the same repos from elasticsearch 6.2.3. |
Just to double check, does it mean that two clusters in different versions where writing to the same repository (Azure container) at the same time? |
There were two phases. First, v5.2.2 successfully created snapshots and v6.2.3 successfully restored snapshots. Then the old v5.2.2 cluster was taken down, and v6.2.3 was now trying to create snapshots onto the same repo - resulting in the NullPointerException above. |
Thanks for your responses. In your case, you snapshotted data from 5.2.2 and restored them on 6.2.3. We usually advise to upgrade to the latest 5.x version first and then move to the next major version (see https://www.elastic.co/guide/en/elasticsearch/reference/6.2/setup-upgrade.html). Also, according to the the documentation:
It means that the repository created using the 5.2.2 cluster must only be accessed by this 5.2.2 cluster for creating or deleting snapshots. The cluster 6.2.3 can register the 5.2.2 repository too but it must be read-only. This way the 6.2.3 cluster can restore data from 5.2.2 but cannot create or delete snapshots. If you want to create snapshots of the 6.2.3 cluster then you have to register a new repository in a different container/bucket/folder, as you did.
Finally found it :) See #29052 where a user had a similar issue than yours and the pull request #26127 that fixed the issue #25878. I'm going to close this issue in favor of #29052. |
Ok, thanks for the update. |
Hi,
Creating a snapshot of a single index reports it finished correctly through _snapshot/_status ("state": "SUCCESS"), but a NullPointerException is thrown and the snapshot eventually does not show up in the repository.
Elasticsearch version:
6.2.3
Plugins installed:
ingest-attachment
ingest-geoip
mapper-murmur3
mapper-size
repository-azure
repository-gcs
repository-s3
JVM version:
openjdk version "1.8.0_151"
OpenJDK Runtime Environment (build 1.8.0_151-8u151-b12-0ubuntu0.16.04.2-b12)
OpenJDK 64-Bit Server VM (build 25.151-b12, mixed mode)
OS version:
Linux 4.13.0-1011-azure #14-Ubuntu SMP 2018 x86_64 GNU/Linux
Description of the problem including expected versus actual behavior:
We've recently upgraded our elasticsearch cluster from v5.2.2 to v6.2.3, running both clusters in parallel and used the snapshots feature on Azure blob storage to replicate the data to the new cluster. After migration, new snapshot creation tasks are executed daily for backup from the new v6.2.3 cluster onto the same repositories. Initiating the create snapshots succeeds, taking the expected amount of time to copy the data to Azure and even reporting "state": "SUCCESS" on _snapshot/_status:
After the create snapshot finishes (_snapshot/_status returns an empty array), the snapshot does not show up in the repository and a NullPointerException is thrown in the master node logs:
This has been replicated with multiple create snapshots operations on multiple repositories.
Additionally, deleting the repository (without individually deleting the snapshots it contains) and re-creating it (thereby loading the available snapshots from 5.2.2) did not solve the problem.
I'd be happy to provide any additional information as needed.
Thanks!
The text was updated successfully, but these errors were encountered: