Encrypted blob store repository #50846

albertzaharovits · 2020-01-10T09:54:03Z

This builds upon the data encryption streams from #49896 to create an encrypted snapshot repository. The repository encryption works with the following existing repository types: FS, Azure, S3, GCS. The encrypted repository is password protected. The platinum license is required to snapshot to the encrypted repository, but no license is required to list or restore already encrypted snapshots.

Example how to use the encrypted FS repository:

similarly to the un-encrypted FS repository, specify the mount point of the shared FS in the elasticsearch.yml conf file (on all the cluster nodes), eg: path.repo: ["/tmp/repo"]
add the password used to derive the encryption keys to the elasticsearch keystore on all the cluster nodes, eg for the test_enc repository name:

./bin/elasticsearch-keystore add repository.encrypted.test_enc.password

start-up the cluster, and create the new encrypted repository, eg:

curl -X PUT "localhost:9200/_snapshot/test_enc?pretty" -H 'Content-Type: application/json' -d'
{
  "type": "encrypted",
  "settings": {
    "location": "/tmp/repo/enc",
    "delegate_type": "fs"
  }
}
'

Overview how it works

Every data blob is encrypted (AES/GCM) independently with a randomly generated AES256 secret key. The key is stored in another metadata blob, which is itself encrypted (AES/GCM) with a key derived from the repository password. The metadata blob tree structure mimics the data blob tree structure, but it is rooted by the fixed blob container encryption-metadata.

I will detail more how each piece works by commenting in the code source.

Relates #49896
Relates #41910
Obsoletes #48221

…herstream-2

albertzaharovits · 2020-02-25T13:32:40Z

FYI I have made the following changes to the main code, besides the changes suggested above in the review comments:

ef4c860 Fix silly cleanup bug (I've used putIfAbsent as if computeIfAbsent)
a7e2dac Log unrecognized metadata blob names during cleanup
28dd9e1 Adds method to compute the size of the encrypted blob, given the plain text size

original-brownbear

Thanks @albertzaharovits I found a few more things (still not a full review pass I'm afraid but I'll get to it shortly :)). The build/license situation is a problem I think though and it's probably good to address/refactor that asap as this may be a little tricky to get right.

original-brownbear · 2020-02-26T13:47:10Z

...t/java/org/elasticsearch/repositories/azure/EncryptedAzureBlobStoreRepositoryIntegTests.java

+        for (int i = 0; i < 32; i++) {
+            names.add("test-repo-" + i);
+        }
+        repositoryNames = Collections.synchronizedList(names);


Doe this list have to be synchronized? No right?

It does, because the test methods in the class can run in parallel and there should not be two tests that use the same repository.

Nah I don't think we can have two test methods from the same class run concurrently in the same JVM. How would that work with the single static reference to the internal node/cluster and its cleanup after every test?

Makes sense, I don't know what made me believe otherwise.

original-brownbear · 2020-02-26T13:48:43Z

plugins/repository-gcs/build.gradle

@@ -54,6 +54,8 @@ dependencies {
  compile 'com.google.apis:google-api-services-storage:v1-rev20190426-1.28.0'

  testCompile project(':test:fixtures:gcs-fixture')
+  // required by the test for the encrypted GCS repository
+  testCompile project(path: ':x-pack:plugin:repository-encrypted', configuration: 'testArtifacts')


I'm not the right person to judge this but this pattern in the build seems troubling to me. We are using non-Apache licensed code in the tests of Apache licensed code now. Is that ok? Maybe it's better to run these tests in a separate module downstream?
You are definitely using this dependency in Apache licensed files, this seems wrong.

we discussed this synchronously, and also the tests requiring this might not be relevant anymore, but I'm presenting my counter for posterity

I see your point, and I didn't look at it like this. I had minor qualms about having tests for non-Apache licensed code as Apache licensed. The fact that we pull non-Apache test artifacts is a slightly different point.
But, I tend to dismiss the Apache purists that get offended if the tests require an artifact dependency which is not Apache licensed. It feels squarely like a mean-intending exaggeration; If we don't publish the tests jar, they should not even notice.

When I went on this path, I worried that the reverse alternative (having the encrypted repo depend in tests on the cloud repository) would entail more project setup boilerplate code. I'm not so sure about it right now, and I think it is beneficial if we keep these tests inside the encrypted-repo module, for subjective reasons of code organization.

original-brownbear · 2020-02-26T13:56:05Z

...test/java/org/elasticsearch/repositories/gcs/EncryptedGoogleCloudStorageThirdPartyTests.java

+        // blobs are larger after encryption
+        Map<String, BlobMetaData> blobsWithSizeAfterEncryption = new HashMap<>();
+        blobs.forEach((name, meta) -> {
+            blobsWithSizeAfterEncryption.put(name, new BlobMetaData() {


NIT: Probably shorter to just do:

new PlainBlobMetaData(meta.name(), EncryptedRepository.getEncryptedBlobByteLength(meta.length());

here :)

server/src/test/java/org/elasticsearch/common/blobstore/BlobPathTests.java

original-brownbear

Alright finally had some quiet time to go over this in-depth :)
I think the main remaining point for me in production code is the delete efficiency. We shouldn't be adding new LIST operations to bulk deletes unless we absolutely can't avoid them. I put some more details on this in-line. But what it boils down to from my perspective is this:

We have blobs that can be overwritten and for those we need the meta-id at the beginning of the blob which makes deletes and reads a little tricky since we don't know the meta path straight from the blob itself.
The vast majority of blobs though (everything but meta- and snap-) is never ever overwritten and we don't need the meta-id magic. They can be deleted as efficiently as any other blob (or at least half as efficiently) because we could technically make the naming of the metadata fixed for them.

I know we already have the index.latest special case, but right now special casing snap- and meta- blobs looks like the only way of achieving happiness here. We can then work on removing the possibility of overwriting these blobs from the core blob-store implementation (meta is already done pretty much and snap- are easy) and get rid of the complication with the meta id eventually.

original-brownbear · 2020-02-27T03:59:09Z