-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libroach: encrypt data at rest #21580
Conversation
20f0058
to
caeb344
Compare
nit: place the (The release not extraction assumes the release note can contain multiple paragraphs so the entire text of your PR description goes into it. Except when it's "none" which is why it's just a nit here.) |
I haven't done a complete first pass, but I have some questions about the EnvContext/SwitchingProvider design so here are my initial comments. Reviewed 3 of 30 files at r4, 1 of 33 files at r7, 1 of 4 files at r8, 13 of 30 files at r10, 1 of 2 files at r11, 2 of 5 files at r12, 3 of 6 files at r13. c-deps/libroach/switching_provider.cc, line 24 at r6 (raw file):
You can use c-deps/libroach/switching_provider.h, line 27 at r4 (raw file):
keps? c-deps/libroach/switching_provider.h, line 28 at r4 (raw file):
When would you want to access base_env other than when creating db_env? c-deps/libroach/switching_provider.h, line 30 at r4 (raw file):
s/EnvContext/EnvManager/? That's not a great name either, but "Env" and "Context" are nearly synonymous. (also, when I saw EnvContext in another header, i wasn't sure which file would have its declaration, so having some word in common between EnvContext and SwitchingProvider would be helpful). c-deps/libroach/switching_provider.h, line 67 at r13 (raw file):
"Env levels" need some documentation. It looks like this should be an enum instead of an int. c-deps/libroach/switching_provider.h, line 69 at r13 (raw file):
I don't see a c-deps/libroach/rocksdbutils/aligned_buffer.h, line 4 at r1 (raw file):
The apache and bsd references here should be rewritten to refer to the copies in our c-deps/libroach/rocksdbutils/env_encryption.h, line 76 at r10 (raw file):
Why did you introduce this env_level parameter that gets plumbed through EncryptedEnv to CreateCipherStream, instead of giving each EncryptedEnv a separate EncryptionProvider that knows its "level"? pkg/ccl/cmdccl/enc_utils/main.go, line 9 at r5 (raw file):
Should this be linked into the main binary as a debug command instead of a separate executable? Or is it too much of a proof-of-concept to even be useful as a debug command? (Turning this into a unit test might be nice just to make sure that the encryption is happening as expected) pkg/ccl/cmdccl/enc_utils/main.go, line 28 at r5 (raw file):
I think these should just reference constants from engineccl. Comments from Reviewable |
Review status: 12 of 37 files reviewed at latest revision, 10 unresolved discussions. c-deps/libroach/switching_provider.cc, line 24 at r6 (raw file): Previously, bdarnell (Ben Darnell) wrote…
I tried that but it couldn't fine c-deps/libroach/switching_provider.h, line 27 at r4 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Done. "kept". c-deps/libroach/switching_provider.h, line 28 at r4 (raw file): Previously, bdarnell (Ben Darnell) wrote…
The store-level Env (the one using the store key manager) uses the c-deps/libroach/switching_provider.h, line 30 at r4 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Sorry. Renaming to c-deps/libroach/switching_provider.h, line 67 at r13 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Switched to the c-deps/libroach/switching_provider.h, line 69 at r13 (raw file): Previously, bdarnell (Ben Darnell) wrote…
moved to the mention of c-deps/libroach/rocksdbutils/aligned_buffer.h, line 4 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
I've added them to the same directory. This feels more contained than putting everything in ours, and cleaner than doing a mix. This also has the advantage of having everything clearly in one place (the original intent of this subdirectory). c-deps/libroach/rocksdbutils/env_encryption.h, line 76 at r10 (raw file): Previously, bdarnell (Ben Darnell) wrote…
The goal of the env_level is to allow for multiple cipher stream types and ensure that we can tell that all the required ones are present (OSS vs CCL plain vs CCL with encryption options). I'll update this PR to have the The SwitchingProvider itself is still useful as it provides the coordination with the file registry (and the added bonus of owning all creators). Ultimately, I also want to move plaintext (when using the switching env) to use a level 0 stream creator. This still bypasses block operations so should be fine to replace the pkg/ccl/cmdccl/enc_utils/main.go, line 9 at r5 (raw file): Previously, bdarnell (Ben Darnell) wrote…
This is mostly proof-of-concept for now, hence the completely separate command. CLI interaction with rocksdb files should be properly-wrapped debug commands and perhaps some high-level encryption-related command in C++ (eg: show encryption settings and usage). I expect this code to disappear in favor of a utility used in Go-side testing. pkg/ccl/cmdccl/enc_utils/main.go, line 28 at r5 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Do you mean one set of constants on the Go side, and the same set of constants on the C++ side? Co-located with the proto could be ok, but either way we'll have two things to keep in sync. Comments from Reviewable |
I've addressed most review comments. I'll change the env_level use as described in a comment below. Should be in tomorrow morning. Review status: 12 of 37 files reviewed at latest revision, 10 unresolved discussions. Comments from Reviewable |
Reviewed 4 of 30 files at r10, 1 of 5 files at r12, 7 of 12 files at r14, 7 of 7 files at r15. c-deps/libroach/db.cc, line 1486 at r15 (raw file):
Why isn't this done in the CCL DBOpenHook? Then I don't think we'd need to expose switching_provider on the OSS side at all and can move it all over to CCL. (I guess we'd need a check something like (If this goes, I think we can get EnvManager over to the CCL side too, but I'll save that for the next round. I'm pushing for this not because I want to maximize how much stuff falls under the CCL but because I think straddling the divide is forcing some awkward design decisions and I don't think we need this many crossing points) c-deps/libroach/env_manager.h, line 25 at r15 (raw file):
Can we just make a copy of Env::Default when we use it so that we can have everything be owned by the EnvManager instead of a mix of raw and unique pointers? c-deps/libroach/rocksdbutils/aligned_buffer.h, line 4 at r1 (raw file): Previously, mberhault (marc) wrote…
We want to include all the licenses for anything in the repo in the top-level pkg/ccl/cmdccl/enc_utils/main.go, line 28 at r5 (raw file): Previously, mberhault (marc) wrote…
I was thinking just one set of constants period, but I wasn't thinking about cross-language issues. It's not worth plumbing constants through cgo just to avoid this duplication. pkg/storage/engine/enginepb/file_registry.proto, line 26 at r15 (raw file):
"Level" makes these sound ordered/comparable, but they're really not. pkg/storage/engine/enginepb/file_registry.proto, line 53 at r15 (raw file):
Why isn't this inside the encryption_settings protobuf? We don't create FileEntries for plaintext files, so this would just indicate whether it's "store" or "data", which only the CCL side cares about. Combining this and my previous comment, I suggest moving this to the ccl key_registry.proto and renaming to something like KeyScope or KeyType. Comments from Reviewable |
Review status: 20 of 35 files reviewed at latest revision, 8 unresolved discussions, some commit checks failed. c-deps/libroach/db.cc, line 1486 at r15 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Because we absolutely must check whether encryption was used before we start up, regardless of the current build mode or flags. Otherwise the only indication is garbage files (eg: rocksdb's c-deps/libroach/env_manager.h, line 25 at r15 (raw file): Previously, bdarnell (Ben Darnell) wrote…
We could have yet another env wrapper that only implements the things that can't be inherited and use that right away. The use of c-deps/libroach/rocksdbutils/aligned_buffer.h, line 4 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Done. pkg/ccl/cmdccl/enc_utils/main.go, line 28 at r5 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Yeah, I with proto3 had constants, or even the proto2 "default" work-around. pkg/storage/engine/enginepb/file_registry.proto, line 26 at r15 (raw file): Previously, bdarnell (Ben Darnell) wrote…
True. Right now, the higher level requires the lower level env, but if we add different types of ciphers we may want to give them their own levels (it's pointless to ask a CTR cipher to decode GCM cipher data). I could call them pkg/storage/engine/enginepb/file_registry.proto, line 53 at r15 (raw file): Previously, bdarnell (Ben Darnell) wrote…
We don't, but there's no reason we couldn't save some (or all) plaintext files as well. Ultimately, it's to be able to have the OSS side properly handle mismatched levels. This will become even more important when we have other users of the encryption envs. Having just a blob we can't read is a bit unfriendly. Comments from Reviewable |
Review status: 17 of 35 files reviewed at latest revision, 8 unresolved discussions, some commit checks failed. c-deps/libroach/db.cc, line 1486 at r15 (raw file): Previously, mberhault (marc) wrote…
Right, but can we get that from the FileRegistry ( pkg/storage/engine/enginepb/file_registry.proto, line 26 at r15 (raw file): Previously, mberhault (marc) wrote…
pkg/storage/engine/enginepb/file_registry.proto, line 53 at r15 (raw file):
Then we could store a bool for plaintext or not, or just infer it from the non-empty
What other users are you thinking about? Comments from Reviewable |
127563b
to
3057580
Compare
Review status: 0 of 37 files reviewed at latest revision, 8 unresolved discussions, some commit checks failed. c-deps/libroach/db.cc, line 1486 at r15 (raw file): Previously, bdarnell (Ben Darnell) wrote…
We can ask the file registry how many it needs, but if I register the wrong ones (eg: eventually we have pkg/storage/engine/enginepb/file_registry.proto, line 26 at r15 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Done. pkg/storage/engine/enginepb/file_registry.proto, line 53 at r15 (raw file): Previously, bdarnell (Ben Darnell) wrote…
True, but this is again about nicer error handling and misconfiguration detection. The other uses I'm thinking of is the sideloading stuff. We may be able to reuse the env, but not always (eg: debug commands). Comments from Reviewable |
Review status: 0 of 37 files reviewed at latest revision, 7 unresolved discussions, some commit checks failed. c-deps/libroach/db.cc, line 1486 at r15 (raw file): Previously, mberhault (marc) wrote…
But that part of the validation can happen on the CCL side in DBOpenHook. pkg/storage/engine/enginepb/file_registry.proto, line 53 at r15 (raw file): Previously, mberhault (marc) wrote…
I'm not seeing how an int plus a blob on the OSS side is any better than just a blob. Any user of this is going to need to be able to see into the blob. Comments from Reviewable |
Review status: 0 of 37 files reviewed at latest revision, 7 unresolved discussions, some commit checks failed. c-deps/libroach/db.cc, line 1486 at r15 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Sure, I can check in the CCL hook 1) when I have no encryption args, and 2) when I do, then in OSS always. At some point, it seems simpler to have a single safeguard that all encryption objects are loaded before trying to start rocksdb. pkg/storage/engine/enginepb/file_registry.proto, line 53 at r15 (raw file): Previously, bdarnell (Ben Darnell) wrote…
It keeps all the Comments from Reviewable |
Review status: 0 of 37 files reviewed at latest revision, 8 unresolved discussions, some commit checks failed. c-deps/libroach/rocksdbutils/env_encryption.cc, line 322 at r26 (raw file):
I'm not sure passing a CipherStreamCreator is any better than passing the level. When I suggested getting rid of the level argument here, I was thinking of making the EncryptionProvider self-contained: there wouldn't be a SwitchingProvider any more. Instead, there would be a separate EncryptionProvider for each level. (The validation functionality now performed by SwitchingProvider would go in some "registry" object instead of in an EncryptionProvider) Basically I'm trying to keep the stuff on the OSS side limited to "registry" type objects and anything dealing with encryption, cipher streams, etc on the CCL SIDE> Comments from Reviewable |
714dac6
to
e9dd641
Compare
4dea69e
to
129beb8
Compare
569c24f
to
dcea9bb
Compare
dcea9bb
to
39be944
Compare
7c2d7ec
to
4b2e509
Compare
Dropped the switching provider in the last commit, making the encrypted env(s) directly own the cipher stream creators. Since last review batch, some small tweaks to 1) fix path handling (rocksdb something uses double slashes) 2) dump more info in the Go tool (still need to figure out the long term goal for this, most likely move it into debug commands) |
This is looking pretty good. Reviewed 8 of 30 files at r30, 1 of 2 files at r31, 1 of 5 files at r32, 3 of 18 files at r36, 1 of 2 files at r38, 4 of 12 files at r39, 2 of 4 files at r40, 1 of 5 files at r43, 1 of 1 files at r44, 16 of 16 files at r45. c-deps/libroach/ccl/ctr_stream.cc, line 133 at r45 (raw file):
You'd be surprised what compilers can do. GCC and Clang can both turn this into something optimal provided both c-deps/libroach/ccl/db.cc, line 42 at r45 (raw file):
Should c-deps/libroach/ccl/db.cc, line 108 at r45 (raw file):
Remove these comments? I'm not too worried about people stumbling across unreleased enterprise features and using them. c-deps/libroach/rocksdbutils/env_encryption.cc, line 22 at r45 (raw file):
Are there any particular parts of this file I should be looking at? It looks like it's mostly copied from the rocksdb version so I haven't gone through it too closely. pkg/ccl/cmdccl/enc_utils/README.md, line 6 at r45 (raw file):
Coming back to this after a long time away, the use of 48-byte keys is striking (why can't we just use 16-byte keys again?). We should (in a subsequent PR) offer a pkg/storage/engine/enginepb/file_registry.proto, line 53 at r15 (raw file): Previously, mberhault (marc) wrote…
Now that SwitchingProvider is gone, is it still useful to have EnvType here? pkg/storage/engine/enginepb/file_registry.proto, line 46 at r45 (raw file):
What kinds of files could be written to this registry that are outside the rocksdb dir? I assume the reason for relative paths is that we want to allow moving the rocksdb directory to a different path; could that be needed for other files too? We might need an enum of different roots and always store paths relative to some root. Comments from Reviewable |
I've added some cleanup/rename/todos to #19783, some of them are trivial but touch a lot of files. Review status: 37 of 38 files reviewed at latest revision, 8 unresolved discussions. c-deps/libroach/ccl/ctr_stream.cc, line 133 at r45 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Yeah, I'm not too concerned, but cryptopp (and others) have optimized primitives for these kinds of things, it may be worth investigating depending on the benchmark/profiling results. c-deps/libroach/ccl/db.cc, line 42 at r45 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Sure. The main change now is the file registry. This touches the Go engine code as well, so I propose to do this in a followup. c-deps/libroach/ccl/db.cc, line 108 at r45 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Done. c-deps/libroach/rocksdbutils/env_encryption.cc, line 322 at r26 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Done. c-deps/libroach/rocksdbutils/env_encryption.cc, line 22 at r45 (raw file): Previously, bdarnell (Ben Darnell) wrote…
The main bits are pkg/ccl/cmdccl/enc_utils/README.md, line 6 at r45 (raw file): Previously, bdarnell (Ben Darnell) wrote…
This was when we switched to grabbing a key ID out of the key file itself as opposed to a hash of the key. The reason for the 256 bit ID length (which is definitely excessive) is to be able to tell from the size of the file what key size we're talking about and detect when the user generated the wrong size (eg: 16/24/32 bytes are invalid, you need 48/56/64). I'm happy to revisit this, either in this PR or later (later might be best, it's a small, easy change after all). pkg/storage/engine/enginepb/file_registry.proto, line 53 at r15 (raw file): Previously, bdarnell (Ben Darnell) wrote…
pkg/storage/engine/enginepb/file_registry.proto, line 46 at r45 (raw file): Previously, bdarnell (Ben Darnell) wrote…
The most obvious ones will be ingested sstables and temporary work dir. There's quite a bit of work left to do for those, so I left this unspecified for now. I propose to flesh this out properly in the next phase ("encrypt other uses of local disk") Comments from Reviewable |
Ping? I think this is ready to go in post alpha sha picking. |
5e345c9
to
7b6fee7
Compare
Reviewed 8 of 30 files at r49, 1 of 2 files at r50, 1 of 5 files at r51, 3 of 18 files at r55, 1 of 2 files at r57, 4 of 12 files at r58, 2 of 4 files at r59, 1 of 5 files at r62, 1 of 1 files at r63, 12 of 16 files at r64, 2 of 2 files at r65. Comments from Reviewable |
Part of encryption-at-rest (see cockroachdb#19783) Quick overview: * file registry records encryption settings for encrypted files * an encrypted env has a "env level" (plain, store, data) and a key manager as a source * the data key manager uses an encrypted env with the store key manager as key source * rocksdb uses an encrypted env with the data key manager as key source Release note: none
7b6fee7
to
b641eac
Compare
bors r+ |
21580: libroach: encrypt data at rest r=mberhault a=mberhault Part of encryption-at-rest (see #19783) Quick overview: * file registry records encryption settings for encrypted files * an encrypted env has a "env level" (plain, store, data) and a key manager as a source * the data key manager uses an encrypted env with the store key manager as key source * rocksdb uses an encrypted env with the data key manager as key source Release note: none 25240: distsqlrun: forward-port regression test for topk panic r=jordanlewis a=jordanlewis Release note: None Co-authored-by: marc <[email protected]> Co-authored-by: Jordan Lewis <[email protected]>
Build succeeded |
Cleanup following cockroachdb#21580, the switching env went away. Release note: None
25248: encryption: rename "switching env" format version to "file registry". r=mberhault a=mberhault Cleanup following #21580, the switching env went away. This is a rename only, the only thing persisted was the format version iota, and that has not changed. Release note: None Co-authored-by: marc <[email protected]>
Part of encryption-at-rest (see #19783)
Quick overview:
manager as a source
key source
Release note: none