-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ccl/storageccl/engineccl: crash testing #96670
Labels
A-storage
Relating to our storage engine (Pebble) on-disk storage.
C-cleanup
Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior.
quality-friday
A good issue to work on on Quality Friday
T-storage
Storage Team
Comments
jbowens
added
C-cleanup
Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior.
A-storage
Relating to our storage engine (Pebble) on-disk storage.
T-storage
Storage Team
labels
Feb 6, 2023
sumeerbhola
added a commit
to sumeerbhola/cockroach
that referenced
this issue
Jul 26, 2023
The encryptedFS can return an error after doing part of the work, as modifying the encryption metadata and the underlying FS is not atomic. This makes some operations (rename, link, remove) non-idempotent, which is harmless for the CockroachDB use cases (since they don't retry on the same files). The test works around these by retrying in a way that makes them idempotent. Additionally, the test catches panics caused by FS errors, in order to test a node that crashes because of a panic caused by a transient error, and is subsequently restarted. Epic: none Informs: cockroachdb#96670 Release note: None
sumeerbhola
added a commit
to sumeerbhola/cockroach
that referenced
this issue
Jul 27, 2023
The encryptedFS can return an error after doing part of the work, as modifying the encryption metadata and the underlying FS is not atomic. This makes some operations (rename, link, remove) non-idempotent, which is harmless for the CockroachDB use cases (since they don't retry on the same files). The test works around these by retrying in a way that makes them idempotent. Additionally, the test catches panics caused by FS errors, in order to test a node that crashes because of a panic caused by a transient error, and is subsequently restarted. Epic: none Informs: cockroachdb#96670 Release note: None
sumeerbhola
added a commit
to sumeerbhola/cockroach
that referenced
this issue
Jul 28, 2023
The encryptedFS can return an error after doing part of the work, as modifying the encryption metadata and the underlying FS is not atomic. This makes some operations (rename, link, remove) non-idempotent, which is harmless for the CockroachDB use cases (since they don't retry on the same files). The test works around these by retrying in a way that makes them idempotent. Additionally, the test catches panics caused by FS errors, in order to test a node that crashes because of a panic caused by a transient error, and is subsequently restarted. Epic: none Informs: cockroachdb#96670 Release note: None
craig bot
pushed a commit
that referenced
this issue
Aug 1, 2023
107618: engineccl: add randomized error injector test for encryptedFS r=jbowens a=sumeerbhola The encryptedFS can return an error after doing part of the work, as modifying the encryption metadata and the underlying FS is not atomic. This makes some operations (rename, link, remove) non-idempotent, which is harmless for the CockroachDB use cases (since they don't retry on the same files). The test works around these by retrying in a way that makes them idempotent. Additionally, the test catches panics caused by FS errors, in order to test a node that crashes because of a panic caused by a transient error, and is subsequently restarted. Epic: none Informs: #96670 Release note: None 107927: roachtest: ignore some ORM tests r=rafiss a=rafiss fixes #107698 fixes #107849 fixes #107861 fixes #107869 Release note: None Co-authored-by: sumeerbhola <[email protected]> Co-authored-by: Rafi Shamim <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-storage
Relating to our storage engine (Pebble) on-disk storage.
C-cleanup
Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior.
quality-friday
A good issue to work on on Quality Friday
T-storage
Storage Team
Pebble performs crash testing using
vfs.NewStrictMem
, a vfs.FS filesystem that intentionally loses all data that is not synced. This is invaluable within Pebble for finding durability bugs. We don't today perform the same type of testing up in Cockroach to test encryption-at-rest. We should improve our test coverage here.We could consider running the Pebble metamorphic tests with encryption-at-rest if we converted it to an externally-runnable library.
Adjacent to cockroachdb/pebble#2086.
Jira issue: CRDB-24270
The text was updated successfully, but these errors were encountered: