-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage/engine: MVCC Metamorphic test suite, first phase #44458
Conversation
8d4aaac
to
bb8bf97
Compare
I'll designate this as WIP for now until I finish the ability to parse/check an output file later today. Until that happens, some of the parsing code is unused and buggier than I'd like. |
Fixed bugs in the parsing paths, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass comments that jumped out at me. I'll continue to review this later today.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal, @petermattis, and @sumeerbhola)
pkg/storage/engine/metamorphic/generator.go, line 35 at r1 (raw file):
engine engine.Engine tsGenerator tsGenerator managers map[operandType]operandManager
I suspect there is a level of abstraction here that isn't serving much purpose. Given that there are a fixed number of "*managers", rather than a map you could embed the concrete types here. There are a few places where you're using a variable as a key to managers
, but my suspicion is that is due to the operand abstraction which I'm also questioning.
pkg/storage/engine/metamorphic/operands.go, line 30 at r1 (raw file):
const ( OPERAND_TRANSACTION operandType = iota
Nit: Go style is camelcase for constants. For example, operandTransaction
.
pkg/storage/engine/metamorphic/operations.go, line 32 at r1 (raw file):
// dependantOps commands should be stateless, with all state stored in the // passed-in test runner or its operand managers. type mvccOp struct {
You took a different approach here than what is done in the Pebble metamorphic test. Rather than having an op
interface with concrete implementations that refer to concrete arguments, you've abstracted the operands. Do you mind motivating this different approach? On the surface, it seems to push more work to the op execution phase. That isn't necessarily problematic. It also mixes in generation with execution (e.g. mvccOp.weight
which shouldn't be needed at execution time). I'm also not quite clear on what to make of dependentOps
. In the Pebble metamorphic test, op dependencies are implicit in their ordering.
36a57d3
to
a295ae9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal, @petermattis, and @sumeerbhola)
pkg/storage/engine/metamorphic/operations.go, line 32 at r1 (raw file):
Previously, petermattis (Peter Mattis) wrote…
You took a different approach here than what is done in the Pebble metamorphic test. Rather than having an
op
interface with concrete implementations that refer to concrete arguments, you've abstracted the operands. Do you mind motivating this different approach? On the surface, it seems to push more work to the op execution phase. That isn't necessarily problematic. It also mixes in generation with execution (e.g.mvccOp.weight
which shouldn't be needed at execution time). I'm also not quite clear on what to make ofdependentOps
. In the Pebble metamorphic test, op dependencies are implicit in their ordering.
The operand abstraction came from an ideal goal of being able to cleanly define what operands depended on others - so if you need a batch and have zero batches, you'll just create one by inserting a batch's opener
, and the same for transactions, etc. I figured this would be an easier way to generate correct sequences of operations, to just insert openers as necessary, instead of restricting your generator to the set of legal operations (a set that can be of high complexity to completely generate; the interaction between batches and transactions being the most tricky one that comes to mind). The hope was also that all state information about active objects would live in their managers only.
In practice it all ended up being more intertwined and less modular than I was hoping, but the general idea still works.
All non-opener dependent operations are defined in dependentOps
. I should probably describe this in a comment too, but basically, every time you "randomly generate" an operation, you go through its operands list, see if any need to be opened, then recursively run their openers (if necessary), then run the dependentOps
method, and recursively run those operations (which could also require running more openers and more dependentOps), then you run the operation itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal and @sumeerbhola)
pkg/storage/engine/metamorphic/operands.go, line 58 at r3 (raw file):
} type operand interface{}
Usage of the empty interface is a big signal that something is off with the abstractions. The required use of type assertions is a related indicator. My instinct is that there should be abstraction at the op level, with concrete implementations of the various MVCC operations that know the concrete types of their parameters/operands. For example:
type mvccScanOp struct {
reader Reader
start roachpb.Key
end roachpb.Key
max int64
...
}
One nuance here is that you don't want to store an actual engine.Reader
in mvccScanOp
. That's why the Pebble metamorphic tests use IDs for all of the objects. The level of indirection allows an op to refer to an object that hasn't been created yet. This is similar to the way you have ops retrieving objects from managers.
pkg/storage/engine/metamorphic/operations.go, line 32 at r1 (raw file):
The operand abstraction came from an ideal goal of being able to cleanly define what operands depended on others - so if you need a batch and have zero batches, you'll just create one by inserting a batch's opener, and the same for transactions, etc.
I don't think it implies the operand
abstraction. I think you primarily get this functionality via the mvccOp.operands
specification.
It isn't clear to me if this goal is necessary. The Pebble metamorphic test generator simply skips generating an op if it is applied to an object for which there are no live objects. For example, if you try to generate an iter.Close
op and there are no live iterators. Immediately opening and then closing an iterator doesn't seem to be terribly useful.
The hope was also that all state information about active objects would live in their managers only.
I'm good with placing the state information about active objects in the managers. My feeling is that the managers should be a op-generation time component. Currently they are also an op-execution component as well.
In practice it all ended up being more intertwined and less modular than I was hoping, but the general idea still works.
I think it is worthwhile to disentangle the code and state needed for generation from the state needed for executing ops. The current mvccOps
structure is good on the generation side, but the way execution is mixed in there is a bit messy. And I think this will get even messier when you add parsing and what not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This testing infrastructure is complex enough, that it is beneficial for it to have its own tests so you can debug the component pieces in isolation. It would be nice to have a test that outputs the ops that are generated (without running them) for a particular seed and count. See also TestGenerator
in the Pebble metamorphic test which verifies that all of the live objects are properly closed. Another bit that could use a test is the parsing. Right now that is a bit difficult because execution is interwoven with parsing. Being able to test parsing in isolation is a good reason for disentangling that.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal and @sumeerbhola)
pkg/storage/engine/metamorphic/generator.go, line 248 at r3 (raw file):
func (t *tsGenerator) randomPastTimestamp(rng *rand.Rand) hlc.Timestamp { var result hlc.Timestamp result.WallTime = int64(float64(t.lastTS.WallTime+1) * rng.Float64())
I suspect it would be better for the historic times to be skewed towards the current time. Not sure precisely how I would accomplish that. Perhaps with a zipfian distribution.
pkg/storage/engine/metamorphic/meta_test.go, line 119 at r3 (raw file):
defer leaktest.AfterTest(t) ctx := context.Background() seeds := []int64{123}
By default, I think you should generate a few random seeds. Perhaps make this a command line flag.
pkg/storage/engine/metamorphic/meta_test.go, line 136 at r3 (raw file):
} for _, seed := range seeds { t.Run(fmt.Sprintf("seed=%d", seed), func(t *testing.T) {
The seed won't generate the same series of operations if the use of the random number generator changes. Is that a problem?
pkg/storage/engine/metamorphic/operands.go, line 112 at r3 (raw file):
} return k.liveKeys[int(k.rng.Float64()*float64(len(k.liveKeys)))]
Better to do k.liveKeys[k.rng.Intn(len(k.liveKeys))]
. Using floats to generate random numbers in a range like this is actually not uniformly random which is sort of surprising. I don't think it really matters, but it is better to use the supported idiom for generating integers in a range.
Thanks for the review! I just added support for more operations, as well as standardized behaviour between rocksdb and pebble enough that you can pass output from one engine into the other with --check and it should always pass. Functionally I'll cap this PR off at this point. I'll make another parallel PR for the restart point work - which should be straightforward now that everything it depends on is done. Once that is up and running (hopefully with no obvious failures) I'll come back here and address code review comments - so thanks for doing a review quickly. I'm open to pulling out all execution bits from the And speaking about the file checking mode, it turned out to be relatively straightforward, because assuming the rest of the state machine behaves predictably and the output file was well generated, all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Functionally I'll cap this PR off at this point.
Let me give this another review today and we'll try to get it merged in its current form. The suggestions I'm making can be handled in follow-on PRs. Probably easier for everyone that way.
operands can be ID strings going forward, and just parsed by the relevant manager upon execution time.
I'm suggesting that there is an explicit parsing step for both operators and operands. If an operator has an int operator, it should be an int
field in the associated mvccOp
struct.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal and @sumeerbhola)
070afcd
to
20103e1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal and @sumeerbhola)
pkg/storage/engine/metamorphic/deck.go, line 24 at r4 (raw file):
// ensures that each element is returned with a desired frequency within the // size of the deck. type Deck struct {
Only used within this package, so this can be unexported: s/Deck/deck/g
.
pkg/storage/engine/metamorphic/generator.go, line 100 at r4 (raw file):
// generateAndRun generates n operations using a TPCC-style deck shuffle with // weighted probabilities of each operation appearing. func (m *metaTestRunner) generateAndRun(n uint64) {
Nit: we're unlikely to run more than 2 billion operations. s/uint64/int/g
. This also simplifies the loop below slightly.
pkg/storage/engine/metamorphic/generator.go, line 101 at r4 (raw file):
// weighted probabilities of each operation appearing. func (m *metaTestRunner) generateAndRun(n uint64) { m.ops = make([]*mvccOp, n)
You're initializing m.ops
to length n
, but then seem to be appending to m.ops
in runOp
. I think the end result is that you'll end up with len(m.ops) == 2 * n
. Perhaps you meant to make([]*mvccOp, 0, n)
?
pkg/storage/engine/metamorphic/generator.go, line 105 at r4 (raw file):
for i := uint64(0); i < n; i++ { opToAdd := &operations[deck.Int()]
Are you adding the op somewhere? Seems like you're just running it. Perhaps s/opToAdd/op/g
.
pkg/storage/engine/metamorphic/meta_test.go, line 47 at r4 (raw file):
// createTestPebbleEngine returns a new in-memory Pebble storage engine. func createTestPebbleEngine(path string) (engine.Engine, error) { return engine.NewEngine(enginepb.EngineTypePebble, 1<<20, base.StorageConfig{
The base.StorageConfig
looks identical between this function and createTestRocksDBEngine
. Perhaps pull out into a helper: func makeStorageConfig(dir string) base.StorageConfig
.
pkg/storage/engine/metamorphic/meta_test.go, line 109 at r4 (raw file):
testRunner.parseFileAndRun(checkFile) } else { testRunner.generateAndRun(10000)
Add a TODO about making this configurable. Note that in Pebble, the default is a random number in the range [5000,10000)
.
pkg/storage/engine/metamorphic/operands.go, line 49 at r4 (raw file):
// writers, etc) should be stored in an operandManager. type operandManager interface { get() operand
Can you provide some commentary around these? get()
seems to generate an operand (would generate
be a better name?). opener()
seems to return the name of an operator.
pkg/storage/engine/metamorphic/operands.go, line 178 at r4 (raw file):
liveTxns []*roachpb.Transaction txnIdMap map[string]*roachpb.Transaction inFlightBatches map[*roachpb.Transaction][]engine.Batch
s/inFlightBatches/openBatches/g
?
pkg/storage/engine/metamorphic/operands.go, line 256 at r4 (raw file):
i-- found = true // Don't break; this batch could appear multiple times.
Can you elaborate? Why would the same batch appear multiple times in the in-flight batches list.
pkg/storage/engine/metamorphic/operations.go, line 131 at r4 (raw file):
// List of operations, where each operation is defined as one instance of mvccOp. var operations = []mvccOp{
I think there are some operations missing here. Can you add a TODO? For example, MVCCConditionalPut
, MVCCBlindPut
, MVCCMerge
, MVCCClearTimeRange
. Perhaps you've added these in subsequent PRs.
I'm not seeing ingestion, which is another important bit of surface area to trigger. The way ingestion works in the Pebble metamorphic test is that it converts a batch into an sstable and ingests that. Something similar could be done here. Again, a TODO is sufficient for this PR.
pkg/storage/engine/metamorphic/operations.go, line 161 at r4 (raw file):
val, intent, err := engine.MVCCGet(ctx, reader, key.Key, txn.ReadTimestamp, engine.MVCCGetOptions{ Inconsistent: false, Tombstones: true,
Add a TODO that all of these parameters should be operands.
pkg/storage/engine/metamorphic/operations.go, line 313 at r4 (raw file):
result = append(result, opRun{ op: m.nameToOp["batch_commit"], args: []operand{batch},
I don't want to beat a dead horse too severely, but this is another indication of the problem with the current abstractions. "batch_commit"
is referring to an mvccOp
and requiring specific arguments. There is no type checking here on the operands. No type checking that you've spelled "batch_commit"
correctly. The alternative would be to have something like:
result = append(result, batchCommitOp{
batch: batch,
})
Let's leave this change to a follow-on PR, as it will be extensive.
pkg/storage/engine/metamorphic/operations.go, line 529 at r4 (raw file):
tmpKey := key key = endKey endKey = tmpKey
The Go idiom for swapping two variables is: key, endKey = endKey, key
. No need for tmpKey
.
pkg/storage/engine/metamorphic/operations.go, line 563 at r4 (raw file):
OPERAND_MVCC_KEY, }, weight: 5,
I suspect the weight of this operation should be even lower than this. We don't perform manual compactions very often, usually only after a delete-range.
20103e1
to
6edc57d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the very extensive review, really appreciated! I've addressed all the simpler points, and noted the major refactor ones for a follow-up PR. I'll update the restarts PR once this is merged.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @petermattis and @sumeerbhola)
pkg/storage/engine/metamorphic/deck.go, line 24 at r4 (raw file):
Previously, petermattis (Peter Mattis) wrote…
Only used within this package, so this can be unexported:
s/Deck/deck/g
.
Done.
pkg/storage/engine/metamorphic/generator.go, line 35 at r1 (raw file):
Previously, petermattis (Peter Mattis) wrote…
I suspect there is a level of abstraction here that isn't serving much purpose. Given that there are a fixed number of "*managers", rather than a map you could embed the concrete types here. There are a few places where you're using a variable as a key to
managers
, but my suspicion is that is due to the operand abstraction which I'm also questioning.
Deferred for a followup PR that refactors operand abstraction.
pkg/storage/engine/metamorphic/generator.go, line 248 at r3 (raw file):
Previously, petermattis (Peter Mattis) wrote…
I suspect it would be better for the historic times to be skewed towards the current time. Not sure precisely how I would accomplish that. Perhaps with a zipfian distribution.
Done.
pkg/storage/engine/metamorphic/generator.go, line 100 at r4 (raw file):
Previously, petermattis (Peter Mattis) wrote…
Nit: we're unlikely to run more than 2 billion operations.
s/uint64/int/g
. This also simplifies the loop below slightly.
Done.
pkg/storage/engine/metamorphic/generator.go, line 101 at r4 (raw file):
Previously, petermattis (Peter Mattis) wrote…
You're initializing
m.ops
to lengthn
, but then seem to be appending tom.ops
inrunOp
. I think the end result is that you'll end up withlen(m.ops) == 2 * n
. Perhaps you meant tomake([]*mvccOp, 0, n)
?
Done. Whoops. I was keeping this around for printing output runs in the end, but it's ultimately unnecessary so I got rid of this slice altogether.
pkg/storage/engine/metamorphic/generator.go, line 105 at r4 (raw file):
Previously, petermattis (Peter Mattis) wrote…
Are you adding the op somewhere? Seems like you're just running it. Perhaps
s/opToAdd/op/g
.
Done.
pkg/storage/engine/metamorphic/meta_test.go, line 119 at r3 (raw file):
Previously, petermattis (Peter Mattis) wrote…
By default, I think you should generate a few random seeds. Perhaps make this a command line flag.
Given test runtime can be quite slow, I've resorted to adding 1 fixed seed, 1 user-specified seed, and 1 random seed.
pkg/storage/engine/metamorphic/meta_test.go, line 136 at r3 (raw file):
Previously, petermattis (Peter Mattis) wrote…
The seed won't generate the same series of operations if the use of the random number generator changes. Is that a problem?
It shouldn't be an issue. The purpose of having a seeded RNG is to more easily be able to recreate a test run if the output file gets lost for whatever reason (eg. on CI). Presumably if there's a code change, the seed will not exactly match up to the same deterministic set of operations, but that's alright.
pkg/storage/engine/metamorphic/meta_test.go, line 47 at r4 (raw file):
Previously, petermattis (Peter Mattis) wrote…
The
base.StorageConfig
looks identical between this function andcreateTestRocksDBEngine
. Perhaps pull out into a helper:func makeStorageConfig(dir string) base.StorageConfig
.
Done.
pkg/storage/engine/metamorphic/meta_test.go, line 109 at r4 (raw file):
Previously, petermattis (Peter Mattis) wrote…
Add a TODO about making this configurable. Note that in Pebble, the default is a random number in the range
[5000,10000)
.
Done.
pkg/storage/engine/metamorphic/operands.go, line 30 at r1 (raw file):
Previously, petermattis (Peter Mattis) wrote…
Nit: Go style is camelcase for constants. For example,
operandTransaction
.
Done.
pkg/storage/engine/metamorphic/operands.go, line 58 at r3 (raw file):
Previously, petermattis (Peter Mattis) wrote…
Usage of the empty interface is a big signal that something is off with the abstractions. The required use of type assertions is a related indicator. My instinct is that there should be abstraction at the op level, with concrete implementations of the various MVCC operations that know the concrete types of their parameters/operands. For example:
type mvccScanOp struct { reader Reader start roachpb.Key end roachpb.Key max int64 ... }
One nuance here is that you don't want to store an actual
engine.Reader
inmvccScanOp
. That's why the Pebble metamorphic tests use IDs for all of the objects. The level of indirection allows an op to refer to an object that hasn't been created yet. This is similar to the way you have ops retrieving objects from managers.
Noted. Will address this in a refactor in an upcoming PR.
pkg/storage/engine/metamorphic/operands.go, line 112 at r3 (raw file):
Previously, petermattis (Peter Mattis) wrote…
Better to do
k.liveKeys[k.rng.Intn(len(k.liveKeys))]
. Using floats to generate random numbers in a range like this is actually not uniformly random which is sort of surprising. I don't think it really matters, but it is better to use the supported idiom for generating integers in a range.
Done.
pkg/storage/engine/metamorphic/operands.go, line 49 at r4 (raw file):
Previously, petermattis (Peter Mattis) wrote…
Can you provide some commentary around these?
get()
seems to generate an operand (wouldgenerate
be a better name?).opener()
seems to return the name of an operator.
Done. Get usually does not generate an operand if there already exists one. I say usually because keys are an exception; we want to sometimes reuse keys and other times return new ones. Added commentary about this.
pkg/storage/engine/metamorphic/operands.go, line 178 at r4 (raw file):
Previously, petermattis (Peter Mattis) wrote…
s/inFlightBatches/openBatches/g
?
It's batches that have writes that need to happen before this transaction commits - so maybe "in-flight" makes more sense? The transaction struct itself has an inFlightWrites field.
pkg/storage/engine/metamorphic/operands.go, line 256 at r4 (raw file):
Previously, petermattis (Peter Mattis) wrote…
Can you elaborate? Why would the same batch appear multiple times in the in-flight batches list.
At write time, we don't check if this transaction has already written to this batch. We just append it to the end of this slice. It's up to the read case to resolve the dedups. I've clarified it here.
pkg/storage/engine/metamorphic/operations.go, line 32 at r1 (raw file):
Previously, petermattis (Peter Mattis) wrote…
The operand abstraction came from an ideal goal of being able to cleanly define what operands depended on others - so if you need a batch and have zero batches, you'll just create one by inserting a batch's opener, and the same for transactions, etc.
I don't think it implies the
operand
abstraction. I think you primarily get this functionality via themvccOp.operands
specification.It isn't clear to me if this goal is necessary. The Pebble metamorphic test generator simply skips generating an op if it is applied to an object for which there are no live objects. For example, if you try to generate an
iter.Close
op and there are no live iterators. Immediately opening and then closing an iterator doesn't seem to be terribly useful.The hope was also that all state information about active objects would live in their managers only.
I'm good with placing the state information about active objects in the managers. My feeling is that the managers should be a op-generation time component. Currently they are also an op-execution component as well.
In practice it all ended up being more intertwined and less modular than I was hoping, but the general idea still works.
I think it is worthwhile to disentangle the code and state needed for generation from the state needed for executing ops. The current
mvccOps
structure is good on the generation side, but the way execution is mixed in there is a bit messy. And I think this will get even messier when you add parsing and what not.
Noted. Will address in a follow-up PR.
pkg/storage/engine/metamorphic/operations.go, line 131 at r4 (raw file):
Previously, petermattis (Peter Mattis) wrote…
I think there are some operations missing here. Can you add a TODO? For example,
MVCCConditionalPut
,MVCCBlindPut
,MVCCMerge
,MVCCClearTimeRange
. Perhaps you've added these in subsequent PRs.I'm not seeing ingestion, which is another important bit of surface area to trigger. The way ingestion works in the Pebble metamorphic test is that it converts a batch into an sstable and ingests that. Something similar could be done here. Again, a TODO is sufficient for this PR.
Done.
pkg/storage/engine/metamorphic/operations.go, line 161 at r4 (raw file):
Previously, petermattis (Peter Mattis) wrote…
Add a TODO that all of these parameters should be operands.
Done.
pkg/storage/engine/metamorphic/operations.go, line 313 at r4 (raw file):
Previously, petermattis (Peter Mattis) wrote…
I don't want to beat a dead horse too severely, but this is another indication of the problem with the current abstractions.
"batch_commit"
is referring to anmvccOp
and requiring specific arguments. There is no type checking here on the operands. No type checking that you've spelled"batch_commit"
correctly. The alternative would be to have something like:result = append(result, batchCommitOp{ batch: batch, })
Let's leave this change to a follow-on PR, as it will be extensive.
Noted for a follow on PR. Thanks!
pkg/storage/engine/metamorphic/operations.go, line 529 at r4 (raw file):
Previously, petermattis (Peter Mattis) wrote…
The Go idiom for swapping two variables is:
key, endKey = endKey, key
. No need fortmpKey
.
Done.
pkg/storage/engine/metamorphic/operations.go, line 563 at r4 (raw file):
Previously, petermattis (Peter Mattis) wrote…
I suspect the weight of this operation should be even lower than this. We don't perform manual compactions very often, usually only after a delete-range.
Lowered it to 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modulo the comment on the type of inFlightBatches
. Definitely want to see the follow-on PRs as I think the clarification will be significant.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @itsbilal and @sumeerbhola)
pkg/storage/engine/metamorphic/generator.go, line 35 at r1 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Deferred for a followup PR that refactors operand abstraction.
Ack.
pkg/storage/engine/metamorphic/operands.go, line 58 at r3 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Noted. Will address this in a refactor in an upcoming PR.
Ack.
pkg/storage/engine/metamorphic/operands.go, line 178 at r4 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
It's batches that have writes that need to happen before this transaction commits - so maybe "in-flight" makes more sense? The transaction struct itself has an inFlightWrites field.
The transaction's in-flight writes refers to network operations (I think).
pkg/storage/engine/metamorphic/operands.go, line 256 at r4 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
At write time, we don't check if this transaction has already written to this batch. We just append it to the end of this slice. It's up to the read case to resolve the dedups. I've clarified it here.
Seems like what you want is the set of uncommitted batches per transaction. Could you change the type of inFlightBatches
to map[*roachpb.Transaction]map[engine.Batch]struct{}
?
pkg/storage/engine/metamorphic/operations.go, line 32 at r1 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Noted. Will address in a follow-up PR.
Ack.
pkg/storage/engine/metamorphic/operations.go, line 131 at r4 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Done.
I'm not sure if my list was exhaustive. Did you double-check that there aren't other operations?
pkg/storage/engine/metamorphic/operations.go, line 313 at r4 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Noted for a follow on PR. Thanks!
Ack.
This PR adds a new test-only sub-package to engine, metamorphic, which has one test, TestMeta, that generates and runs random MVCC operations on rocksdb and pebble instances with default settings. Future additions to this test suite could include: - A "check" mode that takes an output file as input, parses it, runs the operations in that sequence, and compares output strings. - Diffing test output between rocksdb and pebble and failing if there's a difference - Adding support for more operations - Adding a "restart" operation that closes the engine and restarts a different kind of engine in the store directory, then confirming operations after that point generate the same output. First-but-biggest part of cockroachdb#43762 . Release note: None
When run with the --check <filename> flag, TestMeta will parse the specified output file from a past TestMeta run for operations. It will run them in that order, and check for matching output strings. If there's a mismatch, the test will throw an error. Will be useful for when this test suite will try swapping engine types after a restart and will verify if the output stays the same. Release note: None.
Implements more MVCC operations such as MVCCDelete, MVCCInconsistent{Get,Scan}. Also updates batch iterator use to make the behaviour appear the same between rocksdb and pebble. There are some minor differences in the implementations: for instance, RocksDB batch iterators don't support SeekLT, Prev, don't respond well to SeekGEs before lower bounds, and appear to be SeekGE'd to the lower bound upon initialization. This branch ensures batch iterators are used so that TestMeta runs on rocksdb and pebble generate the same output. Important for when output from one type of TestMeta run is --check ed with the other engine. Release note: None.
6edc57d
to
1659148
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review!
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @itsbilal, @petermattis, and @sumeerbhola)
pkg/storage/engine/metamorphic/operands.go, line 178 at r4 (raw file):
Previously, petermattis (Peter Mattis) wrote…
The transaction's in-flight writes refers to network operations (I think).
Done.
pkg/storage/engine/metamorphic/operands.go, line 256 at r4 (raw file):
Previously, petermattis (Peter Mattis) wrote…
Seems like what you want is the set of uncommitted batches per transaction. Could you change the type of
inFlightBatches
tomap[*roachpb.Transaction]map[engine.Batch]struct{}
?
Done.
pkg/storage/engine/metamorphic/operations.go, line 131 at r4 (raw file):
Previously, petermattis (Peter Mattis) wrote…
I'm not sure if my list was exhaustive. Did you double-check that there aren't other operations?
There are others: MVCCIterate (seems unused), MVCCResolveWriteIntent (which we kinda already call in txn_commit), MVCCFindSplitKey, MVCCIncrement, MVCCInitPut, as well as some variants of Get/Put meant for protos that we can probably ignore. I'll add them too, and say that this isn't an exhaustive list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @itsbilal and @sumeerbhola)
pkg/storage/engine/metamorphic/operands.go, line 262 at r8 (raw file):
func (t *txnManager) clearBatch(batch engine.Batch) { for _, batches := range t.openBatches { for batch2 := range batches {
Is this loop necessary? Can't you blindly delete(batches, batch)
here?
1659148
to
503eb08
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed linter-reported issues too. TFTR!
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @itsbilal, @petermattis, and @sumeerbhola)
pkg/storage/engine/metamorphic/operands.go, line 262 at r8 (raw file):
Previously, petermattis (Peter Mattis) wrote…
Is this loop necessary? Can't you blindly
delete(batches, batch)
here?
🤦♂️ yes, you can. My bad.
dc0bb59
to
ad8f2d0
Compare
This PR builds on top of cockroachdb#44458 (all commits before the last one are from that PR). It adds two things: one, it ensures that whenever successive engine types are tested, it does a comparison for matching outputs. This will ensure CI will fail if any change down the line causes an observed difference in behaviour between rocksdb and pebble. Furthermore, it adds a new kind of operation: restart. Restarts are special in that they're added after the specified n number of runs have happened; so a test run with 3 specified engine types (so 2 restarts), and n = 10000 will have 3 * 10000 = 30000 operations. A restart closes all open objects, then closes the engine, and starts up the next engine in the specified engine sequence. This, combined with the aforementioned checking functionality across different engine runs, lets us test for bidirectional compatibility across different engines. Fixes cockroachdb#43762 . Release note: None.
Lots of minor changes such as renaming all operandTypes to camel casing, TODO insertions wherever necessary, the use of the pastTS operand type for inconsistent gets/scans, and the use of rand.Zipf for generating past timestamps. Release note: None
ad8f2d0
to
b03ca7f
Compare
bors r+ |
44458: storage/engine: MVCC Metamorphic test suite, first phase r=itsbilal a=itsbilal This PR adds a new test-only sub-package to engine, metamorphic, which has one test, TestMeta, that generates and runs random MVCC operations on rocksdb and pebble instances with default settings. Future additions to this test suite could include: - [x] A "check" mode that takes an output file as input, parses it, runs the operations in that sequence, and compares output strings. - [ ] Diffing test output between rocksdb and pebble and failing if there's a difference - [ ] Adding support for more operations - [ ] Adding a "restart" operation that closes the engine and restarts a different kind of engine in the store directory, then confirming operations after that point generate the same output. First-but-biggest part of #43762 . Release note: None 44730: storage: fix some tests that were fooling themselves r=andreimatei a=andreimatei A couple of tests wanted multiple write too old errors, but they were running at the wrong timestamp and so they were really getting a single one. Also add a test showing a funky scenario where 1PC batches behave differently from non-1PC. Release note: None 44786: jobs: allow blocking job adoption via sentinel file r=dt a=dt This change adds a check for a sentinel file, DISABLE_STARTING_BACKGROUND_JOBS, in the first on-disk store directory to the job adoption loop. If it is found, jobs will not be adopted by that node and current job executions are cancelled. Operators can thus touch this file if a misbehaving job is causing problems and otherwise preventing traditional job control that requires being able to write to the database. Release note (general change): background job execution is disabled on nodes where the file DISABLE_STARTING_BACKGROUND_JOBS exists in the first store directory providing an emergency tool for disabling job execution. Co-authored-by: Bilal Akhtar <[email protected]> Co-authored-by: Andrei Matei <[email protected]> Co-authored-by: David Taylor <[email protected]>
Build succeeded |
This PR builds on top of cockroachdb#44458 (all commits before the last one are from that PR). It adds two things: one, it ensures that whenever successive engine types are tested, it does a comparison for matching outputs. This will ensure CI will fail if any change down the line causes an observed difference in behaviour between rocksdb and pebble. It also adds more MVCC operations that would be useful to test, such as MVCCFindSplitKey, MVCCDeleteRange, MVCCClearTimeRange, MVCCConditionalPut, etc. Furthermore, it adds a new kind of operation: restart. Restarts are special in that they're added after the specified n number of runs have happened; so a test run with 3 specified engine types (so 2 restarts), and n = 10000 will have 3 * 10000 = 30000 operations. A restart closes all open objects, then closes the engine, and starts up the next engine in the specified engine sequence. This, combined with the aforementioned checking functionality across different engine runs, lets us test for bidirectional compatibility across different engines. Fixes cockroachdb#43762 . Release note: None.
This PR builds on top of cockroachdb#44458 (all commits before the last one are from that PR). It adds two things: one, it ensures that whenever successive engine types are tested, it does a comparison for matching outputs. This will ensure CI will fail if any change down the line causes an observed difference in behaviour between rocksdb and pebble. It also adds more MVCC operations that would be useful to test, such as MVCCFindSplitKey, MVCCDeleteRange, MVCCClearTimeRange, MVCCConditionalPut, etc. Furthermore, it adds a new kind of operation: restart. Restarts are special in that they're added after the specified n number of runs have happened; so a test run with 3 specified engine types (so 2 restarts), and n = 10000 will have 3 * 10000 = 30000 operations. A restart closes all open objects, then closes the engine, and starts up the next engine in the specified engine sequence. This, combined with the aforementioned checking functionality across different engine runs, lets us test for bidirectional compatibility across different engines. Fixes cockroachdb#43762 . Release note: None.
This PR builds on top of cockroachdb#44458 (all commits before the last one are from that PR). It adds two things: one, it ensures that whenever successive engine types are tested, it does a comparison for matching outputs. This will ensure CI will fail if any change down the line causes an observed difference in behaviour between rocksdb and pebble. It also adds more MVCC operations that would be useful to test, such as MVCCFindSplitKey, MVCCDeleteRange, MVCCClearTimeRange, MVCCConditionalPut, etc. Furthermore, it adds a new kind of operation: restart. Restarts are special in that they're added after the specified n number of runs have happened; so a test run with 3 specified engine types (so 2 restarts), and n = 10000 will have 3 * 10000 = 30000 operations. A restart closes all open objects, then closes the engine, and starts up the next engine in the specified engine sequence. This, combined with the aforementioned checking functionality across different engine runs, lets us test for bidirectional compatibility across different engines. Fixes cockroachdb#43762 . Release note: None.
44630: engine/metamorphic: Add engine restarts, compare across runs r=itsbilal a=itsbilal This PR builds on top of #44458 (all commits before the last one are from that PR). It adds two things: one, it ensures that whenever successive engine types are tested, it does a comparison for matching outputs. This will ensure CI will fail if any change down the line causes an observed difference in behaviour between rocksdb and pebble. Furthermore, it adds a new kind of operation: restart. Restarts are special in that they're added after the specified n number of runs have happened; so a test run with 3 specified engine types (so 2 restarts), and n = 10000 will have 3 * 10000 = 30000 operations. A restart closes all open objects, then closes the engine, and starts up the next engine in the specified engine sequence. This, combined with the aforementioned checking functionality across different engine runs, lets us test for bidirectional compatibility across different engines. Fixes #43762 . Release note: None. Co-authored-by: Bilal Akhtar <[email protected]>
This PR adds a new test-only sub-package to engine, metamorphic,
which has one test, TestMeta, that generates and runs random MVCC
operations on rocksdb and pebble instances with default settings.
Future additions to this test suite could include:
First-but-biggest part of #43762 .
Release note: None