-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
internal/metamorphic: TestMeta failed #1459
Comments
Reproduced locally with
|
The panic message looks like it's describing a sequence number inversion bug. Could be a bug in #1432? |
Confirmed that without #1432 this doesn't happen. If it helps, I used the following test helper to just run the single variant, which speeds up debugging: func Test(t *testing.T) {
runDir := "./_meta/220120-074952.888/random-016" // will be different
historyPath := "/tmp/hist"
testMetaRun(t, runDir, 1642659109724527380, historyPath)
} |
Currently, if the level checker encounters a violation an the iterator for the last level has been closed, the debug string providing information on the level and the file is lost. Save the debug string for potential use after the iterator is closed. ``` // Before checker failed with error: found InternalKey zzqwavxgrec@12#4092,SINGLEDEL in L0.2: fileNum=000830 and InternalKey zzqwavxgrec@12#4627,DEL in %!s(<nil>) // After checker failed with error: found InternalKey zzqwavxgrec@12#4092,SINGLEDEL in L0.2: fileNum=000830 and InternalKey zzqwavxgrec@12#4627,DEL in L6: fileNum=000997 ``` Related to cockroachdb#1459.
Hi, found another test failure on one of my branches:
Don't know if it's the same error, but looking. |
This manifest dump clearly shows the sequence number inversion on the key
The key descended the LSM improperly in an L0->L5 compaction:
The version edit for the flush that introduced added the key to L0:
|
Instrumented to collect the verbose
|
Currently, if the level checker encounters a violation an the iterator for the last level has been closed, the debug string providing information on the level and the file is lost. Save the debug string for potential use after the iterator is closed. ``` // Before checker failed with error: found InternalKey zzqwavxgrec@12#4092,SINGLEDEL in L0.2: fileNum=000830 and InternalKey zzqwavxgrec@12#4627,DEL in %!s(<nil>) // After checker failed with error: found InternalKey zzqwavxgrec@12#4092,SINGLEDEL in L0.2: fileNum=000830 and InternalKey zzqwavxgrec@12#4627,DEL in L6: fileNum=000997 ``` Related to #1459.
Fix is incoming. Taking the opportunity to convert |
When Version.Overlaps is called for L0, the overlap window is iteratively expanded until stable. In cockroachdb#1432, the Overlaps function was adjusted to allow specifying that the end bound should be considered exclusive. However, cockroachdb#1432 failed to update the exclusivity of the end bound when the range widened. This improperly excluded files with largest keys that exactly equaled the new widened end bound. This commit also transforms the TestOverlaps test into a datadriven test, introducing a few helpers for parsing the DebugString output of a Version. Fix cockroachdb#1459.
74318: tracing: add /debug/tracez rendering the active spans r=andreimatei a=andreimatei `/debug/tracez` lets users take a snapshot of the active spans registry and render the new snapshot, or one of the previously taken snapshots. The Tracer can hold up to 10 snapshots in memory. It looks like this: ![Screenshot from 2022-01-04 19-03-39](https://user-images.githubusercontent.com/377201/148140272-306658d5-5b9c-4f2a-b59c-28df9c5ed10c.png) When visualizing a snapshot, the page lets you do a number of things: 1. List all the spans. 2. See the (current) stack trace for each span's goroutine (if the goroutine was still running at the time when the snapshot was captured). Stack traces can be toggled visible/hidden. 3. Sort the spans by name or start time. 4. Filter the span according to text search. The search works across the name and stack trace. 5. Go from a span to the full trace containing that span. For the table Javascript providing sorting and filtering, this patch embeds the library from https://listjs.com/ . Limitations: - for now, only the registry of the local node is snapshotted. In the fuiture I'll collect info from all nodes. - for now, the relationships between different spans are not represented in any way. I'll work on the ability to go from a span to the whole trace that the span is part of. - for now, tags and structured and unstructured log messages that a span might have are not displayed in any way. At the moment, span creation is not enabled in production by default (i.e. the Tracer is put in `TracingModeOnDemand` by default, instead of the required `TracingModeActiveSpansRegistry`). This patch does not change that, so in order to benefit from /debug/tracez in all its glory, one has to run with `COCKROACH_REAL_SPANS=1` for now. Not for long, though. Release note: None 74867: sql: Support CREATE DATABASE WITH OWNER r=Fenil-P a=Fenil-P fixes #67817 Release note (sql change): Allow users to specify the owner when creating a database. Similar to postgresql: CREATE DATABASE name [ [ WITH ] [ OWNER [=] user_name ] 74871: sql: add a tracing tag with the txn ID r=andreimatei a=andreimatei This patch adds the txn's ID as a tag to the tracing span representing a SQL txn. I'm creating a UI to explore the current spans, and this ID will make it easy to navigate between a query/request blocking on a lock held by some other txn, and the activity of that other txn. Release note: None 75114: sql: directly specify columns in TableReader r=RaduBerinde a=RaduBerinde ~Note: the first commit is #74922.~ The internal columns of the TableReader (as well as the row fetcher) are all the columns of the table, with only a subset of values actually produced. This design decision has been carried over way past the point where it makes sense (I admit, it's questionable whether it ever made sense). For one, "all the columns" is ambiguous (does it contain non-public columns? does it include system columns?) leading to various flags and inherent fragility. Second, it relies on the execution engine to figure out (based on the PostProcessSpec) which columns are actually needed, which the optimizer already figures out for us now. This commit changes the TableReader spec and the interface of row.Fetcher to always produce a given specific set of column IDs. The diagram for table readers now specifies the columns by name. The JoinReader, InvertedJoiner, ZigzagJoiner are not changed in this commit (but they should be cleaned up as well). Release note: None 75175: colfetcher: fix the bytes read statistic collection r=yuzefovich a=yuzefovich During 21.2 release we adjusted the `cFetcher` to be `Close`d eagerly when it is returning the zero-length batch. This was done in order to release some references in order for the memory to be GCed sooner; additionally, the `cFetcher` started being used for the index join where the fetcher is restarted from scratch for every batch of spans, so it seemed reasonable to close it automatically. However, that eager closure broke "bytes read" statistic collection since the `row.KVFetcher` was responsible for providing it, and we were zeroing it out. This commit fixes this problem by the `cFetcher` memorizing the number of bytes it has read in `Close`. Some care needs to be taken to not double-count the bytes read in the index join, so a couple of helper methods have been introduced. Additionally this commit applies the same eager-close optimization to the `cFetcher` when the last batch is returned (which makes it so that if we've just exhausted all KVs, we close the fetcher - previously, we would set the zero length on the batch and might never get into `stateFinished`). Fixes: #75128. Release note (bug fix): Previously, CockroachDB could incorrectly report `KV bytes read` statistic in `EXPLAIN ANALYZE` output. The bug is present only in 21.2.x versions. 75215: cmd/github-post: fix Pebble metamorphic reproduction command r=jbowens a=jbowens When posting a github issue for a Pebble metamorphic test failure, include the correct `-ops` flag. Discovered because cockroachdb/pebble#1459 contained a reproduction command that contained too few ops to reproduce the issue. Release note: none 75228: logictestccl: skip flaky TestCCLLogic/fakedist-metadata/partitioning_enum r=mgartner a=mgartner Informs #75227 Release note: None 75237: cli,rpc: don't check the active cluster version in the CLI r=andreimatei a=knz This commit removes a code path that would tickle an assertion failure if we were to later fix the context propagation in the RPC heartbeat method (see PR #71243): there's no "active cluster version" in the CLI and so we can't compare it in a client interceptor. Release note: None 75254: scripts: add `dev generate --mirror` to `bump-pebble.sh` script r=jbowens a=nicktrav CI now expects that dependencies are mirrored to cloud storage and will fail if the TODO for mirroring the repo is left unaddressed in the `DEPS.bzl` file. Add a mirroring step to the `bump-pebble.sh` script. Release note: none Co-authored-by: Andrei Matei <[email protected]> Co-authored-by: Fenil Patel <[email protected]> Co-authored-by: Radu Berinde <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: Jackson Owens <[email protected]> Co-authored-by: Marcus Gartner <[email protected]> Co-authored-by: Raphael 'kena' Poss <[email protected]> Co-authored-by: Nick Travers <[email protected]>
When Version.Overlaps is called for L0, the overlap window is iteratively expanded until stable. In cockroachdb#1432, the Overlaps function was adjusted to allow specifying that the end bound should be considered exclusive. However, cockroachdb#1432 failed to update the exclusivity of the end bound when the range widened. This improperly excluded files with largest keys that exactly equaled the new widened end bound. This commit also transforms the TestOverlaps test into a datadriven test, introducing a few helpers for parsing the DebugString output of a Version. Fix cockroachdb#1459.
When Version.Overlaps is called for L0, the overlap window is iteratively expanded until stable. In #1432, the Overlaps function was adjusted to allow specifying that the end bound should be considered exclusive. However, #1432 failed to update the exclusivity of the end bound when the range widened. This improperly excluded files with largest keys that exactly equaled the new widened end bound. This commit also transforms the TestOverlaps test into a datadriven test, introducing a few helpers for parsing the DebugString output of a Version. Fix #1459.
internal/metamorphic.TestMeta failed with artifacts on master @ 7f1a70dc4fb56a501489a453f72f37333e9bccd5:
Help
To reproduce, try:
This test on roachdash | Improve this report!
The text was updated successfully, but these errors were encountered: