Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metamorphic test failure: inconsistent keys in range #3174

Closed
RaduBerinde opened this issue Dec 19, 2023 · 14 comments · Fixed by #3187
Closed

metamorphic test failure: inconsistent keys in range #3174

RaduBerinde opened this issue Dec 19, 2023 · 14 comments · Fixed by #3187

Comments

@RaduBerinde
Copy link
Member

CI ran into a metamorphic test failure in PR #3173. It's unlikely (but not impossible) that the PR itself introduces the problem (it's mostly code cleanup).

https://github.com/cockroachdb/pebble/actions/runs/7266582327/job/19798611718?pr=3173


		1703013741613279788
		===== DIFF =====
		_meta/231219-192221.6133753307642/{standard-000,random-009}
		@@ -2103,11 +2103,11 @@
 iter20.SetBounds("rvxxunegblh@1", "tjpwuigvwge@1") // <nil> #2102
 iter20.SeekLT("tjpwuigvwge@1", "") // [true,"tjaj",<no point>,["tjaj","tjpwuigvwge@1")=>{"@6"="gqxiqxgpiskukyar"}*] <nil> #2103
 iter20.Next("") // [false] <nil> #2104
 iter20.Next("") // [false] <nil> #2105
 snap17.Close() // <nil> #2106
-iter22.Next("") // [true,"rpsllbp",<no point>,["rpsllbp","rvxxunegblh")=>{"@6"="iclhhyqvjdtfc","@4"="qumipltbbtehcoala","@3"="gtsviaw"}*] <nil> #2107
+iter22.Next("") // [true,"rpsllbp",<no point>,["rpsllbp","rvxxunegblh")=>{""="aevhlsxcjbxypnnas","@6"="iclhhyqvjdtfc","@4"="qumipltbbtehcoala","@3"="gtsviaw"}*] <nil> #2107
@RaduBerinde
Copy link
Member Author

Reproduced on master (69994dd):

		===== SEED =====
		1703020073158492000
		===== DIFF =====
		_meta/231219-130753.1583218548355/{standard-000,random-027}
		@@ -4276,11 +4276,11 @@
 iter44.Prev("cksgzvdspi@8") // [valid,"gwmb",<no point>,["gwmb","jjbudo")=>{""="irymnotaxnqtzfxjdv","@9"="kyet","@6"=""}*] <nil> #4275
 batch30.Delete("zsgbcatts@6") // <nil> #4276
 db2.Ingest(batch30) // <nil> #4277
 iter44.SeekPrefixGE("zkqrrqt@5") // [false] pebble: SeekPrefixGE supplied with key outside of upper bound #4278
 iter44.Prev("") // [false] pebble: SeekPrefixGE supplied with key outside of upper bound #4279
-iter44.SeekPrefixGE("fusdchsqf@8") // [true,"fusdchsqf@8",<no point>,["fusdchsqf","fusdchsqf\x00")=>{"@9"="kyet","@6"=""}*] <nil> #4280
+iter44.SeekPrefixGE("fusdchsqf@8") // [true,"fusdchsqf@8",<no point>,["fusdchsqf","fusdchsqf\x00")=>{"@9"="kyet","@6"="","@5"="jmngn"}*] <nil> #4280

@nicktrav
Copy link
Contributor

Can you share the repro command? I tried the following, but it doesn't repro for me on Darwin or Linux, on master @ 69994dd.

$ CGO_ENABLED=0 go test -test.v -run TestMeta$ ./internal/metamorphic -seed 1703020073158492000 -keep

@itsbilal
Copy link
Member

Looks like it's TestMetaTwoInstance that's affected, which is also the only metamorphic test failure to have VSSTs and related functionality enabled.

@nicktrav
Copy link
Contributor

Ah, you're right. This does it:

$ go test -run TestMetaTwoInstance$ ./internal/metamorphic -seed 1703020073158492000

@RaduBerinde
Copy link
Member Author

Interestingly I got a different error with that same seed:

                checker failed with error: out of order keys qtvsxobnl@1#294,DELSIZED >= qtvsxobnl@1#294,DELSIZED in L6: fileNum=000070

@itsbilal
Copy link
Member

@RaduBerinde is this with your change to surface the start bound by any chance? I'm guessing not, but still good to know.

@RaduBerinde
Copy link
Member Author

No, it is reproducible on master (see 2nd comment).

I realized we are not runnign TestMetaTwoInstance in the nightlies, filed #3175

@RaduBerinde
Copy link
Member Author

I wrote a little tool to reduce the number of operations to get a small reproduction. I got one with these ops:

Init(2 /* dbs */, 50 /* batches */, 73 /* iters */, 50 /* snapshots */)
db2.DeleteRange("pdzxpawc@1", "zyltfzrkfx")
db2.Delete("qtvsxobnl@1")
snap3 = db2.NewSnapshot("cqlkmvj", "gwmb", "pdzxpawc", "pjtsroxxhzl", "rvutqvp", "tyklrgn", "vqdcha", "xmmrme", "ztgkqdcuwk", "zyltfzrkfx")
db2.Delete("qtvsxobnl@1")
db2.Compact("rvutqvp@1", "tyklrgn@1", false /* parallelize */)
db2.Replicate(db1, "gwmb", "rdiijw")
db1.Replicate(db2, "exiocx", "urziitjlj")

and these OPTIONS

[Version]
  pebble_version=0.1

[Options]
  bytes_per_sync=131072
  cache_size=64
  cleaner=delete
  compaction_debt_concurrency=1073741824
  comparer=pebble.internal.testkeys
  disable_wal=true
  flush_delay_delete_range=275ms
  flush_delay_range_key=685ms
  flush_split_bytes=16
  format_major_version=16
  l0_compaction_concurrency=1
  l0_compaction_file_threshold=1024
  l0_compaction_threshold=14
  l0_stop_writes_threshold=38
  lbase_max_bytes=4194304
  level_multiplier=160
  max_concurrent_compactions=3
  max_manifest_file_size=2
  max_open_files=1000
  mem_table_size=8192
  mem_table_stop_writes_threshold=2
  min_deletion_rate=1048576
  merger=pebble.concatenate
  multilevel_compaction_heuristic=wamp(0.00, false)
  read_compaction_rate=16000
  read_sampling_multiplier=16
  strict_wal_tail=true
  table_cache_shards=10
  table_property_collectors=[]
  validate_on_ingest=true
  wal_dir=data/wal
  wal_bytes_per_sync=0
  max_writer_concurrency=0
  force_writer_parallelism=false
  secondary_cache_size_bytes=33554432
  create_on_shared=1
  disable_elision_only_compactions=true

[Level "0"]
  block_restart_interval=58
  block_size=2
  block_size_threshold=89
  compression=NoCompression
  filter_policy=testing_bloom_filter/bits_per_key=16
  filter_type=table
  index_block_size=4
  target_file_size=1024

[TestOptions]
  delete_sized=true
  replace_single_delete=true
  threads=10
  async_apply_to_db=true
  shared_storage_enabled=true
  secondary_cache_enabled=true
  seed_efos=11947775591189849618
  use_shared_replicate=true
  efos_always_creates_iterators=true

Error is:

db1.Replicate(db2, "exiocx", "urziitjlj") // <nil> #7
Seed: 0
checker failed with error: out of order keys qtvsxobnl@1#13,DELSIZED >= qtvsxobnl@1#13,DELSIZED in L6: fileNum=000009

@itsbilal
Copy link
Member

@RaduBerinde nice find! It looks like "boomerang files" which are files that get shared to another pebble, and then shared back to the same pebble, need to also have hideObsoletePoints set to true in tableCacheShard.newIters:

https://github.com/cockroachdb/pebble/blob/5be92739e7bb3cc2bb2158ad14be32bf7bc19e9f/table_cache.go#L552C1-L552C1

I think we'd want to set that to true for any shared ingested files (and we can determine ingested by seeing if the start/end seqnums are the same), not just shared foreign files. That also mimics the way we track ingested files for substituting sequence numbers, so is easier to reason about.

@RaduBerinde
Copy link
Member Author

Thanks. Yes, that seems to fix the problem. I am however running in other problems related to table format versions being too low. I'm working on a change to require VirtualSSTables format to use shared objects.

RaduBerinde added a commit to RaduBerinde/pebble that referenced this issue Dec 21, 2023
We weren't setting this flag correctly When a shared file is ingested
by another store and then ingested back into the original store,
leading to an "out of order keys" error in `TestMetaTwoInstances`.

Informs cockroachdb#3174.
@RaduBerinde
Copy link
Member Author

I posted PR #3179 for the "key out of order" issue. I can still repro a compare failure. I will extend the "reduce" tool to work on compare as well.

RaduBerinde added a commit that referenced this issue Dec 21, 2023
We weren't setting this flag correctly When a shared file is ingested
by another store and then ingested back into the original store,
leading to an "out of order keys" error in `TestMetaTwoInstances`.

Informs #3174.
@RaduBerinde
Copy link
Member Author

RaduBerinde commented Dec 22, 2023

I managed to reduce a repro to this:

Init(2 /* dbs */, 49 /* batches */, 63 /* iters */, 45 /* snapshots */)
db2.RangeKeySet("hesu", "olwqrm", "@6", "ybdmeuxousglnwebka")
snap9 = db2.NewSnapshot("boqrkyiv", "fhgoonxbav", "iifk", "jhbioo", "lipbcfybquf", "nvolewnkbok", "olwqrm", "vqzangor")
db2.RangeKeyDelete("iifk", "njojeaszbsh")
db2.Compact("nvolewnkbok@2", "uunagerm@9", true /* parallelize */)
db2.Replicate(db1, "faaopmh", "njojeaszbsh")
iter25 = db1.NewIter("", "", 2 /* key types */, 0, 0, false /* use L6 filters */, "@7" /* masking suffix */)
iter25.SeekGE("lipbcfybquf@3", "")
-iter25.SeekGE("lipbcfybquf@3", "") // [false] <nil> #7
+iter25.SeekGE("lipbcfybquf@3", "") // [true,"lipbcfybquf@3",<no point>,["hesu","njojeaszbsh")=>{"@6"="ybdmeuxousglnwebka"}*] <nil> #7

@itsbilal the expected result here is the second one, right? It's a little weird - without actual points, the RangeKeySet wouldn't do anything so they're both technically correct.

@RaduBerinde
Copy link
Member Author

I simplified the keys; the repro is now:

Init(2 /* dbs */, 49 /* batches */, 63 /* iters */, 45 /* snapshots */)
db2.RangeKeySet("c", "h", "@6", "foo")
snap9 = db2.NewSnapshot("a", "z")
db2.RangeKeyDelete("d", "f")
db2.Compact("g@2", "i@9", true /* parallelize */)
db2.Replicate(db1, "b", "f")
iter25 = db1.NewIter("", "", 2 /* key types */, 0, 0, false /* use L6 filters */, "@7" /* masking suffix */)
iter25.SeekGE("e@3", "")

This is a visual of the operations:

               |-----------------------------| RangeKeySet
   |-----------------------------------------------| Snapshot
                     |-----------| RangeKeyDelete
                                       |-----------| Compact
         |-----------------------| Replicate
                           |-> SeekGE
   a --- b --- c --- d --- e --- f --- g --- h --- i

When running with these OPTIONS, we get:

iter25.SeekGE("e@3", "") // [true,"e@3",<no point>,["c","f")=>{"@6"="foo"}*] <nil> #7

This is incorrect, as this portion of the RangeKeySet has been removed by the RangeKeyDel.

The Replicate op involves sharing and virtualizing an SST file. The physical SST file contains these range keys (note, this is after seqnum rewriting):

c-d:{(#10,RANGEKEYSET,@6,foo)}
d-f:{(#10,RANGEKEYDEL) (#10,RANGEKEYSET,@6,foo)}
f-h:{(#10,RANGEKEYSET,@6,foo)}

[ Question: the snapshot prevented the RANGEKEYSET from being removed when writing out the SST file, correct? ]

The virtual raw range key iter contains:

c-d:{(#10,RANGEKEYSET,@6,foo)}
d-f:{(#10,RANGEKEYDEL) (#10,RANGEKEYSET,@6,foo)}

Tracking what our hefty iterator stack is doing is pretty hard, but it looks like the keyspan.MergingIter is calling UserIteratorConfig.Transform which skips RANGEKEYDEL keys.

The snapshot is important for the reproduction. Without the snapshot, the SST contains nicely resolved fragments:

c-d:{(#10,RANGEKEYSET,@6,foo)}
d-f:{(#10,RANGEKEYDEL)}
f-h:{(#10,RANGEKEYSET,@6,foo)}

Also, if we query the original db2, we get the correct result iter25.SeekGE("e@3", "") // [true,"f",<no point>,["f","h")=>{"@6"="foo"}*] <nil> #7. In this case MegingIter->Transform->colapse does the right thing, because the Delete key shadows the other key (as per this invariant).

It feels that the problem stems from the fact that we rewrite the sequence numbers to the same value, and that we have keys that are supposed to shadow other keys. With the new sequence numbers, the RangeKeyDelete does not shadow the RangeKeySet because they have the same sequence number.

@itsbilal what do you think? Perhaps we never want to share SSTs that have unresolved keys because of snapshots?

Also, what assertions are we missing? We should somehow check that shared SSTs have no shadowed keys?

itsbilal added a commit to itsbilal/pebble that referenced this issue Dec 27, 2023
This change updates the virtual sstable range key iterator to
hide obsolete range keys for shared ingested files. It also
updates the sstable writer to set the obsolete key bit for
range key spans that are underneath existing range key spans
with a matching prefix and/or the one above is a del.

This change also cleans up some dangling references to the older
`isForeign` way of determining whether to hide obsolete keys,
in lieu of the newer isSharedIngested approach to address
boomerang shared files.

Fixes cockroachdb#3174.
@itsbilal
Copy link
Member

@RaduBerinde thank you for digging into this and for the clear and concise explanation of what's going on!

As we discussed on Slack, the best approach at addressing this is to use the hideObsolete logic for block iters to hide away the keys that were only preserved for snapshots. I made #3187 which applies that fix. In your example the shadowed RANGEKEYSET would have its obsolete bit set, and when that sstable gets shared, the receiver will hide away the RANGEKEYSET before applying the sequence number substitution.

I'll have to think more about what assertions we can add that'd harden this. It might be tricky but we can probably throw something into UserIteratorConfig that catches this case for shared ingested files only.

itsbilal added a commit to itsbilal/pebble that referenced this issue Dec 28, 2023
This change updates the virtual sstable range key iterator to
hide obsolete range keys for shared ingested files. It also
updates the sstable writer to set the obsolete key bit for
range key spans that are underneath existing range key spans
with a matching prefix and/or the one above is a del.

This change also cleans up some dangling references to the older
`isForeign` way of determining whether to hide obsolete keys,
in lieu of the newer isSharedIngested approach to address
boomerang shared files.

Fixes cockroachdb#3174.
itsbilal added a commit to itsbilal/pebble that referenced this issue Jan 2, 2024
This change updates the virtual sstable range key iterator to
hide obsolete range keys for shared ingested files. It does
this by reusing the ForeignSSTTransformer that was removed
in cockroachdb#2782.

This change also cleans up some dangling references to the older
`isForeign` way of determining whether to hide obsolete keys,
in lieu of the newer isSharedIngested approach to address
boomerang shared files.

Fixes cockroachdb#3174.
itsbilal added a commit to itsbilal/pebble that referenced this issue Jan 3, 2024
This change updates the virtual sstable range key iterator to
hide obsolete range keys for shared ingested files. It does
this by reusing the ForeignSSTTransformer that was removed
in cockroachdb#2782.

This change also cleans up some dangling references to the older
`isForeign` way of determining whether to hide obsolete keys,
in lieu of the newer isSharedIngested approach to address
boomerang shared files.

Fixes cockroachdb#3174.
itsbilal added a commit to itsbilal/pebble that referenced this issue Jan 3, 2024
This change updates the virtual sstable range key iterator to
hide obsolete range keys for shared ingested files. It does
this by reusing the ForeignSSTTransformer that was removed
in cockroachdb#2782.

This change also cleans up some dangling references to the older
`isForeign` way of determining whether to hide obsolete keys,
in lieu of the newer isSharedIngested approach to address
boomerang shared files.

Fixes cockroachdb#3174.
itsbilal added a commit to itsbilal/pebble that referenced this issue Jan 3, 2024
This change updates the virtual sstable range key iterator to
hide obsolete range keys for shared ingested files. It does
this by reusing the ForeignSSTTransformer that was removed
in cockroachdb#2782.

This change also cleans up some dangling references to the older
`isForeign` way of determining whether to hide obsolete keys,
in lieu of the newer isSharedIngested approach to address
boomerang shared files.

Fixes cockroachdb#3174.
itsbilal added a commit that referenced this issue Jan 3, 2024
This change updates the virtual sstable range key iterator to
hide obsolete range keys for shared ingested files. It does
this by reusing the ForeignSSTTransformer that was removed
in #2782.

This change also cleans up some dangling references to the older
`isForeign` way of determining whether to hide obsolete keys,
in lieu of the newer isSharedIngested approach to address
boomerang shared files.

Fixes #3174.
@jbowens jbowens moved this to Done in [Deprecated] Storage Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants