changefeedccl: time-bound iterator and CDC in 2.1 #32799

danhhz · 2018-12-03T19:32:24Z

There is an unresolved bug in time-bound iterator. It manifests as incorrectly seeing an unresolved intent (the bug is that deletion of the intent key is missed). All current ideas for fixing this are too scary to backport to 2.1, but changefeeds use them (via ExportRequest). See much more detail on the bug in #28358. We have a customer commitment for changefeeds in 2.1 so something has to be done here. Some possible options:

(1) TBIs are a performance optimization, we could disable them. This is what we did in almost every use of them in crdb. Changefeeds and incremental backup (the two uses of ExportRequest) are the only remaining TBI uses in 2.1.
(2) When an ExportRequest hits what appears to be an unresolved intent, double check with a normal (non-TBI) iterator.

danhhz · 2018-12-03T20:02:34Z

I ran some performance tests for option (1) last week. Summary: I'm not sure that disabling time-bound iterators completely is workable.

Changefeeds in 2.1 work by repeatedly polling all the "watched" data for any changes. This is done by picking a new high-water mark and sending ExportRequests with the previous and new high-water as time bounds. This poll happens every changefeed.experimental_poll_interval which currently defaults to 1s.

Setting the time bounds in ExportRequest triggers the time-bound iterator optimization. Internally, it uses the bounds along with the metadata on the sstables in the RocksDB LSM to avoid even looking at ones that can't possibly relevant. Since changefeeds are constantly looking at only the most recent data, this means they get to skip over most of the larger sstables in the lower levels of the LSM. Running tpcc warehouses=1000 on a 3 node n1-standard-16 cluster, a full poll usually takes ~100ms. Most of the time, it easily finishes within the 1s poll duration.

Last week, I ran the above test with TBIs completely disabled. Initially, the max ExportRequest duration for a single range stays around 10-20s (contrast this to 100ms for all ~5k ranges with TBI). The requests are run with some limited amount of concurrency and it all works out to a new poll every 10-15m. Over time, the max duration for a single range continually creeps up. I ran the test on and off (kept hitting roachtest timeouts that I thought I disabled) for 3 days (before my cluster met its premature demise 😢) and the max was up to ~300s. p50/p90 started at 1.6s/4.8s and got to 7.0s/20.4s.

One theory for this is at the start of the test, we've just done a restore and all data is in L6 tables. As the test runs, the intermediate LSM levels fill in. If the load and changfeed are stopped and restarted, it doesn't drop back down, so I don't think its resource leakage.

Ultimately, the good news is that somehow, the most recent resolved timestamp of the changefeeds seems to stay around 8-15m behind realtime. However, the time these ExportRequests are taking (and the growth over several days) makes it hard to be confidant that we could productionize it.

danhhz · 2018-12-04T17:57:48Z

Here's a graph with the difference between (1) and and a prototype of (2). Ignore the small spikes, they're from shutting my laptop while commuting.

For reasons described in cockroachdb#28358, a time-bound iterator will sometimes see an unresolved intent where there is none. A normal iterator doesn't have this problem, so we work around it in MVCCIncrementalIterator by double checking all non-value keys. If the normal iterator has a different value for the key, it's used instead. If the normal iterator doesn't have that key, it's skipped. This fixes both changefeeds and incremental backup. Closes cockroachdb#32104 Closes cockroachdb#32799 Release note (bug fix): `CHANGEFEED`s and incremental `BACKUP`s no longer indefinitely hang under an infrequent condition.

32909: engineccl/mvcc: work around time-bound iterator bug r=bdarnell,benesch a=danhhz For reasons described in #28358, a time-bound iterator will sometimes see an unresolved intent where there is none. A normal iterator doesn't have this problem, so we work around it in MVCCIncrementalIterator by double checking all non-value keys. If the normal iterator has a different value for the key, it's used instead. If the normal iterator doesn't have that key, it's skipped. This fixes both changefeeds and incremental backup. Closes #32104 Closes #32799 Release note (bug fix): `CHANGEFEED`s and incremental `BACKUP`s no longer indefinitely hang under an infrequent condition. Co-authored-by: Daniel Harrison <[email protected]>

For reasons described in cockroachdb#28358, a time-bound iterator will sometimes see an unresolved intent where there is none. A normal iterator doesn't have this problem, so we work around it in MVCCIncrementalIterator by double checking all non-value keys. If the normal iterator has a different value for the key, it's used instead. If the normal iterator doesn't have that key, it's skipped. This fixes both changefeeds and incremental backup. Closes cockroachdb#32104 Closes cockroachdb#32799 Release note (bug fix): `CHANGEFEED`s and incremental `BACKUP`s no longer indefinitely hang under an infrequent condition.

danhhz added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-performance Perf of queries or internals. Solution not expected to change functional behavior. A-cdc Change Data Capture labels Dec 3, 2018

danhhz self-assigned this Dec 3, 2018

danhhz mentioned this issue Dec 5, 2018

libroach: timebound iterator metadata doesn't consider intents #28358

Closed

danhhz mentioned this issue Dec 6, 2018

engineccl/mvcc: work around time-bound iterator bug #32909

Merged

danhhz mentioned this issue Dec 13, 2018

release-2.1: engineccl/mvcc: work around time-bound iterator bug #33141

Merged

craig bot closed this as completed in #32909 Dec 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

changefeedccl: time-bound iterator and CDC in 2.1 #32799

changefeedccl: time-bound iterator and CDC in 2.1 #32799

danhhz commented Dec 3, 2018

danhhz commented Dec 3, 2018

danhhz commented Dec 4, 2018

changefeedccl: time-bound iterator and CDC in 2.1 #32799

changefeedccl: time-bound iterator and CDC in 2.1 #32799

Comments

danhhz commented Dec 3, 2018

danhhz commented Dec 3, 2018

danhhz commented Dec 4, 2018