Skip to content

Commit

Permalink
changefeedccl: Add a large doc comment
Browse files Browse the repository at this point in the history
I found drawing out this diagram useful when working on this system,
perhaps it'll be useful to others as well.

Release note: None
  • Loading branch information
stevendanna committed Apr 29, 2021
1 parent 7396811 commit 78f8e0f
Show file tree
Hide file tree
Showing 2 changed files with 92 additions and 0 deletions.
1 change: 1 addition & 0 deletions pkg/ccl/changefeedccl/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ go_library(
"changefeed_dist.go",
"changefeed_processors.go",
"changefeed_stmt.go",
"doc.go",
"encoder.go",
"errors.go",
"metrics.go",
Expand Down
91 changes: 91 additions & 0 deletions pkg/ccl/changefeedccl/doc.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
// Copyright 2021 The Cockroach Authors.
//
// Licensed as a CockroachDB Enterprise file under the Cockroach Community
// License (the "License"); you may not use this file except in compliance with
// the License. You may obtain a copy of the License at
//
// https://github.com/cockroachdb/cockroach/blob/master/licenses/CCL.txt

/*
Package changefeedccl is the internal implementation behind
changefeeds.
Changefeeds emit KV events on user-specified tables to user-specified
sinks.
Changefeeds are built on top of rangefeeds, which provide a stream of
KV events for a given keyspan as well as periodic "resolved
timestamps" for those spans. For more information on rangefeeds see
docs/RFCS/20170613_range_feeds_storage_primitive.md
The changefeed machinery encodes and delivers both the KV events
and resolved timestamps to the sinks. It further uses the resolved
timestamps to periodically checkpoint a changefeed's progress such
that it can be resumed in the case of a failure.
To ensure that we can correctly encode every KV returned by the
rangefeed, changefeeds also monitor for schema changes.
"Enterprise" changefeeds are all changefeeds with a sink. These
feeds emit KV events to external systems and are run via the job
system.
"Sinkless" or "Experimental" changefeeds are changefeeds without a
sink which emit rows back to the original sql node that issues the
CREATE CHANGEFEED request.
The major components of this system are:
changfeedAggregator: Reads events from a kvfeed, encodes and emits
KV events to the sink and forwards resolved to the changeFrontier.
changeFrontier: Keeps track of the high-watermark of resolved
timestamps seen across the spans we are tracking. Periodically, it
emits resolved timestamps to the sink and checkpoints the
changefeed progress in the job system.
kvfeed: Coordinates the consumption of the rangefeed with the
schemafeed. It starts a set of goroutines that consume the
rangefeed events and forwards events back to the
changefeedAggregator once the schema for the event is known.
schemafeed: Periodically polls the table descriptors
table. Rangefeed events are held until it is sure it knows the
schema for the relevant table at the event's timestamp.
+-----------------+
+------+ | | +-----+
| sink |<------+ changeFrontier +------>| job |
+------+ | | +-----+
+--------+--------+
^
|
+-------+--------+
+------+ | |
| sink +<-------+ changefeedAgg |<------------+
+------+ | | |
+--+-------------+ chanBuffer
| |
v +------+------+
+--------------+ | |
| +------>| copyFromTo +--+
| kvfeed | | | |
| | +------+------+ |
+--------+---+-+ ^ |
| | memBuffer |
| | | |
| | +-----+------+ | +-----------+
| | | | | | |
| +--------> |physical +----->| rangefeed |
| | feed | | | |
| +------------+ | +-----------+
| |
| |
| +------------+ |
+------------> | schemafeed |<-|
| (polls) |
+------------+
*/
package changefeedccl

0 comments on commit 78f8e0f

Please sign in to comment.