-
Notifications
You must be signed in to change notification settings - Fork 138
Sequence Handling
NOTE: All of the information below describes the current (June 2015) Sync Gateway approach to sequence management.
The replication protocol used by Sync Gateway and Couchbase Lite is based on documents being associated with unique, monotonically increasing sequence values. When a client initiates a pull replication, it asks Sync Gateway for all changes since a specified sequence. Sync Gateway responds with the list of changed documents since this sequence, and also returns a last sequence value (lastseq).
With this response, Sync Gateway is providing a consistency guarantee to the client: that the client has been sent the complete set of documents (filtered for access control) between since and lastseq. The next time that client replicates, it sends lastseq as the new since value.
The sequence value itself is intended to be opaque to the client, and can be defined as any JSON element.
Each document written by Sync Gateway includes a sequence
value, stored in the document's _sync
metadata:
{
"_sync": {
"rev": "1-480a0a76c43f80e572405c164ffc7e3d",
"sequence": 181,
"recent_sequences": [
181
],
"history": {
"revs": [
"1-480a0a76c43f80e572405c164ffc7e3d"
],
"parents": [
-1
],
"channels": [
null
]
},
"time_saved": "2015-06-18T14:34:56.349529424-07:00"
},
"value": "1"
}
Sync Gateway uses a counter document in the bucket to generate new sequence values (the _sync:seq
). Whenever Sync Gateway writes a new document to the bucket, it first does an atomic increment on the _sync:seq
value to obtain a new sequence value.
- Validate the document (using the Sync Function).
- If valid, obtain a new sequence for the document by incrementing
_sync:seq
, and insert it as thesequence
property in the document's_sync
metadata. - Do a CAS update of the document
- If the CAS fails, 1 and 2 are repeated. When repeated, the previously allocated sequence(s) are stored in the document's _sync metadata (as
unusedSequences
), for use during sequence buffering (see below).
Sync Gateway listens to the mutation feed (TAP or DCP) from the Couchbase Server cluster to build an in-memory cache of recent mutations. This cache is used (in part) to handle replication requests. There is no ordering guarantee for sequence values arriving on the mutation feed (there are many opportunities for variable latency between a Sync Gateway node incrementing _sync:seq, Sync Gateway writing the document to a Server node, and that document showing up on the feed).
In order to deliver the client consistency guarantee, Sync Gateway buffers the sequences seen on the feed, and doesn't replicate data to the client until it has a continguous set of sequence values. Sync Gateway tracks the lowest contiguous sequence that's been seen on the feed - the stable sequence value - and only replicates documents up to that sequence value.
There are some optimizations in place to ensure that slow-arriving sequences don't block replication indefinitely - if sequences are pending buffering for more than a fixed interval (defaulting to 5s), Sync Gateway can send these to clients with a compound sequence number of the form stable_seq::seq
. Clients that subsequently send a since
value that's a compound sequence will receive all mutations since stable_seq
, and deduplicate any previously seen revisions using the standard revs_diff
replication processing.