diff --git a/docs/reference/docs.asciidoc b/docs/reference/docs.asciidoc index a8ab282853c8f..5c4c471b0a131 100644 --- a/docs/reference/docs.asciidoc +++ b/docs/reference/docs.asciidoc @@ -50,3 +50,5 @@ include::docs/termvectors.asciidoc[] include::docs/multi-termvectors.asciidoc[] include::docs/refresh.asciidoc[] + +include::docs/concurrency-control.asciidoc[] diff --git a/docs/reference/docs/bulk.asciidoc b/docs/reference/docs/bulk.asciidoc index 7ee634ccef649..0aae2365d965e 100644 --- a/docs/reference/docs/bulk.asciidoc +++ b/docs/reference/docs/bulk.asciidoc @@ -197,6 +197,17 @@ size for your particular workload. If using the HTTP API, make sure that the client does not send HTTP chunks, as this will slow things down. +[float] +[[bulk-optimistic-concurrency-control]] +=== Optimistic Concurrency Control + +Each `index` and `delete` action within a bulk API call may include the +`if_seq_no` and `if_primary_term` parameters in their respective action +and meta data lines. The `if_seq_no` and `if_primary_term` parameters control +how operations are executed, based on the last modification to existing +documents. See <> for more details. + + [float] [[bulk-versioning]] === Versioning diff --git a/docs/reference/docs/concurrency-control.asciidoc b/docs/reference/docs/concurrency-control.asciidoc new file mode 100644 index 0000000000000..d457b14068e26 --- /dev/null +++ b/docs/reference/docs/concurrency-control.asciidoc @@ -0,0 +1,114 @@ +[[optimistic-concurrency-control]] +== Optimistic concurrency control + +Elasticsearch is distributed. When documents are created, updated, or deleted, +the new version of the document has to be replicated to other nodes in the cluster. +Elasticsearch is also asynchronous and concurrent, meaning that these replication +requests are sent in parallel, and may arrive at their destination out of sequence. +Elasticsearch needs a way of ensuring that an older version of a document never +overwrites a newer version. + + +To ensure an older version of a document doesn't overwrite a newer version, every +operation performed to a document is assigned a sequence number by the primary +shard that coordinates that change. The sequence number is increased with each +operation and thus newer operations are guaranteed to have a higher sequence +number than older operations. Elasticsearch can then use the sequence number of +operations to make sure they never override a newer document version is never +overridden by a change that has a smaller sequence number assigned to it. + +For example, the following indexing command will create a document and assign it +an initial sequence number and primary term: + +[source,js] +-------------------------------------------------- +PUT products/_doc/1567 +{ + "product" : "r2d2", + "details" : "A resourceful astromech droid" +} +-------------------------------------------------- +// CONSOLE + +You can see the assigned sequence number and primary term in the +the `_seq_no` and `_primary_term` fields of the response: + +[source,js] +-------------------------------------------------- +{ + "_shards" : { + "total" : 2, + "failed" : 0, + "successful" : 1 + }, + "_index" : "products", + "_type" : "_doc", + "_id" : "1567", + "_version" : 1, + "_seq_no" : 362, + "_primary_term" : 2, + "result" : "created" +} +-------------------------------------------------- +// TESTRESPONSE[s/"_seq_no" : \d+/"_seq_no" : $body._seq_no/ s/"_primary_term" : 2/"_primary_term" : $body._primary_term/] + + +Elasticsearch keeps tracks of the sequence number and primary of the last +operation to have changed each of the document it stores. The sequence number +and primary term are returned in the `_seq_no` and `_primary_term` fields in +the response of the <>: + +[source,js] +-------------------------------------------------- +GET products/_doc/1567 +-------------------------------------------------- +// CONSOLE +// TEST[continued] + +returns: + +[source,js] +-------------------------------------------------- +{ + "_index" : "products", + "_type" : "_doc", + "_id" : "1567", + "_version" : 1, + "_seq_no" : 362, + "_primary_term" : 2, + "found": true, + "_source" : { + "product" : "r2d2", + "details" : "A resourceful astromech droid" + } +} +-------------------------------------------------- +// TESTRESPONSE[s/"_seq_no" : \d+/"_seq_no" : $body._seq_no/ s/"_primary_term" : 2/"_primary_term" : $body._primary_term/] + + +Note: The <> can return the `_seq_no` and `_primary_term` +for each search hit by requesting the `_seq_no` and `_primary_term` <>. + +The sequence number and the primary term uniquely identify a change. By noting down +the sequence number and primary term returned, you can make sure to only change the +document if no other change was made to it since you retrieved it. This +is done by setting the `if_seq_no` and `if_primary_term` parameters of either the +<> or the <>. + +For example, the following indexing call will make sure to add a tag to the +document without losing any potential change to the description or an addition +of another tag by another API: + +[source,js] +-------------------------------------------------- +PUT products/_doc/1567?if_seq_no=362&if_primary_term=2 +{ + "product" : "r2d2", + "details" : "A resourceful astromech droid", + "tags": ["droid"] +} +-------------------------------------------------- +// CONSOLE +// TEST[continued] +// TEST[catch: conflict] + diff --git a/docs/reference/docs/delete.asciidoc b/docs/reference/docs/delete.asciidoc index 49f31eb2d75fb..3a4559773613c 100644 --- a/docs/reference/docs/delete.asciidoc +++ b/docs/reference/docs/delete.asciidoc @@ -35,6 +35,16 @@ The result of the above delete operation is: // TESTRESPONSE[s/"_primary_term" : 1/"_primary_term" : $body._primary_term/] // TESTRESPONSE[s/"_seq_no" : 5/"_seq_no" : $body._seq_no/] +[float] +[[optimistic-concurrency-control-delete]] +=== Optimistic concurrency control + +Delete operations can be made optional and only be performed if the last +modification to the document was assigned the sequence number and primary +term specified by the `if_seq_no` and `if_primary_term` parameters. If a +mismatch is detected, the operation will result in a `VersionConflictException` +and a status code of 409. See <> for more details. + [float] [[delete-versioning]] === Versioning diff --git a/docs/reference/docs/index_.asciidoc b/docs/reference/docs/index_.asciidoc index b0ddb483d9f72..e52aa7faa2b83 100644 --- a/docs/reference/docs/index_.asciidoc +++ b/docs/reference/docs/index_.asciidoc @@ -78,91 +78,6 @@ Automatic index creation can include a pattern based white/black list, for example, set `action.auto_create_index` to `+aaa*,-bbb*,+ccc*,-*` (+ meaning allowed, and - meaning disallowed). -[float] -[[index-versioning]] -=== Versioning - -Each indexed document is given a version number. The associated -`version` number is returned as part of the response to the index API -request. The index API optionally allows for -http://en.wikipedia.org/wiki/Optimistic_concurrency_control[optimistic -concurrency control] when the `version` parameter is specified. This -will control the version of the document the operation is intended to be -executed against. A good example of a use case for versioning is -performing a transactional read-then-update. Specifying a `version` from -the document initially read ensures no changes have happened in the -meantime (when reading in order to update, it is recommended to set -`preference` to `_primary`). For example: - -[source,js] --------------------------------------------------- -PUT twitter/_doc/1?version=2 -{ - "message" : "elasticsearch now has versioning support, double cool!" -} --------------------------------------------------- -// CONSOLE -// TEST[continued] -// TEST[catch: conflict] - -*NOTE:* versioning is completely real time, and is not affected by the -near real time aspects of search operations. If no version is provided, -then the operation is executed without any version checks. - -By default, internal versioning is used that starts at 1 and increments -with each update, deletes included. Optionally, the version number can be -supplemented with an external value (for example, if maintained in a -database). To enable this functionality, `version_type` should be set to -`external`. The value provided must be a numeric, long value greater or equal to 0, -and less than around 9.2e+18. When using the external version type, instead -of checking for a matching version number, the system checks to see if -the version number passed to the index request is greater than the -version of the currently stored document. If true, the document will be -indexed and the new version number used. If the value provided is less -than or equal to the stored document's version number, a version -conflict will occur and the index operation will fail. - -WARNING: External versioning supports the value 0 as a valid version number. -This allows the version to be in sync with an external versioning system -where version numbers start from zero instead of one. It has the side effect -that documents with version number equal to zero cannot neither be updated -using the <> nor be deleted -using the <> as long as their -version number is equal to zero. - -A nice side effect is that there is no need to maintain strict ordering -of async indexing operations executed as a result of changes to a source -database, as long as version numbers from the source database are used. -Even the simple case of updating the Elasticsearch index using data from -a database is simplified if external versioning is used, as only the -latest version will be used if the index operations are out of order for -whatever reason. - -[float] -==== Version types - -Next to the `internal` & `external` version types explained above, Elasticsearch -also supports other types for specific use cases. Here is an overview of -the different version types and their semantics. - -`internal`:: only index the document if the given version is identical to the version -of the stored document. - -`external` or `external_gt`:: only index the document if the given version is strictly higher -than the version of the stored document *or* if there is no existing document. The given -version will be used as the new version and will be stored with the new document. The supplied -version must be a non-negative long number. - -`external_gte`:: only index the document if the given version is *equal* or higher -than the version of the stored document. If there is no existing document -the operation will succeed as well. The given version will be used as the new version -and will be stored with the new document. The supplied version must be a non-negative long number. - -*NOTE*: The `external_gte` version type is meant for special use cases and -should be used with care. If used incorrectly, it can result in loss of data. -There is another option, `force`, which is deprecated because it can cause -primary and replica shards to diverge. - [float] [[operation-type]] === Operation Type @@ -238,6 +153,16 @@ The result of the above index operation is: -------------------------------------------------- // TESTRESPONSE[s/W0tpsmIBdwcYyG50zbta/$body._id/ s/"successful" : 2/"successful" : 1/] +[float] +[[optimistic-concurrency-control-index]] +=== Optimistic concurrency control + +Index operations can be made optional and only be performed if the last +modification to the document was assigned the sequence number and primary +term specified by the `if_seq_no` and `if_primary_term` parameters. If a +mismatch is detected, the operation will result in a `VersionConflictException` +and a status code of 409. See <> for more details. + [float] [[index-routing]] === Routing @@ -380,3 +305,83 @@ PUT twitter/_doc/1?timeout=5m } -------------------------------------------------- // CONSOLE + +[float] +[[index-versioning]] +=== Versioning + +Each indexed document is given a version number. By default, +internal versioning is used that starts at 1 and increments +with each update, deletes included. Optionally, the version number can be +set to an external value (for example, if maintained in a +database). To enable this functionality, `version_type` should be set to +`external`. The value provided must be a numeric, long value greater or equal to 0, +and less than around 9.2e+18. + +When using the external version type, the system checks to see if +the version number passed to the index request is greater than the +version of the currently stored document. If true, the document will be +indexed and the new version number used. If the value provided is less +than or equal to the stored document's version number, a version +conflict will occur and the index operation will fail. For example: + +[source,js] +-------------------------------------------------- +PUT twitter/_doc/1?version=2&version_type=external +{ + "message" : "elasticsearch now has versioning support, double cool!" +} +-------------------------------------------------- +// CONSOLE +// TEST[continued] + +*NOTE:* versioning is completely real time, and is not affected by the +near real time aspects of search operations. If no version is provided, +then the operation is executed without any version checks. + +The above will succeed since the the supplied version of 2 is higher than +the current document version of 1. If the document was already updated +and it's version was set to 2 or higher, the indexing command will fail +and result in a conflict (409 http status code). + +WARNING: External versioning supports the value 0 as a valid version number. +This allows the version to be in sync with an external versioning system +where version numbers start from zero instead of one. It has the side effect +that documents with version number equal to zero cannot neither be updated +using the <> nor be deleted +using the <> as long as their +version number is equal to zero. + +A nice side effect is that there is no need to maintain strict ordering +of async indexing operations executed as a result of changes to a source +database, as long as version numbers from the source database are used. +Even the simple case of updating the Elasticsearch index using data from +a database is simplified if external versioning is used, as only the +latest version will be used if the index operations are out of order for +whatever reason. + +[float] +==== Version types + +Next to the `external` version type explained above, Elasticsearch +also supports other types for specific use cases. Here is an overview of +the different version types and their semantics. + +`internal`:: only index the document if the given version is identical to the version +of the stored document. + +`external` or `external_gt`:: only index the document if the given version is strictly higher +than the version of the stored document *or* if there is no existing document. The given +version will be used as the new version and will be stored with the new document. The supplied +version must be a non-negative long number. + +`external_gte`:: only index the document if the given version is *equal* or higher +than the version of the stored document. If there is no existing document +the operation will succeed as well. The given version will be used as the new version +and will be stored with the new document. The supplied version must be a non-negative long number. + +*NOTE*: The `external_gte` version type is meant for special use cases and +should be used with care. If used incorrectly, it can result in loss of data. +There is another option, `force`, which is deprecated because it can cause +primary and replica shards to diverge. +