elastic · amyjtechwriter · Apr 21, 2023 · Nov 10, 2022 · Nov 10, 2022 · Nov 10, 2022
diff --git a/docs/reference/ccr/bi-directional-disaster-recovery.asciidoc b/docs/reference/ccr/bi-directional-disaster-recovery.asciidoc
@@ -0,0 +1,256 @@
+[role="xpack"]
+[[ccr-disaster-recovery-bi-directional-tutorial]]
+=== Tutorial: Disaster recovery based on bi-directional {ccr}
+++++
+<titleabbrev>Bi-directional disaster recovery</titleabbrev>
+++++
+
+Learn how to set up disaster recovery between two clusters based on
+bi-directional {ccr}. The following tutorial is designed for data streams which supports 
+<<{ref}/use-a-data-stream.html#update-docs-in-a-data-stream-by-query,update_by_query>> and
+ <<{ref}/use-a-data-stream.html#delete-docs-in-a-data-stream-by-query,delete_by_query>>. 
+ You can only perform these actions on the leader index. 
+
+This tutorial works with Logstash as the source of ingestion. It takes
+advantage of a logstash feature where <<{logstash-ref}/plugins-outputs-elasticsearch,the output can be load balanced
+across an array of hosts specified>>. Beats and agents currently do not 
+support multiple outputs. It should also be possible to set up a proxy 
+(Load Balancer) to redirect traffic without Logstash in this tutorial. 
+
+* Setting up a remote cluster on `clusterA` and `clusterB`.
+* Setting up bi-directional cross-cluster replication with exclusion patterns.
+* Setting up Logstash with multiple hosts to allow automatic load balancing and switching during disasters.
+
+image::images/ccr-bi-directional-disaster-recovery.png[Bi-directional cross cluster replication failover and failback]
+
+==== Initial setup
+. Set up a remote cluster on both clusters.
++
+[source,console]
+----
+### On cluster A ###
+PUT _cluster/settings
+{
+  "persistent": {
+    "cluster": {
+      "remote": {
+        "clusterB": {
+          "mode": "proxy",
+          "skip_unavailable": true,
+          "server_name": "clusterb.es.australia-southeast1.gcp.elastic-cloud.com",
+          "proxy_socket_connections": 18,
+          "proxy_address": "clusterb.es.australia-southeast1.gcp.elastic-cloud.com:9400"
+        }
+      }
+    }
+  }
+}
+### On cluster B ###
+PUT _cluster/settings
+{
+  "persistent": {
+    "cluster": {
+      "remote": {
+        "clusterA": {
+          "mode": "proxy",
+          "skip_unavailable": true,
+          "server_name": "clustera.es.australia-southeast1.gcp.elastic-cloud.com",
+          "proxy_socket_connections": 18,
+          "proxy_address": "clustera.es.australia-southeast1.gcp.elastic-cloud.com:9400"
+        }
+      }
+    }
+  }
+}
+----
+
+. Set up bi-directional cross-cluster replication.
++
+[source,console]
+----
+### On cluster A ###
+PUT /_ccr/auto_follow/logs-generic-default
+{
+  "remote_cluster": "clusterB",
+  "leader_index_patterns": [
+    ".ds-logs-generic-default-20*"
+  ],
+  "leader_index_exclusion_patterns":"{{leader_index}}-replicated_from_clustera",
+  "follow_index_pattern": "{{leader_index}}-replicated_from_clusterb"
+}
+
+### On cluster B ###
+PUT /_ccr/auto_follow/logs-generic-default
+{
+  "remote_cluster": "clusterA",
+  "leader_index_patterns": [
+    ".ds-logs-generic-default-20*"
+  ],
+  "leader_index_exclusion_patterns":"{{leader_index}}-replicated_from_clusterb",
+  "follow_index_pattern": "{{leader_index}}-replicated_from_clustera"
+}
+----
++
+IMPORTANT: Existing data on the cluster will not be replicated by
+`_ccr/auto_follow` even though the patterns may match. This function will only
+replicate newly created backing indices (as part of the data stream)
++
+IMPORTANT: Ensure to have `leader_index_exclustion_patterns` to avoid recursion.
++
+TIP: `follow_index_pattern` allows lowercase characters only.
++
+TIP: This step cannot be executed via Kibana UI due to the lack of an exclusion
+pattern in the UI. Use API in this step.
+
+. Set up the Logstash config file
++
+In the following example, I use the input generator to demonstrate the document
+count in the clusters. Users would need to reconfigure this section
+to suit their use cases. 
++
+[source,logstash]
+----
+### On Logstash server ###
+### This is a logstash config file ###
+input {
+  generator{
+    message => 'Hello World'
+    count => 100
+  }
+}
+output {
+  elasticsearch {
+    hosts => ["https://clustera.es.australia-southeast1.gcp.elastic-cloud.com:9243","https://clusterb.es.australia-southeast1.gcp.elastic-cloud.com:9243"]
+    user => "logstash-user"
+    password => "same_password_for_both_clusters"
+  }
+}
+----
++
+IMPORTANT: The key point is that when `cluster A` is down, all traffic will be
+automatically redirected to `cluster B`, and once `cluster A` comes back, they
+are automatically redirected back to `cluster A` again. This is achieved by the
+option `hosts` where multiple ES cluster endpoints are specified in the
+array `[cluserA, clusterB]`.
++
+TIP: Set up the same password for the same user on both clusters to use this load-balancing feature.
+
+. Start logstash with the above config file.
++
+[source,sh]
+----
+### On Logstash server ###
+bin/logstash -f multiple_hosts.conf
+----
+
+. Observe document counts in data streams
++
+The setup above will create a data stream named `logs-generic-default`
+on each of the clusters. Logstash will write 50% of the documents to `clusterp
+A` and 50% of the documents to `cluster B` when both clusters are alive.
++
+Bi-directional {ccr} will create one more data stream on each of the clusters
+with the `-replication_from_cluster{a|b}` suffix. At the end of this step,
+you should see:
++
+* data streams On cluster A 
+** 50 documents in logs-generic-default-replicated_from_clusterb 
+** 50 documents in logs-generic-default
+* data streams On cluster B 
+** 50 documents in logs-generic-default-replicated_from_clustera
+** 50 documents in logs-generic-default
+
+. Queries should be set up to perform search across them.
++
+If you perform a search on `logs*` on either of the clusters, you should see 100
+hits in total. 
++
+[source,console]
+----
+GET logs*/_search?size=0
+----
+
+
+==== Failover when `clusterA` is down
+. You can simulate this by shutting down either of the clusters. Let's shut down
+`cluster A` in this tutorial.
+. Start logstash with the same config file. (This step is not required in real
+use cases where logstash ingests continuously)
++
+[source,sh]
+----
+### On Logstash server ###
+bin/logstash -f multiple_hosts.conf
+----
+
+. Observe all logstash traffic will be redirected to `cluster B` automatically. 
++
+TIP: You should also redirect all search traffic to the `clusterB` cluster during this time. 
+
+. Observe two data streams on `cluster B` now contain a different number of documents. 
++
+* data streams On cluster A (Dead) 
+** 50 documents in logs-generic-default-replicated_from_clusterb 
+** 50 documents in logs-generic-default
+* data streams On cluster B (Alive) 
+** 50 documents in logs-generic-default-replicated_from_clustera
+** 150 documents in logs-generic-default
+
+
+==== Failback when `clusterA` comes back
+. You can simulate this by turning `cluster A` back. 
+. Observe data ingested to `cluster B` during `cluster A` 's downtime will be
+automatically replicated. 
++
+* data streams On cluster A
+** 150 documents in logs-generic-default-replicated_from_clusterb 
+** 50 documents in logs-generic-default
+* data streams On cluster B
+** 50 documents in logs-generic-default-replicated_from_clustera
+** 150 documents in logs-generic-default
+
+. If you have logstash running at this time, you will also observe traffic is
+sending to both clusters.
+
+==== Perform update or delete by query
+It is possible to update or delete the documents but you can only perform these actions on the leader index.
+. First identify which backing index contains the document you want to update.
++
+[source,console]
+----
+### On either of the cluster ###
+GET logs-generic-default*/_search?filter_path=hits.hits._index
+{
+"query": {
+    "match": {
+      "event.sequence": "97"
+    }
+  }
+}
+----
++
+* If the hits returns ` "_index": ".ds-logs-generic-default-replicated_from_clustera-<yyyy.MM.dd>-*"`, then you need to proceed to the next step on `cluster A`.
+* If the hits returns ` "_index": ".ds-logs-generic-default-replicated_from_clusterb-<yyyy.MM.dd>-*"`, then you need to proceed to the next step on `cluster B`.
+* If the hits returns ` "_index": ".ds-logs-generic-default-<yyyy.MM.dd>-*"`, then you need to proceed to the next step on the same cluster where you perform the search query.
+
+. Perform the update (or delete) by query
++
+[source,console]
+----
+### On the cluster identified from the previous step ###
+POST logs-generic-default/_update_by_query
+{
+  "query": {
+    "match": {
+      "event.sequence": "97"
+    }
+  },
+  "script": {
+    "source": "ctx._source.event.original = params.new_event",
+    "lang": "painless",
+    "params": {
+      "new_event": "FOOBAR"
+    }
+  }
+}
+----
diff --git a/docs/reference/ccr/images/ccr-bi-directional-disaster-recovery.png b/docs/reference/ccr/images/ccr-bi-directional-disaster-recovery.png
diff --git a/docs/reference/ccr/index.asciidoc b/docs/reference/ccr/index.asciidoc
@@ -343,3 +343,4 @@ include::getting-started.asciidoc[]
 include::managing.asciidoc[]
 include::auto-follow.asciidoc[]
 include::upgrading.asciidoc[]
+include::bi-directional-disaster-recovery.asciidoc[]