Merge branch 'main' into esql-data-type-converter

fang-xing-esql · Mar 16, 2024 · e6a4b09 · e6a4b09
2 parents 64bc8a6 + 387eb38
commit e6a4b09
Show file tree

Hide file tree

Showing 108 changed files with 3,904 additions and 203 deletions.
diff --git a/docs/changelog/101656.yaml b/docs/changelog/101656.yaml
@@ -0,0 +1,5 @@
+pr: 101656
+summary: Adjust interception of requests for specific shard IDs
+area: Authorization
+type: bug
+issues: []
diff --git a/docs/changelog/105709.yaml b/docs/changelog/105709.yaml
@@ -0,0 +1,6 @@
+pr: 105709
+summary: Disable validate when rewrite parameter is sent and the index access control
+  list is non-null
+area: Security
+type: bug
+issues: []
diff --git a/docs/changelog/105714.yaml b/docs/changelog/105714.yaml
@@ -0,0 +1,5 @@
+pr: 105714
+summary: Cross check livedocs for terms aggs when index access control list is non-null
+area: "Aggregations"
+type: bug
+issues: []
diff --git a/docs/changelog/106315.yaml b/docs/changelog/106315.yaml
@@ -0,0 +1,5 @@
+pr: 106315
+summary: Updating the tika version to 2.9.1 in the ingest attachment plugin
+area: Ingest Node
+type: upgrade
+issues: []
diff --git a/docs/changelog/106327.yaml b/docs/changelog/106327.yaml
@@ -0,0 +1,5 @@
+pr: 106327
+summary: Serialize big array vectors
+area: ES|QL
+type: enhancement
+issues: []
diff --git a/docs/changelog/106351.yaml b/docs/changelog/106351.yaml
@@ -0,0 +1,6 @@
+pr: 106351
+summary: "Fix error on sorting unsortable `geo_point` and `cartesian_point`"
+area: ES|QL
+type: bug
+issues:
+ - 106007
diff --git a/docs/changelog/106373.yaml b/docs/changelog/106373.yaml
@@ -0,0 +1,5 @@
+pr: 106373
+summary: Serialize big array blocks
+area: ES|QL
+type: enhancement
+issues: []
diff --git a/docs/reference/esql/esql-get-started.asciidoc b/docs/reference/esql/esql-get-started.asciidoc
@@ -9,6 +9,11 @@ preview::["Do not use {esql} on production environments. This functionality is i
 
 This guide shows how you can use {esql} to query and aggregate your data.
 
+[TIP]
+====
+This getting started is also available as an https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/esql/esql-getting-started.ipynb[interactive Python notebook] in the `elasticsearch-labs` GitHub repository.
+====
+
 [discrete]
 [[esql-getting-started-prerequisites]]
 === Prerequisites

diff --git a/docs/reference/esql/esql-limitations.asciidoc b/docs/reference/esql/esql-limitations.asciidoc
@@ -73,6 +73,18 @@ unsupported type is not explicitly used in a query, it is returned with `null`
 values, with the exception of nested fields. Nested fields are not returned at
 all.
 
+[discrete]
+==== Limitations on supported types
+
+Some <<mapping-types,field types>> are not supported in all contexts:
+
+* Spatial types are not supported in the <<esql-sort,SORT>> processing command.
+  Specifying a column of one of these types as a sort parameter will result in an error:
+** `geo_point`
+** `geo_shape`
+** `cartesian_point`
+** `cartesian_shape`
+
 [discrete]
 [[esql-_source-availability]]
 === _source availability
@@ -140,12 +152,6 @@ you query, and query `keyword` sub-fields instead of `text` fields.
 
 {esql} does not support querying time series data streams (TSDS).
 
-[discrete]
-[[esql-limitations-ccs]]
-=== {ccs-cap} is not supported
-
-{esql} does not support {ccs}.
-
 [discrete]
 [[esql-limitations-date-math]]
 === Date math limitations

diff --git a/docs/reference/troubleshooting.asciidoc b/docs/reference/troubleshooting.asciidoc
@@ -56,6 +56,7 @@ fix problems that an {es} deployment might encounter.
 * <<watcher-troubleshooting,Troubleshooting Watcher>>
 * <<troubleshooting-searches,Troubleshooting searches>>
 * <<troubleshooting-shards-capacity-issues,Troubleshooting shards capacity>>
+* <<troubleshooting-unbalanced-cluster,Troubleshooting an unbalanced cluster>>
 * <<remote-clusters-troubleshooting,Troubleshooting remote clusters>>
 
 [discrete]
@@ -135,3 +136,5 @@ include::watcher/troubleshooting.asciidoc[]
 include::troubleshooting/troubleshooting-searches.asciidoc[]
 
 include::troubleshooting/troubleshooting-shards-capacity.asciidoc[]
+
+include::troubleshooting/troubleshooting-unbalanced-cluster.asciidoc[]
diff --git a/docs/reference/troubleshooting/troubleshooting-unbalanced-cluster.asciidoc b/docs/reference/troubleshooting/troubleshooting-unbalanced-cluster.asciidoc
@@ -0,0 +1,74 @@
+[[troubleshooting-unbalanced-cluster]]
+== Troubleshooting an unbalanced cluster
+
+Elasticsearch balances shards across data tiers to achieve a good compromise between:
+
+* shard count
+* disk usage
+* write load (for indices in data streams)
+
+Elasticsearch does not take into account the amount or complexity of search queries when rebalancing shards.
+This is indirectly achieved by balancing shard count and disk usage.
+
+There is no guarantee that individual components will be evenly spread across the nodes.
+This could happen if some nodes have fewer shards, or are using less disk space,
+but are assigned shards with higher write loads.
+
+Use the <<cat-allocation,cat allocation command>> to list workloads per node:
+
+[source,console]
+--------------------------------------------------
+GET /_cat/allocation?v
+--------------------------------------------------
+// TEST[s/^/PUT test\n{"settings": {"number_of_replicas": 0}}\n/]
+
+The API returns the following response:
+
+[source,text]
+--------------------------------------------------
+shards shards.undesired write_load.forecast disk.indices.forecast disk.indices disk.used disk.avail disk.total disk.percent host      ip        node    node.role
+     1                0                 0.0                  260b         260b    47.3gb     43.4gb    100.7gb           46 127.0.0.1 127.0.0.1 CSUXak2 himrst
+--------------------------------------------------
+// TESTRESPONSE[s/\d+(\.\d+)?[tgmk]?b/\\d+(\\.\\d+)?[tgmk]?b/ s/46/\\d+/]
+// TESTRESPONSE[s/CSUXak2 himrst/.+/ non_json]
+
+This response contains the following information that influences balancing:
+
+* `shards` is the current number of shards allocated to the node
+* `shards.undesired` is the number of shards that needs to be moved to other nodes to finish balancing
+* `disk.indices.forecast` is the expected disk usage according to projected shard growth
+* `write_load.forecast` is the projected total write load associated with this node
+
+A cluster is considered balanced when all shards are in their desired locations,
+which means that no further shard movements are planned (all `shards.undesired` values are equal to 0).
+
+Some operations such as node restarting, decommissioning, or changing cluster allocation settings
+are disruptive and might require multiple shards to move in order to rebalance the cluster.
+
+Shard movement order is not deterministic and mostly determined by the source and target node readiness to move a shard.
+While rebalancing is in progress some nodes might appear busier then others.
+
+When a shard is allocated to an undesired node it uses the resources of the current node instead of the target.
+This might cause a hotspot (disk or CPU) when multiple shards reside on the current node that have not been
+moved to their corresponding targets yet.
+
+If a cluster takes a long time to finish rebalancing you might find the following log entries:
+[source,text]
+--------------------------------------------------
+[WARN][o.e.c.r.a.a.DesiredBalanceReconciler] [10%] of assigned shards (10/100) are not on their desired nodes, which exceeds the warn threshold of [10%]
+--------------------------------------------------
+This is not concerning as long as the number of such shards is decreasing and this warning appears occasionally,
+for example after rolling restarts or changing allocation settings.
+
+If the cluster has this warning repeatedly for an extended period of time (multiple hours),
+it is possible that the desired balance is diverging too far from the current state.
+
+If so, increase the <<shards-rebalancing-heuristics,`cluster.routing.allocation.balance.threshold`>>
+to reduce the sensitivity of the algorithm that tries to level up the shard count and disk usage within the cluster.
+
+And reset the desired balance using the following API call:
+
+[source,console,id=delete-desired-balance-request-example]
+--------------------------------------------------
+DELETE /_internal/desired_balance
+--------------------------------------------------