Skip to content

Commit

Permalink
Merge branch 'main' into esql-data-type-converter
Browse files Browse the repository at this point in the history
  • Loading branch information
fang-xing-esql committed Mar 16, 2024
2 parents 64bc8a6 + 387eb38 commit e6a4b09
Show file tree
Hide file tree
Showing 108 changed files with 3,904 additions and 203 deletions.
5 changes: 5 additions & 0 deletions docs/changelog/101656.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 101656
summary: Adjust interception of requests for specific shard IDs
area: Authorization
type: bug
issues: []
6 changes: 6 additions & 0 deletions docs/changelog/105709.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
pr: 105709
summary: Disable validate when rewrite parameter is sent and the index access control
list is non-null
area: Security
type: bug
issues: []
5 changes: 5 additions & 0 deletions docs/changelog/105714.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 105714
summary: Cross check livedocs for terms aggs when index access control list is non-null
area: "Aggregations"
type: bug
issues: []
5 changes: 5 additions & 0 deletions docs/changelog/106315.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 106315
summary: Updating the tika version to 2.9.1 in the ingest attachment plugin
area: Ingest Node
type: upgrade
issues: []
5 changes: 5 additions & 0 deletions docs/changelog/106327.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 106327
summary: Serialize big array vectors
area: ES|QL
type: enhancement
issues: []
6 changes: 6 additions & 0 deletions docs/changelog/106351.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
pr: 106351
summary: "Fix error on sorting unsortable `geo_point` and `cartesian_point`"
area: ES|QL
type: bug
issues:
- 106007
5 changes: 5 additions & 0 deletions docs/changelog/106373.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 106373
summary: Serialize big array blocks
area: ES|QL
type: enhancement
issues: []
5 changes: 5 additions & 0 deletions docs/reference/esql/esql-get-started.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,11 @@ preview::["Do not use {esql} on production environments. This functionality is i

This guide shows how you can use {esql} to query and aggregate your data.

[TIP]
====
This getting started is also available as an https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/esql/esql-getting-started.ipynb[interactive Python notebook] in the `elasticsearch-labs` GitHub repository.
====

[discrete]
[[esql-getting-started-prerequisites]]
=== Prerequisites
Expand Down
18 changes: 12 additions & 6 deletions docs/reference/esql/esql-limitations.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,18 @@ unsupported type is not explicitly used in a query, it is returned with `null`
values, with the exception of nested fields. Nested fields are not returned at
all.

[discrete]
==== Limitations on supported types

Some <<mapping-types,field types>> are not supported in all contexts:

* Spatial types are not supported in the <<esql-sort,SORT>> processing command.
Specifying a column of one of these types as a sort parameter will result in an error:
** `geo_point`
** `geo_shape`
** `cartesian_point`
** `cartesian_shape`

[discrete]
[[esql-_source-availability]]
=== _source availability
Expand Down Expand Up @@ -140,12 +152,6 @@ you query, and query `keyword` sub-fields instead of `text` fields.

{esql} does not support querying time series data streams (TSDS).

[discrete]
[[esql-limitations-ccs]]
=== {ccs-cap} is not supported

{esql} does not support {ccs}.

[discrete]
[[esql-limitations-date-math]]
=== Date math limitations
Expand Down
3 changes: 3 additions & 0 deletions docs/reference/troubleshooting.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ fix problems that an {es} deployment might encounter.
* <<watcher-troubleshooting,Troubleshooting Watcher>>
* <<troubleshooting-searches,Troubleshooting searches>>
* <<troubleshooting-shards-capacity-issues,Troubleshooting shards capacity>>
* <<troubleshooting-unbalanced-cluster,Troubleshooting an unbalanced cluster>>
* <<remote-clusters-troubleshooting,Troubleshooting remote clusters>>

[discrete]
Expand Down Expand Up @@ -135,3 +136,5 @@ include::watcher/troubleshooting.asciidoc[]
include::troubleshooting/troubleshooting-searches.asciidoc[]

include::troubleshooting/troubleshooting-shards-capacity.asciidoc[]

include::troubleshooting/troubleshooting-unbalanced-cluster.asciidoc[]
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
[[troubleshooting-unbalanced-cluster]]
== Troubleshooting an unbalanced cluster

Elasticsearch balances shards across data tiers to achieve a good compromise between:

* shard count
* disk usage
* write load (for indices in data streams)

Elasticsearch does not take into account the amount or complexity of search queries when rebalancing shards.
This is indirectly achieved by balancing shard count and disk usage.

There is no guarantee that individual components will be evenly spread across the nodes.
This could happen if some nodes have fewer shards, or are using less disk space,
but are assigned shards with higher write loads.

Use the <<cat-allocation,cat allocation command>> to list workloads per node:

[source,console]
--------------------------------------------------
GET /_cat/allocation?v
--------------------------------------------------
// TEST[s/^/PUT test\n{"settings": {"number_of_replicas": 0}}\n/]

The API returns the following response:

[source,text]
--------------------------------------------------
shards shards.undesired write_load.forecast disk.indices.forecast disk.indices disk.used disk.avail disk.total disk.percent host ip node node.role
1 0 0.0 260b 260b 47.3gb 43.4gb 100.7gb 46 127.0.0.1 127.0.0.1 CSUXak2 himrst
--------------------------------------------------
// TESTRESPONSE[s/\d+(\.\d+)?[tgmk]?b/\\d+(\\.\\d+)?[tgmk]?b/ s/46/\\d+/]
// TESTRESPONSE[s/CSUXak2 himrst/.+/ non_json]

This response contains the following information that influences balancing:

* `shards` is the current number of shards allocated to the node
* `shards.undesired` is the number of shards that needs to be moved to other nodes to finish balancing
* `disk.indices.forecast` is the expected disk usage according to projected shard growth
* `write_load.forecast` is the projected total write load associated with this node

A cluster is considered balanced when all shards are in their desired locations,
which means that no further shard movements are planned (all `shards.undesired` values are equal to 0).

Some operations such as node restarting, decommissioning, or changing cluster allocation settings
are disruptive and might require multiple shards to move in order to rebalance the cluster.

Shard movement order is not deterministic and mostly determined by the source and target node readiness to move a shard.
While rebalancing is in progress some nodes might appear busier then others.

When a shard is allocated to an undesired node it uses the resources of the current node instead of the target.
This might cause a hotspot (disk or CPU) when multiple shards reside on the current node that have not been
moved to their corresponding targets yet.

If a cluster takes a long time to finish rebalancing you might find the following log entries:
[source,text]
--------------------------------------------------
[WARN][o.e.c.r.a.a.DesiredBalanceReconciler] [10%] of assigned shards (10/100) are not on their desired nodes, which exceeds the warn threshold of [10%]
--------------------------------------------------
This is not concerning as long as the number of such shards is decreasing and this warning appears occasionally,
for example after rolling restarts or changing allocation settings.

If the cluster has this warning repeatedly for an extended period of time (multiple hours),
it is possible that the desired balance is diverging too far from the current state.

If so, increase the <<shards-rebalancing-heuristics,`cluster.routing.allocation.balance.threshold`>>
to reduce the sensitivity of the algorithm that tries to level up the shard count and disk usage within the cluster.

And reset the desired balance using the following API call:

[source,console,id=delete-desired-balance-request-example]
--------------------------------------------------
DELETE /_internal/desired_balance
--------------------------------------------------
Loading

0 comments on commit e6a4b09

Please sign in to comment.