From d5597c9ba6e5e90e9b1c850a7398eeeb1bef71f6 Mon Sep 17 00:00:00 2001 From: James Rodewig Date: Thu, 5 Sep 2019 08:33:17 -0400 Subject: [PATCH] [DOCS] Reformat index segments API docs (#46345) --- docs/reference/cat/segments.asciidoc | 42 ++-- docs/reference/indices/segments.asciidoc | 186 ++++++++++++------ docs/reference/rest-api/common-parms.asciidoc | 73 ++++++- 3 files changed, 208 insertions(+), 93 deletions(-) diff --git a/docs/reference/cat/segments.asciidoc b/docs/reference/cat/segments.asciidoc index e67f48440ab80..59fefaa309b97 100644 --- a/docs/reference/cat/segments.asciidoc +++ b/docs/reference/cat/segments.asciidoc @@ -49,46 +49,40 @@ Valid columns are: (Default) IP address of the segment's shard, such as `127.0.1.1`. `segment`:: -(Default) Name of the segment, such as `_0`. The segment name is derived from -the segment generation and used internally to create file names in the directory -of the shard. +(Default) +include::{docdir}/rest-api/common-parms.asciidoc[tag=segment] `generation`:: -(Default) Generation number, such as `0`. {es} increments this generation number -for each segment written. {es} then uses this number to derive the segment name. +(Default) +include::{docdir}/rest-api/common-parms.asciidoc[tag=generation] `docs.count`:: -(Default) Number of non-deleted documents in the segment, such as `25`. This -number is based on Lucene documents and may include documents from -<> fields. +(Default) +include::{docdir}/rest-api/common-parms.asciidoc[tag=docs-count] `docs.deleted`:: -(Default) Number of deleted documents in the segment, such as `0`. This number -is based on Lucene documents. {es} reclaims the disk space of deleted Lucene -documents when a segment is merged. +(Default) +include::{docdir}/rest-api/common-parms.asciidoc[tag=docs-deleted] `size`:: -(Default) Disk space used by the segment, such as `50kb`. +(Default) +include::{docdir}/rest-api/common-parms.asciidoc[tag=segment-size] `size.memory`:: -(Default) Bytes of segment data stored in memory for efficient search, such as -`1264`. +(Default) +include::{docdir}/rest-api/common-parms.asciidoc[tag=memory] `committed`:: -(Default) If `true`, the segment is committed to disk. Segments committed to -disk would survive a hard reboot. -+ -If `false`, the data from uncommitted segments is also stored in the transaction -log. {es} replays those changes on the next start. +(Default) +include::{docdir}/rest-api/common-parms.asciidoc[tag=committed] `searchable`:: -(Default) If `true`, the segment is searchable. -+ -If `false`, likely means the segment is written to disk but has not been -<>. +(Default) +include::{docdir}/rest-api/common-parms.asciidoc[tag=segment-search] `version`:: -(Default) Version of Lucene used to write the segment. +(Default) +include::{docdir}/rest-api/common-parms.asciidoc[tag=segment-version] `compound`:: (Default) If `true`, the segment is stored in a compound file. This means Lucene diff --git a/docs/reference/indices/segments.asciidoc b/docs/reference/indices/segments.asciidoc index bc204a0a4a577..5500fba2d9f40 100644 --- a/docs/reference/indices/segments.asciidoc +++ b/docs/reference/indices/segments.asciidoc @@ -1,41 +1,136 @@ [[indices-segments]] -=== Indices Segments +=== Index segments API +++++ +Index segments +++++ -Provide low level segments information that a Lucene index (shard level) -is built with. Allows to be used to provide more information on the -state of a shard and an index, possibly optimization information, data -"wasted" on deletes, and so on. +Returns low-level information about the https://lucene.apache.org/core/[Lucene] +segments in index shards. -Endpoints include segments for a specific index: +[source,console] +---- +GET /twitter/_segments +---- +// TEST[setup:twitter] -[source,js] + +[[index-segments-api-request]] +==== {api-request-title} + +`GET //_segments` + +`GET /_segments` + +`GET /_cat/segments/` + + +[[index-segments-api-path-params]] +==== {api-path-parms-title} + +include::{docdir}/rest-api/common-parms.asciidoc[tag=index] + + +[[index-segments-api-query-params]] +==== {api-query-parms-title} + +include::{docdir}/rest-api/common-parms.asciidoc[tag=allow-no-indices] + +include::{docdir}/rest-api/common-parms.asciidoc[tag=expand-wildcards] ++ +Defaults to `open`. + +include::{docdir}/rest-api/common-parms.asciidoc[tag=index-ignore-unavailable] + +`verbose`:: +experimental:[] +(Optional, boolean) +If `true`, the response includes detailed information +about Lucene's memory usage. +Defaults to `false`. + + +[[index-segments-api-response-body]] +==== {api-response-body-title} + +``:: +(String) +include::{docdir}/rest-api/common-parms.asciidoc[tag=segment] + +`generation`:: +(Integer) +include::{docdir}/rest-api/common-parms.asciidoc[tag=generation] + +`num_docs`:: +(Integer) +include::{docdir}/rest-api/common-parms.asciidoc[tag=docs-count] + +`deleted_docs`:: +(Integer) +include::{docdir}/rest-api/common-parms.asciidoc[tag=docs-deleted] + +`size_in_bytes`:: +(Integer) +include::{docdir}/rest-api/common-parms.asciidoc[tag=segment-size] + +`memory_in_bytes`:: +(Integer) +include::{docdir}/rest-api/common-parms.asciidoc[tag=memory] + +`committed`:: +(Boolean) +include::{docdir}/rest-api/common-parms.asciidoc[tag=committed] + +`search`:: +(Boolean) +include::{docdir}/rest-api/common-parms.asciidoc[tag=segment-search] + +`version`:: +(String) +include::{docdir}/rest-api/common-parms.asciidoc[tag=segment-version] + +`compound`:: +(Boolean) +If `true`, Lucene merged all files from the segment +into a single file to save file descriptors. + +`attributes`:: +(Object) +Contains information about whether high compression was enabled. + + +[[index-segments-api-example]] +==== {api-examples-title} + + +===== Get segment information for a specific index + +[source,console] -------------------------------------------------- GET /test/_segments -------------------------------------------------- -// CONSOLE // TEST[s/^/PUT test\n{"settings":{"number_of_shards":1, "number_of_replicas": 0}}\nPOST test\/test\?refresh\n{"test": "test"}\n/] -// TESTSETUP -For several indices: -[source,js] +===== Get segment information for several indices + +[source,console] -------------------------------------------------- GET /test1,test2/_segments -------------------------------------------------- -// CONSOLE // TEST[s/^/PUT test1\nPUT test2\n/] -Or for all indices: -[source,js] +===== Get segment information for all indices + +[source,console] -------------------------------------------------- GET /_segments -------------------------------------------------- -// CONSOLE +// TEST[s/^/PUT test\n{"settings":{"number_of_shards":1, "number_of_replicas": 0}}\nPOST test\/test\?refresh\n{"test": "test"}\n/] -Response: +The API returns the following response: -[source,js] +[source,console-response] -------------------------------------------------- { "_shards": ... @@ -79,61 +174,23 @@ Response: // TESTRESPONSE[s/: (\-)?[0-9]+/: $body.$_path/] // TESTRESPONSE[s/7\.0\.0/$body.$_path/] -_0:: The key of the JSON document is the name of the segment. This name - is used to generate file names: all files starting with this - segment name in the directory of the shard belong to this segment. - -generation:: A generation number that is basically incremented when needing to - write a new segment. The segment name is derived from this - generation number. - -num_docs:: The number of non-deleted documents that are stored in this segment. - -deleted_docs:: The number of deleted documents that are stored in this segment. - It is perfectly fine if this number is greater than 0, space is - going to be reclaimed when this segment gets merged. - -size_in_bytes:: The amount of disk space that this segment uses, in bytes. - -memory_in_bytes:: Segments need to store some data into memory in order to be - searchable efficiently. This number returns the number of bytes - that are used for that purpose. A value of -1 indicates that - Elasticsearch was not able to compute this number. - -committed:: Whether the segment has been sync'ed on disk. Segments that are - committed would survive a hard reboot. No need to worry in case - of false, the data from uncommitted segments is also stored in - the transaction log so that Elasticsearch is able to replay - changes on the next start. - -search:: Whether the segment is searchable. A value of false would most - likely mean that the segment has been written to disk but no - refresh occurred since then to make it searchable. - -version:: The version of Lucene that has been used to write this segment. - -compound:: Whether the segment is stored in a compound file. When true, this - means that Lucene merged all files from the segment in a single - one in order to save file descriptors. - -attributes:: Contains information about whether high compression was enabled -[float] -==== Verbose mode +===== Verbose mode -To add additional information that can be used for debugging, use the `verbose` flag. +To add additional information that can be used for debugging, +use the `verbose` flag. -NOTE: The format of the additional detail information is labelled as experimental in Lucene and it may change in the future. +experimental::[] -[source,js] +[source,console] -------------------------------------------------- GET /test/_segments?verbose=true -------------------------------------------------- -// CONSOLE +// TEST[continued] -Response: +The API returns the following response: -[source,js] +[source,console-response] -------------------------------------------------- { ... @@ -159,5 +216,4 @@ Response: ... } -------------------------------------------------- -// NOTCONSOLE -//Response is too verbose to be fully shown in documentation, so we just show the relevant bit and don't test the response. +// TESTRESPONSE[skip:Response is too verbose to be fully shown in documentation, so we just show the relevant bit and don't test the response.] diff --git a/docs/reference/rest-api/common-parms.asciidoc b/docs/reference/rest-api/common-parms.asciidoc index 4cfd734df64f8..8b3e57068a38c 100644 --- a/docs/reference/rest-api/common-parms.asciidoc +++ b/docs/reference/rest-api/common-parms.asciidoc @@ -42,6 +42,16 @@ tag::bytes[] (Optional, <>) Unit used to display byte values. end::bytes[] +tag::committed[] +If `true`, +the segments is synced to disk. Segments that are synced can survive a hard reboot. ++ +If `false`, +the data from uncommitted segments is also stored in +the transaction log so that Elasticsearch is able to replay +changes on the next start. +end::committed[] + tag::default_operator[] `default_operator`:: (Optional, string) The default operator for query string query: AND or OR. @@ -54,6 +64,18 @@ tag::df[] given in the query string. end::df[] +tag::docs-count[] +Number of non-deleted documents in the segment, such as `25`. This +number is based on Lucene documents and may include documents from +<> fields. +end::docs-count[] + +tag::docs-deleted[] +Number of deleted documents in the segment, such as `0`. This number +is based on Lucene documents. {es} reclaims the disk space of deleted Lucene +documents when a segment is merged. +end::docs-deleted[] + tag::expand-wildcards[] `expand_wildcards`:: + @@ -75,12 +97,25 @@ Wildcard expressions are not accepted. -- end::expand-wildcards[] +tag::index-alias-filter[] +<> +used to limit the index alias. ++ +If specified, +the index alias only applies to documents returned by the filter. +end::index-alias-filter[] + tag::flat-settings[] `flat_settings`:: (Optional, boolean) If `true`, returns settings in flat format. Defaults to `false`. end::flat-settings[] +tag::generation[] +Generation number, such as `0`. {es} increments this generation number +for each segment written. {es} then uses this number to derive the segment name. +end::generation[] + tag::http-format[] `format`:: (Optional, string) Short version of the @@ -199,6 +234,13 @@ tag::max_docs[] documents. end::max_docs[] +tag::memory[] +Bytes of segment data stored in memory for efficient search, +such as `1264`. ++ +A value of `-1` indicates {es} was unable to compute this number. +end::memory[] + tag::name[] ``:: (Optional, string) Comma-separated list of alias names to return. @@ -242,8 +284,8 @@ end::request_cache[] tag::requests_per_second[] `requests_per_second`:: - (Optional, integer) The throttle for this request in sub-requests per second. - -1 means no throttle. Defaults to 0. +(Optional, integer) The throttle for this request in sub-requests per second. +-1 means no throttle. Defaults to 0. end::requests_per_second[] tag::routing[] @@ -276,6 +318,15 @@ tag::scroll_size[] Defaults to 100. end::scroll_size[] +tag::segment-search[] +If `true`, +the segment is searchable. ++ +If `false`, +the segment has most likely been written to disk +but needs a <> to be searchable. +end::segment-search[] + tag::search_timeout[] `search_timeout`:: (Optional, <> Explicit timeout for each search @@ -289,12 +340,22 @@ tag::search_type[] * `dfs_query_then_fetch` end::search_type[] +tag::segment[] +Name of the segment, such as `_0`. The segment name is derived from +the segment generation and used internally to create file names in the directory +of the shard. +end::segment[] + tag::settings[] `settings`:: (Optional, <>) Configuration options for the index. See <>. end::settings[] +tag::segment-size[] +Disk space used by the segment, such as `50kb`. +end::segment-size[] + tag::slices[] `slices`:: (Optional, integer) The number of slices this task should be divided into. @@ -326,8 +387,8 @@ end::source_includes[] tag::stats[] `stats`:: - (Optional, string) Specific `tag` of the request for logging and statistical - purposes. +(Optional, string) Specific `tag` of the request for logging and statistical +purposes. end::stats[] tag::terminate_after[] @@ -372,6 +433,10 @@ The specified version must match the current version of the document for the request to succeed. end::doc-version[] +tag::segment-version[] +Version of Lucene used to write the segment. +end::segment-version[] + tag::version_type[] `version_type`:: (Optional, enum) Specific version type: `internal`, `external`,