Skip to content

Commit

Permalink
[DOCS] Streamline analyzer mapping parm def (elastic#51874)
Browse files Browse the repository at this point in the history
Simplifies the `analyzer` mapping parameter definition to remove
duplicated analysis content and examples.
  • Loading branch information
jrodewig committed Mar 18, 2020
1 parent e5aa090 commit 0732475
Showing 1 changed file with 17 additions and 75 deletions.
92 changes: 17 additions & 75 deletions docs/reference/mapping/params/analyzer.asciidoc
Original file line number Diff line number Diff line change
@@ -1,81 +1,23 @@
[[analyzer]]
=== `analyzer`

The values of <<text,`text`>> fields are passed through an
<<analysis,analyzer>> to convert the string into a stream of _tokens_ or
_terms_. For instance, the string `"The quick Brown Foxes."` may, depending
on which analyzer is used, be analyzed to the tokens: `quick`, `brown`,
`fox`. These are the actual terms that are indexed for the field, which makes
it possible to search efficiently for individual words _within_ big blobs of
text.

This analysis process needs to happen not just at index time, but also at
query time: the query string needs to be passed through the same (or a
similar) analyzer so that the terms that it tries to find are in the same
format as those that exist in the index.

Elasticsearch ships with a number of <<analysis-analyzers,pre-defined analyzers>>,
which can be used without further configuration. It also ships with many
<<analysis-charfilters,character filters>>, <<analysis-tokenizers,tokenizers>>,
and <<analysis-tokenfilters>> which can be combined to configure
custom analyzers per index.

Analyzers can be specified per-query, per-field or per-index. At index time,
Elasticsearch will look for an analyzer in this order:

* The `analyzer` defined in the field mapping.
* An analyzer named `default` in the index settings.
* The <<analysis-standard-analyzer,`standard`>> analyzer.

At query time, there are a few more layers:

* The `analyzer` defined in a <<full-text-queries,full-text query>>.
* The `search_analyzer` defined in the field mapping.
* The `analyzer` defined in the field mapping.
* An analyzer named `default_search` in the index settings.
* An analyzer named `default` in the index settings.
* The <<analysis-standard-analyzer,`standard`>> analyzer.

The easiest way to specify an analyzer for a particular field is to define it
in the field mapping, as follows:

[source,console]
--------------------------------------------------
PUT /my_index
{
"mappings": {
"properties": {
"text": { <1>
"type": "text",
"fields": {
"english": { <2>
"type": "text",
"analyzer": "english"
}
}
}
}
}
}
GET my_index/_analyze <3>
{
"field": "text",
"text": "The quick Brown Foxes."
}
GET my_index/_analyze <4>
{
"field": "text.english",
"text": "The quick Brown Foxes."
}
--------------------------------------------------

<1> The `text` field uses the default `standard` analyzer`.
<2> The `text.english` <<multi-fields,multi-field>> uses the `english` analyzer, which removes stop words and applies stemming.
<3> This returns the tokens: [ `the`, `quick`, `brown`, `foxes` ].
<4> This returns the tokens: [ `quick`, `brown`, `fox` ].

[IMPORTANT]
====
Only <<text,`text`>> fields support the `analyzer` mapping parameter.
====

The `analyzer` parameter specifies the <<analyzer-anatomy,analyzer>> used for
<<analysis,text analysis>> when indexing or searching a `text` field.

Unless overridden with the <<search-analyzer,`search_analyzer`>> mapping
parameter, this analyzer is used for both <<analysis-index-search-time,index and
search analysis>>. See <<specify-analyzer>>.

[TIP]
====
We recommend testing analyzers before using them in production. See
<<test-analyzer>>.
====

[[search-quote-analyzer]]
==== `search_quote_analyzer`
Expand Down

0 comments on commit 0732475

Please sign in to comment.