>.
-[source,text]
----------------------------
-[ I'm, so, happy ]
----------------------------
-
-[float]
-=== Configuration
+[source,console]
+----
+PUT /my_index
+{
+ "settings": {
+ "analysis": {
+ "analyzer": {
+ "my_analyzer": {
+ "tokenizer": "keyword",
+ "char_filter": [
+ "html_strip"
+ ]
+ }
+ }
+ }
+ }
+}
+----
-The `html_strip` character filter accepts the following parameter:
+[[analysis-htmlstrip-charfilter-configure-parms]]
+==== Configurable parameters
-[horizontal]
`escaped_tags`::
+(Optional, array of strings)
+Array of HTML elements without enclosing angle brackets (`< >`). The filter
+skips these HTML elements when stripping HTML from the text. For example, a
+value of `[ "p" ]` skips the `` HTML element.
- An array of HTML tags which should not be stripped from the original text.
+[[analysis-htmlstrip-charfilter-customize]]
+==== Customize
-[float]
-=== Example configuration
+To customize the `html_strip` filter, duplicate it to create the basis
+for a new custom token filter. You can modify the filter using its configurable
+parameters.
-In this example, we configure the `html_strip` character filter to leave ``
-tags in place:
+The following <> request
+configures a new <> using a custom
+`html_strip` filter, `my_custom_html_strip_char_filter`.
+
+The `my_custom_html_strip_char_filter` filter skips the removal of the ``
+HTML element.
[source,console]
-----------------------------
+----
PUT my_index
{
"settings": {
@@ -79,49 +111,20 @@ PUT my_index
"analyzer": {
"my_analyzer": {
"tokenizer": "keyword",
- "char_filter": ["my_char_filter"]
+ "char_filter": [
+ "my_custom_html_strip_char_filter"
+ ]
}
},
"char_filter": {
- "my_char_filter": {
+ "my_custom_html_strip_char_filter": {
"type": "html_strip",
- "escaped_tags": ["b"]
+ "escaped_tags": [
+ "b"
+ ]
}
}
}
}
}
-
-POST my_index/_analyze
-{
- "analyzer": "my_analyzer",
- "text": "I'm so happy!
"
-}
-----------------------------
-
-/////////////////////
-
-[source,console-result]
-----------------------------
-{
- "tokens": [
- {
- "token": "\nI'm so happy!\n",
- "start_offset": 0,
- "end_offset": 32,
- "type": "word",
- "position": 0
- }
- ]
-}
-----------------------------
-
-/////////////////////
-
-
-The above example produces the following term:
-
-[source,text]
----------------------------
-[ \nI'm so happy!\n ]
----------------------------
+----