Skip to content

Commit

Permalink
Doc review
Browse files Browse the repository at this point in the history
Signed-off-by: Fanit Kolchina <[email protected]>
  • Loading branch information
kolchfa-aws committed Dec 5, 2024
1 parent 3b96fb3 commit 81e942e
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions _analyzers/tokenizers/uax-url-email.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,11 @@ nav_order: 150

# UAX URL email tokenizer

In addition to normal text, the `uax_url_email` tokenizer is designed to handle URLs, email addresses, and domain names. It is based on the Unicode Text Segmentation algorithm ([UAX #29](https://www.unicode.org/reports/tr29/)), which allows it to tokenize complex text correctly, including URLs and email addresses.
In addition to regular text, the `uax_url_email` tokenizer is designed to handle URLs, email addresses, and domain names. It is based on the Unicode Text Segmentation algorithm ([UAX #29](https://www.unicode.org/reports/tr29/)), which allows it to correctly tokenize complex text, including URLs and email addresses.

Check failure on line 10 in _analyzers/tokenizers/uax-url-email.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.SpacingWords] There should be one space between words in 'to correctly'. Raw Output: {"message": "[OpenSearch.SpacingWords] There should be one space between words in 'to correctly'.", "location": {"path": "_analyzers/tokenizers/uax-url-email.md", "range": {"start": {"line": 10, "column": 246}}}, "severity": "ERROR"}

## Example usage

The following example request creates a new index named `my_index` and configures an analyzer with `uax_url_email` tokenizer:
The following example request creates a new index named `my_index` and configures an analyzer with a `uax_url_email` tokenizer:

```json
PUT /my_index
Expand Down Expand Up @@ -74,11 +74,11 @@ The response contains the generated tokens:
}
```

## Configuration
## Parameters

The `uax_url_email` tokenizer can be configured with the following parameter.

Parameter | Required/Optional | Data type | Description
:--- | :--- | :--- | :---
`max_token_length` | Optional | Integer | Sets the maximum length of the produced token. If this length is exceeded, the token is split into multiple tokens at length configured in `max_token_length`. Default is `255`.
`max_token_length` | Optional | Integer | Sets the maximum length of the produced token. If this length is exceeded, the token is split into multiple tokens at the length configured in `max_token_length`. Default is `255`.

0 comments on commit 81e942e

Please sign in to comment.