Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ported over some previous commits #3

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 56 additions & 14 deletions docs/alerting/monitors.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,8 @@ has_children: false
Term | Definition
:--- | :---
Monitor | A job that runs on a defined schedule and queries OpenSearch. The results of these queries are then used as input for one or more *triggers*.
Trigger | Conditions that, if met, generate *alerts* and can perform some *action*.
Alert | A notification that a monitor's trigger condition has been met.
Trigger | Conditions that, if met, generate *alerts*.
Alert | An event associated with a trigger. When an alert is created, the trigger performs *actions*, which can include sending a notification.
Action | The information that you want the monitor to send out after being triggered. Actions have a *destination*, a message subject, and a message body.
Destination | A reusable location for an action, such as Amazon Chime, Slack, or a webhook URL.

Expand Down Expand Up @@ -256,18 +256,60 @@ if (score > 99) {
}
```


#### Available variables

Variable | Description
:--- | :---
`ctx.results` | An array with one element (i.e. `ctx.results[0]`). Contains the query results. This variable is empty if the trigger was unable to retrieve results. See `ctx.error`.
`ctx.monitor` | Includes `ctx.monitor.name`, `ctx.monitor.type`, `ctx.monitor.enabled`, `ctx.monitor.enabled_time`, `ctx.monitor.schedule`, `ctx.monitor.inputs`, `triggers` and `ctx.monitor.last_update_time`.
`ctx.trigger` | Includes `ctx.trigger.name`, `ctx.trigger.severity`, `ctx.trigger.condition`, and `ctx.trigger.actions`.
`ctx.periodStart` | Unix timestamp for the beginning of the period during which the alert triggered. For example, if a monitor runs every ten minutes, a period might begin at 10:40 and end at 10:50.
`ctx.periodEnd` | The end of the period during which the alert triggered.
`ctx.error` | The error message if the trigger was unable to retrieve results or unable to evaluate the trigger, typically due to a compile error or null pointer exception. Null otherwise.
`ctx.alert` | The current, active alert (if it exists). Includes `ctx.alert.id`, `ctx.alert.version`, and `ctx.alert.isAcknowledged`. Null if no alert is active.
Below are some variables you can include in your message using Mustache templates to see more information about your monitors.

### Available variables

#### Monitor variables

Variable | Data Type | Description
:--- | :--- | :---
`ctx.monitor` | JSON | Includes `ctx.monitor.name`, `ctx.monitor.type`, `ctx.monitor.enabled`, `ctx.monitor.enabled_time`, `ctx.monitor.schedule`, `ctx.monitor.inputs`, `triggers` and `ctx.monitor.last_update_time`.
`ctx.monitor.user` | JSON | Includes information about the user who created the monitor. Includes `ctx.monitor.user.backend_roles` and `ctx.monitor.user.roles`, which are arrays that contain the backend roles and roles assigned to the user. See [alerting security](../security.md/) for more information.
`ctx.monitor.enabled` | Boolean | Whether the monitor is enabled.
`ctx.monitor.enabled_time` | Milliseconds | Unix epoch time of when the monitor was last enabled.
`ctx.monitor.schedule` | JSON | Contains a schedule of how often or when the monitor should run.
`ctx.monitor.schedule.period.interval` | Integer | The interval at which the monitor runs.
`ctx.monitor.schedule.period.unit` | String | The interval's unit of time.
`ctx.monitor.inputs` | Array | An array that contains the indices and definition used to create the monitor.
`ctx.monitor.inputs.search.indices` | Array | An array that contains the indices the monitor observes.
`ctx.monitor.inputs.search.query` | N/A | The definition used to define the monitor.

#### Trigger variables

Variable | Data Type | Description
:--- | :--- | : ---
`ctx.trigger.id` | String | The trigger's ID.
`ctx.trigger.name` | String | The trigger's name.
`ctx.trigger.severity` | String | The trigger's severity.
`ctx.trigger.condition`| JSON | Contains the Painless script used when creating the monitor.
`ctx.trigger.condition.script.source` | String | The language used to define the script. Must be painless.
`ctx.trigger.condition.script.lang` | String | The script used to define the trigger.
`ctx.trigger.actions`| Array | An array with one element that contains information about the action the monitor needs to trigger.

#### Action variables

Variable | Data Type | Description
:--- | :--- | : ---
`ctx.trigger.actions.id` | String | The action's ID.
`ctx.trigger.actions.name` | String | The action's name.
`ctx.trigger.actions.destination_id`| String | The alert destination's ID.
`ctx.trigger.actions.message_template.source` | String | The message to send in the alert.
`ctx.trigger.actions.message_template.lang` | String | The scripting language used to define the message. Must be Mustache.
`ctx.trigger.actions.throttle_enabled` | Boolean | Whether throttling is enabled for this trigger. See [adding actions](#add-actions/) for more information about throttling.
`ctx.trigger.actions.subject_template.source` | String | The message's subject in the alert.
`ctx.trigger.actions.subject_template.lang` | String | The scripting language used to define the subject. Must be mustache.

#### Other variables

Variable | Data Type | Description
:--- | :--- : :---
`ctx.results` | Array | An array with one element (i.e. `ctx.results[0]`). Contains the query results. This variable is empty if the trigger was unable to retrieve results. See `ctx.error`.
`ctx.last_update_time` | Milliseconds | Unix epoch time of when the monitor was last updated.
`ctx.periodStart` | String | Unix timestamp for the beginning of the period during which the alert triggered. For example, if a monitor runs every ten minutes, a period might begin at 10:40 and end at 10:50.
`ctx.periodEnd` | String | The end of the period during which the alert triggered.
`ctx.error` | String | The error message if the trigger was unable to retrieve results or unable to evaluate the trigger, typically due to a compile error or null pointer exception. Null otherwise.
`ctx.alert` | JSON | The current, active alert (if it exists). Includes `ctx.alert.id`, `ctx.alert.version`, and `ctx.alert.isAcknowledged`. Null if no alert is active.


---
Expand Down
108 changes: 105 additions & 3 deletions docs/knn/knn-score-script.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ has_math: true
# Exact k-NN with Scoring Script
The k-NN plugin implements the OpenSearch score script plugin that you can use to find the exact k-nearest neighbors to a given query point. Using the k-NN score script, you can apply a filter on an index before executing the nearest neighbor search. This is useful for dynamic search cases where the index body may vary based on other conditions. Because this approach executes a brute force search, it does not scale as well as the [Approximate approach](../approximate-knn). In some cases, it may be better to think about refactoring your workflow or index structure to use the Approximate approach instead of this approach.

## Getting started with the score script
## Getting started with the score script for vectors

Similar to approximate nearest neighbor search, in order to use the score script on a body of vectors, you must first create an index with one or more `knn_vector` fields. If you intend to just use the script score approach (and not the approximate approach) `index.knn` can be set to `false` and `index.knn.space_type` does not need to be set. The space type can be chosen during search. See the [spaces section](#spaces) to see what spaces the k-NN score script suppports. Here is an example that creates an index with two `knn_vector` fields:

Expand All @@ -32,8 +32,6 @@ PUT my-knn-index-1
}
```

*Note* -- For binary spaces, such as the Hamming bit space, `type` needs to be either `binary` or `long`. The binary data then needs to be encoded either as a base64 string or as a long (if the data is 64 bits or less).

If you *only* want to use the score script, you can omit `"index.knn": true`. The benefit of this approach is faster indexing speed and lower memory usage, but you lose the ability to perform standard k-NN queries on the index.
{: .tip}

Expand Down Expand Up @@ -172,6 +170,110 @@ GET my-knn-index-2/_search
}
```

## Getting started with the score script for binary data
The k-NN score script also allows you to run k-NN search on your binary data with the Hamming distance space.
In order to use Hamming distance, the field of interest must have either a `binary` or `long` field type. If you're using `binary` type, the data must be a base64-encoded string.

This example shows how to use the Hamming distance space with a `binary` field type:

```json
PUT my-index
{
"mappings": {
"properties": {
"my_binary": {
"type": "binary",
"doc_values": true
},
"color": {
"type": "keyword"
}
}
}
}
```

Then add some documents:

```json
POST _bulk
{ "index": { "_index": "my-index", "_id": "1" } }
{ "my_binary": "SGVsbG8gV29ybGQh", "color" : "RED" }
{ "index": { "_index": "my-index", "_id": "2" } }
{ "my_binary": "ay1OTiBjdXN0b20gc2NvcmluZyE=", "color" : "RED" }
{ "index": { "_index": "my-index", "_id": "3" } }
{ "my_binary": "V2VsY29tZSB0byBrLU5O", "color" : "RED" }
{ "index": { "_index": "my-index", "_id": "4" } }
{ "my_binary": "SSBob3BlIHRoaXMgaXMgaGVscGZ1bA==", "color" : "BLUE" }
{ "index": { "_index": "my-index", "_id": "5" } }
{ "my_binary": "QSBjb3VwbGUgbW9yZSBkb2NzLi4u", "color" : "BLUE" }
{ "index": { "_index": "my-index", "_id": "6" } }
{ "my_binary": "TGFzdCBvbmUh", "color" : "BLUE" }

```

Finally, use the `script_score` query to pre-filter your documents before identifying nearest neighbors:

```json
GET my-index/_search
{
"size": 2,
"query": {
"script_score": {
"query": {
"bool": {
"filter": {
"term": {
"color": "BLUE"
}
}
}
},
"script": {
"lang": "knn",
"source": "knn_score",
"params": {
"field": "my_binary",
"query_value": "U29tZXRoaW5nIEltIGxvb2tpbmcgZm9y",
"space_type": "hammingbit"
}
}
}
}
}
```

Similarly, you can encode your data with the `long` field and run a search:

```json
GET my-long-index/_search
{
"size": 2,
"query": {
"script_score": {
"query": {
"bool": {
"filter": {
"term": {
"color": "BLUE"
}
}
}
},
"script": {
"lang": "knn",
"source": "knn_score",
"params": {
"field": "my_long",
"query_value": 23,
"space_type": "hammingbit"
}
}
}
}
}
```

## Spaces

A space corresponds to the function used to measure the distance between 2 points in order to determine the k-nearest neighbors. From the k-NN perspective, a lower score equates to a closer and better result. This is the opposite of how OpenSearch scores results, where a greater score equates to a better result. We include the conversions to OpenSearch scores in the table below:
Expand Down