Skip to content

Commit

Permalink
Add PPL and SQL section (#1111)
Browse files Browse the repository at this point in the history
* Merge pull request #1 from Yury-Fridlyand/dev-update-sql-relevance-docs

Update SQL plugin relevance functions documentation.

Co-authored-by: MaxKsyunz <[email protected]>
Signed-off-by: Yury Fridlyand <[email protected]>

* Address PR feedback.

Signed-off-by: Yury Fridlyand <[email protected]>

* Address PR feedback by @joshuali925.

Signed-off-by: Yury Fridlyand <[email protected]>

* Remove PPL page from Observability Plugin. Add link to Observability page. Make some simple formatting changes

Signed-off-by: Naarcha-AWS <[email protected]>

* Reword paragraph

Signed-off-by: Naarcha-AWS <[email protected]>

* Adds SQL and PPL API and other SQL plugin changes

Signed-off-by: Fanit Kolchina <[email protected]>

* Formatting changes

Signed-off-by: Fanit Kolchina <[email protected]>

* Incorporates editorial comments

Signed-off-by: Fanit Kolchina <[email protected]>

Signed-off-by: Yury Fridlyand <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Co-authored-by: Yury Fridlyand <[email protected]>
Co-authored-by: MaxKsyunz <[email protected]>
Co-authored-by: Fanit Kolchina <[email protected]>
(cherry picked from commit c69f860)
  • Loading branch information
Naarcha-AWS authored and github-actions[bot] committed Sep 26, 2022
1 parent de6a8ed commit 644cbb5
Show file tree
Hide file tree
Showing 41 changed files with 2,097 additions and 1,357 deletions.
2 changes: 1 addition & 1 deletion _ml-commons-plugin/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ has_toc: false

ML Commons for OpenSearch eases the development of machine learning features by providing a set of common machine learning (ML) algorithms through transport and REST API calls. Those calls choose the right nodes and resources for each ML request and monitors ML tasks to ensure uptime. This allows you to leverage existing open-source ML algorithms and reduce the effort required to develop new ML features.

Interaction with the ML Commons plugin occurs through either the [REST API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api) or [AD]({{site.url}}{{site.baseurl}}/observability-plugin/ppl/commands#ad) and [kmeans]({{site.url}}{{site.baseurl}}/observability-plugin/ppl/commands#kmeans) Piped Processing Language (PPL) commands.
Interaction with the ML Commons plugin occurs through either the [REST API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api) or [`ad`]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/functions#ad) and [`kmeans`]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/functions#kmeans) Piped Processing Language (PPL) commands.

Models [trained]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api#train-model) through the ML Commons plugin support model-based algorithms such as kmeans. After you've trained a model enough so that it meets your precision requirements, you can apply the model to [predict]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api#predict) new data safely.

Expand Down
4 changes: 2 additions & 2 deletions _observability-plugin/app-analytics.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ To get started, choose **Observability** in OpenSearch Dashboards, and then choo
2. Enter a name for your application and optionally add a description.
3. Do at least one of the following:

- Use [PPL]({{site.url}}{{site.baseurl}}/observability-plugin/ppl/index) to specify the base query.
- Use [PPL]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/index) to specify the base query.

You can't change the base query after the application is created.
{: .note }
Expand All @@ -31,7 +31,7 @@ You can't change the base query after the application is created.
### Create a visualization

1. Choose the **Log Events** tab.
1. Use [PPL]({{site.url}}{{site.baseurl}}/observability-plugin/ppl/index) to build upon your base query.
1. Use [PPL]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/index) to build upon your base query.
1. Choose the **Visualizations** tab to see your visualizations.
1. Expand the **Save** dropdown menu, enter a name for your visualization, then choose **Save**.

Expand Down
4 changes: 2 additions & 2 deletions _observability-plugin/event-analytics.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ nav_order: 10

# Event analytics

Event analytics in observability is where you can use [Piped Processing Language]({{site.url}}{{site.baseurl}}/observability-plugin/ppl/index) (PPL) queries to build and view different visualizations of your data.
Event analytics in Observability is where you can use [Piped Processing Language]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/index) (PPL) queries to build and view different visualizations of your data.

## Get started with event analytics

Expand All @@ -24,7 +24,7 @@ source = opensearch_dashboards_sample_data_logs | fields host | stats count()

By default, Dashboards shows results from the last 15 minutes of your data. To see data from a different timeframe, use the date and time selector.

For more information about building PPL queries, see [Piped Processing Language]({{site.url}}{{site.baseurl}}/observability-plugin/ppl/index).
For more information about building PPL queries, see [Piped Processing Language]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/index).

## Save a visualization

Expand Down
3 changes: 1 addition & 2 deletions _observability-plugin/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@ nav_order: 1
has_children: false
redirect_from:
- /observability-plugin/
- /observability-plugin/
---

# About Observability
Expand All @@ -16,7 +15,7 @@ Observability is collection of plugins and applications that let you visualize d

Your experience of exploring data might differ, but if you're new to exploring data to create visualizations, we recommend trying a workflow like the following:

1. Explore data over a certain timeframe using [Piped Processing Language]({{site.url}}{{site.baseurl}}/observability-plugin/ppl/index).
1. Explore data within a certain timeframe using [Piped Processing Language]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/index).
2. Use [event analytics]({{site.url}}{{site.baseurl}}/observability-plugin/event-analytics) to turn data-driven events into visualizations.
![Sample Event Analytics View]({{site.url}}{{site.baseurl}}/images/event-analytics.png)
3. Create [operational panels]({{site.url}}{{site.baseurl}}/observability-plugin/operational-panels) and add visualizations to compare data the way you like.
Expand Down
2 changes: 1 addition & 1 deletion _observability-plugin/operational-panels.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ nav_order: 30

# Operational panels

Operational panels in OpenSearch Dashboards are collections of visualizations generated using [Piped Processing Language]({{site.url}}{{site.baseurl}}/observability-plugin/ppl/index) (PPL) queries.
Operational panels in OpenSearch Dashboards are collections of visualizations generated using [Piped Processing Language]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/index) (PPL) queries.

## Get started with operational panels

Expand Down
36 changes: 0 additions & 36 deletions _observability-plugin/ppl/datatypes.md

This file was deleted.

24 changes: 0 additions & 24 deletions _observability-plugin/ppl/endpoint.md

This file was deleted.

10 changes: 0 additions & 10 deletions _observability-plugin/ppl/functions.md

This file was deleted.

71 changes: 0 additions & 71 deletions _observability-plugin/ppl/protocol.md

This file was deleted.

49 changes: 0 additions & 49 deletions _observability-plugin/ppl/settings.md

This file was deleted.

2 changes: 1 addition & 1 deletion _opensearch/data-streams.md
Original file line number Diff line number Diff line change
Expand Up @@ -262,4 +262,4 @@ You can use wildcards to delete more than one data stream.

We recommend deleting data from a data stream using an ISM policy.

You can also use [asynchronous search]({{site.url}}{{site.baseurl}}/search-plugins/async/index/) and [SQL]({{site.url}}{{site.baseurl}}/search-plugins/sql/index/) and [PPL]({{site.url}}{{site.baseurl}}/observability-plugin/ppl/index/) to query your data stream directly. You can also use the security plugin to define granular permissions on the data stream name.
You can also use [asynchronous search]({{site.url}}{{site.baseurl}}/search-plugins/async/index/), [SQL]({{site.url}}{{site.baseurl}}/search-plugins/sql/index/), and [PPL]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/index/) to query your data stream directly. You can also use the security plugin to define granular permissions for the data stream name.
69 changes: 54 additions & 15 deletions _search-plugins/sql/aggregations.md
Original file line number Diff line number Diff line change
@@ -1,52 +1,91 @@
---
layout: default
title: Aggregation Functions
title: Aggregate Functions
parent: SQL
nav_order: 11
---

# Aggregation functions
# Aggregate functions

Aggregate functions use the `GROUP BY` clause to group sets of values into subsets.

OpenSearch supports the following aggregate functions:

Function | Description
:--- | :---
AVG | Returns the average of the results.
COUNT | Returns the number of results.
SUM | Returns the sum of the results.
MIN | Returns the minimum of the results.
MAX | Returns the maximum of the results.
VAR_POP or VARIANCE | Returns the population variance of the results after discarding nulls.
VAR_SAMP | Returns the sample variance of the results after discarding nulls.
STD or STDDEV | Returns the sample standard deviation of the results. Returns 0 when it has only one row of results.
STDDEV_POP | Returns the population standard deviation of the results.
STDDEV_SAMP | Returns the sample standard deviation of the results. Returns null when it has only one row of results.


The examples below reference an `accounts` table. You can try out the examples by indexing the following documents into OpenSearch using the bulk index operation:

```json
```json
PUT accounts/_bulk?refresh
{"index":{"_id":"1"}}
{"account_number":1,"balance":39225,"firstname":"Amber","lastname":"Duke","age":32,"gender":"M","address":"880 Holmes Lane","employer":"Pyrami","email":"[email protected]","city":"Brogan","state":"IL","acct_open_date":"2008-01-23"}
{"index":{"_id":"6"}}
{"account_number":6,"balance":5686,"firstname":"Hattie","lastname":"Bond","age":36,"gender":"M","address":"671 Bristol Street","employer":"Netagy","email":"[email protected]","city":"Dante","state":"TN","acct_open_date":"2008-06-07"}
{"index":{"_id":"13"}}
{"account_number":13,"balance":32838,"firstname":"Nanette","lastname":"Bates","age":28,"gender":"F","address":"789 Madison Street","employer":"Quility","email":"[email protected]","city":"Nogal","state":"VA","acct_open_date":"2010-04-11"}
{"index":{"_id":"18"}}
{"account_number":18,"balance":4180,"firstname":"Dale","lastname":"Adams","age":33,"gender":"M","address":"467 Hutchinson Court","email":"[email protected]","city":"Orick","state":"MD","acct_open_date":"2022-11-05"}
```



## Group By

Use the `GROUP BY` clause as an identifier, ordinal, or expression.

### Identifier

The following query returns the gender and average age of customers in the `accounts` index and groups the results by gender:

```sql
SELECT gender, sum(age) FROM accounts GROUP BY gender;
SELECT gender, avg(age) FROM accounts GROUP BY gender;
```

| gender | sum (age)
| gender | avg(age)
:--- | :---
F | 28 |
M | 101 |
F | 28.0 |
M | 33.666666666666664 |

### Ordinal

The following query returns the gender and average age of customers in the `accounts` index. It groups the results by the first column of the result set, which in this case is `gender`:

```sql
SELECT gender, sum(age) FROM accounts GROUP BY 1;
SELECT gender, avg(age) FROM accounts GROUP BY 1;
```

| gender | sum (age)
:--- | :---
F | 28 |
M | 101 |
F | 28.0 |
M | 33.666666666666664 |

### Expression

The following query

```sql
SELECT abs(account_number), sum(age) FROM accounts GROUP BY abs(account_number);
SELECT abs(account_number), avg(age) FROM accounts GROUP BY abs(account_number);
```

| abs(account_number) | sum (age)
| abs(account_number) | avg(age)
:--- | :---
| 1 | 32 |
| 13 | 28 |
| 18 | 33 |
| 6 | 36 |
| 1 | 32.0 |
| 13 | 28.0 |
| 18 | 33.0 |
| 6 | 36.0 |

## Aggregation

Expand Down
Loading

0 comments on commit 644cbb5

Please sign in to comment.