Add PPL and SQL section (#1111)

* Merge pull request #1 from Yury-Fridlyand/dev-update-sql-relevance-docs Update SQL plugin relevance functions documentation. Co-authored-by: MaxKsyunz <[email protected]> Signed-off-by: Yury Fridlyand <[email protected]> * Address PR feedback. Signed-off-by: Yury Fridlyand <[email protected]> * Address PR feedback by @joshuali925. Signed-off-by: Yury Fridlyand <[email protected]> * Remove PPL page from Observability Plugin. Add link to Observability page. Make some simple formatting changes Signed-off-by: Naarcha-AWS <[email protected]> * Reword paragraph Signed-off-by: Naarcha-AWS <[email protected]> * Adds SQL and PPL API and other SQL plugin changes Signed-off-by: Fanit Kolchina <[email protected]> * Formatting changes Signed-off-by: Fanit Kolchina <[email protected]> * Incorporates editorial comments Signed-off-by: Fanit Kolchina <[email protected]> Signed-off-by: Yury Fridlyand <[email protected]> Signed-off-by: Naarcha-AWS <[email protected]> Signed-off-by: Fanit Kolchina <[email protected]> Co-authored-by: Yury Fridlyand <[email protected]> Co-authored-by: MaxKsyunz <[email protected]> Co-authored-by: Fanit Kolchina <[email protected]> (cherry picked from commit c69f860)
opensearch-project · Sep 26, 2022 · 644cbb5 · 644cbb5
1 parent de6a8ed
commit 644cbb5
Show file tree

Hide file tree

Showing 41 changed files with 2,097 additions and 1,357 deletions.
diff --git a/_ml-commons-plugin/index.md b/_ml-commons-plugin/index.md
@@ -10,7 +10,7 @@ has_toc: false
 
 ML Commons for OpenSearch eases the development of machine learning features by providing a set of common machine learning (ML) algorithms through transport and REST API calls. Those calls choose the right nodes and resources for each ML request and monitors ML tasks to ensure uptime. This allows you to leverage existing open-source ML algorithms and reduce the effort required to develop new ML features.
 
-Interaction with the ML Commons plugin occurs through either the [REST API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api) or [AD]({{site.url}}{{site.baseurl}}/observability-plugin/ppl/commands#ad) and [kmeans]({{site.url}}{{site.baseurl}}/observability-plugin/ppl/commands#kmeans) Piped Processing Language (PPL) commands.
+Interaction with the ML Commons plugin occurs through either the [REST API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api) or [`ad`]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/functions#ad) and [`kmeans`]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/functions#kmeans) Piped Processing Language (PPL) commands.
 
 Models [trained]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api#train-model) through the ML Commons plugin support model-based algorithms such as kmeans. After you've trained a model enough so that it meets your precision requirements, you can apply the model to [predict]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api#predict) new data safely. 
 

diff --git a/_observability-plugin/app-analytics.md b/_observability-plugin/app-analytics.md
@@ -18,7 +18,7 @@ To get started, choose **Observability** in OpenSearch Dashboards, and then choo
 2. Enter a name for your application and optionally add a description.
 3. Do at least one of the following:
 
-- Use [PPL]({{site.url}}{{site.baseurl}}/observability-plugin/ppl/index) to specify the base query.
+- Use [PPL]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/index) to specify the base query.
 
 You can't change the base query after the application is created.
 {: .note }
@@ -31,7 +31,7 @@ You can't change the base query after the application is created.
 ### Create a visualization
 
 1. Choose the **Log Events** tab.
-1. Use [PPL]({{site.url}}{{site.baseurl}}/observability-plugin/ppl/index) to build upon your base query.
+1. Use [PPL]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/index) to build upon your base query.
 1. Choose the **Visualizations** tab to see your visualizations.
 1. Expand the **Save** dropdown menu, enter a name for your visualization, then choose **Save**.
 

diff --git a/_observability-plugin/event-analytics.md b/_observability-plugin/event-analytics.md
@@ -6,7 +6,7 @@ nav_order: 10
 
 # Event analytics
 
-Event analytics in observability is where you can use [Piped Processing Language]({{site.url}}{{site.baseurl}}/observability-plugin/ppl/index) (PPL) queries to build and view different visualizations of your data.
+Event analytics in Observability is where you can use [Piped Processing Language]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/index) (PPL) queries to build and view different visualizations of your data.
 
 ## Get started with event analytics
 
@@ -24,7 +24,7 @@ source = opensearch_dashboards_sample_data_logs | fields host | stats count()
 
 By default, Dashboards shows results from the last 15 minutes of your data. To see data from a different timeframe, use the date and time selector.
 
-For more information about building PPL queries, see [Piped Processing Language]({{site.url}}{{site.baseurl}}/observability-plugin/ppl/index).
+For more information about building PPL queries, see [Piped Processing Language]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/index).
 
 ## Save a visualization
 

diff --git a/_observability-plugin/index.md b/_observability-plugin/index.md
@@ -5,7 +5,6 @@ nav_order: 1
 has_children: false
 redirect_from:
   - /observability-plugin/
-  - /observability-plugin/
 ---
 
 # About Observability
@@ -16,7 +15,7 @@ Observability is collection of plugins and applications that let you visualize d
 
 Your experience of exploring data might differ, but if you're new to exploring data to create visualizations, we recommend trying a workflow like the following:
 
-1. Explore data over a certain timeframe using [Piped Processing Language]({{site.url}}{{site.baseurl}}/observability-plugin/ppl/index).
+1. Explore data within a certain timeframe using [Piped Processing Language]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/index).
 2. Use [event analytics]({{site.url}}{{site.baseurl}}/observability-plugin/event-analytics) to turn data-driven events into visualizations.
   ![Sample Event Analytics View]({{site.url}}{{site.baseurl}}/images/event-analytics.png)
 3. Create [operational panels]({{site.url}}{{site.baseurl}}/observability-plugin/operational-panels) and add visualizations to compare data the way you like.

diff --git a/_observability-plugin/operational-panels.md b/_observability-plugin/operational-panels.md
@@ -6,7 +6,7 @@ nav_order: 30
 
 # Operational panels
 
-Operational panels in OpenSearch Dashboards are collections of visualizations generated using [Piped Processing Language]({{site.url}}{{site.baseurl}}/observability-plugin/ppl/index) (PPL) queries.
+Operational panels in OpenSearch Dashboards are collections of visualizations generated using [Piped Processing Language]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/index) (PPL) queries.
 
 ## Get started with operational panels
 

diff --git a/_observability-plugin/ppl/datatypes.md b/_observability-plugin/ppl/datatypes.md
diff --git a/_observability-plugin/ppl/endpoint.md b/_observability-plugin/ppl/endpoint.md
diff --git a/_observability-plugin/ppl/functions.md b/_observability-plugin/ppl/functions.md
diff --git a/_observability-plugin/ppl/protocol.md b/_observability-plugin/ppl/protocol.md
diff --git a/_observability-plugin/ppl/settings.md b/_observability-plugin/ppl/settings.md
diff --git a/_opensearch/data-streams.md b/_opensearch/data-streams.md
@@ -262,4 +262,4 @@ You can use wildcards to delete more than one data stream.
 
 We recommend deleting data from a data stream using an ISM policy.
 
-You can also use [asynchronous search]({{site.url}}{{site.baseurl}}/search-plugins/async/index/) and [SQL]({{site.url}}{{site.baseurl}}/search-plugins/sql/index/) and [PPL]({{site.url}}{{site.baseurl}}/observability-plugin/ppl/index/) to query your data stream directly. You can also use the security plugin to define granular permissions on the data stream name.
+You can also use [asynchronous search]({{site.url}}{{site.baseurl}}/search-plugins/async/index/), [SQL]({{site.url}}{{site.baseurl}}/search-plugins/sql/index/), and [PPL]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/index/) to query your data stream directly. You can also use the security plugin to define granular permissions for the data stream name.
diff --git a/_search-plugins/sql/aggregations.md b/_search-plugins/sql/aggregations.md
@@ -1,52 +1,91 @@
 ---
 layout: default
-title: Aggregation Functions
+title: Aggregate Functions
 parent: SQL
 nav_order: 11
 ---
 
-# Aggregation functions
+# Aggregate functions
 
 Aggregate functions use the `GROUP BY` clause to group sets of values into subsets.
 
+OpenSearch supports the following aggregate functions:
+
+Function | Description
+:--- | :---
+AVG | Returns the average of the results.
+COUNT | Returns the number of results.
+SUM | Returns the sum of the results.
+MIN | Returns the minimum of the results.
+MAX | Returns the maximum of the results.
+VAR_POP or VARIANCE | Returns the population variance of the results after discarding nulls.
+VAR_SAMP | Returns the sample variance of the results after discarding nulls.
+STD or STDDEV | Returns the sample standard deviation of the results. Returns 0 when it has only one row of results.
+STDDEV_POP | Returns the population standard deviation of the results.
+STDDEV_SAMP | Returns the sample standard deviation of the results. Returns null when it has only one row of results.
+
+
+The examples below reference an `accounts` table. You can try out the examples by indexing the following documents into OpenSearch using the bulk index operation:
+
+```json
+```json
+PUT accounts/_bulk?refresh
+{"index":{"_id":"1"}}
+{"account_number":1,"balance":39225,"firstname":"Amber","lastname":"Duke","age":32,"gender":"M","address":"880 Holmes Lane","employer":"Pyrami","email":"[email protected]","city":"Brogan","state":"IL","acct_open_date":"2008-01-23"}
+{"index":{"_id":"6"}}
+{"account_number":6,"balance":5686,"firstname":"Hattie","lastname":"Bond","age":36,"gender":"M","address":"671 Bristol Street","employer":"Netagy","email":"[email protected]","city":"Dante","state":"TN","acct_open_date":"2008-06-07"}
+{"index":{"_id":"13"}}
+{"account_number":13,"balance":32838,"firstname":"Nanette","lastname":"Bates","age":28,"gender":"F","address":"789 Madison Street","employer":"Quility","email":"[email protected]","city":"Nogal","state":"VA","acct_open_date":"2010-04-11"}
+{"index":{"_id":"18"}}
+{"account_number":18,"balance":4180,"firstname":"Dale","lastname":"Adams","age":33,"gender":"M","address":"467 Hutchinson Court","email":"[email protected]","city":"Orick","state":"MD","acct_open_date":"2022-11-05"}
+```
+
+
+
 ## Group By
 
 Use the `GROUP BY` clause as an identifier, ordinal, or expression.
 
 ### Identifier
 
+The following query returns the gender and average age of customers in the `accounts` index and groups the results by gender:
+
 ```sql
-SELECT gender, sum(age) FROM accounts GROUP BY gender;
+SELECT gender, avg(age) FROM accounts GROUP BY gender;
 ```
 
-| gender | sum (age)
+| gender | avg(age)
 :--- | :---
-F | 28 |
-M | 101 |
+F | 28.0  |
+M | 33.666666666666664 |
 
 ### Ordinal
 
+The following query returns the gender and average age of customers in the `accounts` index. It groups the results by the first column of the result set, which in this case is `gender`:
+
 ```sql
-SELECT gender, sum(age) FROM accounts GROUP BY 1;
+SELECT gender, avg(age) FROM accounts GROUP BY 1;
 ```
 
 | gender | sum (age)
 :--- | :---
-F | 28 |
-M | 101 |
+F | 28.0  |
+M | 33.666666666666664 |
 
 ### Expression
 
+The following query 
+
 ```sql
-SELECT abs(account_number), sum(age) FROM accounts GROUP BY abs(account_number);
+SELECT abs(account_number), avg(age) FROM accounts GROUP BY abs(account_number);
 ```
 
-| abs(account_number) | sum (age)
+| abs(account_number) | avg(age)
 :--- | :---
-| 1  | 32  |
-| 13 | 28  |
-| 18 | 33  |
-| 6  | 36  |
+| 1  | 32.0  |
+| 13 | 28.0  |
+| 18 | 33.0  |
+| 6  | 36.0  |
 
 ## Aggregation
Original file line number	Diff line number	Diff line change
Expand Up		@@ -262,4 +262,4 @@ You can use wildcards to delete more than one data stream.

		We recommend deleting data from a data stream using an ISM policy.

		You can also use [asynchronous search]({{site.url}}{{site.baseurl}}/search-plugins/async/index/) and [SQL]({{site.url}}{{site.baseurl}}/search-plugins/sql/index/) and [PPL]({{site.url}}{{site.baseurl}}/observability-plugin/ppl/index/) to query your data stream directly. You can also use the security plugin to define granular permissions on the data stream name.
		You can also use [asynchronous search]({{site.url}}{{site.baseurl}}/search-plugins/async/index/), [SQL]({{site.url}}{{site.baseurl}}/search-plugins/sql/index/), and [PPL]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/index/) to query your data stream directly. You can also use the security plugin to define granular permissions for the data stream name.