diff --git a/_ml-commons-plugin/index.md b/_ml-commons-plugin/index.md index 12142c2065..209b31a3f1 100644 --- a/_ml-commons-plugin/index.md +++ b/_ml-commons-plugin/index.md @@ -10,7 +10,7 @@ has_toc: false ML Commons for OpenSearch eases the development of machine learning features by providing a set of common machine learning (ML) algorithms through transport and REST API calls. Those calls choose the right nodes and resources for each ML request and monitors ML tasks to ensure uptime. This allows you to leverage existing open-source ML algorithms and reduce the effort required to develop new ML features. -Interaction with the ML Commons plugin occurs through either the [REST API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api) or [AD]({{site.url}}{{site.baseurl}}/observability-plugin/ppl/commands#ad) and [kmeans]({{site.url}}{{site.baseurl}}/observability-plugin/ppl/commands#kmeans) Piped Processing Language (PPL) commands. +Interaction with the ML Commons plugin occurs through either the [REST API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api) or [`ad`]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/functions#ad) and [`kmeans`]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/functions#kmeans) Piped Processing Language (PPL) commands. Models [trained]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api#train-model) through the ML Commons plugin support model-based algorithms such as kmeans. After you've trained a model enough so that it meets your precision requirements, you can apply the model to [predict]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api#predict) new data safely. diff --git a/_observability-plugin/app-analytics.md b/_observability-plugin/app-analytics.md index 2bb1d2fdfd..03a3dfa07a 100644 --- a/_observability-plugin/app-analytics.md +++ b/_observability-plugin/app-analytics.md @@ -18,7 +18,7 @@ To get started, select the Menu button on the upper left corner of the OpenSearc 2. Enter a name for your application and optionally add a description. 3. Do at least one of the following: -- Use [PPL]({{site.url}}{{site.baseurl}}/observability-plugin/ppl/index) to specify the base query. +- Use [PPL]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/index) to specify the base query. You can't change the base query after the application is created. {: .note } @@ -31,7 +31,7 @@ You can't change the base query after the application is created. ### Create a visualization 1. Choose the **Log Events** tab. -1. Use [PPL]({{site.url}}{{site.baseurl}}/observability-plugin/ppl/index) to build upon your base query. +1. Use [PPL]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/index) to build upon your base query. 1. Choose the **Visualizations** tab to see your visualizations. 1. Expand the **Save** dropdown menu, enter a name for your visualization, then choose **Save**. diff --git a/_observability-plugin/event-analytics.md b/_observability-plugin/event-analytics.md index 184fb484f5..030315eb28 100644 --- a/_observability-plugin/event-analytics.md +++ b/_observability-plugin/event-analytics.md @@ -6,7 +6,7 @@ nav_order: 10 # Event analytics -Event analytics in observability is where you can use [Piped Processing Language]({{site.url}}{{site.baseurl}}/observability-plugin/ppl/index) (PPL) queries to build and view different visualizations of your data. +Event analytics in Observability is where you can use [Piped Processing Language]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/index) (PPL) queries to build and view different visualizations of your data. ## Get started with event analytics @@ -24,7 +24,7 @@ source = opensearch_dashboards_sample_data_logs | fields host | stats count() By default, Dashboards shows results from the last 15 minutes of your data. To see data from a different timeframe, use the date and time selector. -For more information about building PPL queries, see [Piped Processing Language]({{site.url}}{{site.baseurl}}/observability-plugin/ppl/index). +For more information about building PPL queries, see [Piped Processing Language]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/index). ## Save a visualization diff --git a/_observability-plugin/index.md b/_observability-plugin/index.md index b7d05fe9ca..304cf2dbbf 100644 --- a/_observability-plugin/index.md +++ b/_observability-plugin/index.md @@ -5,7 +5,6 @@ nav_order: 1 has_children: false redirect_from: - /observability-plugin/ - - /observability-plugin/ --- # About Observability @@ -16,7 +15,7 @@ Observability is collection of plugins and applications that let you visualize d Your experience of exploring data might differ, but if you're new to exploring data to create visualizations, we recommend trying a workflow like the following: -1. Explore data over a certain timeframe using [Piped Processing Language]({{site.url}}{{site.baseurl}}/observability-plugin/ppl/index). +1. Explore data within a certain timeframe using [Piped Processing Language]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/index). 2. Use [event analytics]({{site.url}}{{site.baseurl}}/observability-plugin/event-analytics) to turn data-driven events into visualizations. ![Sample Event Analytics View]({{site.url}}{{site.baseurl}}/images/event-analytics.png) 3. Create [operational panels]({{site.url}}{{site.baseurl}}/observability-plugin/operational-panels) and add visualizations to compare data the way you like. diff --git a/_observability-plugin/operational-panels.md b/_observability-plugin/operational-panels.md index 9578d7cd6e..8b8db539a4 100644 --- a/_observability-plugin/operational-panels.md +++ b/_observability-plugin/operational-panels.md @@ -6,7 +6,7 @@ nav_order: 30 # Operational panels -Operational panels in OpenSearch Dashboards are collections of visualizations generated using [Piped Processing Language]({{site.url}}{{site.baseurl}}/observability-plugin/ppl/index) (PPL) queries. +Operational panels in OpenSearch Dashboards are collections of visualizations generated using [Piped Processing Language]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/index) (PPL) queries. ## Get started with operational panels diff --git a/_observability-plugin/ppl/datatypes.md b/_observability-plugin/ppl/datatypes.md deleted file mode 100644 index 910d96b312..0000000000 --- a/_observability-plugin/ppl/datatypes.md +++ /dev/null @@ -1,36 +0,0 @@ ---- -layout: default -title: Data Types -parent: Piped processing language -nav_order: 6 ---- - - -# Data types - -The following table shows the data types supported by the PPL plugin and how each one maps to OpenSearch and SQL data types: - -PPL Type | OpenSearch Type | SQL Type -:--- | :--- | :--- -boolean | boolean | BOOLEAN -byte | byte | TINYINT -byte | short | SMALLINT -integer | integer | INTEGER -long | long | BIGINT -float | float | REAL -float | half_float | FLOAT -float | scaled_float | DOUBLE -double | double | DOUBLE -string | keyword | VARCHAR -text | text | VARCHAR -timestamp | date | TIMESTAMP -ip | ip | VARCHAR -timestamp | date | TIMESTAMP -binary | binary | VARBINARY -struct | object | STRUCT -array | nested | STRUCT - -In addition to this list, the PPL plugin also supports the `datetime` type, though it doesn't have a corresponding mapping with OpenSearch. -To use a function without a corresponding mapping, you must explicitly convert the data type to one that does. - -The PPL plugin supports all SQL date and time types. To learn more, see [SQL Data Types]({{site.url}}{{site.baseurl}}/search-plugins/sql/datatypes/). diff --git a/_observability-plugin/ppl/endpoint.md b/_observability-plugin/ppl/endpoint.md deleted file mode 100644 index f4d693faed..0000000000 --- a/_observability-plugin/ppl/endpoint.md +++ /dev/null @@ -1,24 +0,0 @@ ---- -layout: default -title: Endpoint -parent: Piped processing language -nav_order: 1 ---- - -# Endpoint -Introduced 1.0 -{: .label .label-purple } - -To send a query request to PPL plugin, use the HTTP POST request. -We recommend a POST request because it doesn't have any length limit and it allows you to pass other parameters to the plugin for other functionality. - -Use the `_explain` endpoint for query translation and troubleshooting. - -## Request Format - -To use the PPL plugin with your own applications, send requests to `_plugins/_ppl`, with your query in the request body: - -```json -curl -H 'Content-Type: application/json' -X POST localhost:9200/_plugins/_ppl \ -... -d '{"query" : "source=accounts | fields firstname, lastname"}' -``` diff --git a/_observability-plugin/ppl/functions.md b/_observability-plugin/ppl/functions.md deleted file mode 100644 index 11ea712485..0000000000 --- a/_observability-plugin/ppl/functions.md +++ /dev/null @@ -1,10 +0,0 @@ ---- -layout: default -title: Functions -parent: Piped processing language -nav_order: 10 ---- - -# Functions - -The PPL plugin supports all SQL functions. To learn more, see [SQL Functions]({{site.url}}{{site.baseurl}}/search-plugins/sql/functions/). diff --git a/_observability-plugin/ppl/protocol.md b/_observability-plugin/ppl/protocol.md deleted file mode 100644 index 3636801f13..0000000000 --- a/_observability-plugin/ppl/protocol.md +++ /dev/null @@ -1,71 +0,0 @@ ---- -layout: default -title: Protocol -parent: Piped processing language -nav_order: 2 ---- - -# Protocol - -The PPL plugin provides responses in JDBC format. The JDBC format is widely used because it provides schema information and more functionality such as pagination. Besides JDBC driver, various clients can benefit from the detailed and well formatted response. - -## Response Format - -The body of HTTP POST request can take a few more additional fields with the PPL query: - -```json -curl -H 'Content-Type: application/json' -X POST localhost:9200/_plugins/_ppl \ -... -d '{"query" : "source=accounts | fields firstname, lastname"}' -``` - -The following example shows a normal response where the schema includes a field name and its type and datarows includes the result set: - -```json -{ - "schema": [ - { - "name": "firstname", - "type": "string" - }, - { - "name": "lastname", - "type": "string" - } - ], - "datarows": [ - [ - "Amber", - "Duke" - ], - [ - "Hattie", - "Bond" - ], - [ - "Nanette", - "Bates" - ], - [ - "Dale", - "Adams" - ] - ], - "total": 4, - "size": 4 -} -``` - -If any error occurred, error message and the cause will be returned instead: - -```json -curl -H 'Content-Type: application/json' -X POST localhost:9200/_plugins/_ppl \ -... -d '{"query" : "source=unknown | fields firstname, lastname"}' -{ - "error": { - "reason": "Error occurred in OpenSearch engine: no such index [unknown]", - "details": "org.opensearch.index.IndexNotFoundException: no such index [unknown]\nFor more details, please send request for Json format to see the raw response from opensearch engine.", - "type": "IndexNotFoundException" - }, - "status": 404 -} -``` diff --git a/_observability-plugin/ppl/settings.md b/_observability-plugin/ppl/settings.md deleted file mode 100644 index 8dd7476b64..0000000000 --- a/_observability-plugin/ppl/settings.md +++ /dev/null @@ -1,49 +0,0 @@ ---- -layout: default -title: Settings -parent: Piped processing language -nav_order: 3 ---- - -# Settings - -The PPL plugin adds a few settings to the standard OpenSearch cluster settings. Most are dynamic, so you can change the default behavior of the plugin without restarting your cluster. - -You can update these settings like any other cluster setting: - -```json -PUT _cluster/settings -{ - "transient": { - "plugins": { - "ppl": { - "enabled": "false" - } - } - } -} -``` - -Similarly, you can also update the settings by sending request to the plugin setting endpoint `_plugins/_query/settings` : -```json -PUT _plugins/_query/settings -{ - "transient": { - "plugins": { - "ppl": { - "enabled": "false" - } - } - } -} -``` - -Requests to `_plugins/_ppl` include index names in the request body, so they have the same access policy considerations as the `bulk`, `mget`, and `msearch` operations. If you set the `rest.action.multi.allow_explicit_index` parameter to `false`, the PPL plugin is disabled. - -You can specify the settings shown in the following table: - -Setting | Description | Default -:--- | :--- | :--- -`plugins.ppl.enabled` | Change to `false` to disable the PPL component. | True -`plugins.query.memory_limit` | Set heap memory usage limit. If a query crosses this limit, it's terminated. | 85% -`plugins.query.size_limit` | Set the maximum number of results that you want to see. This impacts the accuracy of aggregation operations. For example, if you have 1000 documents in an index, by default, only 200 documents are extracted from the index for aggregation. | 200 diff --git a/_opensearch/data-streams.md b/_opensearch/data-streams.md index 69ee1b1321..0dc094fbe2 100644 --- a/_opensearch/data-streams.md +++ b/_opensearch/data-streams.md @@ -262,4 +262,4 @@ You can use wildcards to delete more than one data stream. We recommend deleting data from a data stream using an ISM policy. -You can also use [asynchronous search]({{site.url}}{{site.baseurl}}/search-plugins/async/index/) and [SQL]({{site.url}}{{site.baseurl}}/search-plugins/sql/index/) and [PPL]({{site.url}}{{site.baseurl}}/observability-plugin/ppl/index/) to query your data stream directly. You can also use the security plugin to define granular permissions on the data stream name. +You can also use [asynchronous search]({{site.url}}{{site.baseurl}}/search-plugins/async/index/), [SQL]({{site.url}}{{site.baseurl}}/search-plugins/sql/index/), and [PPL]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/index/) to query your data stream directly. You can also use the security plugin to define granular permissions for the data stream name. diff --git a/_search-plugins/sql/aggregations.md b/_search-plugins/sql/aggregations.md index 688ec1b962..0b0137e1c6 100644 --- a/_search-plugins/sql/aggregations.md +++ b/_search-plugins/sql/aggregations.md @@ -1,52 +1,91 @@ --- layout: default -title: Aggregation Functions +title: Aggregate Functions parent: SQL nav_order: 11 --- -# Aggregation functions +# Aggregate functions Aggregate functions use the `GROUP BY` clause to group sets of values into subsets. +OpenSearch supports the following aggregate functions: + +Function | Description +:--- | :--- +AVG | Returns the average of the results. +COUNT | Returns the number of results. +SUM | Returns the sum of the results. +MIN | Returns the minimum of the results. +MAX | Returns the maximum of the results. +VAR_POP or VARIANCE | Returns the population variance of the results after discarding nulls. +VAR_SAMP | Returns the sample variance of the results after discarding nulls. +STD or STDDEV | Returns the sample standard deviation of the results. Returns 0 when it has only one row of results. +STDDEV_POP | Returns the population standard deviation of the results. +STDDEV_SAMP | Returns the sample standard deviation of the results. Returns null when it has only one row of results. + + +The examples below reference an `accounts` table. You can try out the examples by indexing the following documents into OpenSearch using the bulk index operation: + +```json +```json +PUT accounts/_bulk?refresh +{"index":{"_id":"1"}} +{"account_number":1,"balance":39225,"firstname":"Amber","lastname":"Duke","age":32,"gender":"M","address":"880 Holmes Lane","employer":"Pyrami","email":"amberduke@pyrami.com","city":"Brogan","state":"IL","acct_open_date":"2008-01-23"} +{"index":{"_id":"6"}} +{"account_number":6,"balance":5686,"firstname":"Hattie","lastname":"Bond","age":36,"gender":"M","address":"671 Bristol Street","employer":"Netagy","email":"hattiebond@netagy.com","city":"Dante","state":"TN","acct_open_date":"2008-06-07"} +{"index":{"_id":"13"}} +{"account_number":13,"balance":32838,"firstname":"Nanette","lastname":"Bates","age":28,"gender":"F","address":"789 Madison Street","employer":"Quility","email":"nanettebates@quility.com","city":"Nogal","state":"VA","acct_open_date":"2010-04-11"} +{"index":{"_id":"18"}} +{"account_number":18,"balance":4180,"firstname":"Dale","lastname":"Adams","age":33,"gender":"M","address":"467 Hutchinson Court","email":"daleadams@boink.com","city":"Orick","state":"MD","acct_open_date":"2022-11-05"} +``` + + + ## Group By Use the `GROUP BY` clause as an identifier, ordinal, or expression. ### Identifier +The following query returns the gender and average age of customers in the `accounts` index and groups the results by gender: + ```sql -SELECT gender, sum(age) FROM accounts GROUP BY gender; +SELECT gender, avg(age) FROM accounts GROUP BY gender; ``` -| gender | sum (age) +| gender | avg(age) :--- | :--- -F | 28 | -M | 101 | +F | 28.0 | +M | 33.666666666666664 | ### Ordinal +The following query returns the gender and average age of customers in the `accounts` index. It groups the results by the first column of the result set, which in this case is `gender`: + ```sql -SELECT gender, sum(age) FROM accounts GROUP BY 1; +SELECT gender, avg(age) FROM accounts GROUP BY 1; ``` | gender | sum (age) :--- | :--- -F | 28 | -M | 101 | +F | 28.0 | +M | 33.666666666666664 | ### Expression +The following query + ```sql -SELECT abs(account_number), sum(age) FROM accounts GROUP BY abs(account_number); +SELECT abs(account_number), avg(age) FROM accounts GROUP BY abs(account_number); ``` -| abs(account_number) | sum (age) +| abs(account_number) | avg(age) :--- | :--- -| 1 | 32 | -| 13 | 28 | -| 18 | 33 | -| 6 | 36 | +| 1 | 32.0 | +| 13 | 28.0 | +| 18 | 33.0 | +| 6 | 36.0 | ## Aggregation diff --git a/_search-plugins/sql/cli.md b/_search-plugins/sql/cli.md index c2b16f7c5e..a91c0ca05b 100644 --- a/_search-plugins/sql/cli.md +++ b/_search-plugins/sql/cli.md @@ -1,23 +1,24 @@ --- layout: default -title: SQL CLI -parent: SQL -nav_order: 2 +title: SQL and PPL CLI +parent: SQL and PPL +nav_order: 3 --- -# SQL CLI +# SQL and PPL CLI -SQL CLI is a stand-alone Python application that you can launch with the `opensearchsql` command. +The SQL and PPL command line interface (CLI) is a standalone Python application that you can launch with the `opensearchsql` command. -Install the SQL plugin to your OpenSearch instance, run the CLI using MacOS or Linux, and connect to any valid OpenSearch end-point. + To use the SQL and PPL CLI, install the SQL plugin on your OpenSearch instance, run the CLI using MacOS or Linux, and connect to any valid OpenSearch endpoint. ![SQL CLI]({{site.url}}{{site.baseurl}}/images/cli.gif) ## Features -SQL CLI has the following features: +The SQL and PPL CLI has the following features: - Multi-line input +- PPL support - Autocomplete for SQL syntax and index names - Syntax highlighting - Formatted output: @@ -33,26 +34,16 @@ SQL CLI has the following features: Launch your local OpenSearch instance and make sure you have the SQL plugin installed. -To install the SQL CLI: - -1. We suggest you install and activate a python3 virtual environment to avoid changing your local environment: -``` -pip install virtualenv -virtualenv venv -cd venv -source ./bin/activate -``` - -2. Install the CLI: -``` +1. Install the CLI: +```console pip3 install opensearchsql ``` The SQL CLI only works with Python 3. {: .note } -3. To launch the CLI, run: -``` +2. To launch the CLI, run: +```console opensearchsql https://localhost:9200 --username admin --password admin ``` By default, the `opensearchsql` command connects to http://localhost:9200. @@ -71,25 +62,41 @@ For a list of all available configurations, see [clirc](https://github.com/opens ## Using the CLI -1. Save the sample [accounts test data](https://github.com/opensearch-project/sql/blob/main/doctest/test_data/accounts.json) file. - -1. Index the sample data. +1. Run the CLI tool. If your cluster runs with the default security settings, use the following command: +```console +opensearchsql --username admin --password admin https://localhost:9200 ``` -curl -H "Content-Type: application/x-ndjson" -POST https://localhost:9200/data/_bulk -u 'admin:admin' --insecure --data-binary "@accounts.json" +If your cluster runs without security, run: +```console +opensearchsql ``` -1. Run a sample SQL command: -``` +2. Run a sample SQL command: +```sql SELECT * FROM accounts; ``` By default, you see a maximum output of 200 rows. To show more results, add a `LIMIT` clause with the desired value. +To exit the CLI tool, select **Ctrl+D**. +{: .tip } + +## Using the CLI with PPL + +1. Run the CLI by specifying the query language: +```console +opensearchsql -l ppl +``` + +2. Execute a PPL query: +```sql +source=accounts | fields firstname, lastname +``` + ## Query options -Run a single query with the following options: +Run a single query with the following command line options: -- `--help`: Help page for options - `-q`: Follow by a single query - `-f`: Specify JDBC or raw format output - `-v`: Display data vertically @@ -97,6 +104,7 @@ Run a single query with the following options: ## CLI options +- `--help`: Help page for options - `-l`: Query language option. Available options are `sql` and `ppl`. Default is `sql` - `-p`: Always use pager to display output - `--clirc`: Provide path for the configuration file diff --git a/_search-plugins/sql/datatypes.md b/_search-plugins/sql/datatypes.md index a11fa233dd..c2eb4e3860 100644 --- a/_search-plugins/sql/datatypes.md +++ b/_search-plugins/sql/datatypes.md @@ -1,8 +1,8 @@ --- layout: default title: Data Types -parent: SQL -nav_order: 73 +parent: SQL and PPL +nav_order: 7 --- # Data types diff --git a/_search-plugins/sql/endpoints.md b/_search-plugins/sql/endpoints.md deleted file mode 100644 index 95affd3c00..0000000000 --- a/_search-plugins/sql/endpoints.md +++ /dev/null @@ -1,226 +0,0 @@ ---- -layout: default -title: Endpoint -parent: SQL -nav_order: 13 ---- - - -# Endpoint -Introduced 1.0 -{: .label .label-purple } - -To send query request to SQL plugin, you can either use a request -parameter in HTTP GET or request body by HTTP POST request. POST request -is recommended because it doesn't have length limitation and allows for -other parameters passed to plugin for other functionality such as -prepared statement. And also the explain endpoint is used very often for -query translation and troubleshooting. - -## GET - -### Description - -You can send HTTP GET request with your query embedded in URL parameter. - -### Example - -SQL query: - -```console ->> curl -H 'Content-Type: application/json' -X GET localhost:9200/_plugins/_sql?sql=SELECT * FROM accounts -``` - -## POST - -### Description - -You can also send HTTP POST request with your query in request body. - -### Example - -SQL query: - -```console ->> curl -H 'Content-Type: application/json' -X POST localhost:9200/_plugins/_sql -d '{ - "query" : "SELECT * FROM accounts" -}' -``` - -## Explain - -### Description - -To translate your query, send it to explain endpoint. The explain output -is OpenSearch domain specific language (DSL) in JSON format. You can -just copy and paste it to your console to run it against OpenSearch -directly. - -### Example - -Explain query: - -```console ->> curl -H 'Content-Type: application/json' -X POST localhost:9200/_plugins/_sql/_explain -d '{ - "query" : "SELECT firstname, lastname FROM accounts WHERE age > 20" -}' -``` - -Explain: - -```json -{ - "from": 0, - "size": 200, - "query": { - "bool": { - "filter": [{ - "bool": { - "must": [{ - "range": { - "age": { - "from": 20, - "to": null, - "include_lower": false, - "include_upper": true, - "boost": 1.0 - } - } - }], - "adjust_pure_negative": true, - "boost": 1.0 - } - }], - "adjust_pure_negative": true, - "boost": 1.0 - } - }, - "_source": { - "includes": [ - "firstname", - "lastname" - ], - "excludes": [] - } -} -``` - - -## Cursor - -### Description - -To get back a paginated response, use the `fetch_size` parameter. The value of `fetch_size` should be greater than 0. The default value is 1,000. A value of 0 will fallback to a non-paginated response. - -The `fetch_size` parameter is only supported for the JDBC response format. -{: .note } - - -### Example - -SQL query: - -```console ->> curl -H 'Content-Type: application/json' -X POST localhost:9200/_plugins/_sql -d '{ - "fetch_size" : 5, - "query" : "SELECT firstname, lastname FROM accounts WHERE age > 20 ORDER BY state ASC" -}' -``` - -Result set: - -```json -{ - "schema": [ - { - "name": "firstname", - "type": "text" - }, - { - "name": "lastname", - "type": "text" - } - ], - "cursor": "d:eyJhIjp7fSwicyI6IkRYRjFaWEo1UVc1a1JtVjBZMmdCQUFBQUFBQUFBQU1XZWpkdFRFRkZUMlpTZEZkeFdsWnJkRlZoYnpaeVVRPT0iLCJjIjpbeyJuYW1lIjoiZmlyc3RuYW1lIiwidHlwZSI6InRleHQifSx7Im5hbWUiOiJsYXN0bmFtZSIsInR5cGUiOiJ0ZXh0In1dLCJmIjo1LCJpIjoiYWNjb3VudHMiLCJsIjo5NTF9", - "total": 956, - "datarows": [ - [ - "Cherry", - "Carey" - ], - [ - "Lindsey", - "Hawkins" - ], - [ - "Sargent", - "Powers" - ], - [ - "Campos", - "Olsen" - ], - [ - "Savannah", - "Kirby" - ] - ], - "size": 5, - "status": 200 -} -``` - -To fetch subsequent pages, use the `cursor` from last response: - -```console ->> curl -H 'Content-Type: application/json' -X POST localhost:9200/_plugins/_sql -d '{ - "cursor": "d:eyJhIjp7fSwicyI6IkRYRjFaWEo1UVc1a1JtVjBZMmdCQUFBQUFBQUFBQU1XZWpkdFRFRkZUMlpTZEZkeFdsWnJkRlZoYnpaeVVRPT0iLCJjIjpbeyJuYW1lIjoiZmlyc3RuYW1lIiwidHlwZSI6InRleHQifSx7Im5hbWUiOiJsYXN0bmFtZSIsInR5cGUiOiJ0ZXh0In1dLCJmIjo1LCJpIjoiYWNjb3VudHMiLCJsIjo5NTF9" -}' -``` - -The result only has the `fetch_size` number of `datarows` and `cursor`. -The last page has only `datarows` and no `cursor`. -The `datarows` can have more than the `fetch_size` number of records in case the nested fields are flattened. - -```json -{ - "cursor": "d:eyJhIjp7fSwicyI6IkRYRjFaWEo1UVc1a1JtVjBZMmdCQUFBQUFBQUFBQU1XZWpkdFRFRkZUMlpTZEZkeFdsWnJkRlZoYnpaeVVRPT0iLCJjIjpbeyJuYW1lIjoiZmlyc3RuYW1lIiwidHlwZSI6InRleHQifSx7Im5hbWUiOiJsYXN0bmFtZSIsInR5cGUiOiJ0ZXh0In1dLCJmIjo1LCJpIjoiYWNjb3VudHMabcde12345", - "datarows": [ - [ - "Abbey", - "Karen" - ], - [ - "Chen", - "Ken" - ], - [ - "Ani", - "Jade" - ], - [ - "Peng", - "Hu" - ], - [ - "John", - "Doe" - ] - ] -} -``` - -The `cursor` context is automatically cleared on the last page. -To explicitly clear cursor context, use the `_plugins/_sql/close endpoint` operation. - -```console ->> curl -H 'Content-Type: application/json' -X POST localhost:9200/_plugins/_sql/close -d '{ - "cursor": "d:eyJhIjp7fSwicyI6IkRYRjFaWEo1UVc1a1JtVjBZMmdCQUFBQUFBQUFBQU1XZWpkdFRFRkZUMlpTZEZkeFdsWnJkRlZoYnpaeVVRPT0iLCJjIjpbeyJuYW1lIjoiZmlyc3RuYW1lIiwidHlwZSI6InRleHQifSx7Im5hbWUiOiJsYXN0bmFtZSIsInR5cGUiOiJ0ZXh0In1dLCJmIjo1LCJpIjoiYWNjb3VudHMiLCJsIjo5NTF9" -}' -``` - -#### Sample response - -```json -{"succeeded":true} -``` diff --git a/_search-plugins/sql/full-text.md b/_search-plugins/sql/full-text.md new file mode 100644 index 0000000000..459cd39105 --- /dev/null +++ b/_search-plugins/sql/full-text.md @@ -0,0 +1,490 @@ +--- +layout: default +title: Full-Text Search +parent: SQL and PPL +nav_order: 11 +--- + +# Full-text search + +Use SQL commands for full-text search. The SQL plugin supports a subset of full-text queries available in OpenSearch. + +To learn about full-text queries in OpenSearch, see [Full-text queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/). + +## Match + +Use the `MATCH` function to search documents that match a `string`, `number`, `date`, or `boolean` value for a given field. + +### Syntax + +```sql +match(field_expression, query_expression[, option=]*) +``` + +You can specify the following options in any order: + +- `analyzer` +- `auto_generate_synonyms_phrase` +- `fuzziness` +- `max_expansions` +- `prefix_length` +- `fuzzy_transpositions` +- `fuzzy_rewrite` +- `lenient` +- `operator` +- `minimum_should_match` +- `zero_terms_query` +- `boost` + +Please, refer to `match` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/#match) for parameter description and supported values. + +### Example 1: Search the `message` field for the text "this is a test": + +```json +GET my_index/_search +{ + "query": { + "match": { + "message": "this is a test" + } + } +} +``` + +*SQL query:* +```sql +SELECT message FROM my_index WHERE match(message, "this is a test") +``` +*PPL query:* +```ppl +SOURCE=my_index | WHERE match(message, "this is a test") | FIELDS message +``` + +### Example 2: Search the `message` field with the `operator` parameter: + +```json +GET my_index/_search +{ + "query": { + "match": { + "message": { + "query": "this is a test", + "operator": "and" + } + } + } +} +``` + +*SQL query:* +```sql +SELECT message FROM my_index WHERE match(message, "this is a test", operator='and') +``` +*PPL query:* +```ppl +SOURCE=my_index | WHERE match(message, "this is a test", operator='and') | FIELDS message +``` + +### Example 3: Search the `message` field with the `operator` and `zero_terms_query` parameters: + +```json +GET my_index/_search +{ + "query": { + "match": { + "message": { + "query": "to be or not to be", + "operator": "and", + "zero_terms_query": "all" + } + } + } +} +``` + +*SQL query:* +```sql +SELECT message FROM my_index WHERE match(message, "this is a test", operator='and', zero_terms_query='all') +``` +*PPL query:* +```sql +SOURCE=my_index | WHERE match(message, "this is a test", operator='and', zero_terms_query='all') | FIELDS message +``` + +## Multi-match + +To search for text in multiple fields, use `MULTI_MATCH` function. This function maps to the `multi_match` query used in search engine, to returns the documents that match a provided text, number, date or boolean value with a given field or fields. + +### Syntax + +The `MULTI_MATCH` function lets you *boost* certain fields using **^** character. Boosts are multipliers that weigh matches in one field more heavily than matches in other fields. The syntax allows to specify the fields in double quotes, single quotes, surrounded by backticks, or unquoted. Use star ``"*"`` to search all fields. Star symbol should be quoted. + +```sql +multi_match([field_expression+], query_expression[, option=]*) +``` + +The weight is optional and is specified after the field name. It could be delimited by the `caret` character -- `^` or by whitespace. Please, refer to examples below: + +```sql +multi_match(["Tags" ^ 2, 'Title' 3.4, `Body`, Comments ^ 0.3], ...) +multi_match(["*"], ...) +``` + +You can specify the following options for `MULTI_MATCH` in any order: + +- `analyzer` +- `auto_generate_synonyms_phrase` +- `cutoff_frequency` +- `fuzziness` +- `fuzzy_transpositions` +- `lenient` +- `max_expansions` +- `minimum_should_match` +- `operator` +- `prefix_length` +- `tie_breaker` +- `type` +- `slop` +- `zero_terms_query` +- `boost` + +Please, refer to `multi_match` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/#multi-match) for parameter description and supported values. + +### For example, REST API search for `Dale` in either the `firstname` or `lastname` fields: + +```json +GET accounts/_search +{ + "query": { + "multi_match": { + "query": "Lane Street", + "fields": [ "address" ], + } + } +} +``` +could be called from *SQL* using `multi_match` function +```sql +SELECT firstname, lastname +FROM accounts +WHERE multi_match(['*name'], 'Dale') +``` +or `multi_match` *PPL* function +```sql +SOURCE=accounts | WHERE multi_match(['*name'], 'Dale') | fields firstname, lastname +``` + +| firstname | lastname +:--- | :--- +Dale | Adams + +## Query string + +To split text based on operators, use the `QUERY_STRING` function. The `QUERY_STRING` function supports logical connectives, wildcard, regex, and proximity search. +This function maps to the to the `query_string` query used in search engine, to return the documents that match a provided text, number, date or boolean value with a given field or fields. + +### Syntax + +The `QUERY_STRING` function has syntax similar to `MATCH_QUERY` and lets you *boost* certain fields using **^** character. Boosts are multipliers that weigh matches in one field more heavily than matches in other fields. The syntax allows to specify the fields in double quotes, single quotes, surrounded by backticks, or unquoted. Use star ``"*"`` to search all fields. Star symbol should be quoted. + +```sql +query_string([field_expression+], query_expression[, option=]*) +``` + +The weight is optional and is specified after the field name. It could be delimited by the `caret` character -- `^` or by whitespace. Please, refer to examples below: + +```sql +query_string(["Tags" ^ 2, 'Title' 3.4, `Body`, Comments ^ 0.3], ...) +query_string(["*"], ...) +``` + +You can specify the following options for `QUERY_STRING` in any order: + +- `analyzer` +- `allow_leading_wildcard` +- `analyze_wildcard` +- `auto_generate_synonyms_phrase_query` +- `boost` +- `default_operator` +- `enable_position_increments` +- `fuzziness` +- `fuzzy_rewrite` +- `escape` +- `fuzzy_max_expansions` +- `fuzzy_prefix_length` +- `fuzzy_transpositions` +- `lenient` +- `max_determinized_states` +- `minimum_should_match` +- `quote_analyzer` +- `phrase_slop` +- `quote_field_suffix` +- `rewrite` +- `type` +- `tie_breaker` +- `time_zone` + +Please, refer to `query_string` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/#query-string) for parameter description and supported values. + +### Example of using `query_string` in SQL and PPL queries: + +The REST API search request + +```json +GET accounts/_search +{ + "query": { + "query_string": { + "query": "Lane Street", + "fields": [ "address" ], + } + } +} +``` + +could be called from *SQL* + +```sql +SELECT account_number, address +FROM accounts +WHERE query_string(['address'], 'Lane Street', default_operator='OR') +``` + +or from *PPL* + +```sql +SOURCE=accounts | WHERE query_string(['address'], 'Lane Street', default_operator='OR') | fields account_number, address +``` + +| account_number | address +:--- | :--- +1 | 880 Holmes Lane +6 | 671 Bristol Street +13 | 789 Madison Street + +## Match phrase + +To search for exact phrases, use `MATCHPHRASE` or `MATCH_PHRASE` functions. + +### Syntax + +```sql +matchphrasequery(field_expression, query_expression) +matchphrase(field_expression, query_expression[, option=]*) +match_phrase(field_expression, query_expression[, option=]*) +``` + +The `MATCHPHRASE`/`MATCH_PHRASE` functions let you specify the following options in any order: + +- `analyzer` +- `slop` +- `zero_terms_query` +- `boost` + +Please, refer to `match_phrase` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/#match-phrase) for parameter description and supported values. + +### Example of using `match_phrase` in SQL and PPL queries: + +The REST API search request +```json +GET accounts/_search +{ + "query": { + "match_phrase": { + "address": { + "query": "880 Holmes Lane" + } + } + } +} +``` +could be called from *SQL* +```sql +SELECT account_number, address +FROM accounts +WHERE match_phrase(address, '880 Holmes Lane') +``` +or *PPL* +```sql +SOURCE=accounts | WHERE match_phrase(address, '880 Holmes Lane') | FIELDS account_number, address +``` + +| account_number | address +:--- | :--- +1 | 880 Holmes Lane + + +## Simple query string + +The `simple_query_string` function maps to the `simple_query_string` query in OpenSearch. It returns the documents that match a provided text, number, date or boolean value with a given field or fields. +The **^** lets you *boost* certain fields. Boosts are multipliers that weigh matches in one field more heavily than matches in other fields. + +### Syntax + +The syntax allows to specify the fields in double quotes, single quotes, surrounded by backticks, or unquoted. Use star ``"*"`` to search all fields. Star symbol should be quoted. + +```sql +simple_query_string([field_expression+], query_expression[, option=]*) +``` + +The weight is optional and is specified after the field name. It could be delimited by the `caret` character -- `^` or by whitespace. Please, refer to examples below: + +```sql +simple_query_string(["Tags" ^ 2, 'Title' 3.4, `Body`, Comments ^ 0.3], ...) +simple_query_string(["*"], ...) +``` + +You can specify the following options for `SIMPLE_QUERY_STRING` in any order: + +- `analyze_wildcard` +- `analyzer` +- `auto_generate_synonyms_phrase_query` +- `boost` +- `default_operator` +- `flags` +- `fuzzy_max_expansions` +- `fuzzy_prefix_length` +- `fuzzy_transpositions` +- `lenient` +- `minimum_should_match` +- `quote_field_suffix` + +Please, refer to `simple_query_string` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/#simple-query-string) to check parameter meanings and available values. + +### *Example* of using `simple_query_string` in SQL and PPL queries: + +The REST API search request +```json +GET accounts/_search +{ + "query": { + "simple_query_string": { + "query": "Lane Street", + "fields": [ "address" ], + } + } +} +``` +could be called from *SQL* +```sql +SELECT account_number, address +FROM accounts +WHERE simple_query_string(['address'], 'Lane Street', default_operator='OR') +``` +or from *PPL* +```sql +SOURCE=accounts | WHERE simple_query_string(['address'], 'Lane Street', default_operator='OR') | fields account_number, address +``` + +| account_number | address +:--- | :--- +1 | 880 Holmes Lane +6 | 671 Bristol Street +13 | 789 Madison Street + +## Match phrase prefix + +To search for phrases by given prefix, use `MATCH_PHRASE_PREFIX` function to make a prefix query out of the last term in the query string. + +### Syntax + +```sql +match_phrase_prefix(field_expression, query_expression[, option=]*) +``` + +The `MATCH_PHRASE_PREFIX` function lets you specify the following options in any order: + +- `analyzer` +- `slop` +- `max_expansions` +- `zero_terms_query` +- `boost` + +Please, refer to `match_phrase_prefix` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/#match-phrase-prefix) for parameter description and supported values. + +### *Example* of using `match_phrase_prefix` in SQL and PPL queries: + +The REST API search request +```json +GET accounts/_search +{ + "query": { + "match_phrase_prefix": { + "author": { + "query": "Alexander Mil" + } + } + } +} +``` +could be called from *SQL* +```sql +SELECT author, title +FROM books +WHERE match_phrase_prefix(author, 'Alexander Mil') +``` +or *PPL* +```sql +source=books | where match_phrase_prefix(author, 'Alexander Mil') | fields author, title +``` + +| author | title +:--- | :--- +Alan Alexander Milne | The House at Pooh Corner +Alan Alexander Milne | Winnie-the-Pooh + + +## Match boolean prefix + +Use the `match_bool_prefix` function to search documents that match text only for a given field prefix. + +### Syntax + +```sql +match_bool_prefix(field_expression, query_expression[, option=]*) +``` + +The `MATCH_BOOL_PREFIX` function lets you specify the following options in any order: + +- `minimum_should_match` +- `fuzziness` +- `prefix_length` +- `max_expansions` +- `fuzzy_transpositions` +- `fuzzy_rewrite` +- `boost` +- `analyzer` +- `operator` + +Please, refer to `match_bool_prefix` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/#match-boolean-prefix) for parameter description and supported values. + +### Example of using `match_bool_prefix` in SQL and PPL queries: + +The REST API search request +```json +GET accounts/_search +{ + "query": { + "match_bool_prefix": { + "address": { + "query": "Bristol Stre" + } + } + } +} +``` +could be called from *SQL* +```sql +SELECT firstname, address +FROM accounts +WHERE match_bool_prefix(address, 'Bristol Stre') +``` +or *PPL* +```sql +source=accounts | where match_bool_prefix(address, 'Bristol Stre') | fields firstname, address +``` + +| firstname | address +:--- | :--- +Hattie | 671 Bristol Street +Nanette | 789 Madison Street diff --git a/_search-plugins/sql/functions.md b/_search-plugins/sql/functions.md index 215300fba3..6604ecdd21 100644 --- a/_search-plugins/sql/functions.md +++ b/_search-plugins/sql/functions.md @@ -1,7 +1,7 @@ --- layout: default title: Functions -parent: SQL +parent: SQL and PPL nav_order: 10 --- @@ -10,9 +10,9 @@ nav_order: 10 You must enable fielddata in the document mapping for most string functions to work properly. The specification shows the return type of the function with a generic type `T` as the argument. -For example, `abs(number T) -> T` means that the function `abs` accepts a numerical argument of type `T`, which could be any sub-type of the `number` type, and it returns the actual type of `T` as the return type. +For example, `abs(number T) -> T` means that the function `abs` accepts a numerical argument of type `T`, which could be any subtype of the `number` type, and it returns the actual type of `T` as the return type. -The SQL plugin supports the following functions. +The SQL plugin supports the following common functions shared across the SQL and PPL languages. ## Mathematical @@ -131,3 +131,7 @@ Function | Specification | Example if | `if(boolean, es_type, es_type) -> es_type` | `SELECT if(false, 0, 1) FROM my-index LIMIT 1`, `SELECT if(true, 0, 1) FROM my-index LIMIT 1` ifnull | `ifnull(es_type, es_type) -> es_type` | `SELECT ifnull('hello', 1) FROM my-index LIMIT 1`, `SELECT ifnull(null, 1) FROM my-index LIMIT 1` isnull | `isnull(es_type) -> integer` | `SELECT isnull(null) FROM my-index LIMIT 1`, `SELECT isnull(1) FROM my-index LIMIT 1` + +## Relevance-based search (full-text search) + +These functions are only available in the `WHERE` clause. For their descriptions and usage examples in SQL and PPL, see [Full-text search]({{site.url}}{{site.baseurl}}/search-plugins/sql/full-text/). diff --git a/_observability-plugin/ppl/identifiers.md b/_search-plugins/sql/identifiers.md similarity index 95% rename from _observability-plugin/ppl/identifiers.md rename to _search-plugins/sql/identifiers.md index aac5b744de..4e214a34a9 100644 --- a/_observability-plugin/ppl/identifiers.md +++ b/_search-plugins/sql/identifiers.md @@ -1,8 +1,8 @@ --- layout: default title: Identifiers -parent: Piped processing language -nav_order: 7 +parent: SQL and PPL +nav_order: 6 --- @@ -28,7 +28,7 @@ For regular identifiers, you can use the name without any back tick or escape ch In this example, `source`, `fields`, `account_number`, `firstname`, and `lastname` are all identifiers. Out of these, the `source` field is a reserved identifier. ```sql -source=accounts | fields account_number, firstname, lastname; +SELECT account_number, firstname, lastname FROM accounts; ``` | account_number | firstname | lastname | diff --git a/_search-plugins/sql/index.md b/_search-plugins/sql/index.md index a3c1ee95bd..852ab65578 100644 --- a/_search-plugins/sql/index.md +++ b/_search-plugins/sql/index.md @@ -1,6 +1,6 @@ --- layout: default -title: SQL +title: SQL and PPL nav_order: 38 has_children: true has_toc: false @@ -8,69 +8,10 @@ redirect_from: - /search-plugins/sql/ --- -# SQL +# SQL and PPL OpenSearch SQL lets you write queries in SQL rather than the [OpenSearch query domain-specific language (DSL)]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text). If you're already familiar with SQL and don't want to learn the query DSL, this feature is a great option. - -## Workbench - -The easiest way to get familiar with the SQL plugin is to use **Query Workbench** in OpenSearch Dashboards to test various queries. To learn more, see [Workbench]({{site.url}}{{site.baseurl}}/search-plugins/sql/workbench/). - -![OpenSearch Dashboards SQL UI plugin]({{site.url}}{{site.baseurl}}/images/sql.png) - - -## REST API - -To use the SQL plugin with your own applications, send requests to `_plugins/_sql`: - -```json -POST _plugins/_sql -{ - "query": "SELECT * FROM my-index LIMIT 50" -} -``` - -Here’s how core SQL concepts map to OpenSearch: - -SQL | OpenSearch -:--- | :--- -Table | Index -Row | Document -Column | Field - -You can query multiple indices by listing them or using wildcards: - -```json -POST _plugins/_sql -{ - "query": "SELECT * FROM my-index1,myindex2,myindex3 LIMIT 50" -} - -POST _plugins/_sql -{ - "query": "SELECT * FROM my-index* LIMIT 50" -} -``` - -For a sample [curl](https://curl.haxx.se/) command, try: - -```bash -curl -XPOST https://localhost:9200/_plugins/_sql -u 'admin:admin' -k -H 'Content-Type: application/json' -d '{"query": "SELECT * FROM opensearch_dashboards_sample_data_flights LIMIT 10"}' -``` - -By default, queries return data in JDBC format, but you can also return data in standard OpenSearch JSON, CSV, or raw formats: - -```json -POST _plugins/_sql?format=json|csv|raw -{ - "query": "SELECT * FROM my-index LIMIT 50" -} -``` - -See the rest of this guide for detailed information on request parameters, settings, supported operations, tools, and more. - - ## Contributing To get involved and help us improve the SQL plugin, see the [development guide](https://github.com/opensearch-project/sql/blob/main/DEVELOPER_GUIDE.rst) for instructions on setting up your development environment and building the project. diff --git a/_search-plugins/sql/limitation.md b/_search-plugins/sql/limitation.md index 8a3e8a825b..74e4119aa4 100644 --- a/_search-plugins/sql/limitation.md +++ b/_search-plugins/sql/limitation.md @@ -1,61 +1,17 @@ --- layout: default title: Limitations -parent: SQL -nav_order: 18 +parent: SQL and PPL +nav_order: 99 --- # Limitations The SQL plugin has the following limitations: -## SELECT FROM WHERE - -### Select literal is not supported - -The select literal expression is not supported. For example, `Select 1` is not supported. - - -### Where clause does not support arithmetic operations - -The `WHERE` clause does not support expressions. For example, `SELECT FlightNum FROM opensearch_dashboards_sample_data_flights where (AvgTicketPrice + 100) <= 1000` is not supported. - - -### Aggregation over expression is not supported - -You can only apply aggregation on fields, aggregations can't accept an expression as a parameter. For example, `avg(log(age))` is not supported. - - -### Conflict type in multiple index query - -Queries using wildcard index fail if the index has the field with a conflict type. -For example, if you have two indices with field `a`: - -``` -POST conflict_index_1/_doc/ -{ - "a": { - "b": 1 - } -} - -POST conflict_index_2/_doc/ -{ - "a": { - "b": 1, - "c": 2 - } -} -``` - -Then, the query fails because of the field mapping conflict. The query `SELECT * FROM conflict_index*` also fails for the same reason. - -```sql -Error occurred in OpenSearch engine: Different mappings are not allowed for the same field[a]: found [{properties:{b:{type:long},c:{type:long}}}] and [{properties:{b:{type:long}}}] ", - "details": "com.amazon.opensearch.sql.rewriter.matchtoterm.VerificationException: Different mappings are not allowed for the same field[a]: found [{properties:{b:{type:long},c:{type:long}}}] and [{properties:{b:{type:long}}}] \nFor more details, please send request for Json format to see the raw response from opensearch engine.", - "type": "VerificationException -``` +## Aggregation over expression is not supported +You can only apply aggregation to fields. Aggregations cannot accept an expression as a parameter. For example, `avg(log(age))` is not supported. ## Subquery in the FROM clause @@ -76,10 +32,10 @@ But, if the outer query has `GROUP BY` or `ORDER BY`, then it's not supported. The `join` query does not support aggregations on the joined result. For example, e.g. `SELECT depo.name, avg(empo.age) FROM empo JOIN depo WHERE empo.id == depo.id GROUP BY depo.name` is not supported. - ## Pagination only supports basic queries The pagination query enables you to get back paginated responses. + Currently, the pagination only supports basic queries. For example, the following query returns the data with cursor id. ```json @@ -116,3 +72,23 @@ The response in JDBC format with cursor id. ``` The query with `aggregation` and `join` does not support pagination for now. + +## Query processing engines + +The SQL plugin has two query processing engines, `V1` and `V2`. Most of the features are supported by both engines, but only the new engine is actively being developed. A query that is first executed on the `V2` engine falls back to the `V1` engine in case of failure. If a query is supported in `V2` but not included in `V1`, the query will fail with an error response. + +### V1 engine limitations + +* The select literal expression without `FROM` clause is not supported. For example, `SELECT 1` is not supported. +* The `WHERE` clause does not support expressions. For example, `SELECT FlightNum FROM opensearch_dashboards_sample_data_flights where (AvgTicketPrice + 100) <= 1000` is not supported. +* Most [relevancy search functions]({{site.url}}{{site.baseurl}}/search-plugins/sql/full-text/) are implemented in the `V2` engine only. + +Such queries are successfully executed by the `V2` engine unless they have `V1`-specific functions. You will likely never meet these limitations. + +### V2 engine limitations + +* The [cursor feature](#pagination-only-supports-basic-queries) is supported by the `V1` engine only. +For support of `cursor`/`pagination` in the `V2` engine, track [GitHub issue #656](https://github.com/opensearch-project/sql/issues/656). +* The `V2` engine does not track query execution time, so slow queries are not reported. +* The `V2` query engine not only runs queries in the OpenSearch engine but also supports post-processing for complicated queries. Accordingly, the explain output is no longer pure OpenSearch domain-specific language (DSL) but also includes query plan information from the `V2` query engine. +* The `V2` engine does not support [`SCORE_QUERY`]({{site.url}}{{site.baseurl}}/search-plugins/sql/sql/functions#score-query) and [`WILDCARD_QUERY`]({{site.url}}{{site.baseurl}}/search-plugins/sql/sql/functions#wildcard-query) functions. diff --git a/_search-plugins/sql/monitoring.md b/_search-plugins/sql/monitoring.md index b8f6478acc..9c4d1b8049 100644 --- a/_search-plugins/sql/monitoring.md +++ b/_search-plugins/sql/monitoring.md @@ -1,8 +1,8 @@ --- layout: default title: Monitoring -parent: SQL -nav_order: 15 +parent: SQL and PPL +nav_order: 95 --- # Monitoring diff --git a/_observability-plugin/ppl/commands.md b/_search-plugins/sql/ppl/functions.md similarity index 74% rename from _observability-plugin/ppl/commands.md rename to _search-plugins/sql/ppl/functions.md index 98e05887ed..9ae2e6cf6e 100644 --- a/_observability-plugin/ppl/commands.md +++ b/_search-plugins/sql/ppl/functions.md @@ -1,69 +1,14 @@ --- layout: default title: Commands -parent: Piped processing language -nav_order: 4 +parent: PPL - Piped Processing Language +grand_parent: SQL and PPL +nav_order: 2 --- - # Commands -Start a PPL query with a `search` command to reference a table to search from. You can have the commands that follow in any order. - -In the following example, the `search` command refers to an `accounts` index as the source, then uses `fields` and `where` commands for the conditions: - -```sql -search source=accounts -| where age > 18 -| fields firstname, lastname -``` - -In the below examples, we represent required arguments in angle brackets `< >` and optional arguments in square brackets `[ ]`. -{: .note } - -## search - -Use the `search` command to retrieve a document from an index. You can only use the `search` command as the first command in the PPL query. - -### Syntax - -```sql -search source= [boolean-expression] -``` - -Field | Description | Required -:--- | :--- |:--- -`search` | Specify search keywords. | Yes -`index` | Specify which index to query from. | No -`bool-expression` | Specify an expression that evaluates to a boolean value. | No - -*Example 1*: Get all documents - -To get all documents from the `accounts` index: - -```sql -search source=accounts; -``` - -| account_number | firstname | address | balance | gender | city | employer | state | age | email | lastname | -:--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- -| 1 | Amber | 880 Holmes Lane | 39225 | M | Brogan | Pyrami | IL | 32 | amberduke@pyrami.com | Duke -| 6 | Hattie | 671 Bristol Street | 5686 | M | Dante | Netagy | TN | 36 | hattiebond@netagy.com | Bond -| 13 | Nanette | 789 Madison Street | 32838 | F | Nogal | Quility | VA | 28 | null | Bates -| 18 | Dale | 467 Hutchinson Court | 4180 | M | Orick | null | MD | 33 | daleadams@boink.com | Adams - -*Example 2*: Get documents that match a condition - -To get all documents from the `accounts` index that have either `account_number` equal to 1 or have `gender` as `F`: - -```sql -search source=accounts account_number=1 or gender=\"F\"; -``` - -| account_number | firstname | address | balance | gender | city | employer | state | age | email | lastname | -:--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- -| 1 | Amber | 880 Holmes Lane | 39225 | M | Brogan | Pyrami | IL | 32 | amberduke@pyrami.com | Duke | -| 13 | Nanette | 789 Madison Street | 32838 | F | Nogal | Quility | VA | 28 | null | Bates | +`PPL` supports all [`SQL` common]({{site.url}}{{site.baseurl}}/search-plugins/sql/functions/) functions, including [relevance search]({{site.url}}{{site.baseurl}}/search-plugins/sql/full-text/), but also introduces few more functions (called `commands`) which are available in `PPL` only. ## dedup @@ -82,7 +27,7 @@ Field | Description | Type | Required | Default `consecutive` | If true, remove only consecutive events with duplicate combinations of values. | `Boolean` | No | False `field-list` | Specify a comma-delimited field list. At least one field is required. | `String` or comma-separated list of strings | Yes | - -*Example 1*: Dedup by one field +**Example 1: Dedup by one field** To remove duplicate documents with the same gender: @@ -96,7 +41,7 @@ search source=accounts | dedup gender | fields account_number, gender; 13 | F -*Example 2*: Keep two duplicate documents +**Example 2: Keep two duplicate documents** To keep two duplicate documents with the same gender: @@ -110,7 +55,7 @@ search source=accounts | dedup 2 gender | fields account_number, gender; 6 | M 13 | F -*Example 3*: Keep or ignore an empty field by default +**Example 3: Keep or ignore an empty field by default** To keep two duplicate documents with a `null` field value: @@ -137,7 +82,7 @@ search source=accounts | dedup email | fields account_number, email; 6 | hattiebond@netagy.com 18 | daleadams@boink.com -*Example 4*: Dedup of consecutive documents +**Example 4: Dedup of consecutive documents** To remove duplicates of consecutive documents: @@ -170,7 +115,7 @@ Field | Description | Required `field` | If a field name does not exist, a new field is added. If the field name already exists, it's overwritten. | Yes `expression` | Specify any supported expression. | Yes -*Example 1*: Create a new field +**Example 1: Create a new field** To create a new `doubleAge` field for each document. `doubleAge` is the result of `age` multiplied by 2: @@ -200,7 +145,7 @@ search source=accounts | eval age = age + 1 | fields age; | 29 | 34 -*Example 3*: Create a new field with a field defined with the `eval` command +**Example 3: Create a new field with a field defined with the `eval` command** To create a new field `ddAge`. `ddAge` is the result of `doubleAge` multiplied by 2, where `doubleAge` is defined in the `eval` command: @@ -235,7 +180,7 @@ Field | Description | Required | Default `index` | Plus (+) keeps only fields specified in the field list. Minus (-) removes all fields specified in the field list. | No | + `field list` | Specify a comma-delimited list of fields. | Yes | No default -*Example 1*: Select specified fields from result +**Example 1: Select specified fields from result** To get `account_number`, `firstname`, and `lastname` fields from a search result: @@ -250,7 +195,7 @@ search source=accounts | fields account_number, firstname, lastname; | 13 | Nanette | Bates | 18 | Dale | Adams -*Example 2*: Remove specified fields from a search result +**Example 2: Remove specified fields from a search result** To remove the `account_number` field from the search results: @@ -283,7 +228,7 @@ regular-expression | The regular expression used to extract new fields from the The regular expression is used to match the whole text field of each document with Java regex engine. Each named capture group in the expression will become a new ``STRING`` field. -*Example 1*: Create new field +**Example 1: Create new field** The example shows how to create new field `host` for each document. `host` will be the hostname after `@` in `email` field. Parsing a null field will return an empty string. @@ -315,7 +260,7 @@ fetched rows / total rows = 4/4 | Madison Street | Hutchinson Court -*Example 3*: Filter and sort be casted parsed field +**Example 3: Filter and sort be casted parsed field** The example shows how to sort street numbers that are higher than 500 in address field. @@ -354,7 +299,7 @@ Field | Description | Required `source-field` | The name of the field that you want to rename. | Yes `target-field` | The name you want to rename to. | Yes -*Example 1*: Rename one field +**Example 1: Rename one field** Rename the `account_number` field as `an`: @@ -369,7 +314,7 @@ search source=accounts | rename account_number as an | fields an; | 13 | 18 -*Example 2*: Rename multiple fields +**Example 2: Rename multiple fields** Rename the `account_number` field as `an` and `employer` as `emp`: @@ -404,7 +349,7 @@ Field | Description | Required | Default `[+|-]` | Use plus [+] to sort by ascending order and minus [-] to sort by descending order. | No | Ascending order `sort-field` | Specify the field that you want to sort by. | Yes | - -*Example 1*: Sort by one field +**Example 1: Sort by one field** To sort all documents by the `age` field in ascending order: @@ -419,7 +364,7 @@ search source=accounts | sort age | fields account_number, age; | 18 | 33 | 6 | 36 -*Example 2*: Sort by one field and return all results +**Example 2: Sort by one field and return all results** To sort all documents by the `age` field in ascending order and specify count as 0 to get back all results: @@ -434,7 +379,7 @@ search source=accounts | sort 0 age | fields account_number, age; | 18 | 33 | 6 | 36 -*Example 3*: Sort by one field in descending order +**Example 3: Sort by one field in descending order** To sort all documents by the `age` field in descending order: @@ -449,7 +394,7 @@ search source=accounts | sort - age | fields account_number, age; | 1 | 32 | 13 | 28 -*Example 4*: Specify the number of sorted documents to return +**Example 4: Specify the number of sorted documents to return** To sort all documents by the `age` field in ascending order and specify count as 2 to get back two results: @@ -462,7 +407,7 @@ search source=accounts | sort 2 age | fields account_number, age; | 13 | 28 | 1 | 32 -*Example 5*: Sort by multiple fields +**Example 5: Sort by multiple fields** To sort all documents by the `gender` field in ascending order and `age` field in descending order: @@ -503,7 +448,7 @@ Field | Description | Required | Default `aggregation` | Specify a statistical aggregation function. The argument of this function must be a field. | Yes | 1000 `by-clause` | Specify one or more fields to group the results by. If not specified, the `stats` command returns only one row, which is the aggregation over the entire result set. | No | - -*Example 1*: Calculate the average value of a field +**Example 1: Calculate the average value of a field** To calculate the average `age` of all documents: @@ -515,7 +460,7 @@ search source=accounts | stats avg(age); :--- | | 32.25 -*Example 2*: Calculate the average value of a field by group +**Example 2: Calculate the average value of a field by group** To calculate the average age grouped by gender: @@ -528,7 +473,7 @@ search source=accounts | stats avg(age) by gender; | F | 28.0 | M | 33.666666666666664 -*Example 3*: Calculate the average and sum of a field by group +**Example 3: Calculate the average and sum of a field by group** To calculate the average and sum of age grouped by gender: @@ -541,7 +486,7 @@ search source=accounts | stats avg(age), sum(age) by gender; | F | 28 | 28 | M | 33.666666666666664 | 101 -*Example 4*: Calculate the maximum value of a field +**Example 4: Calculate the maximum value of a field** To calculate the maximum age: @@ -553,7 +498,7 @@ search source=accounts | stats max(age); :--- | | 36 -*Example 5*: Calculate the maximum and minimum value of a field by group +**Example 5: Calculate the maximum and minimum value of a field by group** To calculate the maximum and minimum age values grouped by gender: @@ -580,7 +525,7 @@ Field | Description | Required :--- | :--- |:--- `bool-expression` | An expression that evaluates to a boolean value. | No -*Example 1*: Filter result set with a condition +**Example: Filter result set with a condition** To get all documents from the `accounts` index where `account_number` is 1 or gender is `F`: @@ -607,7 +552,7 @@ Field | Description | Required | Default :--- | :--- |:--- `N` | Specify the number of results to return. | No | 10 -*Example 1*: Get the first 10 results +**Example 1: Get the first 10 results** To get the first 10 results: @@ -621,7 +566,7 @@ search source=accounts | fields firstname, age | head; | Hattie | 36 | Nanette | 28 -*Example 2*: Get the first N results +**Example 2: Get the first N results** To get the first two results: @@ -654,7 +599,7 @@ Field | Description | Required `field-list` | Specify a comma-delimited list of field names. | No `by-clause` | Specify one or more fields to group the results by. | No -*Example 1*: Find the least common values in a field +**Example 1: Find the least common values in a field** To find the least common values of gender: @@ -667,7 +612,7 @@ search source=accounts | rare gender; | F | M -*Example 2*: Find the least common values grouped by gender +**Example 2: Find the least common values grouped by gender** To find the least common age grouped by gender: @@ -701,7 +646,7 @@ Field | Description | Default `field-list` | Specify a comma-delimited list of field names. | - `by-clause` | Specify one or more fields to group the results by. | - -*Example 1*: Find the most common values in a field +**Example 1: Find the most common values in a field** To find the most common genders: @@ -714,7 +659,7 @@ search source=accounts | top gender; | M | F -*Example 2*: Find the most common value in a field +**Example 2: Find the most common value in a field** To find the most common gender: @@ -726,7 +671,7 @@ search source=accounts | top 1 gender; :--- | | M -*Example 2*: Find the most common values grouped by gender +**Example 3: Find the most common values grouped by gender** To find the most common age grouped by gender: @@ -743,100 +688,11 @@ search source=accounts | top 1 age by gender; The `top` command is not rewritten to OpenSearch DSL, it is only executed on the coordination node. -## match - -Use the `match` command to search documents that match a `string`, `number`, `date`, or `boolean` value for a given field. - -### Syntax - -```sql -match(field_expression, query_expression[, option=]*) -``` - -You can specify the following options: - -- `analyzer` -- `auto_generate_synonyms_phrase` -- `fuzziness` -- `max_expansions` -- `prefix_length` -- `fuzzy_transpositions` -- `fuzzy_rewrite` -- `lenient` -- `operator` -- `minimum_should_match` -- `zero_terms_query` -- `boost` - -*Example 1*: Search the `message` field: - -```json -GET my_index/_search -{ - "query": { - "match": { - "message": "this is a test" - } - } -} -``` - -PPL query: - -```sql -search source=my_index | match field=message query="this is a test" -``` - -*Example 2*: Search the `message` field with the `operator` parameter: - -```json -GET my_index/_search -{ - "query": { - "match": { - "message": { - "query": "this is a test", - "operator": "and" - } - } - } -} -``` - -PPL query: - -```sql -search source=my_index | match field=message query="this is a test" operator=and -``` - -*Example 3*: Search the `message` field with the `operator` and `zero_terms_query` parameters: - -```json -GET my_index/_search -{ - "query": { - "match": { - "message": { - "query": "to be or not to be", - "operator": "and", - "zero_terms_query": "all" - } - } - } -} -``` - -PPL query: - -```ppl -search source=my_index | where match(message, "this is a test", operator=and, zero_terms_query=all) -``` - ## ad -The `ad` command applies the Random Cut Forest (RCF) algorithm in the ML Commons plugin on the search result returned by a PPL command. Based on the input, the plugin uses two types of RCF algorithms: fixed in time RCF for processing time-series data and batch RCF for processing non-time-series data. +The `ad` command applies the Random Cut Forest (RCF) algorithm in the [ML Commons plugin]({{site.url}}{{site.baseurl}}/ml-commons-plugin/index/) on the search result returned by a PPL command. Based on the input, the plugin uses two types of RCF algorithms: fixed in time RCF for processing time-series data and batch RCF for processing non-time-series data. -### Fixed In Time RCF For Time-series Data Command Syntax +### Syntax: Fixed In Time RCF For Time-series Data Command ```sql ad @@ -848,7 +704,7 @@ Field | Description | Required `time_decay` | Specifies how much of the recent past to consider when computing an anomaly score. The default value is 0.001. | No `time_field` | Specifies the time filed for RCF to use as time-series data. Must be either a long value, such as the timestamp in miliseconds, or a string value in "yyyy-MM-dd HH:mm:ss".| Yes -### Batch RCF for Non-time-series Data Command Syntax +### Syntax: Batch RCF for Non-time-series Data Command ```sql ad @@ -859,7 +715,7 @@ Field | Description | Required `shingle_size` | A consecutive sequence of the most recent records. The default value is 8. | No `time_decay` | Specifies how much of the recent past to consider when computing an anomaly score. The default value is 0.001. | No -*Example 1*: Detecting events in New York City from taxi ridership data with time-series data +**Example 1: Detecting events in New York City from taxi ridership data with time-series data** The example trains a RCF model and use the model to detect anomalies in the time-series ridership data. @@ -873,7 +729,7 @@ value | timestamp | score | anomaly_grade :--- | :--- |:--- | :--- 10844.0 | 1404172800000 | 0.0 | 0.0 -*Example 2*: Detecting events in New York City from taxi ridership data with non-time-series data +**Example 2: Detecting events in New York City from taxi ridership data with non-time-series data** PPL query: @@ -889,7 +745,7 @@ value | score | anomalous The kmeans command applies the ML Commons plugin's kmeans algorithm to the provided PPL command's search results. -## Syntax +### Syntax ```sql kmeans @@ -897,7 +753,7 @@ kmeans For `cluster-number`, enter the number of clusters you want to group your data points into. -*Example* +**Example: Group Iris data** The example shows how to classify three Iris species (Iris setosa, Iris virginica and Iris versicolor) based on the combination of four features measured from each sample: the length and the width of the sepals and petals. diff --git a/_observability-plugin/ppl/index.md b/_search-plugins/sql/ppl/index.md similarity index 91% rename from _observability-plugin/ppl/index.md rename to _search-plugins/sql/ppl/index.md index e3c9724a2c..de6ee0f84e 100644 --- a/_observability-plugin/ppl/index.md +++ b/_search-plugins/sql/ppl/index.md @@ -1,14 +1,17 @@ --- layout: default -title: Piped processing language -nav_order: 40 +title: PPL – Piped Processing Language +parent: SQL and PPL +nav_order: 5 has_children: true has_toc: false redirect_from: - - /search-plugins/ppl/ + - /search-plugins/sql/ppl + - /search-plugins/ppl + - /observability-plugin/ppl --- -# Piped Processing Language +# PPL – Piped Processing Language Piped Processing Language (PPL) is a query language that lets you use pipe (`|`) syntax to explore, discover, and query data stored in OpenSearch. @@ -42,7 +45,7 @@ Go to **Query Workbench** and select **PPL**. The following example returns `firstname` and `lastname` fields for documents in an `accounts` index with `age` greater than 18: -```json +```sql search source=accounts | where age > 18 | fields firstname, lastname diff --git a/_search-plugins/sql/ppl/syntax.md b/_search-plugins/sql/ppl/syntax.md new file mode 100644 index 0000000000..9ccd701fb5 --- /dev/null +++ b/_search-plugins/sql/ppl/syntax.md @@ -0,0 +1,71 @@ +--- +layout: default +title: Syntax +parent: PPL - Piped Processing Language +grand_parent: SQL and PPL +nav_order: 1 +--- + +# PPL syntax + +Every PPL query starts with the `search` command. It specifies the index to search and retrieve documents from. Subsequent commands can follow in any order. + +Currently, `PPL` supports only one `search` command, which can be omitted to simplify the query. +{ : .note} + +## Syntax + +```sql +search source= [boolean-expression] +source= [boolean-expression] +``` + +Field | Description | Required +:--- | :--- |:--- +`search` | Specifies search keywords. | Yes +`index` | Specifies which index to query from. | No +`bool-expression` | Specifies an expression that evaluates to a Boolean value. | No + +## Examples + +**Example 1: Search through accounts index** + +In the following example, the `search` command refers to an `accounts` index as the source and uses `fields` and `where` commands for the conditions: + +```sql +search source=accounts +| where age > 18 +| fields firstname, lastname +``` + +In the following examples, angle brackets `< >` enclose required arguments and square brackets `[ ]` enclose optional arguments. +{: .note } + + +**Example 2: Get all documents** + +To get all documents from the `accounts` index, specify it as the `source`: + +```sql +search source=accounts; +``` + +| account_number | firstname | address | balance | gender | city | employer | state | age | email | lastname | +:--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- +| 1 | Amber | 880 Holmes Lane | 39225 | M | Brogan | Pyrami | IL | 32 | amberduke@pyrami.com | Duke +| 6 | Hattie | 671 Bristol Street | 5686 | M | Dante | Netagy | TN | 36 | hattiebond@netagy.com | Bond +| 13 | Nanette | 789 Madison Street | 32838 | F | Nogal | Quility | VA | 28 | null | Bates +| 18 | Dale | 467 Hutchinson Court | 4180 | M | Orick | null | MD | 33 | daleadams@boink.com | Adams + +**Example 3: Get documents that match a condition** + +To get all documents from the `accounts` index that either have `account_number` equal to 1 or have `gender` as `F`, use the following query: + +```sql +search source=accounts account_number=1 or gender=\"F\"; +``` + +| account_number | firstname | address | balance | gender | city | employer | state | age | email | lastname | +:--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- +| 1 | Amber | 880 Holmes Lane | 39225 | M | Brogan | Pyrami | IL | 32 | amberduke@pyrami.com | Duke | +| 13 | Nanette | 789 Madison Street | 32838 | F | Nogal | Quility | VA | 28 | null | Bates | diff --git a/_search-plugins/sql/protocol.md b/_search-plugins/sql/protocol.md deleted file mode 100644 index 1c62308513..0000000000 --- a/_search-plugins/sql/protocol.md +++ /dev/null @@ -1,331 +0,0 @@ ---- -layout: default -title: Protocol -parent: SQL -nav_order: 14 ---- - -# Protocol - -For the protocol, SQL plugin provides multiple response formats for -different purposes while the request format is same for all. Among them -JDBC format is widely used because it provides schema information and -more functionality such as pagination. Besides JDBC driver, various -clients can benefit from the detailed and well formatted response. - -## Request Format - -### Description - -The body of HTTP POST request can take a few more other fields with SQL -query. - -### Example 1 - -Use `filter` to add more conditions to -OpenSearch DSL directly. - -SQL query: - -```console ->> curl -H 'Content-Type: application/json' -X POST localhost:9200/_plugins/_sql -d '{ - "query" : "SELECT firstname, lastname, balance FROM accounts", - "filter" : { - "range" : { - "balance" : { - "lt" : 10000 - } - } - } -}' -``` - -Explain: - -```json -{ - "from": 0, - "size": 200, - "query": { - "bool": { - "filter": [{ - "bool": { - "filter": [{ - "range": { - "balance": { - "from": null, - "to": 10000, - "include_lower": true, - "include_upper": false, - "boost": 1.0 - } - } - }], - "adjust_pure_negative": true, - "boost": 1.0 - } - }], - "adjust_pure_negative": true, - "boost": 1.0 - } - }, - "_source": { - "includes": [ - "firstname", - "lastname", - "balance" - ], - "excludes": [] - } -} -``` - -### Example 2 - -Use `parameters` for actual parameter value -in prepared SQL query. - -SQL query: - -```console ->> curl -H 'Content-Type: application/json' -X POST localhost:9200/_plugins/_sql -d '{ - "query": "SELECT * FROM accounts WHERE age = ?", - "parameters": [{ - "type": "integer", - "value": 30 - }] -}' -``` - -Explain: - -```json -{ - "from": 0, - "size": 200, - "query": { - "bool": { - "filter": [{ - "bool": { - "must": [{ - "term": { - "age": { - "value": 30, - "boost": 1.0 - } - } - }], - "adjust_pure_negative": true, - "boost": 1.0 - } - }], - "adjust_pure_negative": true, - "boost": 1.0 - } - } -} - -``` -## JDBC Format - -### Description - -By default, the plugin returns the JDBC standard format. This format -is provided for JDBC driver and clients that need both schema and -result set well formatted. - -### Example 1 - -Here is an example for normal response. The -`schema` includes field name and its type -and `datarows` includes the result set. - -SQL query: - -```console ->> curl -H 'Content-Type: application/json' -X POST localhost:9200/_plugins/_sql -d '{ - "query" : "SELECT firstname, lastname, age FROM accounts ORDER BY age LIMIT 2" -}' -``` - -Result set: - -```json -{ - "schema": [{ - "name": "firstname", - "type": "text" - }, - { - "name": "lastname", - "type": "text" - }, - { - "name": "age", - "type": "long" - } - ], - "total": 4, - "datarows": [ - [ - "Nanette", - "Bates", - 28 - ], - [ - "Amber", - "Duke", - 32 - ] - ], - "size": 2, - "status": 200 -} -``` - -### Example 2 - -If any error occurred, error message and the cause will be returned -instead. - -SQL query: - -```console ->> curl -H 'Content-Type: application/json' -X POST localhost:9200/_plugins/_sql -d '{ - "query" : "SELECT unknown FROM accounts" -}' -``` - -Result set: - -```json -{ - "error": { - "reason": "Invalid SQL query", - "details": "Field [unknown] cannot be found or used here.", - "type": "SemanticAnalysisException" - }, - "status": 400 -} -``` - -## OpenSearch DSL - -### Description - -The `json` format returns original response from OpenSearch in -JSON. Because this is the native response from OpenSearch, extra -efforts are needed to parse and interpret it. - -### Example - -SQL query: - -```console ->> curl -H 'Content-Type: application/json' -X POST localhost:9200/_plugins/_sql?format=json -d '{ - "query" : "SELECT firstname, lastname, age FROM accounts ORDER BY age LIMIT 2" -}' -``` - -Result set: - -```json -{ - "_shards": { - "total": 5, - "failed": 0, - "successful": 5, - "skipped": 0 - }, - "hits": { - "hits": [{ - "_index": "accounts", - "_type": "account", - "_source": { - "firstname": "Nanette", - "age": 28, - "lastname": "Bates" - }, - "_id": "13", - "sort": [ - 28 - ], - "_score": null - }, - { - "_index": "accounts", - "_type": "account", - "_source": { - "firstname": "Amber", - "age": 32, - "lastname": "Duke" - }, - "_id": "1", - "sort": [ - 32 - ], - "_score": null - } - ], - "total": { - "value": 4, - "relation": "eq" - }, - "max_score": null - }, - "took": 100, - "timed_out": false -} -``` - -## CSV Format - -### Description - -You can also use CSV format to download result set as CSV. - -### Example - -SQL query: - -```console ->> curl -H 'Content-Type: application/json' -X POST localhost:9200/_plugins/_sql?format=csv -d '{ - "query" : "SELECT firstname, lastname, age FROM accounts ORDER BY age" -}' -``` - -Result set: - -```text -firstname,lastname,age -Nanette,Bates,28 -Amber,Duke,32 -Dale,Adams,33 -Hattie,Bond,36 -``` - -## Raw Format - -### Description - -Additionally raw format can be used to pipe the result to other command -line tool for post processing. - -### Example - -SQL query: - -```console ->> curl -H 'Content-Type: application/json' -X POST localhost:9200/_plugins/_sql?format=raw -d '{ - "query" : "SELECT firstname, lastname, age FROM accounts ORDER BY age" -}' -``` - -Result set: - -```text -Nanette|Bates|28 -Amber|Duke|32 -Dale|Adams|33 -Hattie|Bond|36 -``` diff --git a/_search-plugins/sql/response-formats.md b/_search-plugins/sql/response-formats.md new file mode 100644 index 0000000000..e27d45847c --- /dev/null +++ b/_search-plugins/sql/response-formats.md @@ -0,0 +1,283 @@ +--- +layout: default +title: Response formats +parent: SQL and PPL +nav_order: 2 +--- + +# Response formats + +The SQL plugin provides the `jdbc`, `csv`, `raw`, and `json` response formats that are useful for different purposes. The `jdbc` format is widely used because it provides the schema information and adds more functionality, such as pagination. Besides the JDBC driver, various clients can benefit from a detailed and well-formatted response. + +## JDBC format + +By default, the SQL plugin returns the response in the standard JDBC format. This format is provided for the JDBC driver and clients that need both the schema and the result set to be well formatted. + +#### Sample request + +The following query does not specify the response format, so the format is set to `jdbc`: + +```json +POST _plugins/_sql +{ + "query" : "SELECT firstname, lastname, age FROM accounts ORDER BY age LIMIT 2" +} +``` + +#### Sample response + +In the response, the `schema` contains the field names and types, and the `datarows` field contains the result set: + +```json +{ + "schema": [{ + "name": "firstname", + "type": "text" + }, + { + "name": "lastname", + "type": "text" + }, + { + "name": "age", + "type": "long" + } + ], + "total": 4, + "datarows": [ + [ + "Nanette", + "Bates", + 28 + ], + [ + "Amber", + "Duke", + 32 + ] + ], + "size": 2, + "status": 200 +} +``` + +If an error of any type occurs, OpenSearch returns the error message. + +The following query searches for a non-existent field `unknown`: + +```json +POST /_plugins/_sql +{ + "query" : "SELECT unknown FROM accounts" +} +``` + +The response contains the error message and the cause of the error: + +```json +{ + "error": { + "reason": "Invalid SQL query", + "details": "Field [unknown] cannot be found or used here.", + "type": "SemanticAnalysisException" + }, + "status": 400 +} +``` + +## OpenSearch DSL JSON format + +If you set the format to `json`, the original OpenSearch response is returned in JSON format. Because this is the native response from OpenSearch, extra effort is needed to parse and interpret it. + +#### Sample request + +The following query sets the response format to `json`: + +```json +POST _plugins/_sql?format=json +{ + "query" : "SELECT firstname, lastname, age FROM accounts ORDER BY age LIMIT 2" +} +``` + +#### Sample response + +The response is the original response from OpenSearch: + +```json +{ + "_shards": { + "total": 5, + "failed": 0, + "successful": 5, + "skipped": 0 + }, + "hits": { + "hits": [{ + "_index": "accounts", + "_type": "account", + "_source": { + "firstname": "Nanette", + "age": 28, + "lastname": "Bates" + }, + "_id": "13", + "sort": [ + 28 + ], + "_score": null + }, + { + "_index": "accounts", + "_type": "account", + "_source": { + "firstname": "Amber", + "age": 32, + "lastname": "Duke" + }, + "_id": "1", + "sort": [ + 32 + ], + "_score": null + } + ], + "total": { + "value": 4, + "relation": "eq" + }, + "max_score": null + }, + "took": 100, + "timed_out": false +} +``` + +## CSV format + +You can also specify to return results in CSV format. + +#### Sample request + +```json +POST /_plugins/_sql?format=csv +{ + "query" : "SELECT firstname, lastname, age FROM accounts ORDER BY age" +} +``` + +#### Sample response + +```text +firstname,lastname,age +Nanette,Bates,28 +Amber,Duke,32 +Dale,Adams,33 +Hattie,Bond,36 +``` +### Sanitizing results in CSV format + +By default, OpenSearch sanitizes header cells (field names) and data cells (field contents) according to the following rules: + +- If a cell starts with `+`, `-`, `=` , or `@`, the sanitizer inserts a single quote (`'`) at the start of the cell. +- If a cell contains one or more commas (`,`), the sanitizer surrounds the cell with double quotes (`"`). + +### Example + +The following query indexes a document with cells that either start with special characters or contain commas: + +```json +PUT /userdata/_doc/1?refresh=true +{ + "+firstname": "-Hattie", + "=lastname": "@Bond", + "address": "671 Bristol Street, Dente, TN" +} +``` + +You can use the query below to request results in CSV format: + +```json +POST /_plugins/_sql?format=csv +{ + "query" : "SELECT * FROM userdata" +} +``` + +In the response, cells that start with special characters are prefixed with `'`. The cell that has commas is surrounded with quotation marks: + +```text +'+firstname,'=lastname,address +'Hattie,'@Bond,"671 Bristol Street, Dente, TN" +``` + +To skip sanitizing, set the `sanitize` query parameter to false: + +```json +POST /_plugins/_sql?format=csvandsanitize=false +{ + "query" : "SELECT * FROM userdata" +} +``` + +The response contains the results in the original CSV format: + +```text +=lastname,address,+firstname +@Bond,"671 Bristol Street, Dente, TN",-Hattie +``` + +## Raw format + +You can use the raw format to pipe the results to other command line tools for post-processing. + +#### Sample request + +```json +POST /_plugins/_sql?format=raw +{ + "query" : "SELECT firstname, lastname, age FROM accounts ORDER BY age" +} +``` + +#### Sample response + +```text +Nanette|Bates|28 +Amber|Duke|32 +Dale|Adams|33 +Hattie|Bond|36 +``` + +By default, OpenSearch sanitizes results in `raw` format according to the following rule: + +- If a data cell contains one or more pipe characters (`|`), the sanitizer surrounds the cell with double quotes. + +### Example + +The following query indexes a document with pipe characters (`|`) in its fields: + +```json +PUT /userdata/_doc/1?refresh=true +{ + "+firstname": "|Hattie", + "=lastname": "Bond|", + "|address": "671 Bristol Street| Dente| TN" +} +``` + +You can use the query below to request results in `raw` format: + +```json +POST /_plugins/_sql?format=raw +{ + "query" : "SELECT * FROM userdata" +} +``` + +The query returns cells with the `|` character surrounded by quotation marks: + +```text +"|address"|=lastname|+firstname +"671 Bristol Street| Dente| TN"|"Bond|"|"|Hattie" +``` \ No newline at end of file diff --git a/_search-plugins/sql/settings.md b/_search-plugins/sql/settings.md index 93832a8cd8..6967111c83 100644 --- a/_search-plugins/sql/settings.md +++ b/_search-plugins/sql/settings.md @@ -1,13 +1,15 @@ --- layout: default title: Settings -parent: SQL -nav_order: 16 +parent: SQL and PPL +nav_order: 77 --- # Settings -The SQL plugin adds a few settings to the standard OpenSearch cluster settings. Most are dynamic, so you can change the default behavior of the plugin without restarting your cluster. +The SQL plugin adds a few settings to the standard OpenSearch cluster settings. Most are dynamic, so you can change the default behavior of the plugin without restarting your cluster. + +It is possible to independently disable processing of `PPL` or `SQL` queries. You can update these settings like any other cluster setting: @@ -20,7 +22,23 @@ PUT _cluster/settings } ``` -Similarly, you can also update the settings by sending the request to the plugin setting endpoint `_plugins/_query/setting`: +Alternatively, you can use the following request format: + +```json +PUT _cluster/settings +{ + "transient": { + "plugins": { + "ppl": { + "enabled": "false" + } + } + } +} +``` + +Similarly, you can update the settings by sending a request to the `_plugins/_query/settings` endpoint: + ```json PUT _plugins/_query/settings { @@ -30,10 +48,31 @@ PUT _plugins/_query/settings } ``` +Alternatively, you can use the following request format: + +```json +PUT _plugins/_query/settings +{ + "transient": { + "plugins": { + "ppl": { + "enabled": "false" + } + } + } +} +``` + +Requests to the `_plugins/_ppl` and `_plugins/_sql` endpoints include index names in the request body, so they have the same access policy considerations as the `bulk`, `mget`, and `msearch` operations. Setting the `rest.action.multi.allow_explicit_index` parameter to `false` disables both the `SQL` and `PPL` endpoints. +{: .note} + +# Available settings + Setting | Default | Description :--- | :--- | :--- -`plugins.sql.enabled` | True | Change to `false` to disable the plugin. -`plugins.sql.slowlog` | 2 seconds | Configure the time limit (in seconds) for slow queries. The plugin logs slow queries as `Slow query: elapsed=xxx (ms)` in `opensearch.log`. -`plugins.sql.cursor.keep_alive` | 1 minute | This value configures how long the cursor context is kept open. Cursor contexts are resource heavy, so we recommend a low value. -`plugins.query.memory_limit` | 85% | This setting configures the heap memory usage limit for the circuit breaker of the query engine. -`plugins.query.size_limit` | 200 | The setting sets the default size of index that the query engine fetches from OpenSearch. +`plugins.sql.enabled` | True | Change to `false` to disable the `SQL` support in the plugin. +`plugins.ppl.enabled` | True | Change to `false` to disable the `PPL` support in the plugin. +`plugins.sql.slowlog` | 2 seconds | Configures the time limit (in seconds) for slow queries. The plugin logs slow queries as `Slow query: elapsed=xxx (ms)` in `opensearch.log`. +`plugins.sql.cursor.keep_alive` | 1 minute | Configures how long the cursor context is kept open. Cursor contexts are resource resource intensive, so we recommend a low value. +`plugins.query.memory_limit` | 85% | Configures the heap memory usage limit for the circuit breaker of the query engine. +`plugins.query.size_limit` | 200 | Sets the default size of index that the query engine fetches from OpenSearch. diff --git a/_search-plugins/sql/sql-full-text.md b/_search-plugins/sql/sql-full-text.md deleted file mode 100644 index 641fd27d6b..0000000000 --- a/_search-plugins/sql/sql-full-text.md +++ /dev/null @@ -1,205 +0,0 @@ ---- -layout: default -title: Full-Text Search -parent: SQL -nav_order: 8 ---- - -# Full-text search - -Use SQL commands for full-text search. The SQL plugin supports a subset of the full-text queries available in OpenSearch. - -To learn about full-text queries in OpenSearch, see [Full-text queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/). - -## Match - -Use the `match` command to search documents that match a `string`, `number`, `date`, or `boolean` value for a given field. - -### Syntax - -```sql -match(field_expression, query_expression[, option=]*) -``` - -You can specify the following options: - -- `analyzer` -- `auto_generate_synonyms_phrase` -- `fuzziness` -- `max_expansions` -- `prefix_length` -- `fuzzy_transpositions` -- `fuzzy_rewrite` -- `lenient` -- `operator` -- `minimum_should_match` -- `zero_terms_query` -- `boost` - -*Example 1*: Search the `message` field: - -```json -GET my_index/_search -{ - "query": { - "match": { - "message": "this is a test" - } - } -} -``` - -SQL query: - -```sql -SELECT message FROM my_index WHERE match(message, "this is a test") -``` - -*Example 2*: Search the `message` field with the `operator` parameter: - -```json -GET my_index/_search -{ - "query": { - "match": { - "message": { - "query": "this is a test", - "operator": "and" - } - } - } -} -``` - -SQL query: - -```sql -SELECT message FROM my_index WHERE match(message, "this is a test", operator=and) -``` - -*Example 3*: Search the `message` field with the `operator` and `zero_terms_query` parameters: - -```json -GET my_index/_search -{ - "query": { - "match": { - "message": { - "query": "to be or not to be", - "operator": "and", - "zero_terms_query": "all" - } - } - } -} -``` - -SQL query: - -```sql -SELECT message FROM my_index WHERE match(message, "this is a test", operator=and, zero_terms_query=all) -``` - -To search for text in a single field, use `MATCHQUERY` or `MATCH_QUERY` functions. - -Pass in your search query and the field name that you want to search against. - -```sql -SELECT account_number, address -FROM accounts -WHERE MATCH_QUERY(address, 'Holmes') -``` - -Alternate syntax: - -```sql -SELECT account_number, address -FROM accounts -WHERE address = MATCH_QUERY('Holmes') -``` - - -| account_number | address -:--- | :--- -1 | 880 Holmes Lane - - -## Multi match - -To search for text in multiple fields, use `MULTI_MATCH`, `MULTIMATCH`, or `MULTIMATCHQUERY` functions. - -For example, search for `Dale` in either the `firstname` or `lastname` fields: - - -```sql -SELECT firstname, lastname -FROM accounts -WHERE MULTI_MATCH('query'='Dale', 'fields'='*name') -``` - - -| firstname | lastname -:--- | :--- -Dale | Adams - - -## Query string - -To split text based on operators, use the `QUERY` function. - - -```sql -SELECT account_number, address -FROM accounts -WHERE QUERY('address:Lane OR address:Street') -``` - - -| account_number | address -:--- | :--- -1 | 880 Holmes Lane -6 | 671 Bristol Street -13 | 789 Madison Street - - -The `QUERY` function supports logical connectives, wildcard, regex, and proximity search. - - -## Match phrase - -To search for exact phrases, use `MATCHPHRASE`, `MATCH_PHRASE`, or `MATCHPHRASEQUERY` functions. - - -```sql -SELECT account_number, address -FROM accounts -WHERE MATCH_PHRASE(address, '880 Holmes Lane') -``` - - -| account_number | address -:--- | :--- -1 | 880 Holmes Lane - - -## Score query - -To return a relevance score along with every matching document, use `SCORE`, `SCOREQUERY`, or `SCORE_QUERY` functions. - -You need to pass in two arguments. The first is the `MATCH_QUERY` expression. The second is an optional floating point number to boost the score (default value is 1.0). - - -```sql -SELECT account_number, address, _score -FROM accounts -WHERE SCORE(MATCH_QUERY(address, 'Lane'), 0.5) OR - SCORE(MATCH_QUERY(address, 'Street'), 100) -ORDER BY _score -``` - - -| account_number | address | score -:--- | :--- | :--- -1 | 880 Holmes Lane | 0.5 -6 | 671 Bristol Street | 100 -13 | 789 Madison Street | 100 diff --git a/_search-plugins/sql/sql-ppl-api.md b/_search-plugins/sql/sql-ppl-api.md new file mode 100644 index 0000000000..a90b4582ce --- /dev/null +++ b/_search-plugins/sql/sql-ppl-api.md @@ -0,0 +1,521 @@ +--- +layout: default +title: SQL/PPL API +parent: SQL and PPL +nav_order: 1 +--- + +# SQL/PPL API + +Use the SQL and PPL API to send queries to the SQL plugin. Use the `_sql` endpoint to send queries in SQL, and the `_ppl` endpoint to send queries in PPL. For both of these, you can also use the `_explain` endpoint to translate your query into [OpenSearch domain-specific language]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/) (DSL) or to troubleshoot errors. + +--- + +#### Table of contents +- TOC +{:toc} + + +--- + +## Query API + +Introduced 1.0 +{: .label .label-purple } + +Sends an SQL/PPL query to the SQL plugin. You can pass the format for the response as a query parameter. + +### Query parameters + +Parameter | Data Type | Description +:--- | :--- | :--- +[format]({{site.url}}{{site.baseurl}}/search-plugins/sql/response-formats/) | String | The format for the response. The `_sql` endpoint supports `jdbc`, `csv`, `raw`, and `json` formats. The `_ppl` endpoint supports `jdbc`, `csv`, and `raw` formats. Default is `jdbc`. +sanitize | Boolean | Specifies whether to escape special characters in the results. See [Response formats]({{site.url}}{{site.baseurl}}/search-plugins/sql/response-formats/) for more information. Default is `true`. + +### Request fields + +Field | Data Type | Description +:--- | :--- | :--- +query | String | The query to be executed. Required. +[filter](#filtering-results) | JSON object | The filter for the results. Optional. +[fetch_size](#paginating-results) | integer | The number of results to return in one response. Used for paginating results. Default is 1,000. Optional. Only supported for the `jdbc` response format. + +#### Sample request + +```json +POST /_plugins/_sql +{ + "query" : "SELECT * FROM accounts" +} +``` + +#### Sample response + +The response contains the schema and the results: + +```json +{ + "schema": [ + { + "name": "account_number", + "type": "long" + }, + { + "name": "firstname", + "type": "text" + }, + { + "name": "address", + "type": "text" + }, + { + "name": "balance", + "type": "long" + }, + { + "name": "gender", + "type": "text" + }, + { + "name": "city", + "type": "text" + }, + { + "name": "employer", + "type": "text" + }, + { + "name": "state", + "type": "text" + }, + { + "name": "age", + "type": "long" + }, + { + "name": "email", + "type": "text" + }, + { + "name": "lastname", + "type": "text" + } + ], + "datarows": [ + [ + 1, + "Amber", + "880 Holmes Lane", + 39225, + "M", + "Brogan", + "Pyrami", + "IL", + 32, + "amberduke@pyrami.com", + "Duke" + ], + [ + 6, + "Hattie", + "671 Bristol Street", + 5686, + "M", + "Dante", + "Netagy", + "TN", + 36, + "hattiebond@netagy.com", + "Bond" + ], + [ + 13, + "Nanette", + "789 Madison Street", + 32838, + "F", + "Nogal", + "Quility", + "VA", + 28, + "nanettebates@quility.com", + "Bates" + ], + [ + 18, + "Dale", + "467 Hutchinson Court", + 4180, + "M", + "Orick", + null, + "MD", + 33, + "daleadams@boink.com", + "Adams" + ] + ], + "total": 4, + "size": 4, + "status": 200 +} +``` + +### Response fields + +Field | Data Type | Description +:--- | :--- | :--- +schema | Array | Specifies the field names and types for all fields. +data_rows | 2D array | An array of results. Each result represents one matching row (document). +total | Integer | The total number of rows (documents) in the index. +size | Integer | The number of results to return in one response. +status | String | The HTTP response status OpenSearch returns after running the query. + +## Explain API + +The SQL plugin has an `explain` feature that shows how a query is executed against OpenSearch, which is useful for debugging and development. A POST request to the `_plugins/_sql/_explain` or `_plugins/_ppl/_explain` endpoint returns [OpenSearch domain-specific language]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/) (DSL) in JSON format, explaining the query. +You can execute the explain API operation either in command line using `curl` or in the Dashboards console, like in the example below. + +#### Sample explain request for an SQL query + +```json +POST _plugins/_sql/_explain +{ + "query": "SELECT firstname, lastname FROM accounts WHERE age > 20" +} +``` + +#### Sample SQL query explain response + +```json +{ + "root": { + "name": "ProjectOperator", + "description": { + "fields": "[firstname, lastname]" + }, + "children": [ + { + "name": "OpenSearchIndexScan", + "description": { + "request": """OpenSearchQueryRequest(indexName=accounts, sourceBuilder={"from":0,"size":200,"timeout":"1m","query":{"range":{"age":{"from":20,"to":null,"include_lower":false,"include_upper":true,"boost":1.0}}},"_source":{"includes":["firstname","lastname"],"excludes":[]},"sort":[{"_doc":{"order":"asc"}}]}, searchDone=false)""" + }, + "children": [] + } + ] + } +} +``` + +#### Sample explain request for a PPL query + +```json +POST _plugins/_ppl/_explain +{ + "query" : "source=accounts | fields firstname, lastname" +} +``` + +#### Sample PPL query explain response + +```json +{ + "root": { + "name": "ProjectOperator", + "description": { + "fields": "[firstname, lastname]" + }, + "children": [ + { + "name": "OpenSearchIndexScan", + "description": { + "request": """OpenSearchQueryRequest(indexName=accounts, sourceBuilder={"from":0,"size":200,"timeout":"1m","_source":{"includes":["firstname","lastname"],"excludes":[]}}, searchDone=false)""" + }, + "children": [] + } + ] + } +} +``` + +For queries that require post-processing, the `explain` response includes a query plan in addition to the OpenSearch DSL. For those queries that don't require post processing, you can see a complete DSL. + +## Paginating results + +To get back a paginated response, use the `fetch_size` parameter. The value of `fetch_size` should be greater than 0. The default value is 1,000. A value of 0 will fall back to a non-paginated response. + +The `fetch_size` parameter is only supported for the `jdbc` response format. +{: .note } + +### Example + +The following request contains an SQL query and specifies to return five results at a time: + +```json +POST _plugins/_sql/ +{ + "fetch_size" : 5, + "query" : "SELECT firstname, lastname FROM accounts WHERE age > 20 ORDER BY state ASC" +} +``` + +The response contains all the fields that a query without `fetch_size` would contain, and a `cursor` field that is used to retrieve subsequent pages of results: + +```json +{ + "schema": [ + { + "name": "firstname", + "type": "text" + }, + { + "name": "lastname", + "type": "text" + } + ], + "cursor": "d:eyJhIjp7fSwicyI6IkRYRjFaWEo1UVc1a1JtVjBZMmdCQUFBQUFBQUFBQU1XZWpkdFRFRkZUMlpTZEZkeFdsWnJkRlZoYnpaeVVRPT0iLCJjIjpbeyJuYW1lIjoiZmlyc3RuYW1lIiwidHlwZSI6InRleHQifSx7Im5hbWUiOiJsYXN0bmFtZSIsInR5cGUiOiJ0ZXh0In1dLCJmIjo1LCJpIjoiYWNjb3VudHMiLCJsIjo5NTF9", + "total": 956, + "datarows": [ + [ + "Cherry", + "Carey" + ], + [ + "Lindsey", + "Hawkins" + ], + [ + "Sargent", + "Powers" + ], + [ + "Campos", + "Olsen" + ], + [ + "Savannah", + "Kirby" + ] + ], + "size": 5, + "status": 200 +} +``` + +To fetch subsequent pages, use the `cursor` from the previous response: + +```json +POST /_plugins/_sql +{ + "cursor": "d:eyJhIjp7fSwicyI6IkRYRjFaWEo1UVc1a1JtVjBZMmdCQUFBQUFBQUFBQU1XZWpkdFRFRkZUMlpTZEZkeFdsWnJkRlZoYnpaeVVRPT0iLCJjIjpbeyJuYW1lIjoiZmlyc3RuYW1lIiwidHlwZSI6InRleHQifSx7Im5hbWUiOiJsYXN0bmFtZSIsInR5cGUiOiJ0ZXh0In1dLCJmIjo1LCJpIjoiYWNjb3VudHMiLCJsIjo5NTF9" +} +``` + +The next response contains only the `datarows` of the results and a new `cursor`. + +```json +{ + "cursor": "d:eyJhIjp7fSwicyI6IkRYRjFaWEo1UVc1a1JtVjBZMmdCQUFBQUFBQUFBQU1XZWpkdFRFRkZUMlpTZEZkeFdsWnJkRlZoYnpaeVVRPT0iLCJjIjpbeyJuYW1lIjoiZmlyc3RuYW1lIiwidHlwZSI6InRleHQifSx7Im5hbWUiOiJsYXN0bmFtZSIsInR5cGUiOiJ0ZXh0In1dLCJmIjo1LCJpIjoiYWNjb3VudHMabcde12345", + "datarows": [ + [ + "Abbey", + "Karen" + ], + [ + "Chen", + "Ken" + ], + [ + "Ani", + "Jade" + ], + [ + "Peng", + "Hu" + ], + [ + "John", + "Doe" + ] + ] +} +``` + +The `datarows` can have more than the `fetch_size` number of records in case nested fields are flattened. +{: .note } + +The last page of results has only `datarows` and no `cursor`. The `cursor` context is automatically cleared on the last page. + +To explicitly clear the cursor context, use the `_plugins/_sql/close` endpoint operation: + +```json +POST /_plugins/_sql/close +{ + "cursor": "d:eyJhIjp7fSwicyI6IkRYRjFaWEo1UVc1a1JtVjBZMmdCQUFBQUFBQUFBQU1XZWpkdFRFRkZUMlpTZEZkeFdsWnJkRlZoYnpaeVVRPT0iLCJjIjpbeyJuYW1lIjoiZmlyc3RuYW1lIiwidHlwZSI6InRleHQifSx7Im5hbWUiOiJsYXN0bmFtZSIsInR5cGUiOiJ0ZXh0In1dLCJmIjo1LCJpIjoiYWNjb3VudHMiLCJsIjo5NTF9" +}' +``` + +The response is an acknowledgement from OpenSearch: + +```json +{"succeeded":true} +``` + +## Filtering results + +You can use the `filter` parameter to add more conditions to the OpenSearch DSL directly. + +The following SQL query returns the names and account balances of all customers. The results are then filtered to contain only those customers with less than $10,000 balance. + +```json +POST /_plugins/_sql/ +{ + "query" : "SELECT firstname, lastname, balance FROM accounts", + "filter" : { + "range" : { + "balance" : { + "lt" : 10000 + } + } + } +} +``` + +The response contains the matching results: + +```json +{ + "schema": [ + { + "name": "firstname", + "type": "text" + }, + { + "name": "lastname", + "type": "text" + }, + { + "name": "balance", + "type": "long" + } + ], + "total": 2, + "datarows": [ + [ + "Hattie", + "Bond", + 5686 + ], + [ + "Dale", + "Adams", + 4180 + ] + ], + "size": 2, + "status": 200 +} +``` + +You can use the Explain API to see how this query is executed against OpenSearch: + +```json +POST /_plugins/_sql/_explain +{ + "query" : "SELECT firstname, lastname, balance FROM accounts", + "filter" : { + "range" : { + "balance" : { + "lt" : 10000 + } + } + } +}' +``` + +The response contains the Boolean query in OpenSearch DSL that corresponds to the query above: + +```json +{ + "from": 0, + "size": 200, + "query": { + "bool": { + "filter": [{ + "bool": { + "filter": [{ + "range": { + "balance": { + "from": null, + "to": 10000, + "include_lower": true, + "include_upper": false, + "boost": 1.0 + } + } + }], + "adjust_pure_negative": true, + "boost": 1.0 + } + }], + "adjust_pure_negative": true, + "boost": 1.0 + } + }, + "_source": { + "includes": [ + "firstname", + "lastname", + "balance" + ], + "excludes": [] + } +} +``` + +## Using parameters + +You can use the `parameters` field to pass parameter values to a prepared SQL query. + +The following explain operation uses an SQL query with an `age` parameter: + +```json +POST /_plugins/_sql/_explain +{ + "query": "SELECT * FROM accounts WHERE age = ?", + "parameters": [{ + "type": "integer", + "value": 30 + }] +} +``` + +The response contains the Boolean query in OpenSearch DSL that corresponds to the SQL query above: + +```json +{ + "from": 0, + "size": 200, + "query": { + "bool": { + "filter": [{ + "bool": { + "must": [{ + "term": { + "age": { + "value": 30, + "boost": 1.0 + } + } + }], + "adjust_pure_negative": true, + "boost": 1.0 + } + }], + "adjust_pure_negative": true, + "boost": 1.0 + } + } +} + +``` diff --git a/_search-plugins/sql/sql/aggregations.md b/_search-plugins/sql/sql/aggregations.md new file mode 100644 index 0000000000..0fbb8b5ec5 --- /dev/null +++ b/_search-plugins/sql/sql/aggregations.md @@ -0,0 +1,149 @@ +--- +layout: default +title: Aggregation Functions +parent: SQL +grand_parent: SQL and PPL +nav_order: 11 +--- + +# Aggregation functions + +Aggregate functions use the `GROUP BY` clause to group sets of values into subsets. + +## Group By + +Use the `GROUP BY` clause as an identifier, ordinal, or expression. + +### Identifier + +```sql +SELECT gender, sum(age) FROM accounts GROUP BY gender; +``` + +| gender | sum (age) +:--- | :--- +F | 28 | +M | 101 | + +### Ordinal + +```sql +SELECT gender, sum(age) FROM accounts GROUP BY 1; +``` + +| gender | sum (age) +:--- | :--- +F | 28 | +M | 101 | + +### Expression + +```sql +SELECT abs(account_number), sum(age) FROM accounts GROUP BY abs(account_number); +``` + +| abs(account_number) | sum (age) +:--- | :--- +| 1 | 32 | +| 13 | 28 | +| 18 | 33 | +| 6 | 36 | + +## Aggregation + +Use aggregations as a select, expression, or an argument of an expression. + +### Select + +```sql +SELECT gender, sum(age) FROM accounts GROUP BY gender; +``` + +| gender | sum (age) +:--- | :--- +F | 28 | +M | 101 | + +### Argument + +```sql +SELECT gender, sum(age) * 2 as sum2 FROM accounts GROUP BY gender; +``` + +| gender | sum2 +:--- | :--- +F | 56 | +M | 202 | + +### Expression + +```sql +SELECT gender, sum(age * 2) as sum2 FROM accounts GROUP BY gender; +``` + +| gender | sum2 +:--- | :--- +F | 56 | +M | 202 | + +### COUNT + +Use the `COUNT` function to accept arguments such as a `*` or a literal like `1`. +The meaning of these different forms are as follows: + +- `COUNT(field)` - Only counts if given a field (or expression) is not null or missing in the input rows. +- `COUNT(*)` - Counts the number of all its input rows. +- `COUNT(1)` (same as `COUNT(*)`) - Counts any non-null literal. + +## Having + +Use the `HAVING` clause to filter out aggregated values. + +### HAVING with GROUP BY + +You can use aggregate expressions or its aliases defined in a `SELECT` clause in a `HAVING` condition. + +We recommend using a non-aggregate expression in the `WHERE` clause although you can do this in a `HAVING` clause. + +The aggregations in a `HAVING` clause are not necessarily the same as that in a select list. As an extension to the SQL standard, you're not restricted to using identifiers only in the `GROUP BY` list. +For example: + +```sql +SELECT gender, sum(age) +FROM accounts +GROUP BY gender +HAVING sum(age) > 100; +``` + +| gender | sum (age) +:--- | :--- +M | 101 | + +Here's another example for using an alias in a `HAVING` condition. + +```sql +SELECT gender, sum(age) AS s +FROM accounts +GROUP BY gender +HAVING s > 100; +``` + +| gender | s +:--- | :--- +M | 101 | + +If an identifier is ambiguous, for example, present both as a select alias and as an index field (preference is alias). In this case, the identifier is replaced with an expression aliased in the `SELECT` clause: + +### HAVING without GROUP BY + +You can use a `HAVING` clause without the `GROUP BY` clause. This is useful because aggregations are not supported in a `WHERE` clause: + +```sql +SELECT 'Total of age > 100' +FROM accounts +HAVING sum(age) > 100; +``` + +| Total of age > 100 | +:--- | +Total of age > 100 | diff --git a/_search-plugins/sql/basic.md b/_search-plugins/sql/sql/basic.md similarity index 99% rename from _search-plugins/sql/basic.md rename to _search-plugins/sql/sql/basic.md index 967ef6d3cd..7b2ffefba0 100644 --- a/_search-plugins/sql/basic.md +++ b/_search-plugins/sql/sql/basic.md @@ -2,6 +2,7 @@ layout: default title: Basic Queries parent: SQL +grand_parent: SQL and PPL nav_order: 5 --- diff --git a/_search-plugins/sql/complex.md b/_search-plugins/sql/sql/complex.md similarity index 99% rename from _search-plugins/sql/complex.md rename to _search-plugins/sql/sql/complex.md index 70a69bbd40..1b3987cf11 100644 --- a/_search-plugins/sql/complex.md +++ b/_search-plugins/sql/sql/complex.md @@ -2,6 +2,7 @@ layout: default title: Complex Queries parent: SQL +grand_parent: SQL and PPL nav_order: 6 --- diff --git a/_search-plugins/sql/delete.md b/_search-plugins/sql/sql/delete.md similarity index 98% rename from _search-plugins/sql/delete.md rename to _search-plugins/sql/sql/delete.md index 397c47ed47..050b8643e4 100644 --- a/_search-plugins/sql/delete.md +++ b/_search-plugins/sql/sql/delete.md @@ -2,6 +2,7 @@ layout: default title: Delete parent: SQL +grand_parent: SQL and PPL nav_order: 12 --- diff --git a/_search-plugins/sql/sql/functions.md b/_search-plugins/sql/sql/functions.md new file mode 100755 index 0000000000..180acfb21b --- /dev/null +++ b/_search-plugins/sql/sql/functions.md @@ -0,0 +1,225 @@ +--- +layout: default +title: Functions +parent: SQL +grand_parent: SQL and PPL +nav_order: 7 +--- + +# Functions + +The SQL language supports all SQL plugin [common functions]({{site.url}}{{site.baseurl}}/search-plugins/sql/functions/), including [relevance search]({{site.url}}{{site.baseurl}}/search-plugins/sql/full-text/), but also introduces a few function synonyms, which are available in SQL only. +These synonyms are provided by the `V1` engine. For more information, see [Limitations]({{site.url}}{{site.baseurl}}/search-plugins/sql/limitation). + +## Match query + +The `MATCHQUERY` and `MATCH_QUERY` functions are synonyms for the [`MATCH`]({{site.url}}{{site.baseurl}}/search-plugins/sql/full-text#match) relevance function. They don't accept additional arguments but provide an alternate syntax. + +### Syntax + +To use `matchquery` or `match_query`, pass in your search query and the field name that you want to search against: + +```sql +match_query(field_expression, query_expression[, option=]*) +matchquery(field_expression, query_expression[, option=]*) +field_expression = match_query(query_expression[, option=]*) +field_expression = matchquery(query_expression[, option=]*) +``` + +You can specify the following options in any order: + +- `analyzer` +- `boost` + +### Example + +You can use `MATCHQUERY` to replace `MATCH`: + +```sql +SELECT account_number, address +FROM accounts +WHERE MATCHQUERY(address, 'Holmes') +``` + +Alternatively, you can use `MATCH_QUERY` to replace `MATCH`: + +```sql +SELECT account_number, address +FROM accounts +WHERE address = MATCH_QUERY('Holmes') +``` + +The results contain documents in which the address contains "Holmes": + +| account_number | address +:--- | :--- +1 | 880 Holmes Lane + +## Multi-match + +There are three synonyms for [`MULTI_MATCH`]({{site.url}}{{site.baseurl}}/search-plugins/sql/full-text#multi-match), each with a slightly different syntax. They accept a query string and a fields list with weights. They can also accept additional optional parameters. + +### Syntax + +```sql +multimatch('query'=query_expression[, 'fields'=field_expression][, option=]*) +multi_match('query'=query_expression[, 'fields'=field_expression][, option=]*) +multimatchquery('query'=query_expression[, 'fields'=field_expression][, option=]*) +``` + +The `fields` parameter is optional and can contain a single field or a comma-separated list (whitespace characters are not allowed). The weight for each field is optional and is specified after the field name. It should be delimited by the `caret` character -- `^` -- without whitespace. + +### Example + +The following queries show the `fields` parameter of a multi-match query with a single field and a field list: + +```sql +multi_match('fields' = "Tags^2,Title^3.4,Body,Comments^0.3", ...) +multi_match('fields' = "Title", ...) +``` + +You can specify the following options in any order: + +- `analyzer` +- `boost` +- `slop` +- `type` +- `tie_breaker` +- `operator` + +## Query string + +The `QUERY` function is a synonym for [`QUERY_STRING`]({{site.url}}{{site.baseurl}}/search-plugins/sql/full-text#query-string). + +### Syntax + +```sql +query('query'=query_expression[, 'fields'=field_expression][, option=]*) +``` + +The `fields` parameter is optional and can contain a single field or a comma-separated list (whitespace characters are not allowed). The weight for each field is optional and is specified after the field name. It should be delimited by the `caret` character -- `^` -- without whitespace. + +### Example + +The following queries show the `fields` parameter of a multi-match query with a single field and a field list: + +```sql +query('fields' = "Tags^2,Title^3.4,Body,Comments^0.3", ...) +query('fields' = "Tags", ...) +``` + +You can specify the following options in any order: + +- `analyzer` +- `boost` +- `slop` +- `default_field` + +### Example of using `query_string` in SQL and PPL queries: + +The following is a sample REST API search request in OpenSearch DSL. + +```json +GET accounts/_search +{ + "query": { + "query_string": { + "query": "Lane Street", + "fields": [ "address" ], + } + } +} +``` + +The request above is equivalent to the following `query` function: + +```sql +SELECT account_number, address +FROM accounts +WHERE query('address:Lane OR address:Street') +``` + +The results contain addresses that contain "Lane" or "Street": + +| account_number | address +:--- | :--- +1 | 880 Holmes Lane +6 | 671 Bristol Street +13 | 789 Madison Street + +## Match phrase + +The `MATCHPHRASEQUERY` function is a synonym for [`MATCH_PHRASE`]({{site.url}}{{site.baseurl}}/search-plugins/sql/full-text#query-string). + +### Syntax + +```sql +matchphrasequery(query_expression, field_expression[, option=]*) +``` + +You can specify the following options in any order: + +- `analyzer` +- `boost` +- `slop` + +## Score query + +To return a relevance score along with every matching document, use the `SCORE`, `SCOREQUERY`, or `SCORE_QUERY` functions. + +### Syntax + +The `SCORE` function expects two arguments. The first argument is the [`MATCH_QUERY`](#match-query) expression. The second argument is an optional floating-point number to boost the score (the default value is 1.0): + +```sql +SCORE(match_query_expression, score) +SCOREQUERY(match_query_expression, score) +SCORE_QUERY(match_query_expression, score) +``` + +### Example + +The following example uses the `SCORE` function to boost the documents' scores: + +```sql +SELECT account_number, address, _score +FROM accounts +WHERE SCORE(MATCH_QUERY(address, 'Lane'), 0.5) OR + SCORE(MATCH_QUERY(address, 'Street'), 100) +ORDER BY _score +``` + +The results contain matches with corresponding scores: + +| account_number | address | score +:--- | :--- | :--- +1 | 880 Holmes Lane | 0.5 +6 | 671 Bristol Street | 100 +13 | 789 Madison Street | 100 + +## Wildcard query + +To search documents by a given wildcard, use the `WILDCARDQUERY` or `WILDCARD_QUERY` functions. + +### Syntax + +```sql +wildcardquery(field_expression, query_expression[, boost=]) +wildcard_query(field_expression, query_expression[, boost=]) +``` + +### Example + +The following example uses a wildcard query: + +```sql +SELECT account_number, address +FROM accounts +WHERE wildcard_query(address, '*Holmes*'); +``` + +The results contain documents that match the wildcard expression: + +| account_number | address +:--- | :--- +1 | 880 Holmes Lane diff --git a/_search-plugins/sql/sql/index.md b/_search-plugins/sql/sql/index.md new file mode 100644 index 0000000000..705f3be244 --- /dev/null +++ b/_search-plugins/sql/sql/index.md @@ -0,0 +1,76 @@ +--- +layout: default +title: SQL +parent: SQL and PPL +nav_order: 4 +has_children: true +has_toc: false +redirect_from: + - /search-plugins/sql/sql +--- + +# SQL + +## Workbench + +The easiest way to get familiar with the SQL plugin is to use **Query Workbench** in OpenSearch Dashboards to test various queries. To learn more, see [Workbench]({{site.url}}{{site.baseurl}}/search-plugins/sql/workbench/). + +![OpenSearch Dashboards SQL UI plugin]({{site.url}}{{site.baseurl}}/images/sql.png) + +## SQL and OpenSearch terminology + +Here’s how core SQL concepts map to OpenSearch: + +SQL | OpenSearch +:--- | :--- +Table | Index +Row | Document +Column | Field + +## REST API + +For a complete REST API reference for the SQL plugin, see [SQL/PPL API]({{site.url}}{{site.baseurl}}/search-plugins/sql/sql-ppl-api). + +To use the SQL plugin with your own applications, send requests to the `_plugins/_sql` endpoint: + +```json +POST _plugins/_sql +{ + "query": "SELECT * FROM my-index LIMIT 50" +} +``` + +You can query multiple indexes by using a comma-separated list: + +```json +POST _plugins/_sql +{ + "query": "SELECT * FROM my-index1,myindex2,myindex3 LIMIT 50" +} +``` + +You can also specify an index pattern with a wildcard expression: + +```json +POST _plugins/_sql +{ + "query": "SELECT * FROM my-index* LIMIT 50" +} +``` + +To run the above query in the command line, use the [curl](https://curl.haxx.se/) command: + +```bash +curl -XPOST https://localhost:9200/_plugins/_sql -u 'admin:admin' -k -H 'Content-Type: application/json' -d '{"query": "SELECT * FROM my-index* LIMIT 50"}' +``` + +You can specify the [response format]({{site.url}}{{site.baseurl}}/search-plugins/sql/response-formats) as JDBC, standard OpenSearch JSON, CSV, or raw. By default, queries return data in JDBC format. The following query sets the format to JSON: + +```json +POST _plugins/_sql?format=json +{ + "query": "SELECT * FROM my-index LIMIT 50" +} +``` + +See the rest of this guide for more information about request parameters, settings, supported operations, and tools. \ No newline at end of file diff --git a/_search-plugins/sql/jdbc.md b/_search-plugins/sql/sql/jdbc.md similarity index 61% rename from _search-plugins/sql/jdbc.md rename to _search-plugins/sql/sql/jdbc.md index fa9c80d2e2..36b080ccae 100644 --- a/_search-plugins/sql/jdbc.md +++ b/_search-plugins/sql/sql/jdbc.md @@ -2,6 +2,7 @@ layout: default title: JDBC Driver parent: SQL +grand_parent: SQL and PPL nav_order: 71 --- @@ -10,3 +11,7 @@ nav_order: 71 The Java Database Connectivity (JDBC) driver lets you integrate OpenSearch with your favorite business intelligence (BI) applications. For information on downloading and using the JAR file, see [the SQL repository on GitHub](https://github.com/opensearch-project/sql/tree/master/sql-jdbc). + +## Connecting to Tableau + +To connect to Tableau, follow the detailed instructions in the [GitHub repository](https://github.com/opensearch-project/sql/blob/main/bi-connectors/TableauConnector/README.md). diff --git a/_search-plugins/sql/metadata.md b/_search-plugins/sql/sql/metadata.md similarity index 99% rename from _search-plugins/sql/metadata.md rename to _search-plugins/sql/sql/metadata.md index 5ba108e480..b9def42f5a 100644 --- a/_search-plugins/sql/metadata.md +++ b/_search-plugins/sql/sql/metadata.md @@ -2,6 +2,7 @@ layout: default title: Metadata Queries parent: SQL +grand_parent: SQL and PPL nav_order: 9 --- diff --git a/_search-plugins/sql/odbc.md b/_search-plugins/sql/sql/odbc.md similarity index 71% rename from _search-plugins/sql/odbc.md rename to _search-plugins/sql/sql/odbc.md index 4cfee69ca5..8815d2814e 100644 --- a/_search-plugins/sql/odbc.md +++ b/_search-plugins/sql/sql/odbc.md @@ -2,6 +2,7 @@ layout: default title: ODBC Driver parent: SQL +grand_parent: SQL and PPL nav_order: 72 --- @@ -9,9 +10,7 @@ nav_order: 72 The Open Database Connectivity (ODBC) driver is a read-only ODBC driver for Windows and macOS that lets you connect business intelligence (BI) and data visualization applications like [Tableau](https://github.com/opensearch-project/sql/blob/main/sql-odbc/docs/user/tableau_support.md), [Microsoft Excel](https://github.com/opensearch-project/sql/blob/main/sql-odbc/docs/user/microsoft_excel_support.md), and [Power BI](https://github.com/opensearch-project/sql/blob/main/sql-odbc/docs/user/power_bi_support.md) to the SQL plugin. -For information on downloading and using the JAR file, see [the SQL repository on GitHub](https://github.com/opensearch-project/sql/tree/main/sql-odbc). - -{% comment %} +For information on downloading and using the driver, see [the SQL repository on GitHub](https://github.com/opensearch-project/sql/tree/main/sql-odbc). ## Specifications @@ -23,8 +22,8 @@ The following operating systems are supported: Operating System | Version :--- | :--- -Windows | Windows 10 -macOS | Catalina 10.15.4 and Mojave 10.14.6 +Windows | Windows 10, Windows 11 +macOS | Catalina 10.15.4, Mojave 10.14.6, Big Sur 11.6.7, Monterey 12.4 ## Concepts @@ -46,13 +45,13 @@ To install the driver, download the bundled distribution installer from [here](h The installer is unsigned and shows a security dialog. Choose **More info** and **Run anyway**. -1. Choose **Next** to proceed with the installation. +2. Choose **Next** to proceed with the installation. -1. Accept the agreement, and choose **Next**. +3. Accept the agreement, and choose **Next**. -1. The installer comes bundled with documentation and useful resources files to connect with various BI tools (for example, a `.tdc` file for Tableau). You can choose to keep or remove these resources. Choose **Next**. +4. The installer comes bundled with documentation and useful resource files to connect to various BI tools (for example, a `.tdc` file for Tableau). You can choose to keep or remove these resources. Choose **Next**. -1. Choose **Install** and **Finish**. +5. Choose **Install** and **Finish**. The following connection information is set up as part of the default DSN: @@ -73,13 +72,13 @@ Before installing the ODBC Driver on macOS, install the iODBC Driver Manager. The installer is unsigned and shows a security dialog. Right-click on the installer and choose **Open**. -1. Choose **Continue** several times to proceed with the installation. +2. Choose **Continue** several times to proceed with the installation. -1. Choose the **Destination** to install the driver files. +3. Choose the **Destination** to install the driver files. -1. The installer comes bundled with documentation and useful resources files to connect with various BI tools (for example, a `.tdc` file for Tableau). You can choose to keep or remove these resources. Choose **Continue**. +4. The installer comes bundled with documentation and useful resources files to connect to various BI tools (for example, a `.tdc` file for Tableau). You can choose to keep or remove these resources. Choose **Continue**. -1. Choose **Install** and **Close**. +5. Choose **Install** and **Close**. Currently, the DSN is not set up as part of the installation and needs to be configured manually. First, open `iODBC Administrator`: @@ -90,19 +89,19 @@ sudo /Applications/iODBC/iODBC\ Administrator64.app/Contents/MacOS/iODBC\ Admini This command gives the application permissions to save the driver and DSN configurations. 1. Choose **ODBC Drivers** tab. -1. Choose **Add a Driver** and fill in the following details: +2. Choose **Add a Driver** and fill in the following details: - **Description of the Driver**: Enter the driver name that you used for the ODBC connection (for example, OpenSearch SQL ODBC Driver). - **Driver File Name**: Enter the path to the driver file (default: `/bin/libopensearchsqlodbc.dylib`). - **Setup File Name**: Enter the path to the setup file (default: `/bin/libopensearchsqlodbc.dylib`). -1. Choose the user driver. -1. Choose **OK** to save the options. -1. Choose the **User DSN** tab. -1. Select **Add**. -1. Choose the driver that you added above. -1. For **Data Source Name (DSN)**, enter the name of the DSN used to store connection options (for example, OpenSearch SQL ODBC DSN). -1. For **Comment**, add an optional comment. -1. Add key-value pairs by using the `+` button. We recommend the following options for a default local OpenSearch installation: +3. Choose the user driver. +4. Choose **OK** to save the options. +5. Choose the **User DSN** tab. +6. Select **Add**. +7. Choose the driver that you added above. +8. For **Data Source Name (DSN)**, enter the name of the DSN used to store connection options (for example, OpenSearch SQL ODBC DSN). +9. For **Comment**, add an optional comment. +10. Add key-value pairs by using the `+` button. We recommend the following options for a default local OpenSearch installation: - **Host**: `localhost` - OpenSearch server endpoint - **Port**: `9200` - The server port - **Auth**: `NONE` - The authentication mode @@ -111,8 +110,8 @@ This command gives the application permissions to save the driver and DSN config - **ResponseTimeout**: `10` - The number of seconds to wait for a response from the server - **UseSSL**: `0` - Do not use SSL for connections -1. Choose **OK** to save the DSN configuration. -1. Choose **OK** to exit the iODBC Administrator. +11. Choose **OK** to save the DSN configuration. +12. Choose **OK** to exit the iODBC Administrator. ## Customizing the ODBC driver @@ -166,13 +165,12 @@ Option | Description | Type | Default Option | Description | Type | Default :--- | :--- -`LogLevel` | Severity level for driver logs. | one of `ES_OFF`, `ES_FATAL`, `ES_ERROR`, `ES_INFO`, `ES_DEBUG`, `ES_TRACE`, `ES_ALL` | `ES_WARNING` +`LogLevel` | Severity level for driver logs. | `LOG_OFF`, `LOG_FATAL`, `LOG_ERROR`, `LOG_INFO`, `LOG_DEBUG`, `LOG_TRACE`, or `LOG_ALL` | `LOG_WARNING` `LogOutput` | Location for storing driver logs. | `string` | `WIN: C:\`, `MAC: /tmp` You need administrative privileges to change the logging options. {: .note } - ## Connecting to Tableau Pre-requisites: @@ -183,13 +181,14 @@ Pre-requisites: 1. Start Tableau. Under the **Connect** section, go to **To a Server** and choose **Other Databases (ODBC)**. -1. In the **DSN drop-down**, select the OpenSearch DSN you set up in the previous set of steps. The options you added will be automatically filled into the **Connection Attributes**. +2. In the **DSN drop-down**, select the OpenSearch DSN you set up in the previous set of steps. The options you added will be automatically filled in under the **Connection Attributes**. -1. Select **Sign In**. After a few seconds, Tableau connects to your OpenSearch server. Once connected, you will directed to **Datasource** window. The **Database** will be already populated with name of the OpenSearch cluster. +3. Select **Sign In**. After a few seconds, Tableau connects to your OpenSearch server. Once connected, you will be directed to the **Datasource** window. The **Database** will be already be populated with the name of the OpenSearch cluster. To list all the indices, click the search icon under **Table**. -1. Start playing with data by dragging table to connection area. Choose **Update Now** or **Automatically Update** to populate table data. +4. Start experimenting with data by dragging the table to the connection area. Choose **Update Now** or **Automatically Update** to populate the table data. +See more detailed instructions in the [GitHub repository](https://github.com/opensearch-project/sql/blob/main/sql-odbc/docs/user/tableau_support.md). ### Troubleshooting @@ -203,4 +202,6 @@ This is most likely due to OpenSearch server not running on **host** and **post* Confirm **host** and **post** are correct and OpenSearch server is running with OpenSearch SQL plugin. Also make sure `.tdc` that was downloaded with the installer is copied correctly to `/Documents/My Tableau Repository/Datasources` directory. -{% endcomment %} +## Connecting to Microsoft Power BI + +Follow the [installation instructions](https://github.com/opensearch-project/sql/blob/main/bi-connectors/PowerBIConnector/README.md) and the [configuration instructions](https://github.com/opensearch-project/sql/blob/main/bi-connectors/PowerBIConnector/power_bi_support.md) published in the GitHub repository. diff --git a/_search-plugins/sql/partiql.md b/_search-plugins/sql/sql/partiql.md similarity index 99% rename from _search-plugins/sql/partiql.md rename to _search-plugins/sql/sql/partiql.md index 1dfadb817d..16caa95e66 100644 --- a/_search-plugins/sql/partiql.md +++ b/_search-plugins/sql/sql/partiql.md @@ -2,7 +2,8 @@ layout: default title: JSON Support parent: SQL -nav_order: 7 +grand_parent: SQL and PPL +nav_order: 8 --- # JSON Support diff --git a/_search-plugins/sql/troubleshoot.md b/_search-plugins/sql/troubleshoot.md index d2d16f633f..678588d773 100644 --- a/_search-plugins/sql/troubleshoot.md +++ b/_search-plugins/sql/troubleshoot.md @@ -1,8 +1,8 @@ --- layout: default title: Troubleshooting -parent: SQL -nav_order: 17 +parent: SQL and PPL +nav_order: 88 --- # Troubleshooting diff --git a/_search-plugins/sql/workbench.md b/_search-plugins/sql/workbench.md index 18d593f3b4..4a97befa4c 100644 --- a/_search-plugins/sql/workbench.md +++ b/_search-plugins/sql/workbench.md @@ -2,6 +2,7 @@ layout: default title: Query Workbench parent: SQL +grand_parent: SQL and PPL nav_order: 1 ---