Skip to content
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.

Commit

Permalink
Enable new SQL query engine (#989)
Browse files Browse the repository at this point in the history
* Enable new SQL engine

* Rename new engine configure method

* Fix broken IT

* Update doc

* Update doc
  • Loading branch information
dai-chen authored Jan 26, 2021
1 parent 43f41bd commit c5ea315
Show file tree
Hide file tree
Showing 17 changed files with 96 additions and 68 deletions.
18 changes: 1 addition & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,23 +28,7 @@ Please refer to the [SQL Language Reference Manual](./docs/user/index.rst), [Pip

## Experimental

Recently we have been actively improving our query engine primarily for better correctness and extensibility. The new enhanced query engine has been already supporting the new released Piped Processing Language query processing behind the scene. Meanwhile, the integration with SQL language is also under way. To try out the power of the new query engine with SQL, simply run the command to enable it by [plugin setting](https://github.com/opendistro-for-elasticsearch/sql/blob/develop/docs/user/admin/settings.rst#opendistro-sql-engine-new-enabled). In future release, this will be enabled by default and nothing required to do from your side. Please stay tuned for updates on our progress and its new exciting features.

Here is a documentation list with features only available in this improved SQL query engine. Please follow the instruction above to enable it before trying out example queries in these docs:

* [Identifiers](./docs/user/general/identifiers.rst): support for identifier names with special characters
* [Data types](./docs/user/general/datatypes.rst): new data types such as date time and interval
* [Expressions](./docs/user/dql/expressions.rst): new expression system that can represent and evaluate complex expressions
* [SQL functions](./docs/user/dql/functions.rst): many more string and date functions added
* [Basic queries](./docs/user/dql/basics.rst)
* Ordering by Aggregate Functions section
* NULLS FIRST/LAST in section Specifying Order for Null
* [Aggregations](./docs/user/dql/aggregations.rst): aggregation over expression and more other features
* [Complex queries](./docs/user/dql/complex.rst)
* Improvement on Subqueries in FROM clause
* [Window functions](./docs/user/dql/window.rst): ranking and aggregate window function support

To avoid impact on your side, normally you won't see any difference in query response. If you want to check if and why your query falls back to be handled by old SQL engine, please explain your query and check Elasticsearch log for "Request is falling back to old SQL engine due to ...".
Recently we have been actively improving our query engine primarily for better correctness and extensibility. Behind the scene, the new enhanced engine has already supported the new released Piped Processing Language. However, it was experimental and disabled by default for SQL query processing. With most important features and full testing complete, now we're ready to promote it as our default SQL query engine. Please find more details in [An Introduction to the New SQL Query Engine](/docs/dev/NewSQLEngine.md).


## Setup
Expand Down
73 changes: 73 additions & 0 deletions docs/dev/NewSQLEngine.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# An Introduction to the New SQL Query Engine

---
## 1.Motivations

The current SQL query engine provides users the basic query capability for using familiar SQL rather than complex Elasticsearch DSL. Based on NLPchina ES-SQL, many new features have been added additionally, such as semantic analyzer, semi-structured data query support, Hash Join etc. However, as we looked into more advanced SQL features, challenges started emerging especially in terms of correctness and extensibility (see [Attributions](../attributions.md)). After thoughtful consideration, we decided to develop a new query engine to address all the problems met so far.


---
## 2.What's New

With the architecture and extensibility improved significantly, the following SQL features are able to be introduced in the new query engine:

* [Identifiers](/docs/user/general/identifiers.rst): Support for identifier names with special characters
* [Data types](/docs/user/general/datatypes.rst): New data types such as date time and interval
* [Expressions](/docs/user/dql/expressions.rst): New expression system that can represent and evaluate complex expressions
* [SQL functions](/docs/user/dql/functions.rst): Many more string and date functions added
* [Basic queries](/docs/user/dql/basics.rst)
* Ordering by Aggregate Functions section
* NULLS FIRST/LAST in section Specifying Order for Null
* [Aggregations](/docs/user/dql/aggregations.rst):
* Aggregation over expression
* Selective aggregation by FILTER function
* [Complex queries](/docs/user/dql/complex.rst)
* Improvement on Subqueries in FROM clause
* [Window functions](/docs/user/dql/window.rst)
* Ranking window functions
* Aggregate window functions

As for correctness, besides full coverage of unit and integration test, we developed a new comparison test framework to ensure correctness by comparing with other databases. Please find more details in [Testing](./Testing.md).


---
## 3.What're Changed

### 3.1 Breaking Changes

Because of implementation changed internally, you can expect Explain output in a different format. For query protocol, there are slightly changes on two fields' value in the default response format:

* **Schema**: Previously the `name` and `alias` value differed for different queries. For consistency, name is always the original text now and alias is its alias defined in SELECT clause or absent if none.
* **Total**: The `total` field represented how many documents matched in total no matter how many returned (indicated by `size` field). However, this field becomes meaningless because of post processing on DSL response in the new query engine. Thus, for now the total number is always same as size field.

### 3.2 Limitations

You can find all the limitations in [Limitations](/docs/user/limitations/limitations.rst). For these unsupported features, the query will be forwarded to the old query engine by fallback mechanism. To avoid impact on your side, normally you won't see any difference in a query response. If you want to check if and why your query falls back to be handled by old SQL engine, please explain your query and check Elasticsearch log for "Request is falling back to old SQL engine due to ...".

Basically, here is a list of the features common though not supported in the new query engine yet:

* **Cursor**: request with `fetch_size` parameter
* **JSON response format**: will not be supported anymore in the new engine
* **Nested field query**: including supports for object field or nested field query
* **JOINs**: including all types of join queries
* **Elasticsearch functions**: fulltext search, metric and bucket functions

### 3.3 What if Something Wrong

No panic! You can roll back to old query engine easily by a plugin setting change. Simply run the command to disable it by [plugin setting](/docs/user/admin/settings.rst#opendistro-sql-engine-new-enabled). Same as other cluster setting change, no need to restart Elasticsearch and the change will take effect on next incoming query. Later on please report the issue to us.


---
## 4.How it's Implemented

If you're interested in the new query engine, please find more details in [Develop Guide](../developing.rst), [Architecture](./Architecture.md) and other docs in the dev folder.


---
## 5.What's Next

As mentioned in section 3.2 Limitations, there are still very popular SQL features unsupported yet in the new query engine yet. In particular, the following items are on our roadmap with high priority:

1. Object/Nested field queries
2. JOIN support
3. Elasticsearch functions
6 changes: 3 additions & 3 deletions docs/user/admin/settings.rst
Original file line number Diff line number Diff line change
Expand Up @@ -518,7 +518,7 @@ Description

We are migrating existing functionalities to a new query engine under development. User can choose to enable the new engine if interested or disable if any issue found.

1. The default value is false.
1. The default value is true.
2. This setting is node scope.
3. This setting can be updated dynamically.

Expand All @@ -532,7 +532,7 @@ SQL query::

>> curl -H 'Content-Type: application/json' -X PUT localhost:9200/_opendistro/_sql/settings -d '{
"transient" : {
"opendistro.sql.engine.new.enabled" : "true"
"opendistro.sql.engine.new.enabled" : "false"
}
}'

Expand All @@ -546,7 +546,7 @@ Result set::
"sql" : {
"engine" : {
"new" : {
"enabled" : "true"
"enabled" : "false"
}
}
}
Expand Down
3 changes: 0 additions & 3 deletions doctest/test_docs.py
Original file line number Diff line number Diff line change
Expand Up @@ -203,7 +203,4 @@ def load_tests(loader, suite, ignore):
# randomize order of tests to make sure they don't depend on each other
random.shuffle(tests)

# prepend a temporary doc to enable new engine so new SQL docs followed can pass
tests.insert(0, doc_suite('../docs/user/dql/newsql.rst'))

return DocTests(tests)
6 changes: 3 additions & 3 deletions integ-test/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,9 @@ integTest {
systemProperty "user", System.getProperty("user")
systemProperty "password", System.getProperty("password")

// Enable new SQL engine
systemProperty 'enableNewEngine', 'false'

// Set default query size limit
systemProperty 'defaultQuerySizeLimit', '10000'

Expand Down Expand Up @@ -109,9 +112,6 @@ task integTestWithNewEngine(type: RestIntegTestTask) {
systemProperty "user", System.getProperty("user")
systemProperty "password", System.getProperty("password")

// Enable new SQL engine
systemProperty 'enableNewEngine', 'true'

// Set default query size limit
systemProperty 'defaultQuerySizeLimit', '10000'

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@
import com.amazon.opendistroforelasticsearch.sql.legacy.utils.StringUtils;
import java.io.IOException;
import org.junit.Assume;
import org.junit.Ignore;
import org.junit.Test;

public class OrdinalAliasRewriterIT extends SQLIntegTestCase {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ public void setUpIndices() throws Exception {
initClient();
}

enableNewQueryEngine();
configureNewQueryEngine();
resetQuerySizeLimit();
init();
}
Expand Down Expand Up @@ -147,15 +147,15 @@ public static void cleanUpIndices() throws IOException {
wipeAllClusterSettings();
}

private void enableNewQueryEngine() throws IOException {
private void configureNewQueryEngine() throws IOException {
boolean isEnabled = isNewQueryEngineEabled();
if (isEnabled) {
com.amazon.opendistroforelasticsearch.sql.util.TestUtils.enableNewQueryEngine(client());
if (!isEnabled) {
com.amazon.opendistroforelasticsearch.sql.util.TestUtils.disableNewQueryEngine(client());
}
}

protected boolean isNewQueryEngineEabled() {
return Boolean.parseBoolean(System.getProperty("enableNewEngine", "false"));
return Boolean.parseBoolean(System.getProperty("enableNewEngine", "true"));
}

protected void setQuerySizeLimit(Integer limit) throws IOException {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@
import com.amazon.opendistroforelasticsearch.sql.common.utils.StringUtils;
import com.amazon.opendistroforelasticsearch.sql.legacy.SQLIntegTestCase;
import com.amazon.opendistroforelasticsearch.sql.legacy.TestsConstants;
import com.amazon.opendistroforelasticsearch.sql.util.TestUtils;
import com.google.common.io.Resources;
import java.io.IOException;
import java.net.URI;
Expand All @@ -39,7 +38,6 @@ public class AdminIT extends SQLIntegTestCase {
@Override
public void init() throws Exception {
super.init();
TestUtils.enableNewQueryEngine(client());
loadIndex(Index.ACCOUNT);
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,13 +21,16 @@
import static com.amazon.opendistroforelasticsearch.sql.data.model.ExprValueUtils.LITERAL_TRUE;
import static com.amazon.opendistroforelasticsearch.sql.legacy.TestsConstants.TEST_INDEX_ACCOUNT;
import static com.amazon.opendistroforelasticsearch.sql.legacy.TestsConstants.TEST_INDEX_BANK_WITH_NULL_VALUES;
import static com.amazon.opendistroforelasticsearch.sql.util.MatcherUtils.*;
import static com.amazon.opendistroforelasticsearch.sql.util.MatcherUtils.hitAny;
import static com.amazon.opendistroforelasticsearch.sql.util.MatcherUtils.kvInt;
import static com.amazon.opendistroforelasticsearch.sql.util.MatcherUtils.rows;
import static com.amazon.opendistroforelasticsearch.sql.util.MatcherUtils.schema;
import static com.amazon.opendistroforelasticsearch.sql.util.MatcherUtils.verifyDataRows;
import static com.amazon.opendistroforelasticsearch.sql.util.MatcherUtils.verifySchema;
import static org.hamcrest.Matchers.equalTo;

import java.io.IOException;

import com.amazon.opendistroforelasticsearch.sql.legacy.SQLIntegTestCase;
import com.amazon.opendistroforelasticsearch.sql.util.TestUtils;
import java.io.IOException;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.common.xcontent.LoggingDeprecationHandler;
import org.elasticsearch.common.xcontent.NamedXContentRegistry;
Expand All @@ -36,16 +39,13 @@
import org.elasticsearch.common.xcontent.XContentType;
import org.elasticsearch.search.SearchHits;
import org.json.JSONObject;

import org.junit.Assume;
import org.junit.Test;

public class ConditionalIT extends SQLIntegTestCase {

@Override
public void init() throws Exception {
super.init();
TestUtils.enableNewQueryEngine(client());
loadIndex(Index.ACCOUNT);
loadIndex(Index.BANK_WITH_NULL_VALUES);
}
Expand All @@ -62,8 +62,6 @@ public void ifnullShouldPassJDBC() throws IOException {

@Test
public void ifnullWithNullInputTest() {
Assume.assumeTrue(isNewQueryEngineEabled());

JSONObject response = new JSONObject(executeQuery(
"SELECT IFNULL(null, firstname) as IFNULL1 ,"
+ " IFNULL(firstname, null) as IFNULL2 ,"
Expand All @@ -83,7 +81,6 @@ public void ifnullWithNullInputTest() {

@Test
public void ifnullWithMissingInputTest() {
Assume.assumeTrue(isNewQueryEngineEabled());
JSONObject response = new JSONObject(executeQuery(
"SELECT IFNULL(balance, 100) as IFNULL1, "
+ " IFNULL(200, balance) as IFNULL2, "
Expand All @@ -103,7 +100,6 @@ public void ifnullWithMissingInputTest() {

@Test
public void nullifShouldPassJDBC() throws IOException {
Assume.assumeTrue(isNewQueryEngineEabled());
JSONObject response = executeJdbcRequest(
"SELECT NULLIF(lastname, 'unknown') AS name FROM " + TEST_INDEX_ACCOUNT);
assertEquals("NULLIF(lastname, \'unknown\')", response.query("/schema/0/name"));
Expand All @@ -113,7 +109,6 @@ public void nullifShouldPassJDBC() throws IOException {

@Test
public void nullifWithNotNullInputTestOne(){
Assume.assumeTrue(isNewQueryEngineEabled());
JSONObject response = new JSONObject(executeQuery(
"SELECT NULLIF(firstname, 'Amber JOHnny') as testnullif "
+ "FROM " + TEST_INDEX_BANK_WITH_NULL_VALUES
Expand All @@ -128,7 +123,6 @@ public void nullifWithNotNullInputTestOne(){

@Test
public void nullifWithNullInputTest() {
Assume.assumeTrue(isNewQueryEngineEabled());
JSONObject response = new JSONObject(executeQuery(
"SELECT NULLIF(1/0, 123) as nullif1 ,"
+ " NULLIF(123, 1/0) as nullif2 ,"
Expand All @@ -147,7 +141,6 @@ public void nullifWithNullInputTest() {

@Test
public void isnullShouldPassJDBC() throws IOException {
Assume.assumeTrue(isNewQueryEngineEabled());
JSONObject response = executeJdbcRequest(
"SELECT ISNULL(lastname) AS name FROM " + TEST_INDEX_ACCOUNT);
assertEquals("ISNULL(lastname)", response.query("/schema/0/name"));
Expand All @@ -169,7 +162,6 @@ public void isnullWithNotNullInputTest() throws IOException {

@Test
public void isnullWithNullInputTest() {
Assume.assumeTrue(isNewQueryEngineEabled());
JSONObject response = new JSONObject(executeQuery(
"SELECT ISNULL(1/0) as ISNULL1 ,"
+ " ISNULL(firstname) as ISNULL2 "
Expand Down Expand Up @@ -198,7 +190,6 @@ public void isnullWithMathExpr() throws IOException{

@Test
public void ifShouldPassJDBC() throws IOException {
Assume.assumeTrue(isNewQueryEngineEabled());
JSONObject response = executeJdbcRequest(
"SELECT IF(2 > 0, \'hello\', \'world\') AS name FROM " + TEST_INDEX_ACCOUNT);
assertEquals("IF(2 > 0, \'hello\', \'world\')", response.query("/schema/0/name"));
Expand All @@ -208,7 +199,6 @@ public void ifShouldPassJDBC() throws IOException {

@Test
public void ifWithTrueAndFalseCondition() throws IOException {
Assume.assumeTrue(isNewQueryEngineEabled());
JSONObject response = new JSONObject(executeQuery(
"SELECT IF(2 < 0, firstname, lastname) as IF0, "
+ " IF(2 > 0, firstname, lastname) as IF1, "
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,6 @@

import com.amazon.opendistroforelasticsearch.sql.common.utils.StringUtils;
import com.amazon.opendistroforelasticsearch.sql.legacy.SQLIntegTestCase;
import com.amazon.opendistroforelasticsearch.sql.util.TestUtils;
import java.io.IOException;
import java.util.Locale;
import org.elasticsearch.client.Request;
Expand All @@ -39,7 +38,6 @@ public class DateTimeFunctionIT extends SQLIntegTestCase {
@Override
public void init() throws Exception {
super.init();
TestUtils.enableNewQueryEngine(client());
loadIndex(Index.BANK);
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@
import static org.hamcrest.Matchers.is;

import com.amazon.opendistroforelasticsearch.sql.legacy.RestIntegTestCase;
import com.amazon.opendistroforelasticsearch.sql.util.TestUtils;
import java.io.IOException;
import java.util.Locale;
import java.util.function.Function;
Expand All @@ -31,7 +30,6 @@
import org.elasticsearch.client.ResponseException;
import org.junit.Ignore;
import org.junit.Rule;
import org.junit.Test;
import org.junit.rules.ExpectedException;

/**
Expand All @@ -48,7 +46,6 @@ public class ExpressionIT extends RestIntegTestCase {
@Override
protected void init() throws Exception {
super.init();
TestUtils.enableNewQueryEngine(client());
}

public ResponseExceptionAssertion expectResponseException() {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,6 @@
import static com.amazon.opendistroforelasticsearch.sql.util.TestUtils.getResponseBody;

import com.amazon.opendistroforelasticsearch.sql.legacy.SQLIntegTestCase;
import com.amazon.opendistroforelasticsearch.sql.util.TestUtils;
import java.io.IOException;
import java.util.Locale;
import org.elasticsearch.client.Request;
Expand All @@ -38,7 +37,6 @@ public class MathematicalFunctionIT extends SQLIntegTestCase {
@Override
public void init() throws Exception {
super.init();
TestUtils.enableNewQueryEngine(client());
loadIndex(Index.BANK);
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@

import com.amazon.opendistroforelasticsearch.sql.legacy.SQLIntegTestCase;
import com.amazon.opendistroforelasticsearch.sql.legacy.metrics.MetricName;
import com.amazon.opendistroforelasticsearch.sql.util.TestUtils;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
Expand All @@ -38,7 +37,6 @@ public class MetricsIT extends SQLIntegTestCase {
@Override
protected void init() throws Exception {
loadIndex(Index.BANK);
TestUtils.enableNewQueryEngine(client());
}

@Test
Expand Down
Loading

0 comments on commit c5ea315

Please sign in to comment.