diff --git a/docs/dev/SemanticAnalysis.md b/docs/dev/SemanticAnalysis.md new file mode 100644 index 0000000000..44c07d65ba --- /dev/null +++ b/docs/dev/SemanticAnalysis.md @@ -0,0 +1,329 @@ +# Semantic Analysis + +--- +## 1.Overview + +Previously SQL plugin didn't do semantic analysis, for example field doesn't exist, function call on field with wrong type etc. So it had to rely on Elasticsearch engine to perform the "check" which is actual execution. +This led to bad user experience because of missing careful verification, cost of actual execution and confusing error message. In this work, we built a new semantic analyzer based on the new ANTLR generated parser introduced recently. +With the new semantic analyzer, we manage to perform various verification in terms of meaning of the query and return clear and helpful message to user for troubleshoot. + +So in this work our initial goal is to add capability to perform basic semantic analysis including: + + 1. Check field name and found if any typo. + 2. Check if function is in use with correct arguments. + 3. Apart from basic check above, it would be nice to do some simple check for JOIN, subquery and multi-query etc. + +For both exception, we want to return useful message to customer and even suggest to change the wrong symbol to possibly right one. + +--- +## 2.Use Cases + +Firstly, you could go through the following examples of semantic check with our new analyzer. In the use cases, you should be able to get a taste of what benefits the new semantic analyzer is bring to us: + +### 2.1 Field Name Typo + +``` +POST _opendistro/_sql +{ + "query": "SELECT balace FROM accounts" +} + +{ + "error": { + "reason": "Invalid SQL query", + "details": "Field [balace] cannot be found or used here. Did you mean [balance]?", + "type": "SemanticAnalysisException" + }, + "status": 400 +} +``` + +### 2.2 Function Call on Incompatible Field Type + +``` +POST _opendistro/_sql +{ + "query": "SELECT * FROM accounts WHERE SUBSTRING(balance, 0, 1) = 'test'" +} + +{ + "error": { + "reason": "Invalid SQL query", + "details": "Function [SUBSTRING] cannot work with [LONG, INTEGER, INTEGER]. Usage: SUBSTRING(STRING T, INTEGER, INTEGER) -> T", + "type": "SemanticAnalysisException" + }, + "status": 400 +} +``` + +### 2.3 An index Join Non-nested Field + +``` +POST _opendistro/_sql +{ + "query": "SELECT * FROM accounts a, a.firstname" +} + +{ + "error": { + "reason": "Invalid SQL query", + "details": "Operator [JOIN] cannot work with [INDEX, TEXT]. Usage: Please join index with other index or its nested field.", + "type": "SemanticAnalysisException" + }, + "status": 400 +} +``` + +### 2.4 Wrong Reference in Subquery + +``` +POST _opendistro/_sql +{ + "query": "SELECT * FROM accounts a WHERE EXISTS (SELECT * FROM accounts b WHERE b.address LIKE 'Seattle') AND b.age > 10" +} + +{ + "error": { + "reason": "Invalid SQL query", + "details": "Field [b.age] cannot be found or used here. Did you mean [a.age]?", + "type": "SemanticAnalysisException" + }, + "status": 400 +} +``` + +### 2.5 Operator Use on Incompatible Field Type + +``` +POST _opendistro/_sql +{ + "query": "SELECT * FROM accounts WHERE lastname IS FALSE" +} + +{ + "error": { + "reason": "Invalid SQL query", + "details": "Operator [IS] cannot work with [TEXT, BOOLEAN]. Usage: Please use compatible types from each side.", + "type": "SemanticAnalysisException" + }, + "status": 400 +} +``` + +### 2.6 Subquery Return Incompatible Type + +``` +POST _opendistro/_sql +{ + "query": "SELECT * FROM accounts WHERE lastname IN (SELECT age FROM accounts)" +} + +{ + "error": { + "reason": "Invalid SQL query", + "details": "Operator [IN] cannot work with [TEXT, LONG]. Usage: Please return field(s) of compatible type from each query.", + "type": "SemanticAnalysisException" + }, + "status": 400 +} +``` + +### 2.7 Multi-query On Incompatible Type + +``` +POST _opendistro/_sql +{ + "query": "SELECT balance FROM accounts UNION ALL SELECT city FROM accounts" +} + +{ + "error": { + "reason": "Invalid SQL query", + "details": "Operator [UNION] cannot work with [LONG, TEXT]. Usage: Please return field(s) of compatible type from each query.", + "type": "SemanticAnalysisException" + }, + "status": 400 +} +``` + +--- +## 3.High Level Design + +The semantic analyzer consists of 2 core components: semantic context and type system. + +### 3.1 Semantic Context + +Semantic context manages Environment in a stack for scope management for nested subquery. Precisely, an Environment is the storage of symbols associated with its type. To perform analysis, looking up in the environments in the semantic context is first step. Only after determining the symbol exists and what's its type, further analysis like type checking can happen. We use the same terminology in compiler theory: + + * **Define**: stores symbol name along with its attribute (type only for now) to the current environment in context. + * **Resolve**: looks up symbol name to find its attribute associated. + +To avoid naming conflict, we need to introduce one more layer inside each environment - namespace. For example, it's supposed to be allowed to have field name or alias in SQL query which has the same name as built-in function such as `SUM`. To implement this, we divide each environment into 3 namespaces for better management: + + * **Field namespace**: field name loaded from index mapping. + * **Function namespace**: built-in function names. + * **Operator namespace**: mainly comparison operator such as `=`, `>`, `IS` etc. + +Here is a simple diagram showing what Semantic Context looks like at runtime: + +![What Semantic Context Looks Like](img/what-semantic-context-looks-like.png) + +### 3.2 Type System + +Type system allows for type check for all symbols present in the SQL query. First of all, we need to define what is type. Typically, type consists of 2 kinds: + + * **Base type**: is based on Elasticsearch data type and organized into hierarchy with "class" type internally. + * **ES data type**: For example, INTEGER and LONG belongs to NUMBER, TEXT and KEYWORD belongs to STRING. + * **ES index**: we also have specific type for index, index pattern and nested field. + * **Type expression**: is expression of multiple base type as argument type along with a constructor, for example, array constructor can construct integer to integer array, struct constructor can construct couple of base type into a new struct type. Similarly, function and comparison operator accepts arguments and generate result type. + * **Function**: including scalar and aggregate function in SQL standard as well as functions for Elasticsearch. + * **Operator**: including comparison operator (=, <, >), set operator (UNION, MINUS) and join operator (JOIN). + +But support for only simple type is not sufficient. The following special types needs to be covered: + + * **Generic type**: when we say `LOG(NUMBER) -> NUMBER` actually we want to return whatever input type (INTEGER, FLOAT) specifically instead of NUMBER. + * **Vararg**: for example, function `CONCAT` can apply to arbitrary number of strings. + * **Overloading**: function like `LOG` can have multiple specifications, one for not specifying base and another for base. + * **Named argument**: most functions for Elasticsearch feature use named argument, for example `TOPHITS('size'=3,'age'='desc'`. + * **Optional argument**: essentially this is same as function overloading. + +Currently we can add support for Generic Type and Overloading. To clarify, function specification involved in generic type like `SUBSTRING(func(T(STRING), INTEGER, INTEGER).to(T))` is similar as that in Java ` T Substring(T, int, int)`. +As for other unsupported feature, we use empty specification to convey the function exists in our type system but we want to skip type check for now. + +### 3.3 String Similarity + +Apart from the core components above, another interesting part is the string similarity algorithm that provides suggestion when semantic analysis error occurred. Currently we make use of classic edit distance algorithm in Lucene library to guess a similar symbol when we suspect user put wrong name. That's why you can see the "Did you mean XXX?" in the examples in Use Cases section. + + +--- +## 4.Detailed Design + +Basically the components of semantic analysis works together in the manner shown in the following diagram: + +![How Semantic Analysis Works](img/how-semantic-analysis-works.png) + +### 4.1 Parse Tree Visitor + +Parse Tree Visitor that walks through the SQL parse tree is driver of the whole semantic analysis process. So this section tries to make clear how Semantic Context, Type System and other parts work together by providing an example. +Suppose an index `accounts` has mapping as below: + +``` +"mappings": { + "account": { + "properties": { + "age": { + "type": "integer" + }, + "city": { + "type": "keyword" + }, + "birthday": { + "type": "date" + }, + "employer": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 256 + } + } + }, + "projects": { + "type": "nested", + "properties": { + "members": { + "type": "nested", + "properties": { + "name": { + "type": "text" + } + } + }, + "active": { + "type": "boolean" + } + } + }, + "manager": { + "properties": { + "name": { + "type": "text" + } + } + } + } + } +} +``` + +Firstly, visitor needs to enforce the visiting order of SQL query. Because some clause like FROM is essentially the definition of symbol, it is required to be visited before other clause such as WHERE which is the resolution of symbol. Currently the visiting process is being performed in the following order: + + 1. **FROM**: define all symbols in index mapping in context for later resolution + 2. **WHERE** + 3. **SELECT**: the reason why SELECT visiting is so early is alias in SELECT could be used in GROUP BY, ex. SELECT SUBSTRING(city) substr ... GROUP BY substr + 4. **GROUP BY** + 5. **HAVING** + 6. **ORDER BY** + 7. **LIMIT** + +### 4.2 Context Initialization + +This part is done in `ESMappingLoader` visitor each of whose visit methods runs ahead of `TypeChecker`. Take query `SELECT * FROM accounts a, a.projects p WHERE age > 20 AND p.active IS TRUE` for example. After visiting the FROM clause, the context completes the initialization with symbol well defined as follows: + +``` + # field names without alias prefix because alias is optional + age -> INTEGER + city -> KEYWORD + birthday -> DATE + employer -> TEXT + employer.keyword -> KEYWORD + projects -> NESTED + projects.active -> BOOLEAN + projects.members -> NESTED + projects.members.name -> TEXT + manager -> OBJECT + manager.name -> TEXT + + # field names with alias prefix + a.age -> INTEGER + a.city -> KEYWORD + a.birthday -> DATE + a.employer -> TEXT + a.employer.keyword -> KEYWORD + a.projects -> NESTED + a.projects.active -> BOOLEAN + a.projects.members -> NESTED + a.projects.members.name -> TEXT + a.manager -> OBJECT + a.manager.name -> TEXT + + # nested field names with nested field alias prefix + p -> NESTED + p.active -> BOOLEAN + p.members -> NESTED + p.members.name -> TEXT +``` + +And then when we meet symbol in WHERE clause or elsewhere, we resolve the symbol, ex. `age` and `p.active`, in the context and identify its type. If not found, semantic analysis will end up throwing exception with root cause and suggestion. + +### 4.3 Type Checking + +The trivial field name symbol resolution is very straightforward. Let's take a look at how type checking works for function and operator. + + * **Leaf node**: simply resolve the symbol and return its type. Leaf node includes constant literal (number, boolean), function name, operator name etc. + * **Internal node**: + 1. **Define alias if any**: internal node such as `FROM` and `SELECT` may define alias for index or field. For example, SELECT AVG(age) AS avg or SELECT * FROM accounts a. + 2. **Synthesize types**: types returned from leaf node needs to be synthesized to a single type as result. Precisely, synthesize here means reduce multiple types into one by applying `construct` method defined by each type in our type system. + +![How Types Synthesized](img/how-types-synthesized.png) + +--- +## 5.What's Next + +Although we read many literature and other open source code for reference, the semantic analysis introduced today is far from being mature. For example, we don't have semantic check for only field in GROUP BY can be used in SELECT without aggregate function wrapped. +Beside improvement on semantic analyzer itself, there are other things we can benefit from: + + 1. **A `HELP` command**: which gets information in type system so customer doesn't necessarily learn what's right until they fail their query. + 2. **Correctness testing**: Generate test cases from grammar. Those cases can be used either for finding gaps between grammar/semantic and our backend code or for performing correctness testing by comparing with other based database. + 3. **Symbol table**: is useful for the entire process from semantic analysis here to logical and physical planning. So it should be either kept (flat to single table or avoid real popping when exit visit query) or annotated into Abstract Syntax Tree and pass to backend. \ No newline at end of file diff --git a/docs/dev/img/how-semantic-analysis-works.png b/docs/dev/img/how-semantic-analysis-works.png new file mode 100644 index 0000000000..f6c2a503dd Binary files /dev/null and b/docs/dev/img/how-semantic-analysis-works.png differ diff --git a/docs/dev/img/how-types-synthesized.png b/docs/dev/img/how-types-synthesized.png new file mode 100644 index 0000000000..957cf3da26 Binary files /dev/null and b/docs/dev/img/how-types-synthesized.png differ diff --git a/docs/dev/img/what-semantic-context-looks-like.png b/docs/dev/img/what-semantic-context-looks-like.png new file mode 100644 index 0000000000..8f932b3544 Binary files /dev/null and b/docs/dev/img/what-semantic-context-looks-like.png differ diff --git a/src/main/antlr/OpenDistroSqlLexer.g4 b/src/main/antlr/OpenDistroSqlLexer.g4 index f54869202e..3486b4af22 100644 --- a/src/main/antlr/OpenDistroSqlLexer.g4 +++ b/src/main/antlr/OpenDistroSqlLexer.g4 @@ -173,7 +173,9 @@ EXTENDED_STATS: 'EXTENDED_STATS'; FIELD: 'FIELD'; FILTER: 'FILTER'; GEO_BOUNDING_BOX: 'GEO_BOUNDING_BOX'; +GEO_CELL: 'GEO_CELL'; GEO_DISTANCE: 'GEO_DISTANCE'; +GEO_DISTANCE_RANGE: 'GEO_DISTANCE_RANGE'; GEO_INTERSECTS: 'GEO_INTERSECTS'; GEO_POLYGON: 'GEO_POLYGON'; HISTOGRAM: 'HISTOGRAM'; diff --git a/src/main/antlr/OpenDistroSqlParser.g4 b/src/main/antlr/OpenDistroSqlParser.g4 index c27c07c8a0..d4fe1b75e2 100644 --- a/src/main/antlr/OpenDistroSqlParser.g4 +++ b/src/main/antlr/OpenDistroSqlParser.g4 @@ -220,7 +220,7 @@ tableName ; fullColumnName - : uid (dottedId dottedId? )? + : uid dottedId* ; uid @@ -316,7 +316,6 @@ aggregateWindowedFunction scalarFunctionName : functionNameBase - | SUBSTRING | TRIM ; functionArgs @@ -345,7 +344,7 @@ expression predicate : predicate NOT? IN '(' (selectStatement | expressions) ')' #inPredicate | predicate IS nullNotnull #isNullPredicate - | left=predicate comparisonOperator right=predicate #binaryComparasionPredicate + | left=predicate comparisonOperator right=predicate #binaryComparisonPredicate | predicate NOT? BETWEEN predicate AND predicate #betweenPredicate | predicate NOT? LIKE predicate #likePredicate | predicate NOT? regex=REGEXP predicate #regexpPredicate @@ -399,16 +398,16 @@ keywordsCanBeId functionNameBase : esFunctionNameBase - | ABS | ASIN | ATAN | CBRT | CEIL | CONCAT | CONCAT_WS + | ABS | ASIN | ATAN | ATAN2 | CBRT | CEIL | CONCAT | CONCAT_WS | COS | COSH | DATE_FORMAT | DEGREES | E | EXP | EXPM1 | FLOOR | LOG | LOG10 | LOG2 | LOWER - | PI | POW | RADIANS | RANDOM | RINT - | SIN | SINH | TAN | UPPER | YEAR + | PI | POW | RADIANS | RANDOM | RINT | ROUND + | SIN | SINH | SQRT | SUBSTRING | TAN | TRIM | UPPER | YEAR ; esFunctionNameBase : DATE_HISTOGRAM | DAY_OF_MONTH | DAY_OF_YEAR | DAY_OF_WEEK | EXCLUDE - | EXTENDED_STATS | FILTER | GEO_BOUNDING_BOX | GEO_DISTANCE | GEO_INTERSECTS + | EXTENDED_STATS | FILTER | GEO_BOUNDING_BOX | GEO_CELL | GEO_DISTANCE | GEO_DISTANCE_RANGE | GEO_INTERSECTS | GEO_POLYGON | INCLUDE | IN_TERMS | HISTOGRAM | HOUR_OF_DAY | MATCHPHRASE | MATCH_PHRASE | MATCHQUERY | MATCH_QUERY | MINUTE_OF_DAY | MINUTE_OF_HOUR | MISSING | MONTH_OF_YEAR | MULTIMATCH | MULTI_MATCH | NESTED diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/OpenDistroSqlAnalyzer.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/OpenDistroSqlAnalyzer.java index 6dfd788e48..fdafb65702 100644 --- a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/OpenDistroSqlAnalyzer.java +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/OpenDistroSqlAnalyzer.java @@ -17,28 +17,80 @@ import com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlLexer; import com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.scope.SemanticContext; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.visitor.ESMappingLoader; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.visitor.SemanticAnalyzer; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.visitor.TypeChecker; import com.amazon.opendistroforelasticsearch.sql.antlr.syntax.CaseInsensitiveCharStream; import com.amazon.opendistroforelasticsearch.sql.antlr.syntax.SyntaxAnalysisErrorListener; +import com.amazon.opendistroforelasticsearch.sql.antlr.visitor.AntlrSqlParseTreeVisitor; +import com.amazon.opendistroforelasticsearch.sql.antlr.visitor.EarlyExitAnalysisException; +import com.amazon.opendistroforelasticsearch.sql.esdomain.LocalClusterState; import org.antlr.v4.runtime.CommonTokenStream; import org.antlr.v4.runtime.Lexer; import org.antlr.v4.runtime.tree.ParseTree; +import org.apache.logging.log4j.LogManager; +import org.apache.logging.log4j.Logger; /** * Entry point for ANTLR generated parser to perform strict syntax and semantic analysis. */ public class OpenDistroSqlAnalyzer { + private static final Logger LOG = LogManager.getLogger(); + + /** Original sql query */ + private final SqlAnalysisConfig config; + + public OpenDistroSqlAnalyzer(SqlAnalysisConfig config) { + this.config = config; + } + + public void analyze(String sql, LocalClusterState clusterState) { + // Perform analysis for SELECT only for now because of extra code changes required for SHOW/DESCRIBE. + if (!isSelectStatement(sql) || !config.isAnalyzerEnabled()) { + return; + } + + try { + analyzeSemantic( + analyzeSyntax(sql), + clusterState + ); + } catch (EarlyExitAnalysisException e) { + // Expected if configured so log on debug level to avoid always logging stack trace + LOG.debug("Analysis exits early and will skip remaining process", e); + } + } + /** - * Generate parse tree for the query to perform syntax and semantic analysis. + * Build lexer and parser to perform syntax analysis only. * Runtime exception with clear message is thrown for any verification error. * - * @param sql original query + * @return parse tree + */ + public ParseTree analyzeSyntax(String sql) { + OpenDistroSqlParser parser = createParser(createLexer(sql)); + parser.addErrorListener(new SyntaxAnalysisErrorListener()); + return parser.root(); + } + + /** + * Perform semantic analysis based on syntax analysis output - parse tree. + * + * @param tree parse tree + * @param clusterState cluster state required for index mapping query */ - public void analyze(String sql) { - analyzeSemantic( - analyzeSyntax( - createParser( - createLexer(sql)))); + public void analyzeSemantic(ParseTree tree, LocalClusterState clusterState) { + tree.accept(new AntlrSqlParseTreeVisitor<>(createAnalyzer(clusterState))); + } + + /** Factory method for semantic analyzer to help assemble all required components together */ + private SemanticAnalyzer createAnalyzer(LocalClusterState clusterState) { + SemanticContext context = new SemanticContext(); + ESMappingLoader mappingLoader = new ESMappingLoader(context, clusterState, config.getAnalysisThreshold()); + TypeChecker typeChecker = new TypeChecker(context, config.isFieldSuggestionEnabled()); + return new SemanticAnalyzer(mappingLoader, typeChecker); } private OpenDistroSqlParser createParser(Lexer lexer) { @@ -51,13 +103,10 @@ private OpenDistroSqlLexer createLexer(String sql) { new CaseInsensitiveCharStream(sql)); } - private ParseTree analyzeSyntax(OpenDistroSqlParser parser) { - parser.addErrorListener(new SyntaxAnalysisErrorListener()); - return parser.root(); - } - - private void analyzeSemantic(ParseTree tree) { - //TODO: implement semantic analysis in next stage + private boolean isSelectStatement(String sql) { + int endOfFirstWord = sql.indexOf(' '); + String firstWord = sql.substring(0, endOfFirstWord > 0 ? endOfFirstWord : sql.length()); + return "SELECT".equalsIgnoreCase(firstWord); } } diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/SimilarSymbols.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/SimilarSymbols.java new file mode 100644 index 0000000000..4efd55c707 --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/SimilarSymbols.java @@ -0,0 +1,72 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr; + +import org.apache.lucene.search.spell.LevenshteinDistance; +import org.apache.lucene.search.spell.StringDistance; + +import java.util.Collection; +import java.util.Collections; +import java.util.Comparator; +import java.util.Optional; + +/** + * String similarity for finding most similar string. + */ +public class SimilarSymbols { + + /** LevenshteinDistance instance is basically a math util which is supposed to be thread safe */ + private static final StringDistance ALGORITHM = new LevenshteinDistance(); + + /** Symbol candidate list from which to pick one as most similar symbol to a target */ + private final Collection candidates; + + public SimilarSymbols(Collection candidates) { + this.candidates = Collections.unmodifiableCollection(candidates); + } + + /** + * Find most similar string in candidates by calculating similarity distance + * among target and candidate strings. + * + * @param target string to match + * @return most similar string to the target + */ + public String mostSimilarTo(String target) { + Optional closest = candidates.stream(). + map(candidate -> new SymbolDistance(candidate, target)). + max(Comparator.comparing(SymbolDistance::similarity)); + if (closest.isPresent()) { + return closest.get().candidate; + } + return target; + } + + /** Distance (similarity) between 2 symbols. This class is mainly for Java 8 stream comparator API */ + private static class SymbolDistance { + private final String candidate; + private final String target; + + private SymbolDistance(String candidate, String target) { + this.candidate = candidate; + this.target = target; + } + + public float similarity() { + return ALGORITHM.getDistance(candidate, target); + } + } +} \ No newline at end of file diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/SqlAnalysisConfig.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/SqlAnalysisConfig.java new file mode 100644 index 0000000000..7981006d0f --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/SqlAnalysisConfig.java @@ -0,0 +1,60 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr; + +/** + * Configuration for SQL analysis. + */ +public class SqlAnalysisConfig { + + /** Is entire analyzer enabled to perform the analysis */ + private final boolean isAnalyzerEnabled; + + /** Is suggestion enabled for field name typo */ + private final boolean isFieldSuggestionEnabled; + + /** Skip entire analysis for index mapping larger than this threhold */ + private final int analysisThreshold; + + public SqlAnalysisConfig(boolean isAnalyzerEnabled, + boolean isFieldSuggestionEnabled, + int analysisThreshold) { + this.isAnalyzerEnabled = isAnalyzerEnabled; + this.isFieldSuggestionEnabled = isFieldSuggestionEnabled; + this.analysisThreshold = analysisThreshold; + } + + public boolean isAnalyzerEnabled() { + return isAnalyzerEnabled; + } + + public boolean isFieldSuggestionEnabled() { + return isFieldSuggestionEnabled; + } + + public int getAnalysisThreshold() { + return analysisThreshold; + } + + @Override + public String toString() { + return "SqlAnalysisConfig{" + + "isAnalyzerEnabled=" + isAnalyzerEnabled + + ", isFieldSuggestionEnabled=" + isFieldSuggestionEnabled + + ", analysisThreshold=" + analysisThreshold + + '}'; + } +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalysisException.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalysisException.java new file mode 100644 index 0000000000..3ef29d5e5b --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalysisException.java @@ -0,0 +1,29 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic; + +import com.amazon.opendistroforelasticsearch.sql.antlr.SqlAnalysisException; + +/** + * Exception for semantic analysis + */ +public class SemanticAnalysisException extends SqlAnalysisException { + + public SemanticAnalysisException(String message) { + super(message); + } + +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/scope/Environment.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/scope/Environment.java new file mode 100644 index 0000000000..533d7e6b3d --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/scope/Environment.java @@ -0,0 +1,104 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic.scope; + +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.Type; + +import java.util.HashMap; +import java.util.Map; +import java.util.Optional; + +/** + * Environment for symbol and its attribute (type) in the current scope + */ +public class Environment { + + private final Environment parent; + + private final SymbolTable symbolTable; + + public Environment(Environment parent) { + this.parent = parent; + this.symbolTable = new SymbolTable(); + } + + /** + * Define symbol with the type + * @param symbol symbol to define + * @param type type + */ + public void define(Symbol symbol, Type type) { + symbolTable.store(symbol, type); + } + + /** + * Resolve symbol in the environment + * @param symbol symbol to look up + * @return type if exist + */ + public Optional resolve(Symbol symbol) { + Optional type = Optional.empty(); + for (Environment cur = this; cur != null; cur = cur.parent) { + type = cur.symbolTable.lookup(symbol); + if (type.isPresent()) { + break; + } + } + return type; + } + + /** + * Resolve symbol definitions by a prefix. + * @param prefix a prefix of symbol + * @return all symbols with types that starts with the prefix + */ + public Map resolveByPrefix(Symbol prefix) { + Map typeByName = new HashMap<>(); + for (Environment cur = this; cur != null; cur = cur.parent) { + typeByName.putAll(cur.symbolTable.lookupByPrefix(prefix)); + } + return typeByName; + } + + /** + * Resolve all symbols in the namespace. + * @param namespace a namespace + * @return all symbols in the namespace + */ + public Map resolveAll(Namespace namespace) { + Map result = new HashMap<>(); + for (Environment cur = this; cur != null; cur = cur.parent) { + // putIfAbsent ensures inner most definition will be used (shadow outers) + cur.symbolTable.lookupAll(namespace).forEach(result::putIfAbsent); + } + return result; + } + + /** Current environment is root and no any symbol defined */ + public boolean isEmpty(Namespace namespace) { + for (Environment cur = this; cur != null; cur = cur.parent) { + if (!cur.symbolTable.isEmpty(namespace)) { + return false; + } + } + return true; + } + + public Environment getParent() { + return parent; + } + +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/scope/Namespace.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/scope/Namespace.java new file mode 100644 index 0000000000..014ddc46dd --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/scope/Namespace.java @@ -0,0 +1,38 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic.scope; + +/** + * Namespace of symbol to avoid naming conflict + */ +public enum Namespace { + + FIELD_NAME("Field"), + FUNCTION_NAME("Function"), + OPERATOR_NAME("Operator"); + + private final String name; + + Namespace(String name) { + this.name = name; + } + + @Override + public String toString() { + return name; + } + +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/scope/SemanticContext.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/scope/SemanticContext.java new file mode 100644 index 0000000000..8ed28cd563 --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/scope/SemanticContext.java @@ -0,0 +1,57 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic.scope; + +import java.util.Objects; + +/** + * Semantic context responsible for environment chain (stack) management and everything required for analysis. + * This context should be shared by different stages in future, particularly + * from semantic analysis to logical planning to physical planning. + */ +public class SemanticContext { + + /** Environment stack for symbol scope management */ + private Environment environment = new Environment(null); + + /** + * Push a new environment + */ + public void push() { + environment = new Environment(environment); + } + + /** + * Return current environment + * @return current environment + */ + public Environment peek() { + return environment; + } + + /** + * Pop up current environment from environment chain + * @return current environment (before pop) + */ + public Environment pop() { + Objects.requireNonNull(environment, "Fail to pop context due to no environment present"); + + Environment curEnv = environment; + environment = curEnv.getParent(); + return curEnv; + } + +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/scope/Symbol.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/scope/Symbol.java new file mode 100644 index 0000000000..17486215be --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/scope/Symbol.java @@ -0,0 +1,45 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic.scope; + +/** + * Symbol in the scope + */ +public class Symbol { + + private final Namespace namespace; + + private final String name; + + public Symbol(Namespace namespace, String name) { + this.namespace = namespace; + this.name = name; + } + + public Namespace getNamespace() { + return namespace; + } + + public String getName() { + return name; + } + + @Override + public String toString() { + return namespace + " [" + name + "]"; + } + +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/scope/SymbolTable.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/scope/SymbolTable.java new file mode 100644 index 0000000000..234d759f11 --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/scope/SymbolTable.java @@ -0,0 +1,93 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic.scope; + +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.Type; + +import java.util.EnumMap; +import java.util.Map; +import java.util.NavigableMap; +import java.util.Optional; +import java.util.TreeMap; + +import static java.util.Collections.emptyMap; +import static java.util.Collections.emptyNavigableMap; + +/** + * Symbol table for symbol definition and resolution. + */ +public class SymbolTable { + + /** Two-dimension hash table to manage symbols with type in different namespace */ + private Map> tableByNamespace = new EnumMap<>(Namespace.class); + + /** + * Store symbol with the type. Create new map for namespace for the first time. + * @param symbol symbol to define + * @param type symbol type + */ + public void store(Symbol symbol, Type type) { + tableByNamespace.computeIfAbsent( + symbol.getNamespace(), + ns -> new TreeMap<>() + ).put(symbol.getName(), type); + } + + /** + * Look up symbol in the namespace map. + * @param symbol symbol to look up + * @return symbol type which is optional + */ + public Optional lookup(Symbol symbol) { + Map table = tableByNamespace.get(symbol.getNamespace()); + Type type = null; + if (table != null) { + type = table.get(symbol.getName()); + } + return Optional.ofNullable(type); + } + + /** + * Look up symbols by a prefix. + * @param prefix a symbol prefix + * @return symbols starting with the prefix + */ + public Map lookupByPrefix(Symbol prefix) { + NavigableMap table = tableByNamespace.get(prefix.getNamespace()); + if (table != null) { + return table.subMap(prefix.getName(), prefix.getName() + Character.MAX_VALUE); + } + return emptyMap(); + } + + /** + * Look up all symbols in the namespace. + * @param namespace a namespace + * @return all symbols in the namespace map + */ + public Map lookupAll(Namespace namespace) { + return tableByNamespace.getOrDefault(namespace, emptyNavigableMap()); + } + + /** + * Check if namespace map in empty (none definition) + * @param namespace a namespace + * @return true for empty + */ + public boolean isEmpty(Namespace namespace) { + return tableByNamespace.getOrDefault(namespace, emptyNavigableMap()).isEmpty(); + } +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/Type.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/Type.java new file mode 100644 index 0000000000..28a73d141e --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/Type.java @@ -0,0 +1,90 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types; + +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.SemanticAnalysisException; +import com.amazon.opendistroforelasticsearch.sql.antlr.visitor.Reducible; +import com.amazon.opendistroforelasticsearch.sql.utils.StringUtils; + +import java.util.List; +import java.util.stream.Collectors; + +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.TYPE_ERROR; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.UNKNOWN; + +/** + * Type interface which represents any type of symbol in the SQL. + */ +public interface Type extends Reducible { + + /** + * Hide generic type ugliness and error check here in one place. + */ + @SuppressWarnings("unchecked") + @Override + default T reduce(List others) { + List actualArgTypes = (List) others; + Type result = construct(actualArgTypes); + if (result != TYPE_ERROR) { + return (T) result; + } + + // Generate error message by current type name, argument types and usage of current type + // For example, 'Function [LOG] cannot work with [TEXT, INTEGER]. Usage: LOG(NUMBER) -> NUMBER + String actualArgTypesStr; + if (actualArgTypes.isEmpty()) { + actualArgTypesStr = ""; + } else { + actualArgTypesStr = actualArgTypes.stream(). + map(Type::usage). + collect(Collectors.joining(", ")); + } + + throw new SemanticAnalysisException( + StringUtils.format("%s cannot work with [%s]. Usage: %s", + this, actualArgTypesStr, usage())); + } + + /** + * Type descriptive name + * @return name + */ + String getName(); + + /** + * Check if current type is compatible with other of same type. + * @param other other type + * @return true if compatible + */ + default boolean isCompatible(Type other) { + return other == UNKNOWN || this == other; + } + + /** + * Construct a new type by applying current constructor on other types. + * Constructor is a generic conception that could be function, operator, join etc. + * + * @param others other types + * @return a new type as result + */ + Type construct(List others); + + /** + * Return typical usage of current type + * @return usage string + */ + String usage(); +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/TypeExpression.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/TypeExpression.java new file mode 100644 index 0000000000..f80d1f6945 --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/TypeExpression.java @@ -0,0 +1,131 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types; + +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.special.Generic; +import com.amazon.opendistroforelasticsearch.sql.utils.StringUtils; + +import java.util.Arrays; +import java.util.List; +import java.util.function.Function; +import java.util.stream.Collectors; + +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.TYPE_ERROR; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.UNKNOWN; + +/** + * Type expression representing specification(s) of constructor such as function, operator etc. + * Type expression has to be an interface with default methods because most subclass needs to be Enum. + */ +public interface TypeExpression extends Type { + + @Override + default Type construct(List actualArgs) { + TypeExpressionSpec[] specifications = specifications(); + if (specifications.length == 0) { + // Empty spec means type check for this type expression is not implemented yet. + // Return this to be compatible with everything. + return UNKNOWN; + } + + // Create a temp specification for compatibility check. + TypeExpressionSpec actualSpec = new TypeExpressionSpec(); + actualSpec.argTypes = actualArgs.toArray(new Type[0]); + + // Perform compatibility check between actual spec (argument types) and expected. + // If found any compatible spec, it means actual spec is legal and thus apply to get result type. + // Ex. Actual=[INTEGER], Specs=[NUMBER->NUMBER], [STRING->NUMBER]. So first spec matches and return NUMBER. + for (TypeExpressionSpec spec : specifications) { + if (spec.isCompatible(actualSpec)) { + return spec.constructFunc.apply(actualArgs.toArray(new Type[0])); + } + } + return TYPE_ERROR; + } + + @Override + default String usage() { + return Arrays.stream(specifications()). + map(spec -> getName() + spec). + collect(Collectors.joining(" or ")); + } + + /** + * Each type expression may be overloaded and include multiple specifications. + * @return all valid specifications or empty which means not implemented yet + */ + TypeExpressionSpec[] specifications(); + + /** + * A specification is combination of a construct function and arg types + * for a type expression (represent a constructor) + */ + class TypeExpressionSpec { + Type[] argTypes; + Function constructFunc; + + public TypeExpressionSpec map(Type... args) { + this.argTypes = args; + return this; + } + + public TypeExpressionSpec to(Function constructFunc) { + // Required for generic type to replace placeholder ex.T with actual position in argument list. + // So construct function of generic type can return binding type finally. + this.constructFunc = Generic.specialize(constructFunc, argTypes); + return this; + } + + /** Return a base type no matter what's the arg types + Mostly this is used for empty arg types */ + public TypeExpressionSpec to(Type returnType) { + this.constructFunc = x -> returnType; + return this; + } + + public boolean isCompatible(TypeExpressionSpec otherSpec) { + Type[] expectArgTypes = this.argTypes; + Type[] actualArgTypes = otherSpec.argTypes; + + // Check if arg numbers exactly match + if (expectArgTypes.length != actualArgTypes.length) { + return false; + } + + // Check if all arg types are compatible + for (int i = 0; i < expectArgTypes.length; i++) { + if (!expectArgTypes[i].isCompatible(actualArgTypes[i])) { + return false; + } + } + return true; + } + + @Override + public String toString() { + String argTypesStr = Arrays.stream(argTypes). + map(Type::usage). + collect(Collectors.joining(", ")); + + // Only show generic type name in return value for clarity + Type returnType = constructFunc.apply(argTypes); + String returnTypeStr = (returnType instanceof Generic) ? returnType.getName() : returnType.usage(); + + return StringUtils.format("(%s) -> %s", argTypesStr, returnTypeStr); + } + } + +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/base/BaseType.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/base/BaseType.java new file mode 100644 index 0000000000..80d59186dc --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/base/BaseType.java @@ -0,0 +1,36 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base; + +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.Type; + +import java.util.List; + +/** + * Base type interface + */ +public interface BaseType extends Type { + + @Override + default Type construct(List others) { + return this; + } + + @Override + default String usage() { + return getName(); + } +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/base/ESDataType.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/base/ESDataType.java new file mode 100644 index 0000000000..1be5440866 --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/base/ESDataType.java @@ -0,0 +1,127 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base; + +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.Type; +import com.google.common.collect.ImmutableMap; + +import java.util.Map; + +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESIndex.IndexType.NESTED_FIELD; +import static com.amazon.opendistroforelasticsearch.sql.utils.StringUtils.toUpper; + +/** + * Base type hierarchy based on Elasticsearch data type + */ +public enum ESDataType implements BaseType { + + TYPE_ERROR, + UNKNOWN, + + SHORT, LONG, + INTEGER(SHORT, LONG), + FLOAT(INTEGER), + DOUBLE(FLOAT), + NUMBER(DOUBLE), + + KEYWORD, + TEXT(KEYWORD), + STRING(TEXT), + + DATE_NANOS, + DATE(DATE_NANOS, STRING), + + BOOLEAN, + + OBJECT, NESTED, + COMPLEX(OBJECT, NESTED), + + GEO_POINT, + + ES_TYPE( + NUMBER, + //STRING, move to under DATE because DATE is compatible + DATE, + BOOLEAN, + COMPLEX, + GEO_POINT + ); + + + /** + * Java Enum's valueOf() may thrown "enum constant not found" exception. + * And Java doesn't provide a contains method. + * So this static map is necessary for check and efficiency. + */ + private static final Map ALL_BASE_TYPES; + static { + ImmutableMap.Builder builder = new ImmutableMap.Builder<>(); + for (ESDataType type : ESDataType.values()) { + builder.put(type.name(), type); + } + ALL_BASE_TYPES = builder.build(); + } + + public static ESDataType typeOf(String str) { + return ALL_BASE_TYPES.getOrDefault(toUpper(str), UNKNOWN); + } + + /** Parent of current base type */ + private ESDataType parent; + + ESDataType(ESDataType... compatibleTypes) { + for (ESDataType subType : compatibleTypes) { + subType.parent = this; + } + } + + @Override + public String getName() { + return name(); + } + + /** + * For base type, compatibility means this (current type) is ancestor of other + * in the base type hierarchy. + */ + @Override + public boolean isCompatible(Type other) { + // Skip compatibility check if type is unknown + if (this == UNKNOWN || other == UNKNOWN) { + return true; + } + + if (!(other instanceof ESDataType)) { + // Nested data type is compatible with nested index type for type expression use + if (other instanceof ESIndex && ((ESIndex) other).type() == NESTED_FIELD) { + return isCompatible(NESTED); + } + return false; + } + + // One way compatibility: parent base type is compatible with children + ESDataType cur = (ESDataType) other; + while (cur != null && cur != this) { + cur = cur.parent; + } + return cur != null; + } + + @Override + public String toString() { + return "ES Data Type [" + getName() + "]"; + } +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/base/ESIndex.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/base/ESIndex.java new file mode 100644 index 0000000000..b9b42654f1 --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/base/ESIndex.java @@ -0,0 +1,80 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base; + +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.Type; + +import java.util.Objects; + +/** + * Index type is not Enum because essentially each index is a brand new type. + */ +public class ESIndex implements BaseType { + + public enum IndexType { + INDEX, NESTED_FIELD, INDEX_PATTERN + } + + private final String indexName; + private final IndexType indexType; + + public ESIndex(String indexName, IndexType indexType) { + this.indexName = indexName; + this.indexType = indexType; + } + + public IndexType type() { + return indexType; + } + + @Override + public String getName() { + return indexName; + } + + @Override + public boolean isCompatible(Type other) { + return equals(other); + } + + @Override + public String usage() { + return indexType.name(); + } + + @Override + public String toString() { + return indexType + " [" + indexName + "]"; + } + + @Override + public boolean equals(Object o) { + if (this == o) { + return true; + } + if (o == null || getClass() != o.getClass()) { + return false; + } + ESIndex index = (ESIndex) o; + return Objects.equals(indexName, index.indexName) + && indexType == index.indexType; + } + + @Override + public int hashCode() { + return Objects.hash(indexName, indexType); + } +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/function/AggregateFunction.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/function/AggregateFunction.java new file mode 100644 index 0000000000..6d99c046a0 --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/function/AggregateFunction.java @@ -0,0 +1,63 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.function; + +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.Type; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.TypeExpression; + +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.ES_TYPE; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.INTEGER; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.NUMBER; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.special.Generic.T; + +/** + * Aggregate function + */ +public enum AggregateFunction implements TypeExpression { + COUNT( + func().to(INTEGER), // COUNT(*) + func(ES_TYPE).to(INTEGER) + ), + MAX(func(T(NUMBER)).to(T)), + MIN(func(T(NUMBER)).to(T)), + AVG(func(T(NUMBER)).to(T)), + SUM(func(T(NUMBER)).to(T)); + + private TypeExpressionSpec[] specifications; + + AggregateFunction(TypeExpressionSpec... specifications) { + this.specifications = specifications; + } + + @Override + public String getName() { + return name(); + } + + @Override + public TypeExpressionSpec[] specifications() { + return specifications; + } + + private static TypeExpressionSpec func(Type... argTypes) { + return new TypeExpressionSpec().map(argTypes); + } + + @Override + public String toString() { + return "Function [" + name() + "]"; + } +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/function/ESScalarFunction.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/function/ESScalarFunction.java new file mode 100644 index 0000000000..4b8f2a9e45 --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/function/ESScalarFunction.java @@ -0,0 +1,111 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.function; + +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.Type; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.TypeExpression; + +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.BOOLEAN; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.DATE; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.GEO_POINT; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.INTEGER; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.NUMBER; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.STRING; + +/** + * Elasticsearch special scalar functions + */ +public enum ESScalarFunction implements TypeExpression { + + DATE_HISTOGRAM(), // this is aggregate function + DAY_OF_MONTH(func(DATE).to(INTEGER)), + DAY_OF_YEAR(func(DATE).to(INTEGER)), + DAY_OF_WEEK(func(DATE).to(INTEGER)), + EXCLUDE(), // can only be used in SELECT? + EXTENDED_STATS(), // need confirm + FIELD(), // couldn't find test cases related + FILTER(), + GEO_BOUNDING_BOX(func(GEO_POINT, NUMBER, NUMBER, NUMBER, NUMBER).to(BOOLEAN)), + GEO_CELL(), // optional arg or overloaded spec is required. + GEO_DISTANCE(func(GEO_POINT, STRING, NUMBER, NUMBER).to(BOOLEAN)), + GEO_DISTANCE_RANGE(func(GEO_POINT, STRING, NUMBER, NUMBER).to(BOOLEAN)), + GEO_INTERSECTS(), //? + GEO_POLYGON(), // varargs is required for 2nd arg + HISTOGRAM(), // same as date_histogram + HOUR_OF_DAY(func(DATE).to(INTEGER)), + INCLUDE(), // same as exclude + IN_TERMS(), // varargs + MATCHPHRASE( + func(STRING, STRING).to(BOOLEAN), + func(STRING).to(STRING) + ), //slop arg is optional + MATCH_PHRASE(MATCHPHRASE.specifications()), + MATCHQUERY( + func(STRING, STRING).to(BOOLEAN), + func(STRING).to(STRING) + ), + MATCH_QUERY(MATCHQUERY.specifications()), + MINUTE_OF_DAY(func(DATE).to(INTEGER)), // or long? + MINUTE_OF_HOUR(func(DATE).to(INTEGER)), + MONTH_OF_YEAR(func(DATE).to(INTEGER)), + MULTIMATCH(), // kw arguments + MULTI_MATCH(MULTIMATCH.specifications()), + NESTED(), // overloaded + PERCENTILES(), //? + REGEXP_QUERY(), //? + REVERSE_NESTED(), // need overloaded + QUERY(func(STRING).to(BOOLEAN)), + RANGE(), // aggregate function + SCORE(), // semantic problem? + SECOND_OF_MINUTE(func(DATE).to(INTEGER)), + STATS(), + TERM(), // semantic problem + TERMS(), // semantic problem + TOPHITS(), // only available in SELECT + WEEK_OF_YEAR(func(DATE).to(INTEGER)), + WILDCARDQUERY( + func(STRING, STRING).to(BOOLEAN), + func(STRING).to(STRING) + ), + WILDCARD_QUERY(WILDCARDQUERY.specifications()); + + + private final TypeExpressionSpec[] specifications; + + ESScalarFunction(TypeExpressionSpec... specifications) { + this.specifications = specifications; + } + + @Override + public String getName() { + return name(); + } + + @Override + public TypeExpressionSpec[] specifications() { + return specifications; + } + + private static TypeExpressionSpec func(Type... argTypes) { + return new TypeExpressionSpec().map(argTypes); + } + + @Override + public String toString() { + return "Function [" + name() + "]"; + } + +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/function/ScalarFunction.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/function/ScalarFunction.java new file mode 100644 index 0000000000..ea0f164982 --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/function/ScalarFunction.java @@ -0,0 +1,106 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.function; + +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.Type; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.TypeExpression; + +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.DATE; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.DOUBLE; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.INTEGER; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.NUMBER; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.STRING; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.special.Generic.T; + +/** + * Scalar SQL function + */ +public enum ScalarFunction implements TypeExpression { + + ABS(func(T(NUMBER)).to(T)), // translate to Java: T ABS(T) + ASIN(func(T(NUMBER)).to(T)), + ATAN(func(T(NUMBER)).to(T)), + ATAN2(func(T(NUMBER)).to(T)), + CBRT(func(T(NUMBER)).to(T)), + CEIL(func(T(NUMBER)).to(T)), + CONCAT(), // TODO: varargs support required + CONCAT_WS(), + COS(func(T(NUMBER)).to(T)), + COSH(func(T(NUMBER)).to(T)), + DATE_FORMAT( + func(DATE, STRING).to(STRING), + func(DATE, STRING, STRING).to(STRING) + ), + DEGREES(func(T(NUMBER)).to(T)), + E(func().to(DOUBLE)), + EXP(func(T(NUMBER)).to(T)), + EXPM1(func(T(NUMBER)).to(T)), + FLOOR(func(T(NUMBER)).to(T)), + LOG( + func(T(NUMBER)).to(T), + func(T(NUMBER), NUMBER).to(T) + ), + LOG2(func(T(NUMBER)).to(T)), + LOG10(func(T(NUMBER)).to(T)), + LOWER( + func(T(STRING)).to(T), + func(T(STRING), STRING).to(T) + ), + PI(func().to(DOUBLE)), + POW( + func(T(NUMBER)).to(T), + func(T(NUMBER), NUMBER).to(T) + ), + RADIANS(func(T(NUMBER)).to(T)), + RANDOM(func(T(NUMBER)).to(T)), + RINT(func(T(NUMBER)).to(T)), + ROUND(func(T(NUMBER)).to(T)), + SIN(func(T(NUMBER)).to(T)), + SINH(func(T(NUMBER)).to(T)), + SQRT(func(T(NUMBER)).to(T)), + SUBSTRING(func(T(STRING), INTEGER, INTEGER).to(T)), + TAN(func(T(NUMBER)).to(T)), + UPPER( + func(T(STRING)).to(T), + func(T(STRING), STRING).to(T) + ), + YEAR(func(DATE).to(INTEGER)); + + private final TypeExpressionSpec[] specifications; + + ScalarFunction(TypeExpressionSpec... specifications) { + this.specifications = specifications; + } + + @Override + public String getName() { + return name(); + } + + @Override + public TypeExpressionSpec[] specifications() { + return specifications; + } + + private static TypeExpressionSpec func(Type... argTypes) { + return new TypeExpressionSpec().map(argTypes); + } + + @Override + public String toString() { + return "Function [" + name() + "]"; + } +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/operator/ComparisonOperator.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/operator/ComparisonOperator.java new file mode 100644 index 0000000000..90b96f5cc1 --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/operator/ComparisonOperator.java @@ -0,0 +1,74 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.operator; + +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.Type; + +import java.util.List; + +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.BOOLEAN; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.TYPE_ERROR; + +/** + * Type for comparison operator + */ +public enum ComparisonOperator implements Type { + + EQUAL("="), + NOT_EQUAL("<>"), + NOT_EQUAL2("!="), + GREATER_THAN(">"), + GREATER_THAN_OR_EQUAL_TO(">="), + SMALLER_THAN("<"), + SMALLER_THAN_OR_EQUAL_TO("<="), + IS("IS"); + + /** Actual name representing the operator */ + private final String name; + + ComparisonOperator(String name) { + this.name = name; + } + + @Override + public String getName() { + return name; + } + + @Override + public Type construct(List actualArgs) { + if (actualArgs.size() != 2) { + return TYPE_ERROR; + } + + Type leftType = actualArgs.get(0); + Type rightType = actualArgs.get(1); + if (leftType.isCompatible(rightType) || rightType.isCompatible(leftType)) { + return BOOLEAN; + } + return TYPE_ERROR; + } + + @Override + public String usage() { + return "Please use compatible types from each side."; + } + + @Override + public String toString() { + return "Operator [" + getName() + "]"; + } +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/operator/JoinOperator.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/operator/JoinOperator.java new file mode 100644 index 0000000000..ffba99ca60 --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/operator/JoinOperator.java @@ -0,0 +1,57 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.operator; + +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.Type; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESIndex; + +import java.util.List; +import java.util.Optional; + +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.TYPE_ERROR; + +/** + * Join operator + */ +public enum JoinOperator implements Type { + JOIN; + + @Override + public String getName() { + return name(); + } + + @Override + public Type construct(List others) { + Optional isAnyNonIndexType = others.stream(). + filter(type -> !(type instanceof ESIndex)). + findAny(); + if (isAnyNonIndexType.isPresent()) { + return TYPE_ERROR; + } + return others.get(0); + } + + @Override + public String usage() { + return "Please join index with other index or its nested field."; + } + + @Override + public String toString() { + return "Operator [" + getName() + "]"; + } +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/operator/SetOperator.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/operator/SetOperator.java new file mode 100644 index 0000000000..5503c31b16 --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/operator/SetOperator.java @@ -0,0 +1,65 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.operator; + +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.Type; + +import java.util.List; + +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.TYPE_ERROR; + +/** + * Set operator between queries. + */ +public enum SetOperator implements Type { + UNION, + MINUS, + IN; + + @Override + public String getName() { + return name(); + } + + @Override + public Type construct(List others) { + if (others.size() < 2) { + throw new IllegalStateException(""); + } + + // Compare each type and return anyone for now if pass + for (int i = 0; i < others.size() - 1; i++) { + Type type1 = others.get(i); + Type type2 = others.get(i + 1); + + // Do it again as in Product because single base type won't be wrapped in Product + if (!type1.isCompatible(type2) && !type2.isCompatible(type1)) { + return TYPE_ERROR; + } + } + return others.get(0); + } + + @Override + public String usage() { + return "Please return field(s) of compatible type from each query."; + } + + @Override + public String toString() { + return "Operator [" + getName() + "]"; + } +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/special/Generic.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/special/Generic.java new file mode 100644 index 0000000000..64f6025ad3 --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/special/Generic.java @@ -0,0 +1,100 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.special; + +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.Type; +import com.amazon.opendistroforelasticsearch.sql.utils.StringUtils; + +import java.util.Arrays; +import java.util.List; +import java.util.function.Function; + +/** + * Generic type for more precise type expression + */ +public class Generic implements Type { + + /** Generic type placeholder namespace */ + private enum Name { T } + + /** Construct function to find generic type in argument list with same name */ + public static final Function T = types -> findSameGenericType(Name.T, types); + + /** Generic type name */ + private final Name name; + + /** Actual type binding to current generic type */ + private final Type binding; + + public Generic(Name name, Type type) { + this.name = name; + this.binding = type; + } + + public static Type T(Type type) { + return new Generic(Name.T, type); + } + + /** + * Return a function for replacing generic type in argument list with binding type. + * Ex. after T instance found in argument list [T(NUMBER), STRING], create function to return actualTypes[0] + * + * @param func function for finding generic type in argument list (namely, function T above) + * @param actualArgTypes actual argument types + */ + public static Function specialize(Function func, + Type[] actualArgTypes) { + if (func != T) { + return func; + } + + Type genericType = func.apply(actualArgTypes); + int genericTypeIndex = Arrays.asList(actualArgTypes).indexOf(genericType); + return actualTypes -> actualTypes[genericTypeIndex]; + } + + /** Find placeholder in argument list, ex. in [T(NUMBER), STRING] -> T, return instance at first T */ + private static Type findSameGenericType(Name name, Type[] types) { + return Arrays.stream(types). + filter(type -> type instanceof Generic). + filter(type -> ((Generic) type).name == name). + findFirst(). + orElseThrow(() -> new IllegalStateException(StringUtils.format( + "Type definition is wrong. Could not unbind generic type [%s] in type list %s.", + name, types)) + ); + } + + @Override + public String getName() { + return this.name.name(); + } + + @Override + public boolean isCompatible(Type other) { + return binding.isCompatible(other); + } + + @Override + public Type construct(List others) { + return binding.construct(others); + } + + @Override + public String usage() { + return binding.usage() + " " + name; + } +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/special/Product.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/special/Product.java new file mode 100644 index 0000000000..653bd3e1c0 --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/special/Product.java @@ -0,0 +1,81 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.special; + +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.Type; + +import java.util.Collections; +import java.util.List; +import java.util.stream.Collectors; + +/** + * Combination of multiple types, ex. function arguments + */ +public class Product implements Type { + + private final List types; + + public Product(List itemTypes) { + types = Collections.unmodifiableList(itemTypes); + } + + @Override + public String getName() { + return "Product of types " + types; + } + + @Override + public boolean isCompatible(Type other) { + if (!(other instanceof Product)) { + return false; + } + + Product otherProd = (Product) other; + if (types.size() != otherProd.types.size()) { + return false; + } + + for (int i = 0; i < types.size(); i++) { + Type type = types.get(i); + Type otherType = otherProd.types.get(i); + if (!isCompatibleEitherWay(type, otherType)) { + return false; + } + } + return true; + } + + @Override + public Type construct(List others) { + return this; + } + + @Override + public String usage() { + if (types.isEmpty()) { + return "(*)"; + } + return types.stream(). + map(Type::usage). + collect(Collectors.joining(", ", "(", ")")); + } + + /** Perform two-way compatibility check here which is different from normal type expression */ + private boolean isCompatibleEitherWay(Type type1, Type type2) { + return type1.isCompatible(type2) || type2.isCompatible(type1); + } + +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/visitor/ESMappingLoader.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/visitor/ESMappingLoader.java new file mode 100644 index 0000000000..34c1ebc801 --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/visitor/ESMappingLoader.java @@ -0,0 +1,210 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic.visitor; + +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.scope.Environment; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.scope.Namespace; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.scope.SemanticContext; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.scope.Symbol; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.Type; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESIndex; +import com.amazon.opendistroforelasticsearch.sql.antlr.visitor.EarlyExitAnalysisException; +import com.amazon.opendistroforelasticsearch.sql.antlr.visitor.GenericSqlParseTreeVisitor; +import com.amazon.opendistroforelasticsearch.sql.esdomain.LocalClusterState; +import com.amazon.opendistroforelasticsearch.sql.esdomain.mapping.FieldMappings; +import com.amazon.opendistroforelasticsearch.sql.esdomain.mapping.IndexMappings; +import com.amazon.opendistroforelasticsearch.sql.utils.StringUtils; + +import java.util.Map; + +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESIndex.IndexType.INDEX; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESIndex.IndexType.NESTED_FIELD; + +/** + * Load index and nested field mapping into semantic context + */ +public class ESMappingLoader implements GenericSqlParseTreeVisitor { + + /** Semantic context shared in the semantic analysis process */ + private final SemanticContext context; + + /** Local cluster state for mapping query */ + private final LocalClusterState clusterState; + + /** Threshold to decide if continue the analysis */ + private final int threshold; + + public ESMappingLoader(SemanticContext context, LocalClusterState clusterState, int threshold) { + this.context = context; + this.clusterState = clusterState; + this.threshold = threshold; + } + + /* + * Suppose index 'accounts' includes 'name', 'age' and nested field 'projects' + * which includes 'name' and 'active'. + * + * 1. Define itself: + * ----- new definitions ----- + * accounts -> INDEX + * + * 2. Define without alias no matter if alias given: + * 'accounts' -> INDEX + * ----- new definitions ----- + * 'name' -> TEXT + * 'age' -> INTEGER + * 'projects' -> NESTED + * 'projects.name' -> KEYWORD + * 'projects.active' -> BOOLEAN + */ + @Override + public Type visitIndexName(String indexName) { + if (isNotNested(indexName)) { + defineIndexType(indexName); + loadAllFieldsWithType(indexName); + } + return defaultValue(); + } + + @Override + public void visitAs(String alias, Type type) { + if (!(type instanceof ESIndex)) { + return; + } + + ESIndex index = (ESIndex) type; + String indexName = type.getName(); + + if (index.type() == INDEX) { + String aliasName = alias.isEmpty() ? indexName : alias; + defineAllFieldNamesByAppendingAliasPrefix(indexName, aliasName); + } else if (index.type() == NESTED_FIELD) { + if (!alias.isEmpty()) { + defineNestedFieldNamesByReplacingWithAlias(indexName, alias); + } + } // else Do nothing for index pattern + } + + private void defineIndexType(String indexName) { + environment().define(new Symbol(Namespace.FIELD_NAME, indexName), new ESIndex(indexName, INDEX)); + } + + private void loadAllFieldsWithType(String indexName) { + FieldMappings mappings = getFieldMappings(indexName); + mappings.flat(this::defineFieldName); + } + + /* + * 3.1 Define with alias if given: ex."SELECT * FROM accounts a". + * 'accounts' -> INDEX + * 'name' -> TEXT + * 'age' -> INTEGER + * 'projects' -> NESTED + * 'projects.name' -> KEYWORD + * 'projects.active' -> BOOLEAN + * ----- new definitions ----- + * ['a' -> INDEX] -- this is done in semantic analyzer + * 'a.name' -> TEXT + * 'a.age' -> INTEGER + * 'a.projects' -> NESTED + * 'a.projects.name' -> KEYWORD + * 'a.projects.active' -> BOOLEAN + * + * 3.2 Otherwise define by index full name: ex."SELECT * FROM account" + * 'accounts' -> INDEX + * 'name' -> TEXT + * 'age' -> INTEGER + * 'projects' -> NESTED + * 'projects.name' -> KEYWORD + * 'projects.active' -> BOOLEAN + * ----- new definitions ----- + * 'accounts.name' -> TEXT + * 'accounts.age' -> INTEGER + * 'accounts.projects' -> NESTED + * 'accounts.projects.name' -> KEYWORD + * 'accounts.projects.active' -> BOOLEAN + */ + private void defineAllFieldNamesByAppendingAliasPrefix(String indexName, String alias) { + FieldMappings mappings = getFieldMappings(indexName); + mappings.flat((fieldName, type) -> defineFieldName(alias + "." + fieldName, type)); + } + + /* + * 3.3 Define with alias if given: ex."SELECT * FROM accounts a, a.project p" + * 'accounts' -> INDEX + * 'name' -> TEXT + * 'age' -> INTEGER + * 'projects' -> NESTED + * 'projects.name' -> KEYWORD + * 'projects.active' -> BOOLEAN + * 'a.name' -> TEXT + * 'a.age' -> INTEGER + * 'a.projects' -> NESTED + * 'a.projects.name' -> KEYWORD + * 'a.projects.active' -> BOOLEAN + * ----- new definitions ----- + * ['p' -> NESTED] -- this is done in semantic analyzer + * 'p.name' -> KEYWORD + * 'p.active' -> BOOLEAN + */ + private void defineNestedFieldNamesByReplacingWithAlias(String nestedFieldName, String alias) { + Map typeByFullName = environment().resolveByPrefix( + new Symbol(Namespace.FIELD_NAME, nestedFieldName)); + typeByFullName.forEach( + (fieldName, fieldType) -> defineFieldName(fieldName.replace(nestedFieldName, alias), fieldType) + ); + } + + /** + * Check if index name is NOT nested, for example. return true for index 'accounts' or '.kibana' + * but return false for nested field name 'a.projects'. + */ + private boolean isNotNested(String indexName) { + return indexName.indexOf('.', 1) == -1; // taking care of .kibana + } + + private FieldMappings getFieldMappings(String indexName) { + IndexMappings indexMappings = clusterState.getFieldMappings(new String[]{indexName}); + FieldMappings fieldMappings = indexMappings.firstMapping().firstMapping(); + + int size = fieldMappings.data().size(); + if (size > threshold) { + throw new EarlyExitAnalysisException(StringUtils.format( + "Index [%s] has [%d] fields more than threshold [%d]", indexName, size, threshold)); + } + return fieldMappings; + } + + private void defineFieldName(String fieldName, String type) { + if ("NESTED".equalsIgnoreCase(type)) { + defineFieldName(fieldName, new ESIndex(fieldName, NESTED_FIELD)); + } else { + defineFieldName(fieldName, ESDataType.typeOf(type)); + } + } + + private void defineFieldName(String fieldName, Type type) { + Symbol symbol = new Symbol(Namespace.FIELD_NAME, fieldName); + if (!environment().resolve(symbol).isPresent()) { // TODO: why? add test for name shadow + environment().define(symbol, type); + } + } + + private Environment environment() { + return context.peek(); + } +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/visitor/SemanticAnalyzer.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/visitor/SemanticAnalyzer.java new file mode 100644 index 0000000000..699bd29a40 --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/visitor/SemanticAnalyzer.java @@ -0,0 +1,126 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic.visitor; + +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.Type; +import com.amazon.opendistroforelasticsearch.sql.antlr.visitor.GenericSqlParseTreeVisitor; + +import java.util.List; + +/** + * Main visitor implementation to drive the entire semantic analysis. + */ +public class SemanticAnalyzer implements GenericSqlParseTreeVisitor { + + private final ESMappingLoader mappingLoader; + + private final TypeChecker typeChecker; + + public SemanticAnalyzer(ESMappingLoader mappingLoader, TypeChecker typeChecker) { + this.mappingLoader = mappingLoader; + this.typeChecker = typeChecker; + } + + @Override + public void visitRoot() { + mappingLoader.visitRoot(); + typeChecker.visitRoot(); + } + + @Override + public void visitQuery() { + mappingLoader.visitQuery(); + typeChecker.visitQuery(); + } + + @Override + public void endVisitQuery() { + mappingLoader.endVisitQuery(); + typeChecker.endVisitQuery(); + } + + @Override + public Type visitSelect(List itemTypes) { + mappingLoader.visitSelect(itemTypes); + return typeChecker.visitSelect(itemTypes); + } + + @Override + public void visitAs(String alias, Type type) { + mappingLoader.visitAs(alias, type); + typeChecker.visitAs(alias, type); + } + + @Override + public Type visitIndexName(String indexName) { + mappingLoader.visitIndexName(indexName); + return typeChecker.visitIndexName(indexName); + } + + @Override + public Type visitFieldName(String fieldName) { + mappingLoader.visitFieldName(fieldName); + return typeChecker.visitFieldName(fieldName); + } + + @Override + public Type visitFunctionName(String funcName) { + mappingLoader.visitFunctionName(funcName); + return typeChecker.visitFunctionName(funcName); + } + + @Override + public Type visitOperator(String opName) { + mappingLoader.visitOperator(opName); + return typeChecker.visitOperator(opName); + } + + @Override + public Type visitString(String text) { + mappingLoader.visitString(text); + return typeChecker.visitString(text); + } + + @Override + public Type visitInteger(String text) { + mappingLoader.visitInteger(text); + return typeChecker.visitInteger(text); + } + + @Override + public Type visitFloat(String text) { + mappingLoader.visitFloat(text); + return typeChecker.visitFloat(text); + } + + @Override + public Type visitBoolean(String text) { + mappingLoader.visitBoolean(text); + return typeChecker.visitBoolean(text); + } + + @Override + public Type visitDate(String text) { + mappingLoader.visitDate(text); + return typeChecker.visitDate(text); + } + + @Override + public Type defaultValue() { + mappingLoader.defaultValue(); + return typeChecker.defaultValue(); + } +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/visitor/TypeChecker.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/visitor/TypeChecker.java new file mode 100644 index 0000000000..3d317409e6 --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/visitor/TypeChecker.java @@ -0,0 +1,215 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic.visitor; + +import com.amazon.opendistroforelasticsearch.sql.antlr.SimilarSymbols; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.SemanticAnalysisException; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.scope.Environment; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.scope.Namespace; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.scope.SemanticContext; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.scope.Symbol; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.Type; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.TypeExpression; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.function.AggregateFunction; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.function.ESScalarFunction; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.function.ScalarFunction; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.operator.ComparisonOperator; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.operator.JoinOperator; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.operator.SetOperator; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.special.Product; +import com.amazon.opendistroforelasticsearch.sql.antlr.visitor.GenericSqlParseTreeVisitor; +import com.amazon.opendistroforelasticsearch.sql.utils.StringUtils; + +import java.util.List; +import java.util.Optional; +import java.util.Set; + +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.UNKNOWN; + +/** + * SQL semantic analyzer that determines if a syntactical correct query is meaningful. + */ +public class TypeChecker implements GenericSqlParseTreeVisitor { + + private static final Type NULL_TYPE = new Type() { + @Override + public String getName() { + return "NULL"; + } + + @Override + public boolean isCompatible(Type other) { + throw new IllegalStateException("Compatibility check on NULL type with " + other); + } + + @Override + public Type construct(List others) { + throw new IllegalStateException("Construct operation on NULL type with " + others); + } + + @Override + public String usage() { + throw new IllegalStateException("Usage print operation on NULL type"); + } + }; + + /** Semantic context for symbol scope management */ + private final SemanticContext context; + + /** Should suggestion provided. Disabled by default for security concern. */ + private final boolean isSuggestEnabled; + + public TypeChecker(SemanticContext context) { + this.context = context; + this.isSuggestEnabled = false; + } + + public TypeChecker(SemanticContext context, boolean isSuggestEnabled) { + this.context = context; + this.isSuggestEnabled = isSuggestEnabled; + } + + @Override + public void visitRoot() { + defineFunctionNames(ScalarFunction.values()); + defineFunctionNames(ESScalarFunction.values()); + defineFunctionNames(AggregateFunction.values()); + defineOperatorNames(ComparisonOperator.values()); + defineOperatorNames(SetOperator.values()); + defineOperatorNames(JoinOperator.values()); + } + + @Override + public void visitQuery() { + context.push(); + } + + @Override + public void endVisitQuery() { + context.pop(); + } + + @Override + public Type visitSelect(List itemTypes) { + if (itemTypes.size() == 1) { + return itemTypes.get(0); + } + // Return product for empty (SELECT *) and #items > 1 + return new Product(itemTypes); + } + + @Override + public void visitAs(String alias, Type type) { + defineFieldName(alias, type); + } + + @Override + public Type visitIndexName(String indexName) { + return resolve(new Symbol(Namespace.FIELD_NAME, indexName)); + } + + @Override + public Type visitFieldName(String fieldName) { + // Bypass hidden fields which is not present in mapping, ex. _id, _type. + if (fieldName.startsWith("_")) { + return UNKNOWN; + } + // Ignore case for function/operator though field name is case sensitive + return resolve(new Symbol(Namespace.FIELD_NAME, fieldName)); + } + + @Override + public Type visitFunctionName(String funcName) { + return resolve(new Symbol(Namespace.FUNCTION_NAME, StringUtils.toUpper(funcName))); + } + + @Override + public Type visitOperator(String opName) { + return resolve(new Symbol(Namespace.OPERATOR_NAME, StringUtils.toUpper(opName))); + } + + @Override + public Type visitString(String text) { + return ESDataType.STRING; + } + + @Override + public Type visitInteger(String text) { + return ESDataType.INTEGER; + } + + @Override + public Type visitFloat(String text) { + return ESDataType.FLOAT; + } + + @Override + public Type visitBoolean(String text) { + // "IS [NOT] MISSING" can be used on any data type + return "MISSING".equalsIgnoreCase(text) ? UNKNOWN : ESDataType.BOOLEAN; + } + + @Override + public Type visitDate(String text) { + return ESDataType.DATE; + } + + @Override + public Type defaultValue() { + return NULL_TYPE; + } + + private void defineFieldName(String fieldName, Type type) { + Symbol symbol = new Symbol(Namespace.FIELD_NAME, fieldName); + if (!environment().resolve(symbol).isPresent()) { + environment().define(symbol, type); + } + } + + private void defineFunctionNames(TypeExpression[] expressions) { + for (TypeExpression expr : expressions) { + environment().define(new Symbol(Namespace.FUNCTION_NAME, expr.getName()), expr); + } + } + + private void defineOperatorNames(Type[] expressions) { + for (Type expr : expressions) { + environment().define(new Symbol(Namespace.OPERATOR_NAME, expr.getName()), expr); + } + } + + private Type resolve(Symbol symbol) { + Optional type = environment().resolve(symbol); + if (type.isPresent()) { + return type.get(); + } + + String errorMsg = StringUtils.format("%s cannot be found or used here.", symbol); + + if (isSuggestEnabled || symbol.getNamespace() != Namespace.FIELD_NAME) { + Set allSymbolsInScope = environment().resolveAll(symbol.getNamespace()).keySet(); + String suggestedWord = new SimilarSymbols(allSymbolsInScope).mostSimilarTo(symbol.getName()); + errorMsg += StringUtils.format(" Did you mean [%s]?", suggestedWord); + } + throw new SemanticAnalysisException(errorMsg); + } + + private Environment environment() { + return context.peek(); + } + +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/syntax/SyntaxAnalysisErrorListener.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/syntax/SyntaxAnalysisErrorListener.java index cbf9d09b9a..2011072f46 100644 --- a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/syntax/SyntaxAnalysisErrorListener.java +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/syntax/SyntaxAnalysisErrorListener.java @@ -37,7 +37,7 @@ public void syntaxError(Recognizer recognizer, Object offendingSymbol, Token offendingToken = (Token) offendingSymbol; String query = tokens.getText(); - throw new SqlSyntaxAnalysisException( + throw new SyntaxAnalysisException( StringUtils.format( "Failed to parse query due to offending symbol [%s] at: '%s' <--- HERE... More details: %s", getOffendingText(offendingToken), diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/syntax/SqlSyntaxAnalysisException.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/syntax/SyntaxAnalysisException.java similarity index 86% rename from src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/syntax/SqlSyntaxAnalysisException.java rename to src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/syntax/SyntaxAnalysisException.java index 55b7b3d7de..3fa056b25b 100644 --- a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/syntax/SqlSyntaxAnalysisException.java +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/syntax/SyntaxAnalysisException.java @@ -20,9 +20,9 @@ /** * Exception for syntax analysis */ -public class SqlSyntaxAnalysisException extends SqlAnalysisException { +public class SyntaxAnalysisException extends SqlAnalysisException { - public SqlSyntaxAnalysisException(String message) { + public SyntaxAnalysisException(String message) { super(message); } } diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/visitor/AntlrSqlParseTreeVisitor.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/visitor/AntlrSqlParseTreeVisitor.java new file mode 100644 index 0000000000..a33345ba39 --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/visitor/AntlrSqlParseTreeVisitor.java @@ -0,0 +1,374 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.visitor; + +import com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.FunctionArgsContext; +import com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.InnerJoinContext; +import com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.QuerySpecificationContext; +import com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.SelectColumnElementContext; +import com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.TableNamePatternContext; +import com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParserBaseVisitor; +import org.antlr.v4.runtime.ParserRuleContext; +import org.antlr.v4.runtime.tree.ParseTree; + +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; +import java.util.stream.Collectors; + +import static com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.AggregateWindowedFunctionContext; +import static com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.AtomTableItemContext; +import static com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.BinaryComparisonPredicateContext; +import static com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.BooleanLiteralContext; +import static com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.ComparisonOperatorContext; +import static com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.ConstantContext; +import static com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.DecimalLiteralContext; +import static com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.FromClauseContext; +import static com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.FullColumnNameContext; +import static com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.FunctionNameBaseContext; +import static com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.InPredicateContext; +import static com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.IsExpressionContext; +import static com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.MinusSelectContext; +import static com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.OuterJoinContext; +import static com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.PredicateContext; +import static com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.RootContext; +import static com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.ScalarFunctionCallContext; +import static com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.SelectElementsContext; +import static com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.SelectExpressionElementContext; +import static com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.SelectFunctionElementContext; +import static com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.SimpleTableNameContext; +import static com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.StringLiteralContext; +import static com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.TableAndTypeNameContext; +import static com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.TableSourceBaseContext; +import static com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.TableSourceItemContext; +import static com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.TableSourcesContext; +import static com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.UdfFunctionCallContext; +import static com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.UidContext; +import static com.amazon.opendistroforelasticsearch.sql.antlr.parser.OpenDistroSqlParser.UnionSelectContext; +import static java.util.Collections.emptyList; +import static java.util.Collections.singleton; + +/** + * ANTLR parse tree visitor to drive the analysis process. + */ +public class AntlrSqlParseTreeVisitor extends OpenDistroSqlParserBaseVisitor { + + /** Generic visitor to perform the real action on parse tree */ + private final GenericSqlParseTreeVisitor visitor; + + public AntlrSqlParseTreeVisitor(GenericSqlParseTreeVisitor visitor) { + this.visitor = visitor; + } + + @Override + public T visitRoot(RootContext ctx) { + visitor.visitRoot(); + return super.visitRoot(ctx); + } + + @Override + public T visitUnionSelect(UnionSelectContext ctx) { + T union = visitor.visitOperator("UNION"); + return reduce(union, + asList( + ctx.querySpecification(), + ctx.unionStatement() + ) + ); + } + + @Override + public T visitMinusSelect(MinusSelectContext ctx) { + T minus = visitor.visitOperator("MINUS"); + return reduce(minus, asList(ctx.querySpecification(), ctx.minusStatement())); + } + + @Override + public T visitInPredicate(InPredicateContext ctx) { + T in = visitor.visitOperator("IN"); + PredicateContext field = ctx.predicate(); + ParserRuleContext subquery = (ctx.selectStatement() != null) ? ctx.selectStatement() : ctx.expressions(); + return reduce(in, Arrays.asList(field, subquery)); + } + + @Override + public T visitTableSources(TableSourcesContext ctx) { + if (ctx.tableSource().size() < 2) { + return super.visitTableSources(ctx); + } + T commaJoin = visitor.visitOperator("JOIN"); + return reduce(commaJoin, ctx.tableSource()); + } + + @Override + public T visitTableSourceBase(TableSourceBaseContext ctx) { + if (ctx.joinPart().isEmpty()) { + return super.visitTableSourceBase(ctx); + } + T join = visitor.visitOperator("JOIN"); + return reduce(join, asList(ctx.tableSourceItem(), ctx.joinPart())); + } + + @Override + public T visitInnerJoin(InnerJoinContext ctx) { + return visitJoin(ctx.children, ctx.tableSourceItem()); + } + + @Override + public T visitOuterJoin(OuterJoinContext ctx) { + return visitJoin(ctx.children, ctx.tableSourceItem()); + } + + /** + * Enforce visit order because ANTLR is generic and unaware. + * + * Visiting order is: + * FROM + * => WHERE + * => SELECT + * => GROUP BY + * => HAVING + * => ORDER BY + * => LIMIT + */ + @Override + public T visitQuerySpecification(QuerySpecificationContext ctx) { + visitor.visitQuery(); + + // Always visit FROM clause first to define symbols + FromClauseContext fromClause = ctx.fromClause(); + visit(fromClause.tableSources()); + + if (fromClause.whereExpr != null) { + visit(fromClause.whereExpr); + } + + // Note visit GROUP BY and HAVING later than SELECT for alias definition + T result = visitSelectElements(ctx.selectElements()); + fromClause.groupByItem().forEach(this::visit); + if (fromClause.havingExpr != null) { + visit(fromClause.havingExpr); + } + + if (ctx.orderByClause() != null) { + visitOrderByClause(ctx.orderByClause()); + } + if (ctx.limitClause() != null) { + visitLimitClause(ctx.limitClause()); + } + + visitor.endVisitQuery(); + return result; + } + + /** Visit here instead of tableName because we need alias */ + @Override + public T visitAtomTableItem(AtomTableItemContext ctx) { + String alias = (ctx.alias == null) ? "" : ctx.alias.getText(); + T result = visit(ctx.tableName()); + visitor.visitAs(alias, result); + return result; + } + + @Override + public T visitSimpleTableName(SimpleTableNameContext ctx) { + return visitor.visitIndexName(ctx.getText()); + } + + @Override + public T visitTableNamePattern(TableNamePatternContext ctx) { + throw new EarlyExitAnalysisException("Exit when meeting index pattern"); + } + + @Override + public T visitTableAndTypeName(TableAndTypeNameContext ctx) { + return visitor.visitIndexName(ctx.uid(0).getText()); + } + + @Override + public T visitFullColumnName(FullColumnNameContext ctx) { + return visitor.visitFieldName(ctx.getText()); + } + + @Override + public T visitUdfFunctionCall(UdfFunctionCallContext ctx) { + String funcName = ctx.fullId().getText(); + T func = visitor.visitFunctionName(funcName); + return reduce(func, ctx.functionArgs()); + } + + // This check should be able to accomplish in grammar + @Override + public T visitScalarFunctionCall(ScalarFunctionCallContext ctx) { + T func = visit(ctx.scalarFunctionName()); + return reduce(func, ctx.functionArgs()); + } + + @Override + public T visitSelectElements(SelectElementsContext ctx) { + return visitor.visitSelect(ctx.selectElement(). + stream(). + map(this::visit). + collect(Collectors.toList())); + } + + @Override + public T visitSelectColumnElement(SelectColumnElementContext ctx) { + return visitSelectItem(ctx.fullColumnName(), ctx.uid()); + } + + @Override + public T visitSelectFunctionElement(SelectFunctionElementContext ctx) { + return visitSelectItem(ctx.functionCall(), ctx.uid()); + } + + @Override + public T visitSelectExpressionElement(SelectExpressionElementContext ctx) { + return visitSelectItem(ctx.expression(), ctx.uid()); + } + + @Override + public T visitAggregateWindowedFunction(AggregateWindowedFunctionContext ctx) { + String funcName = ctx.getChild(0).getText(); + T func = visitor.visitFunctionName(funcName); + return reduce(func, ctx.functionArg()); + } + + @Override + public T visitFunctionNameBase(FunctionNameBaseContext ctx) { + return visitor.visitFunctionName(ctx.getText()); + } + + @Override + public T visitBinaryComparisonPredicate(BinaryComparisonPredicateContext ctx) { + if (isNamedArgument(ctx)) { // Essentially named argument is assign instead of comparison + return defaultResult(); + } + + T op = visit(ctx.comparisonOperator()); + return reduce(op, Arrays.asList(ctx.left, ctx.right)); + } + + @Override + public T visitIsExpression(IsExpressionContext ctx) { + T op = visitor.visitOperator("IS"); + return op.reduce(Arrays.asList( + visit(ctx.predicate()), + visitor.visitBoolean(ctx.testValue.getText())) + ); + } + + @Override + public T visitComparisonOperator(ComparisonOperatorContext ctx) { + return visitor.visitOperator(ctx.getText()); + } + + @Override + public T visitConstant(ConstantContext ctx) { + if (ctx.REAL_LITERAL() != null) { + return visitor.visitFloat(ctx.getText()); + } + if (ctx.dateType != null) { + return visitor.visitDate(ctx.getText()); + } + return super.visitConstant(ctx); + } + + @Override + public T visitStringLiteral(StringLiteralContext ctx) { + return visitor.visitString(ctx.getText()); + } + + @Override + public T visitDecimalLiteral(DecimalLiteralContext ctx) { + return visitor.visitInteger(ctx.getText()); + } + + @Override + public T visitBooleanLiteral(BooleanLiteralContext ctx) { + return visitor.visitBoolean(ctx.getText()); + } + + @Override + protected T defaultResult() { + return visitor.defaultValue(); + } + + @Override + protected T aggregateResult(T aggregate, T nextResult) { + if (nextResult != defaultResult()) { // Simply return non-default value for now + return nextResult; + } + return aggregate; + } + + /** Named argument, ex. TOPHITS('size'=3), is under FunctionArgs -> Predicate */ + private boolean isNamedArgument(BinaryComparisonPredicateContext ctx) { + return ctx.getParent() != null && ctx.getParent().getParent() != null + && ctx.getParent().getParent() instanceof FunctionArgsContext; + } + + /** Enforce visiting result of table instead of ON clause as result */ + private T visitJoin(List children, TableSourceItemContext tableCtx) { + T result = defaultResult(); + for (ParseTree child : children) { + if (child == tableCtx) { + result = visit(tableCtx); + } else { + visit(child); + } + } + return result; + } + + /** Visit select items for type check and alias definition */ + private T visitSelectItem(ParserRuleContext item, UidContext uid) { + T result = visit(item); + if (uid != null) { + visitor.visitAs(uid.getText(), result); + } + return result; + } + + private T reduce(T reducer, ParserRuleContext ctx) { + return reduce(reducer, (ctx == null) ? emptyList() : ctx.children); + } + + /** Make constructor apply arguments and return result type */ + private T reduce(T reducer, List nodes) { + List args; + if (nodes == null) { + args = emptyList(); + } else { + args = nodes.stream(). + map(this::visit). + filter(type -> type != defaultResult()). + collect(Collectors.toList()); + } + return reducer.reduce(args); + } + + /** Combine an item and a list of items to a single list */ + private + List asList(Node1 first, List rest) { + + List result = new ArrayList<>(singleton(first)); + result.addAll(rest); + return result; + } + +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/visitor/EarlyExitAnalysisException.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/visitor/EarlyExitAnalysisException.java new file mode 100644 index 0000000000..c9ee1d0289 --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/visitor/EarlyExitAnalysisException.java @@ -0,0 +1,26 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.visitor; + +/** + * Exit visitor early due to some reason. + */ +public class EarlyExitAnalysisException extends RuntimeException { + + public EarlyExitAnalysisException(String message) { + super(message); + } +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/visitor/GenericSqlParseTreeVisitor.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/visitor/GenericSqlParseTreeVisitor.java new file mode 100644 index 0000000000..e9af6e1500 --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/visitor/GenericSqlParseTreeVisitor.java @@ -0,0 +1,77 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.visitor; + +import java.util.List; + +/** + * Generic parse tree visitor without dependency on concrete parse tree class. + */ +public interface GenericSqlParseTreeVisitor { + + default void visitRoot() {} + + default void visitQuery() {} + + default void endVisitQuery() {} + + default T visitSelect(List items) { + return defaultValue(); + } + + default void visitAs(String alias, T type) {} + + default T visitIndexName(String indexName) { + return defaultValue(); + } + + default T visitFieldName(String fieldName) { + return defaultValue(); + } + + default T visitFunctionName(String funcName) { + return defaultValue(); + } + + default T visitOperator(String opName) { + return defaultValue(); + } + + default T visitString(String text) { + return defaultValue(); + } + + default T visitInteger(String text) { + return defaultValue(); + } + + default T visitFloat(String text) { + return defaultValue(); + } + + default T visitBoolean(String text) { + return defaultValue(); + } + + default T visitDate(String text) { + return defaultValue(); + } + + default T defaultValue() { + return null; + } + +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/visitor/Reducible.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/visitor/Reducible.java new file mode 100644 index 0000000000..7fd986cd45 --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/antlr/visitor/Reducible.java @@ -0,0 +1,32 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.visitor; + +import java.util.List; + +/** + * Abstraction for anything that can be reduced and used by {@link AntlrSqlParseTreeVisitor}. + */ +public interface Reducible { + + /** + * Reduce current and others to generate a new one + * @param others others + * @return reduction + */ + T reduce(List others); + +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/esdomain/LocalClusterState.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/esdomain/LocalClusterState.java index a957f2fbe7..190c827c06 100644 --- a/src/main/java/com/amazon/opendistroforelasticsearch/sql/esdomain/LocalClusterState.java +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/esdomain/LocalClusterState.java @@ -15,29 +15,22 @@ package com.amazon.opendistroforelasticsearch.sql.esdomain; +import com.amazon.opendistroforelasticsearch.sql.esdomain.mapping.IndexMappings; import com.amazon.opendistroforelasticsearch.sql.plugin.SqlSettings; -import com.carrotsearch.hppc.cursors.ObjectObjectCursor; import com.google.common.cache.Cache; import com.google.common.cache.CacheBuilder; -import com.google.common.collect.ImmutableMap; import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; import org.elasticsearch.action.support.IndicesOptions; import org.elasticsearch.cluster.ClusterState; import org.elasticsearch.cluster.metadata.IndexNameExpressionResolver; -import org.elasticsearch.cluster.metadata.MappingMetaData; -import org.elasticsearch.cluster.metadata.MetaData; import org.elasticsearch.cluster.service.ClusterService; -import org.elasticsearch.common.collect.ImmutableOpenMap; import org.elasticsearch.common.collect.Tuple; import org.elasticsearch.common.settings.Setting; import org.elasticsearch.index.IndexNotFoundException; -import org.json.JSONObject; import java.io.IOException; import java.util.Arrays; -import java.util.Collection; -import java.util.HashMap; import java.util.List; import java.util.Map; import java.util.Objects; @@ -46,7 +39,6 @@ import java.util.function.Function; import java.util.function.Predicate; -import static java.util.Collections.emptyMap; import static org.elasticsearch.common.settings.Settings.EMPTY; /** @@ -250,263 +242,4 @@ private List sortToList(T[] array) { return Arrays.asList(array); } - /** - * Mappings interface to provide default implementation (minimal set of Map methods) for subclass in hierarchy. - * - * @param Type of nested mapping - */ - public interface Mappings { - - default boolean has(String name) { - return data().containsKey(name); - } - - default Collection allNames() { - return data().keySet(); - } - - default T mapping(String name) { - return data().get(name); - } - - default T firstMapping() { - return allMappings().iterator().next(); - } - - default Collection allMappings() { - return data().values(); - } - - default boolean isEmpty() { - return data().isEmpty(); - } - - Map data(); - } - - /** - * Index mappings in the cluster. - *

- * Sample: - * indexMappings: { - * 'accounts': typeMappings1, - * 'logs': typeMappings2 - * } - *

- * Difference between response of getMapping/clusterState and getFieldMapping: - *

- * 1) MappingMetadata: - * ((Map) ((Map) (mapping.get("bank").get("account").sourceAsMap().get("properties"))).get("balance")).get("type") - *

- * 2) FieldMetadata: - * ((Map) client.admin().indices().getFieldMappings(request).actionGet().mappings().get("bank") - * .get("account").get("balance").sourceAsMap().get("balance")).get("type") - */ - public static class IndexMappings implements Mappings { - - public static final IndexMappings EMPTY = new IndexMappings(); - - /** - * Mapping from Index name to mappings of all Types in it - */ - private final Map indexMappings; - - public IndexMappings() { - this.indexMappings = emptyMap(); - } - - public IndexMappings(MetaData metaData) { - this.indexMappings = buildMappings(metaData.indices(), - indexMetaData -> new TypeMappings(indexMetaData.getMappings())); - } - - public IndexMappings(ImmutableOpenMap> mappings) { - this.indexMappings = buildMappings(mappings, TypeMappings::new); - } - - @Override - public Map data() { - return indexMappings; - } - - @Override - public boolean equals(Object o) { - if (this == o) { - return true; - } - if (o == null || getClass() != o.getClass()) { - return false; - } - IndexMappings that = (IndexMappings) o; - return Objects.equals(indexMappings, that.indexMappings); - } - - @Override - public int hashCode() { - return Objects.hash(indexMappings); - } - - @Override - public String toString() { - return "IndexMappings{" + indexMappings + '}'; - } - } - - /** - * Type mappings in a specific index. - *

- * Sample: - * typeMappings: { - * '_doc': fieldMappings - * } - */ - public static class TypeMappings implements Mappings { - - /** - * Mapping from Type name to mappings of all Fields in it - */ - private final Map typeMappings; - - public TypeMappings(ImmutableOpenMap mappings) { - typeMappings = buildMappings(mappings, FieldMappings::new); - } - - @Override - public Map data() { - return typeMappings; - } - - @Override - public boolean equals(Object o) { - if (this == o) { - return true; - } - if (o == null || getClass() != o.getClass()) { - return false; - } - TypeMappings that = (TypeMappings) o; - return Objects.equals(typeMappings, that.typeMappings); - } - - @Override - public int hashCode() { - return Objects.hash(typeMappings); - } - - @Override - public String toString() { - return "TypeMappings{" + typeMappings + '}'; - } - } - - /** - * Field mappings in a specific type. - *

- * Sample: - * fieldMappings: { - * 'properties': { - * 'balance': { - * 'type': long - * }, - * 'age': { - * 'type': integer - * }, - * 'state': { - * 'type': text, - * } - * 'name': { - * 'type': text, - * 'fields': { - * 'keyword': { - * 'type': keyword, - * 'ignore_above': 256 - * } - * } - * } - * } - * } - */ - @SuppressWarnings("unchecked") - public static class FieldMappings implements Mappings> { - - private static final String PROPERTIES = "properties"; - - /** - * Mapping from field name to its type - */ - private final Map fieldMappings; - - public FieldMappings(MappingMetaData mappings) { - fieldMappings = mappings.sourceAsMap(); - } - - public FieldMappings(Map> mapping) { - Map finalMapping = new HashMap<>(); - finalMapping.put(PROPERTIES, mapping); - fieldMappings = finalMapping; - } - - @Override - public boolean has(String path) { - return mapping(path) != null; - } - - /** - * Different from default implementation that search mapping for path is required - */ - @Override - public Map mapping(String path) { - Map mapping = fieldMappings; - for (String name : path.split("\\.")) { - if (mapping == null || !mapping.containsKey(PROPERTIES)) { - return null; - } - - mapping = (Map) - ((Map) mapping.get(PROPERTIES)).get(name); - } - return mapping; - } - - @Override - public Map> data() { - // Is this assumption true? Is it possible mapping of field is NOT a Map? - return (Map>) fieldMappings.get(PROPERTIES); - } - - @Override - public boolean equals(Object o) { - if (this == o) { - return true; - } - if (o == null || getClass() != o.getClass()) { - return false; - } - FieldMappings that = (FieldMappings) o; - return Objects.equals(fieldMappings, that.fieldMappings); - } - - @Override - public int hashCode() { - return Objects.hash(fieldMappings); - } - - @Override - public String toString() { - return "FieldMappings" + new JSONObject(fieldMappings).toString(2); - } - - } - - /** - * Convert ES ImmutableOpenMap to JDK Map by applying function: U func(T) - */ - private static Map buildMappings(ImmutableOpenMap mappings, Function func) { - ImmutableMap.Builder builder = ImmutableMap.builder(); - for (ObjectObjectCursor mapping : mappings) { - builder.put(mapping.key, func.apply(mapping.value)); - } - return builder.build(); - } - } diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/esdomain/mapping/FieldMappings.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/esdomain/mapping/FieldMappings.java new file mode 100644 index 0000000000..f21a320d4c --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/esdomain/mapping/FieldMappings.java @@ -0,0 +1,157 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.esdomain.mapping; + +import org.elasticsearch.cluster.metadata.MappingMetaData; +import org.json.JSONObject; + +import java.util.HashMap; +import java.util.Map; +import java.util.Objects; +import java.util.Optional; +import java.util.function.BiConsumer; + +/** + * Field mappings in a specific type. + *

+ * Sample: + * fieldMappings: { + * 'properties': { + * 'balance': { + * 'type': long + * }, + * 'age': { + * 'type': integer + * }, + * 'state': { + * 'type': text, + * } + * 'name': { + * 'type': text, + * 'fields': { + * 'keyword': { + * 'type': keyword, + * 'ignore_above': 256 + * } + * } + * } + * } + * } + */ +@SuppressWarnings("unchecked") +public class FieldMappings implements Mappings> { + + private static final String PROPERTIES = "properties"; + + /** + * Mapping from field name to its type + */ + private final Map fieldMappings; + + public FieldMappings(MappingMetaData mappings) { + fieldMappings = mappings.sourceAsMap(); + } + + public FieldMappings(Map> mapping) { + Map finalMapping = new HashMap<>(); + finalMapping.put(PROPERTIES, mapping); + fieldMappings = finalMapping; + } + + @Override + public boolean has(String path) { + return mapping(path) != null; + } + + /** + * Different from default implementation that search mapping for path is required + */ + @Override + public Map mapping(String path) { + Map mapping = fieldMappings; + for (String name : path.split("\\.")) { + if (mapping == null || !mapping.containsKey(PROPERTIES)) { + return null; + } + + mapping = (Map) + ((Map) mapping.get(PROPERTIES)).get(name); + } + return mapping; + } + + @Override + public Map> data() { + // Is this assumption true? Is it possible mapping of field is NOT a Map? + return (Map>) fieldMappings.get(PROPERTIES); + } + + public void flat(BiConsumer func) { + flatMappings(data(), Optional.empty(), func); + } + + @SuppressWarnings("unchecked") + private void flatMappings(Map> mappings, + Optional path, + BiConsumer func) { + mappings.forEach( + (fieldName, mapping) -> { + String fullFieldName = path.map(s -> s + "." + fieldName).orElse(fieldName); + String type = (String) mapping.getOrDefault("type", "object"); + func.accept(fullFieldName, type); + + if (mapping.containsKey("fields")) { + ((Map>) mapping.get("fields")).forEach( + (innerFieldName, innerMapping) -> + func.accept(fullFieldName + "." + innerFieldName, + (String) innerMapping.getOrDefault("type", "object")) + ); + } + + if (mapping.containsKey("properties")) { + flatMappings( + (Map>) mapping.get("properties"), + Optional.of(fullFieldName), + func + ); + } + } + ); + } + + @Override + public boolean equals(Object o) { + if (this == o) { + return true; + } + if (o == null || getClass() != o.getClass()) { + return false; + } + FieldMappings that = (FieldMappings) o; + return Objects.equals(fieldMappings, that.fieldMappings); + } + + @Override + public int hashCode() { + return Objects.hash(fieldMappings); + } + + @Override + public String toString() { + return "FieldMappings" + new JSONObject(fieldMappings).toString(2); + } + +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/esdomain/mapping/IndexMappings.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/esdomain/mapping/IndexMappings.java new file mode 100644 index 0000000000..283ba3ff30 --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/esdomain/mapping/IndexMappings.java @@ -0,0 +1,93 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.esdomain.mapping; + +import org.elasticsearch.cluster.metadata.MappingMetaData; +import org.elasticsearch.cluster.metadata.MetaData; +import org.elasticsearch.common.collect.ImmutableOpenMap; + +import java.util.Map; +import java.util.Objects; + +import static java.util.Collections.emptyMap; + +/** + * Index mappings in the cluster. + *

+ * Sample: + * indexMappings: { + * 'accounts': typeMappings1, + * 'logs': typeMappings2 + * } + *

+ * Difference between response of getMapping/clusterState and getFieldMapping: + *

+ * 1) MappingMetadata: + * ((Map) ((Map) (mapping.get("bank").get("account").sourceAsMap().get("properties"))).get("balance")).get("type") + *

+ * 2) FieldMetadata: + * ((Map) client.admin().indices().getFieldMappings(request).actionGet().mappings().get("bank") + * .get("account").get("balance").sourceAsMap().get("balance")).get("type") + */ +public class IndexMappings implements Mappings { + + public static final IndexMappings EMPTY = new IndexMappings(); + + /** + * Mapping from Index name to mappings of all Types in it + */ + private final Map indexMappings; + + public IndexMappings() { + this.indexMappings = emptyMap(); + } + + public IndexMappings(MetaData metaData) { + this.indexMappings = buildMappings(metaData.indices(), + indexMetaData -> new TypeMappings(indexMetaData.getMappings())); + } + + public IndexMappings(ImmutableOpenMap> mappings) { + this.indexMappings = buildMappings(mappings, TypeMappings::new); + } + + @Override + public Map data() { + return indexMappings; + } + + @Override + public boolean equals(Object o) { + if (this == o) { + return true; + } + if (o == null || getClass() != o.getClass()) { + return false; + } + IndexMappings that = (IndexMappings) o; + return Objects.equals(indexMappings, that.indexMappings); + } + + @Override + public int hashCode() { + return Objects.hash(indexMappings); + } + + @Override + public String toString() { + return "IndexMappings{" + indexMappings + '}'; + } +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/esdomain/mapping/Mappings.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/esdomain/mapping/Mappings.java new file mode 100644 index 0000000000..156832e47f --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/esdomain/mapping/Mappings.java @@ -0,0 +1,69 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.esdomain.mapping; + +import com.carrotsearch.hppc.cursors.ObjectObjectCursor; +import com.google.common.collect.ImmutableMap; +import org.elasticsearch.common.collect.ImmutableOpenMap; + +import java.util.Collection; +import java.util.Map; +import java.util.function.Function; + +/** + * Mappings interface to provide default implementation (minimal set of Map methods) for subclass in hierarchy. + * + * @param Type of nested mapping + */ +public interface Mappings { + + default boolean has(String name) { + return data().containsKey(name); + } + + default Collection allNames() { + return data().keySet(); + } + + default T mapping(String name) { + return data().get(name); + } + + default T firstMapping() { + return allMappings().iterator().next(); + } + + default Collection allMappings() { + return data().values(); + } + + default boolean isEmpty() { + return data().isEmpty(); + } + + Map data(); + + /** + * Convert ES ImmutableOpenMap to JDK Map by applying function: Y func(X) + */ + default Map buildMappings(ImmutableOpenMap mappings, Function func) { + ImmutableMap.Builder builder = ImmutableMap.builder(); + for (ObjectObjectCursor mapping : mappings) { + builder.put(mapping.key, func.apply(mapping.value)); + } + return builder.build(); + } +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/esdomain/mapping/TypeMappings.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/esdomain/mapping/TypeMappings.java new file mode 100644 index 0000000000..00f7b849c0 --- /dev/null +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/esdomain/mapping/TypeMappings.java @@ -0,0 +1,69 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.esdomain.mapping; + +import org.elasticsearch.cluster.metadata.MappingMetaData; +import org.elasticsearch.common.collect.ImmutableOpenMap; + +import java.util.Map; +import java.util.Objects; + +/** + * Type mappings in a specific index. + *

+ * Sample: + * typeMappings: { + * '_doc': fieldMappings + * } + */ +public class TypeMappings implements Mappings { + + /** + * Mapping from Type name to mappings of all Fields in it + */ + private final Map typeMappings; + + public TypeMappings(ImmutableOpenMap mappings) { + typeMappings = buildMappings(mappings, FieldMappings::new); + } + + @Override + public Map data() { + return typeMappings; + } + + @Override + public boolean equals(Object o) { + if (this == o) { + return true; + } + if (o == null || getClass() != o.getClass()) { + return false; + } + TypeMappings that = (TypeMappings) o; + return Objects.equals(typeMappings, that.typeMappings); + } + + @Override + public int hashCode() { + return Objects.hash(typeMappings); + } + + @Override + public String toString() { + return "TypeMappings{" + typeMappings + '}'; + } +} diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/plugin/RestSqlAction.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/plugin/RestSqlAction.java index aa61c69937..aab8dcc9d5 100644 --- a/src/main/java/com/amazon/opendistroforelasticsearch/sql/plugin/RestSqlAction.java +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/plugin/RestSqlAction.java @@ -16,6 +16,8 @@ package com.amazon.opendistroforelasticsearch.sql.plugin; import com.alibaba.druid.sql.parser.ParserException; +import com.amazon.opendistroforelasticsearch.sql.antlr.OpenDistroSqlAnalyzer; +import com.amazon.opendistroforelasticsearch.sql.antlr.SqlAnalysisConfig; import com.amazon.opendistroforelasticsearch.sql.antlr.SqlAnalysisException; import com.amazon.opendistroforelasticsearch.sql.esdomain.LocalClusterState; import com.amazon.opendistroforelasticsearch.sql.exception.SQLFeatureDisabledException; @@ -53,6 +55,9 @@ import java.util.function.Predicate; import java.util.regex.Pattern; +import static com.amazon.opendistroforelasticsearch.sql.plugin.SqlSettings.QUERY_ANALYSIS_ENABLED; +import static com.amazon.opendistroforelasticsearch.sql.plugin.SqlSettings.QUERY_ANALYSIS_SEMANTIC_SUGGESTION; +import static com.amazon.opendistroforelasticsearch.sql.plugin.SqlSettings.QUERY_ANALYSIS_SEMANTIC_THRESHOLD; import static com.amazon.opendistroforelasticsearch.sql.plugin.SqlSettings.SQL_ENABLED; import static org.elasticsearch.rest.RestStatus.BAD_REQUEST; import static org.elasticsearch.rest.RestStatus.OK; @@ -133,6 +138,8 @@ private static void logAndPublishMetrics(final Exception e) { private static QueryAction explainRequest(final NodeClient client, final SqlRequest sqlRequest) throws SQLFeatureNotSupportedException, SqlParseException { + performAnalysis(sqlRequest.getSql()); + final QueryAction queryAction = new SearchDao(client).explain(sqlRequest.getSql()); queryAction.setSqlRequest(sqlRequest); return queryAction; @@ -193,4 +200,16 @@ private boolean isSQLFeatureEnabled() { boolean isSqlEnabled = LocalClusterState.state().getSettingValue(SQL_ENABLED); return allowExplicitIndex && isSqlEnabled; } + + private static void performAnalysis(String sql) { + LocalClusterState clusterState = LocalClusterState.state(); + SqlAnalysisConfig config = new SqlAnalysisConfig( + clusterState.getSettingValue(QUERY_ANALYSIS_ENABLED), + clusterState.getSettingValue(QUERY_ANALYSIS_SEMANTIC_SUGGESTION), + clusterState.getSettingValue(QUERY_ANALYSIS_SEMANTIC_THRESHOLD) + ); + + OpenDistroSqlAnalyzer analyzer = new OpenDistroSqlAnalyzer(config); + analyzer.analyze(sql, clusterState); + } } diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/plugin/SqlSettings.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/plugin/SqlSettings.java index 9487f3df9f..0cd53cc5d7 100644 --- a/src/main/java/com/amazon/opendistroforelasticsearch/sql/plugin/SqlSettings.java +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/plugin/SqlSettings.java @@ -39,6 +39,8 @@ public class SqlSettings { public static final String SQL_ENABLED = "opendistro.sql.enabled"; public static final String QUERY_SLOWLOG = "opendistro.sql.query.slowlog"; public static final String QUERY_ANALYSIS_ENABLED = "opendistro.sql.query.analysis.enabled"; + public static final String QUERY_ANALYSIS_SEMANTIC_SUGGESTION = "opendistro.sql.query.analysis.semantic.suggestion"; + public static final String QUERY_ANALYSIS_SEMANTIC_THRESHOLD = "opendistro.sql.query.analysis.semantic.threshold"; public static final String METRICS_ROLLING_WINDOW = "opendistro.sql.metrics.rollingwindow"; public static final String METRICS_ROLLING_INTERVAL = "opendistro.sql.metrics.rollinginterval"; @@ -48,7 +50,15 @@ public SqlSettings() { Map> settings = new HashMap<>(); settings.put(SQL_ENABLED, Setting.boolSetting(SQL_ENABLED, true, NodeScope, Dynamic)); settings.put(QUERY_SLOWLOG, Setting.intSetting(QUERY_SLOWLOG, 2, NodeScope, Dynamic)); - settings.put(QUERY_ANALYSIS_ENABLED, Setting.boolSetting(QUERY_ANALYSIS_ENABLED, true, NodeScope, Dynamic)); + + // Settings for new ANTLR query analyzer + settings.put(QUERY_ANALYSIS_ENABLED, Setting.boolSetting( + QUERY_ANALYSIS_ENABLED, true, NodeScope, Dynamic)); + settings.put(QUERY_ANALYSIS_SEMANTIC_SUGGESTION, Setting.boolSetting( + QUERY_ANALYSIS_SEMANTIC_SUGGESTION, false, NodeScope, Dynamic)); + settings.put(QUERY_ANALYSIS_SEMANTIC_THRESHOLD, Setting.intSetting( + QUERY_ANALYSIS_SEMANTIC_THRESHOLD, 200, NodeScope, Dynamic)); + settings.put(METRICS_ROLLING_WINDOW, Setting.longSetting(METRICS_ROLLING_WINDOW, 3600L, 2L, NodeScope, Dynamic)); settings.put(METRICS_ROLLING_INTERVAL, Setting.longSetting(METRICS_ROLLING_INTERVAL, 60L, 1L, diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/query/ESActionFactory.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/query/ESActionFactory.java index dccecd9128..8e3b8703c3 100644 --- a/src/main/java/com/amazon/opendistroforelasticsearch/sql/query/ESActionFactory.java +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/query/ESActionFactory.java @@ -26,12 +26,10 @@ import com.alibaba.druid.sql.parser.SQLExprParser; import com.alibaba.druid.sql.parser.SQLStatementParser; import com.alibaba.druid.sql.parser.Token; -import com.amazon.opendistroforelasticsearch.sql.antlr.OpenDistroSqlAnalyzer; import com.amazon.opendistroforelasticsearch.sql.domain.Delete; import com.amazon.opendistroforelasticsearch.sql.domain.IndexStatement; import com.amazon.opendistroforelasticsearch.sql.domain.JoinSelect; import com.amazon.opendistroforelasticsearch.sql.domain.Select; -import com.amazon.opendistroforelasticsearch.sql.esdomain.LocalClusterState; import com.amazon.opendistroforelasticsearch.sql.exception.SqlParseException; import com.amazon.opendistroforelasticsearch.sql.executor.ElasticResultHandler; import com.amazon.opendistroforelasticsearch.sql.executor.QueryActionElasticExecutor; @@ -57,7 +55,6 @@ import java.util.List; import static com.amazon.opendistroforelasticsearch.sql.domain.IndexStatement.StatementType; -import static com.amazon.opendistroforelasticsearch.sql.plugin.SqlSettings.QUERY_ANALYSIS_ENABLED; public class ESActionFactory { @@ -76,9 +73,6 @@ public static QueryAction create(Client client, String sql) throws SqlParseExcep switch (getFirstWord(sql)) { case "SELECT": - // Perform analysis for SELECT only for now because of extra code changes required for SHOW/DESCRIBE. - performAnalysisIfEnabled(sql); - SQLQueryExpr sqlExpr = (SQLQueryExpr) toSqlExpr(sql); RewriteRuleExecutor ruleExecutor = RewriteRuleExecutor.builder() @@ -172,12 +166,6 @@ private static boolean isJoin(SQLQueryExpr sqlExpr, String sql) { && ((SQLJoinTableSource) query.getFrom()).getJoinType() != SQLJoinTableSource.JoinType.COMMA; } - private static void performAnalysisIfEnabled(String sql) { - if (LocalClusterState.state().getSettingValue(QUERY_ANALYSIS_ENABLED)) { - new OpenDistroSqlAnalyzer().analyze(sql); - } - } - private static SQLExpr toSqlExpr(String sql) { SQLExprParser parser = new ElasticSqlExprParser(sql); SQLExpr expr = parser.expr(); diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/rewriter/matchtoterm/TermFieldRewriter.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/rewriter/matchtoterm/TermFieldRewriter.java index 4722b1846d..28d0372d82 100644 --- a/src/main/java/com/amazon/opendistroforelasticsearch/sql/rewriter/matchtoterm/TermFieldRewriter.java +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/rewriter/matchtoterm/TermFieldRewriter.java @@ -33,6 +33,8 @@ import com.alibaba.druid.sql.dialect.mysql.visitor.MySqlASTVisitorAdapter; import com.alibaba.druid.sql.parser.ParserException; import com.amazon.opendistroforelasticsearch.sql.esdomain.LocalClusterState; +import com.amazon.opendistroforelasticsearch.sql.esdomain.mapping.FieldMappings; +import com.amazon.opendistroforelasticsearch.sql.esdomain.mapping.IndexMappings; import org.elasticsearch.client.Client; import org.json.JSONObject; @@ -44,9 +46,6 @@ import java.util.Set; import java.util.stream.Collectors; -import static com.amazon.opendistroforelasticsearch.sql.esdomain.LocalClusterState.FieldMappings; -import static com.amazon.opendistroforelasticsearch.sql.esdomain.LocalClusterState.IndexMappings; - /** * Visitor to rewrite AST (abstract syntax tree) for supporting term_query in WHERE and IN condition * Simple changing the matchQuery() to termQuery() will not work when mapping is both text and keyword diff --git a/src/main/java/com/amazon/opendistroforelasticsearch/sql/rewriter/matchtoterm/TermFieldScope.java b/src/main/java/com/amazon/opendistroforelasticsearch/sql/rewriter/matchtoterm/TermFieldScope.java index 2396c6dd2e..bcae8e593a 100644 --- a/src/main/java/com/amazon/opendistroforelasticsearch/sql/rewriter/matchtoterm/TermFieldScope.java +++ b/src/main/java/com/amazon/opendistroforelasticsearch/sql/rewriter/matchtoterm/TermFieldScope.java @@ -15,8 +15,8 @@ package com.amazon.opendistroforelasticsearch.sql.rewriter.matchtoterm; -import com.amazon.opendistroforelasticsearch.sql.esdomain.LocalClusterState.FieldMappings; -import com.amazon.opendistroforelasticsearch.sql.esdomain.LocalClusterState.IndexMappings; +import com.amazon.opendistroforelasticsearch.sql.esdomain.mapping.FieldMappings; +import com.amazon.opendistroforelasticsearch.sql.esdomain.mapping.IndexMappings; import java.util.HashMap; import java.util.Map; diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/SymbolSimilarityTest.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/SymbolSimilarityTest.java new file mode 100644 index 0000000000..825f2a2317 --- /dev/null +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/SymbolSimilarityTest.java @@ -0,0 +1,65 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr; + +import org.junit.Assert; +import org.junit.Test; + +import java.util.Arrays; +import java.util.List; + +import static java.util.Collections.emptyList; +import static java.util.Collections.singletonList; + +/** + * Test cases for symbol similarity + */ +public class SymbolSimilarityTest { + + @Test + public void noneCandidateShouldReturnTargetStringItself() { + String target = "test"; + String mostSimilarSymbol = new SimilarSymbols(emptyList()).mostSimilarTo(target); + Assert.assertEquals(target, mostSimilarSymbol); + } + + @Test + public void singleCandidateShouldReturnTheOnlyCandidate() { + String target = "test"; + String candidate = "hello"; + String mostSimilarSymbol = new SimilarSymbols(singletonList(candidate)).mostSimilarTo(target); + Assert.assertEquals(candidate, mostSimilarSymbol); + } + + @Test + public void twoCandidatesShouldReturnMostSimilarCandidate() { + String target = "test"; + String mostSimilar = "tests"; + List candidates = Arrays.asList("hello", mostSimilar); + String mostSimilarSymbol = new SimilarSymbols(candidates).mostSimilarTo(target); + Assert.assertEquals(mostSimilar, mostSimilarSymbol); + } + + @Test + public void manyCandidatesShouldReturnMostSimilarCandidate() { + String target = "test"; + String mostSimilar = "tests"; + List candidates = Arrays.asList("hello", mostSimilar, "world"); + String mostSimilarSymbol = new SimilarSymbols(candidates).mostSimilarTo(target); + Assert.assertEquals(mostSimilar, mostSimilarSymbol); + } + +} diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/SyntaxAnalysisTest.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/SyntaxAnalysisTest.java index 324f8fb673..25e83ea361 100644 --- a/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/SyntaxAnalysisTest.java +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/SyntaxAnalysisTest.java @@ -15,7 +15,7 @@ package com.amazon.opendistroforelasticsearch.sql.antlr; -import com.amazon.opendistroforelasticsearch.sql.antlr.syntax.SqlSyntaxAnalysisException; +import com.amazon.opendistroforelasticsearch.sql.antlr.syntax.SyntaxAnalysisException; import org.hamcrest.Matchers; import org.junit.Rule; import org.junit.Test; @@ -36,8 +36,7 @@ public class SyntaxAnalysisTest { @Rule public ExpectedException exception = ExpectedException.none(); - private OpenDistroSqlAnalyzer analyzer = new OpenDistroSqlAnalyzer(); - + private OpenDistroSqlAnalyzer analyzer = new OpenDistroSqlAnalyzer(new SqlAnalysisConfig(true, true, 1000)); /** In reality exception occurs before reaching new parser for now */ @Test @@ -136,7 +135,7 @@ public void arithmeticExpressionInWhereClauseShouldPass() { } private void expectValidationFailWithErrorMessage(String query, String... messages) { - exception.expect(SqlSyntaxAnalysisException.class); + exception.expect(SyntaxAnalysisException.class); exception.expectMessage(allOf(Arrays.stream(messages). map(Matchers::containsString). collect(toList()))); @@ -144,6 +143,6 @@ private void expectValidationFailWithErrorMessage(String query, String... messag } private void validate(String sql) { - analyzer.analyze(sql); + analyzer.analyzeSyntax(sql); } } diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerAggregateFunctionTest.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerAggregateFunctionTest.java new file mode 100644 index 0000000000..be89e428ec --- /dev/null +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerAggregateFunctionTest.java @@ -0,0 +1,165 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic; + +import org.junit.Ignore; +import org.junit.Test; + +/** + * Semantic analysis test for aggregate functions. + */ +public class SemanticAnalyzerAggregateFunctionTest extends SemanticAnalyzerTestBase { + + @Ignore("To be implemented") + @Test(expected = SemanticAnalysisException.class) + public void useAggregateFunctionInWhereClauseShouldFail() { + validate("SELECT * FROM semantics WHERE AVG(balance) > 10000"); + } + + @Test + public void useAggregateFunctionInSelectClauseShouldPass() { + validate( + "SELECT" + + " city," + + " COUNT(*)," + + " MAX(age)," + + " MIN(balance)," + + " AVG(manager.salary)," + + " SUM(balance)" + + "FROM semantics " + + "GROUP BY city"); + } + + @Test + public void useAggregateFunctionInSelectClauseWithoutGroupByShouldPass() { + validate( + "SELECT" + + " COUNT(*)," + + " MAX(age)," + + " MIN(balance)," + + " AVG(manager.salary)," + + " SUM(balance)" + + "FROM semantics"); + } + + @Test + public void countFunctionCallOnAnyFieldShouldPass() { + validate( + "SELECT" + + " COUNT(address)," + + " COUNT(age)," + + " COUNT(birthday)," + + " COUNT(location)," + + " COUNT(manager.address)," + + " COUNT(employer)" + + "FROM semantics"); + } + + @Test + public void maxFunctionCallOnTextFieldShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT MAX(address) FROM semantics", + "Function [MAX] cannot work with [TEXT].", + "Usage: MAX(NUMBER T) -> T" + ); + } + + @Test + public void minFunctionCallOnDateFieldShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT MIN(birthday) FROM semantics", + "Function [MIN] cannot work with [DATE].", + "Usage: MIN(NUMBER T) -> T" + ); + } + + @Test + public void avgFunctionCallOnBooleanFieldShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT AVG(p.active) FROM semantics s, s.projects p", + "Function [AVG] cannot work with [BOOLEAN].", + "Usage: AVG(NUMBER T) -> T" + ); + } + + @Test + public void sumFunctionCallOnBooleanFieldShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT SUM(city) FROM semantics", + "Function [SUM] cannot work with [KEYWORD].", + "Usage: SUM(NUMBER T) -> T" + ); + } + + @Test + public void useAvgFunctionCallAliasInHavingClauseShouldPass() { + validate("SELECT city, AVG(age) AS avg FROM semantics GROUP BY city HAVING avg > 10"); + } + + @Test + public void useAvgAndMaxFunctionCallAliasInHavingClauseShouldPass() { + validate( + "SELECT city, AVG(age) AS avg, MAX(balance) AS bal FROM semantics " + + "GROUP BY city HAVING avg > 10 AND bal > 10000" + ); + } + + @Test + public void useAvgFunctionCallWithoutAliasInHavingShouldPass() { + validate("SELECT city, AVG(age) FROM semantics GROUP BY city HAVING AVG(age) > 10"); + } + + @Test + public void useDifferentAggregateFunctionInHavingClauseShouldPass() { + validate("SELECT city, AVG(age) FROM semantics GROUP BY city HAVING COUNT(*) > 10 AND SUM(balance) <= 10000"); + } + + @Test + public void useAvgFunctionCallAliasInOrderByClauseShouldPass() { + validate("SELECT city, AVG(age) AS avg FROM semantics GROUP BY city ORDER BY avg"); + } + + @Test + public void useAvgFunctionCallAliasInGroupByAndOrderByClauseShouldPass() { + validate("SELECT SUBSTRING(address, 0, 3) AS add FROM semantics GROUP BY add ORDER BY add"); + } + + @Test + public void useColumnNameAliasInOrderByClauseShouldPass() { + validate("SELECT age AS a, AVG(balance) FROM semantics GROUP BY age ORDER BY a"); + } + + @Test + public void useExpressionAliasInOrderByClauseShouldPass() { + validate("SELECT age + 1 AS a FROM semantics GROUP BY age ORDER BY a"); + } + + @Test + public void useAvgFunctionCallWithTextFieldInHavingClauseShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT city FROM semantics GROUP BY city HAVING AVG(address) > 10", + "Function [AVG] cannot work with [TEXT].", + "Usage: AVG(NUMBER T) -> T" + ); + } + + @Test + public void useCountFunctionCallWithNestedFieldShouldPass() { + validate("SELECT * FROM semantics s, s.projects p GROUP BY city HAVING COUNT(p) > 1"); + validate("SELECT * FROM semantics s, s.projects p, p.members m GROUP BY city HAVING COUNT(m) > 1"); + } + +} diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerBasicTest.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerBasicTest.java new file mode 100644 index 0000000000..ce3a54cbd8 --- /dev/null +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerBasicTest.java @@ -0,0 +1,607 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic; + +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.scope.Namespace; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.scope.SemanticContext; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.scope.Symbol; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.Type; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESIndex; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.visitor.ESMappingLoader; +import com.amazon.opendistroforelasticsearch.sql.esdomain.LocalClusterState; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; + +import java.util.Map; +import java.util.Optional; + +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.BOOLEAN; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.DATE; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.DOUBLE; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.GEO_POINT; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.INTEGER; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.KEYWORD; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.LONG; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.OBJECT; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.TEXT; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.UNKNOWN; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESIndex.IndexType.INDEX; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESIndex.IndexType.NESTED_FIELD; +import static org.hamcrest.Matchers.aMapWithSize; +import static org.hamcrest.Matchers.allOf; +import static org.hamcrest.Matchers.hasEntry; +import static org.junit.Assert.assertThat; + +/** + * Semantic analysis test cases focused on basic scope building logic which is the cornerstone of analysis followed. + * The low abstraction here enumerating all present field names in each test case is intentional for better demonstration. + */ +public class SemanticAnalyzerBasicTest extends SemanticAnalyzerTestBase { + + private SemanticContext context; + + private ESMappingLoader analyzer; + + @Before + public void setUp() { + context = new SemanticContext(); + analyzer = new ESMappingLoader(context, LocalClusterState.state(), 1000); + } + + @Test + public void contextShouldIncludeAllFieldsAfterVisitingIndexNameInFromClause() { + analyzer.visitIndexName("semantics"); + + Map typeByName = context.peek().resolveAll(Namespace.FIELD_NAME); + assertThat( + typeByName, + allOf( + aMapWithSize(21), + hasEntry("semantics", (Type) new ESIndex("semantics", INDEX)), + hasEntry("address", TEXT), + hasEntry("age", INTEGER), + hasEntry("balance", DOUBLE), + hasEntry("city", KEYWORD), + hasEntry("birthday", DATE), + hasEntry("location", GEO_POINT), + hasEntry("new_field", UNKNOWN), + hasEntry("field with spaces", TEXT), + hasEntry("employer", TEXT), + hasEntry("employer.keyword", KEYWORD), + hasEntry("projects", (Type) new ESIndex("projects", NESTED_FIELD)), + hasEntry("projects.active", BOOLEAN), + hasEntry("projects.release", DATE), + hasEntry("projects.members", (Type) new ESIndex("projects.members", NESTED_FIELD)), + hasEntry("projects.members.name", TEXT), + hasEntry("manager", OBJECT), + hasEntry("manager.name", TEXT), + hasEntry("manager.name.keyword", KEYWORD), + hasEntry("manager.address", KEYWORD), + hasEntry("manager.salary", LONG) + ) + ); + + analyzer.visitAs("", new ESIndex("semantics", INDEX)); + typeByName = context.peek().resolveAll(Namespace.FIELD_NAME); + assertThat( + typeByName, + allOf( + aMapWithSize(41), + hasEntry("semantics", (Type) new ESIndex("semantics", INDEX)), + hasEntry("address", TEXT), + hasEntry("age", INTEGER), + hasEntry("balance", DOUBLE), + hasEntry("city", KEYWORD), + hasEntry("birthday", DATE), + hasEntry("location", GEO_POINT), + hasEntry("new_field", UNKNOWN), + hasEntry("field with spaces", TEXT), + hasEntry("employer", TEXT), + hasEntry("employer.keyword", KEYWORD), + hasEntry("projects", (Type) new ESIndex("projects", NESTED_FIELD)), + hasEntry("projects.active", BOOLEAN), + hasEntry("projects.release", DATE), + hasEntry("projects.members", (Type) new ESIndex("projects.members", NESTED_FIELD)), + hasEntry("projects.members.name", TEXT), + hasEntry("manager", OBJECT), + hasEntry("manager.name", TEXT), + hasEntry("manager.name.keyword", KEYWORD), + hasEntry("manager.address", KEYWORD), + hasEntry("manager.salary", LONG), + // These are also valid identifier in SQL + hasEntry("semantics.address", TEXT), + hasEntry("semantics.age", INTEGER), + hasEntry("semantics.balance", DOUBLE), + hasEntry("semantics.city", KEYWORD), + hasEntry("semantics.birthday", DATE), + hasEntry("semantics.location", GEO_POINT), + hasEntry("semantics.new_field", UNKNOWN), + hasEntry("semantics.field with spaces", TEXT), + hasEntry("semantics.employer", TEXT), + hasEntry("semantics.employer.keyword", KEYWORD), + hasEntry("semantics.projects", (Type) new ESIndex("semantics.projects", NESTED_FIELD)), + hasEntry("semantics.projects.active", BOOLEAN), + hasEntry("semantics.projects.release", DATE), + hasEntry("semantics.projects.members", (Type) new ESIndex("semantics.projects.members", NESTED_FIELD)), + hasEntry("semantics.projects.members.name", TEXT), + hasEntry("semantics.manager", OBJECT), + hasEntry("semantics.manager.name", TEXT), + hasEntry("semantics.manager.name.keyword", KEYWORD), + hasEntry("semantics.manager.address", KEYWORD), + hasEntry("semantics.manager.salary", LONG) + ) + ); + } + + @Test + public void contextShouldIncludeAllFieldsPrefixedByIndexAliasAfterVisitingIndexNameWithAliasInFromClause() { + ESIndex indexType = new ESIndex("semantics", INDEX); + analyzer.visitIndexName("semantics"); + analyzer.visitAs("s", indexType); + + Map typeByName = context.peek().resolveAll(Namespace.FIELD_NAME); + assertThat( + typeByName, + allOf( + aMapWithSize(41), + hasEntry("semantics", (Type) indexType), + // These are also valid because alias is optional in SQL + hasEntry("address", TEXT), + hasEntry("age", INTEGER), + hasEntry("balance", DOUBLE), + hasEntry("city", KEYWORD), + hasEntry("birthday", DATE), + hasEntry("location", GEO_POINT), + hasEntry("new_field", UNKNOWN), + hasEntry("field with spaces", TEXT), + hasEntry("employer", TEXT), + hasEntry("employer.keyword", KEYWORD), + hasEntry("projects", (Type) new ESIndex("projects", NESTED_FIELD)), + hasEntry("projects.active", BOOLEAN), + hasEntry("projects.release", DATE), + hasEntry("projects.members", (Type) new ESIndex("projects.members", NESTED_FIELD)), + hasEntry("projects.members.name", TEXT), + hasEntry("manager", OBJECT), + hasEntry("manager.name", TEXT), + hasEntry("manager.name.keyword", KEYWORD), + hasEntry("manager.address", KEYWORD), + hasEntry("manager.salary", LONG), + // These are valid because of alias specified + hasEntry("s.address", TEXT), + hasEntry("s.age", INTEGER), + hasEntry("s.balance", DOUBLE), + hasEntry("s.city", KEYWORD), + hasEntry("s.birthday", DATE), + hasEntry("s.location", GEO_POINT), + hasEntry("s.new_field", UNKNOWN), + hasEntry("s.field with spaces", TEXT), + hasEntry("s.employer", TEXT), + hasEntry("s.employer.keyword", KEYWORD), + hasEntry("s.projects", (Type) new ESIndex("s.projects", NESTED_FIELD)), + hasEntry("s.projects.active", BOOLEAN), + hasEntry("s.projects.release", DATE), + hasEntry("s.projects.members", (Type) new ESIndex("s.projects.members", NESTED_FIELD)), + hasEntry("s.projects.members.name", TEXT), + hasEntry("s.manager", OBJECT), + hasEntry("s.manager.name", TEXT), + hasEntry("s.manager.name.keyword", KEYWORD), + hasEntry("s.manager.address", KEYWORD), + hasEntry("s.manager.salary", LONG) + ) + ); + } + + @Test + public void contextShouldIncludeSameFieldsAfterVisitingNestedFieldWithoutAliasInFromClause() { + ESIndex indexType = new ESIndex("semantics", INDEX); + analyzer.visitIndexName("semantics"); + analyzer.visitAs("s", indexType); + analyzer.visitIndexName("s.projects"); + analyzer.visitAs("", new ESIndex("s.projects", NESTED_FIELD)); + + Map typeByName = context.peek().resolveAll(Namespace.FIELD_NAME); + assertThat( + typeByName, + allOf( + aMapWithSize(41), + hasEntry("semantics", (Type) indexType), + // These are also valid because alias is optional in SQL + hasEntry("address", TEXT), + hasEntry("age", INTEGER), + hasEntry("balance", DOUBLE), + hasEntry("city", KEYWORD), + hasEntry("birthday", DATE), + hasEntry("location", GEO_POINT), + hasEntry("new_field", UNKNOWN), + hasEntry("field with spaces", TEXT), + hasEntry("employer", TEXT), + hasEntry("employer.keyword", KEYWORD), + hasEntry("projects", (Type) new ESIndex("projects", NESTED_FIELD)), + hasEntry("projects.active", BOOLEAN), + hasEntry("projects.release", DATE), + hasEntry("projects.members", (Type) new ESIndex("projects.members", NESTED_FIELD)), + hasEntry("projects.members.name", TEXT), + hasEntry("manager", OBJECT), + hasEntry("manager.name", TEXT), + hasEntry("manager.name.keyword", KEYWORD), + hasEntry("manager.address", KEYWORD), + hasEntry("manager.salary", LONG), + // These are valid because of alias specified + hasEntry("s.address", TEXT), + hasEntry("s.age", INTEGER), + hasEntry("s.balance", DOUBLE), + hasEntry("s.city", KEYWORD), + hasEntry("s.birthday", DATE), + hasEntry("s.location", GEO_POINT), + hasEntry("s.new_field", UNKNOWN), + hasEntry("s.field with spaces", TEXT), + hasEntry("s.employer", TEXT), + hasEntry("s.employer.keyword", KEYWORD), + hasEntry("s.projects", (Type) new ESIndex("s.projects", NESTED_FIELD)), + hasEntry("s.projects.active", BOOLEAN), + hasEntry("s.projects.release", DATE), + hasEntry("s.projects.members", (Type) new ESIndex("s.projects.members", NESTED_FIELD)), + hasEntry("s.projects.members.name", TEXT), + hasEntry("s.manager", OBJECT), + hasEntry("s.manager.name", TEXT), + hasEntry("s.manager.name.keyword", KEYWORD), + hasEntry("s.manager.address", KEYWORD), + hasEntry("s.manager.salary", LONG) + ) + ); + } + + @Test + public void contextShouldIncludeMoreFieldsPrefixedByNestedFieldAliasAfterVisitingNestedFieldWithAliasInFromClause() { + ESIndex indexType = new ESIndex("semantics", INDEX); + analyzer.visitIndexName("semantics"); + analyzer.visitAs("s", indexType); + analyzer.visitIndexName("s.projects"); + analyzer.visitAs("p", new ESIndex("s.projects", NESTED_FIELD)); + + Map typeByName = context.peek().resolveAll(Namespace.FIELD_NAME); + assertThat( + typeByName, + allOf( + aMapWithSize(46), + // These are also valid because alias is optional in SQL + hasEntry("semantics", (Type) indexType), + // These are also valid because alias is optional in SQL + hasEntry("address", TEXT), + hasEntry("age", INTEGER), + hasEntry("balance", DOUBLE), + hasEntry("city", KEYWORD), + hasEntry("birthday", DATE), + hasEntry("location", GEO_POINT), + hasEntry("new_field", UNKNOWN), + hasEntry("field with spaces", TEXT), + hasEntry("employer", TEXT), + hasEntry("employer.keyword", KEYWORD), + hasEntry("projects", (Type) new ESIndex("projects", NESTED_FIELD)), + hasEntry("projects.active", BOOLEAN), + hasEntry("projects.release", DATE), + hasEntry("projects.members", (Type) new ESIndex("projects.members", NESTED_FIELD)), + hasEntry("projects.members.name", TEXT), + hasEntry("manager", OBJECT), + hasEntry("manager.name", TEXT), + hasEntry("manager.name.keyword", KEYWORD), + hasEntry("manager.address", KEYWORD), + hasEntry("manager.salary", LONG), + // These are valid because of alias specified + hasEntry("s.address", TEXT), + hasEntry("s.age", INTEGER), + hasEntry("s.balance", DOUBLE), + hasEntry("s.city", KEYWORD), + hasEntry("s.birthday", DATE), + hasEntry("s.location", GEO_POINT), + hasEntry("s.new_field", UNKNOWN), + hasEntry("s.field with spaces", TEXT), + hasEntry("s.employer", TEXT), + hasEntry("s.employer.keyword", KEYWORD), + hasEntry("s.projects", (Type) new ESIndex("s.projects", NESTED_FIELD)), + hasEntry("s.projects.active", BOOLEAN), + hasEntry("s.projects.release", DATE), + hasEntry("s.projects.members", (Type) new ESIndex("s.projects.members", NESTED_FIELD)), + hasEntry("s.projects.members.name", TEXT), + hasEntry("s.manager", OBJECT), + hasEntry("s.manager.name", TEXT), + hasEntry("s.manager.name.keyword", KEYWORD), + hasEntry("s.manager.address", KEYWORD), + hasEntry("s.manager.salary", LONG), + // Valid because of nested field alias specified + hasEntry("p", (Type) new ESIndex("s.projects", NESTED_FIELD)), + hasEntry("p.active", BOOLEAN), + hasEntry("p.release", DATE), + hasEntry("p.members", (Type) new ESIndex("s.projects.members", NESTED_FIELD)), + hasEntry("p.members.name", TEXT) + ) + ); + } + + @Test + public void contextShouldIncludeMoreFieldsPrefixedByNestedFieldAliasAfterVisitingDeepNestedFieldWithAliasInFromClause() { + ESIndex indexType = new ESIndex("semantics", INDEX); + analyzer.visitIndexName("semantics"); + analyzer.visitAs("s", indexType); + analyzer.visitIndexName("s.projects.members"); + analyzer.visitAs("m", new ESIndex("s.projects.members", NESTED_FIELD)); + + Map typeByName = context.peek().resolveAll(Namespace.FIELD_NAME); + + assertThat( + typeByName, + allOf( + aMapWithSize(43), + hasEntry("semantics", (Type) indexType), + // These are also valid because alias is optional in SQL + hasEntry("address", TEXT), + hasEntry("age", INTEGER), + hasEntry("balance", DOUBLE), + hasEntry("city", KEYWORD), + hasEntry("birthday", DATE), + hasEntry("location", GEO_POINT), + hasEntry("new_field", UNKNOWN), + hasEntry("field with spaces", TEXT), + hasEntry("employer", TEXT), + hasEntry("employer.keyword", KEYWORD), + hasEntry("projects", (Type) new ESIndex("projects", NESTED_FIELD)), + hasEntry("projects.active", BOOLEAN), + hasEntry("projects.release", DATE), + hasEntry("projects.members", (Type) new ESIndex("projects.members", NESTED_FIELD)), + hasEntry("projects.members.name", TEXT), + hasEntry("manager", OBJECT), + hasEntry("manager.name", TEXT), + hasEntry("manager.name.keyword", KEYWORD), + hasEntry("manager.address", KEYWORD), + hasEntry("manager.salary", LONG), + // These are valid because of alias specified + hasEntry("s.address", TEXT), + hasEntry("s.age", INTEGER), + hasEntry("s.balance", DOUBLE), + hasEntry("s.city", KEYWORD), + hasEntry("s.birthday", DATE), + hasEntry("s.location", GEO_POINT), + hasEntry("s.new_field", UNKNOWN), + hasEntry("s.field with spaces", TEXT), + hasEntry("s.employer", TEXT), + hasEntry("s.employer.keyword", KEYWORD), + hasEntry("s.projects", (Type) new ESIndex("s.projects", NESTED_FIELD)), + hasEntry("s.projects.active", BOOLEAN), + hasEntry("s.projects.release", DATE), + hasEntry("s.projects.members", (Type) new ESIndex("s.projects.members", NESTED_FIELD)), + hasEntry("s.projects.members.name", TEXT), + hasEntry("s.manager", OBJECT), + hasEntry("s.manager.name", TEXT), + hasEntry("s.manager.name.keyword", KEYWORD), + hasEntry("s.manager.address", KEYWORD), + hasEntry("s.manager.salary", LONG), + // Valid because of deep nested field alias specified + hasEntry("m", (Type) new ESIndex("s.projects.members", NESTED_FIELD)), + hasEntry("m.name", TEXT) + ) + ); + } + + @Test + public void contextShouldIncludeMoreFieldsPrefixedByNestedFieldAliasAfterVisitingAllNestedFieldsWithAliasInFromClause() { + ESIndex indexType = new ESIndex("semantics", INDEX); + analyzer.visitIndexName("semantics"); + analyzer.visitAs("s", indexType); + analyzer.visitIndexName("s.projects"); + analyzer.visitAs("p", new ESIndex("s.projects", NESTED_FIELD)); + analyzer.visitIndexName("s.projects.members"); + analyzer.visitAs("m", new ESIndex("s.projects.members", NESTED_FIELD)); + + Map typeByName = context.peek().resolveAll(Namespace.FIELD_NAME); + assertThat( + typeByName, + allOf( + aMapWithSize(48), + hasEntry("semantics", (Type) indexType), + // These are also valid because alias is optional in SQL + hasEntry("address", TEXT), + hasEntry("age", INTEGER), + hasEntry("balance", DOUBLE), + hasEntry("city", KEYWORD), + hasEntry("birthday", DATE), + hasEntry("location", GEO_POINT), + hasEntry("new_field", UNKNOWN), + hasEntry("field with spaces", TEXT), + hasEntry("employer", TEXT), + hasEntry("employer.keyword", KEYWORD), + hasEntry("projects", (Type) new ESIndex("projects", NESTED_FIELD)), + hasEntry("projects.active", BOOLEAN), + hasEntry("projects.release", DATE), + hasEntry("projects.members", (Type) new ESIndex("projects.members", NESTED_FIELD)), + hasEntry("projects.members.name", TEXT), + hasEntry("manager", OBJECT), + hasEntry("manager.name", TEXT), + hasEntry("manager.name.keyword", KEYWORD), + hasEntry("manager.address", KEYWORD), + hasEntry("manager.salary", LONG), + // These are valid because of alias specified + hasEntry("s.address", TEXT), + hasEntry("s.age", INTEGER), + hasEntry("s.balance", DOUBLE), + hasEntry("s.city", KEYWORD), + hasEntry("s.birthday", DATE), + hasEntry("s.location", GEO_POINT), + hasEntry("s.new_field", UNKNOWN), + hasEntry("s.field with spaces", TEXT), + hasEntry("s.employer", TEXT), + hasEntry("s.employer.keyword", KEYWORD), + hasEntry("s.projects", (Type) new ESIndex("s.projects", NESTED_FIELD)), + hasEntry("s.projects.active", BOOLEAN), + hasEntry("s.projects.release", DATE), + hasEntry("s.projects.members", (Type) new ESIndex("s.projects.members", NESTED_FIELD)), + hasEntry("s.projects.members.name", TEXT), + hasEntry("s.manager", OBJECT), + hasEntry("s.manager.name", TEXT), + hasEntry("s.manager.name.keyword", KEYWORD), + hasEntry("s.manager.address", KEYWORD), + hasEntry("s.manager.salary", LONG), + // Valid because of nested field alias specified + hasEntry("p", (Type) new ESIndex("s.projects", NESTED_FIELD)), + hasEntry("p.active", BOOLEAN), + hasEntry("p.release", DATE), + hasEntry("p.members", (Type) new ESIndex("s.projects.members", NESTED_FIELD)), + hasEntry("p.members.name", TEXT), + // Valid because of deep nested field alias specified + hasEntry("m", (Type) new ESIndex("s.projects.members", NESTED_FIELD)), + hasEntry("m.name", TEXT) + ) + ); + } + + @Test + public void contextShouldIncludeMoreFieldsPrefixedByNestedFieldAliasAfterVisitingNestedFieldWithAliasInSubqueryFromClause() { + ESIndex indexType = new ESIndex("semantics", INDEX); + analyzer.visitIndexName("semantics"); + analyzer.visitAs("s", indexType); + + context.push(); + analyzer.visitIndexName("s.projects"); + analyzer.visitAs("p", new ESIndex("s.projects", NESTED_FIELD)); + + Map typeByName = context.peek().resolveAll(Namespace.FIELD_NAME); + assertThat( + typeByName, + allOf( + aMapWithSize(46), + // These are also valid because alias is optional in SQL + hasEntry("semantics", (Type) indexType), + // These are also valid because alias is optional in SQL + hasEntry("address", TEXT), + hasEntry("age", INTEGER), + hasEntry("balance", DOUBLE), + hasEntry("city", KEYWORD), + hasEntry("birthday", DATE), + hasEntry("location", GEO_POINT), + hasEntry("new_field", UNKNOWN), + hasEntry("field with spaces", TEXT), + hasEntry("employer", TEXT), + hasEntry("employer.keyword", KEYWORD), + hasEntry("projects", (Type) new ESIndex("projects", NESTED_FIELD)), + hasEntry("projects.active", BOOLEAN), + hasEntry("projects.release", DATE), + hasEntry("projects.members", (Type) new ESIndex("projects.members", NESTED_FIELD)), + hasEntry("projects.members.name", TEXT), + hasEntry("manager", OBJECT), + hasEntry("manager.name", TEXT), + hasEntry("manager.name.keyword", KEYWORD), + hasEntry("manager.address", KEYWORD), + hasEntry("manager.salary", LONG), + // These are valid because of alias specified + hasEntry("s.address", TEXT), + hasEntry("s.age", INTEGER), + hasEntry("s.balance", DOUBLE), + hasEntry("s.city", KEYWORD), + hasEntry("s.birthday", DATE), + hasEntry("s.location", GEO_POINT), + hasEntry("s.new_field", UNKNOWN), + hasEntry("s.field with spaces", TEXT), + hasEntry("s.employer", TEXT), + hasEntry("s.employer.keyword", KEYWORD), + hasEntry("s.projects", (Type) new ESIndex("s.projects", NESTED_FIELD)), + hasEntry("s.projects.active", BOOLEAN), + hasEntry("s.projects.release", DATE), + hasEntry("s.projects.members", (Type) new ESIndex("s.projects.members", NESTED_FIELD)), + hasEntry("s.projects.members.name", TEXT), + hasEntry("s.manager", OBJECT), + hasEntry("s.manager.name", TEXT), + hasEntry("s.manager.name.keyword", KEYWORD), + hasEntry("s.manager.address", KEYWORD), + hasEntry("s.manager.salary", LONG), + // Valid because of nested field alias specified + hasEntry("p", (Type) new ESIndex("s.projects", NESTED_FIELD)), + hasEntry("p.active", BOOLEAN), + hasEntry("p.release", DATE), + hasEntry("p.members", (Type) new ESIndex("s.projects.members", NESTED_FIELD)), + hasEntry("p.members.name", TEXT) + ) + ); + + context.pop(); + typeByName = context.peek().resolveAll(Namespace.FIELD_NAME); + assertThat( + typeByName, + allOf( + aMapWithSize(41), + hasEntry("semantics", (Type) indexType), + // These are also valid because alias is optional in SQL + hasEntry("address", TEXT), + hasEntry("age", INTEGER), + hasEntry("balance", DOUBLE), + hasEntry("city", KEYWORD), + hasEntry("birthday", DATE), + hasEntry("location", GEO_POINT), + hasEntry("new_field", UNKNOWN), + hasEntry("field with spaces", TEXT), + hasEntry("employer", TEXT), + hasEntry("employer.keyword", KEYWORD), + hasEntry("projects", (Type) new ESIndex("projects", NESTED_FIELD)), + hasEntry("projects.active", BOOLEAN), + hasEntry("projects.release", DATE), + hasEntry("projects.members", (Type) new ESIndex("projects.members", NESTED_FIELD)), + hasEntry("projects.members.name", TEXT), + hasEntry("manager", OBJECT), + hasEntry("manager.name", TEXT), + hasEntry("manager.name.keyword", KEYWORD), + hasEntry("manager.address", KEYWORD), + hasEntry("manager.salary", LONG), + // These are valid because of alias specified + hasEntry("s.address", TEXT), + hasEntry("s.age", INTEGER), + hasEntry("s.balance", DOUBLE), + hasEntry("s.city", KEYWORD), + hasEntry("s.birthday", DATE), + hasEntry("s.location", GEO_POINT), + hasEntry("s.new_field", UNKNOWN), + hasEntry("s.field with spaces", TEXT), + hasEntry("s.employer", TEXT), + hasEntry("s.employer.keyword", KEYWORD), + hasEntry("s.projects", (Type) new ESIndex("s.projects", NESTED_FIELD)), + hasEntry("s.projects.active", BOOLEAN), + hasEntry("s.projects.release", DATE), + hasEntry("s.projects.members", (Type) new ESIndex("s.projects.members", NESTED_FIELD)), + hasEntry("s.projects.members.name", TEXT), + hasEntry("s.manager", OBJECT), + hasEntry("s.manager.name", TEXT), + hasEntry("s.manager.name.keyword", KEYWORD), + hasEntry("s.manager.address", KEYWORD), + hasEntry("s.manager.salary", LONG) + ) + ); + } + + @Test + public void fieldWithUnknownEsTypeShouldPass() { + analyzer.visitIndexName("semantics"); + Optional type = context.peek().resolve(new Symbol(Namespace.FIELD_NAME, "new_field")); + Assert.assertTrue(type.isPresent()); + Assert.assertSame(UNKNOWN, type.get()); + } + + @Test + public void fieldWithSpacesInNameShouldPass() { + analyzer.visitIndexName("semantics"); + Optional type = context.peek().resolve(new Symbol(Namespace.FIELD_NAME, "field with spaces")); + Assert.assertTrue(type.isPresent()); + Assert.assertSame(TEXT, type.get()); + } + +} diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerConfigTest.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerConfigTest.java new file mode 100644 index 0000000000..fdcacdbffc --- /dev/null +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerConfigTest.java @@ -0,0 +1,79 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic; + +import com.amazon.opendistroforelasticsearch.sql.antlr.OpenDistroSqlAnalyzer; +import com.amazon.opendistroforelasticsearch.sql.antlr.SqlAnalysisConfig; +import com.amazon.opendistroforelasticsearch.sql.esdomain.LocalClusterState; +import org.junit.Rule; +import org.junit.Test; +import org.junit.rules.ExpectedException; + +import static org.hamcrest.Matchers.allOf; +import static org.hamcrest.Matchers.containsString; +import static org.hamcrest.Matchers.not; + +/** + * Test cases for semantic analysis configuration + */ +public class SemanticAnalyzerConfigTest extends SemanticAnalyzerTestBase { + + @Rule + public ExpectedException exceptionWithoutSuggestion = ExpectedException.none(); + + @Test + public void noAnalysisShouldPerformForNonSelectStatement() { + String sql = "DELETE FROM semantics WHERE age12 = 123"; + expectValidationPassWithConfig(sql, new SqlAnalysisConfig(true, true, 1000)); + } + + @Test + public void noAnalysisShouldPerformIfDisabledAnalysis() { + String sql = "SELECT * FROM semantics WHERE age12 = 123"; + expectValidationFailWithErrorMessages(sql, "Field [age12] cannot be found or used here."); + expectValidationPassWithConfig(sql, new SqlAnalysisConfig(false, true, 1000)); + } + + @Test + public void noFieldNameSuggestionIfDisabledSuggestion() { + String sql = "SELECT * FROM semantics WHERE age12 = 123"; + expectValidationFailWithErrorMessages(sql, + "Field [age12] cannot be found or used here.", + "Did you mean [age]?"); + + exceptionWithoutSuggestion.expect(SemanticAnalysisException.class); + exceptionWithoutSuggestion.expectMessage( + allOf( + containsString("Field [age12] cannot be found or used here"), + not(containsString("Did you mean")) + ) + ); + new OpenDistroSqlAnalyzer(new SqlAnalysisConfig(true, false, 1000)). + analyze(sql, LocalClusterState.state()); + } + + @Test + public void noAnalysisShouldPerformIfIndexMappingIsLargerThanThreshold() { + String sql = "SELECT * FROM semantics WHERE test = 123"; + expectValidationFailWithErrorMessages(sql, "Field [test] cannot be found or used here."); + expectValidationPassWithConfig(sql, new SqlAnalysisConfig(true, true, 1)); + } + + private void expectValidationPassWithConfig(String sql, SqlAnalysisConfig config) { + new OpenDistroSqlAnalyzer(config).analyze(sql, LocalClusterState.state()); + } + +} diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerESScalarFunctionTest.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerESScalarFunctionTest.java new file mode 100644 index 0000000000..8a63a36a82 --- /dev/null +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerESScalarFunctionTest.java @@ -0,0 +1,66 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic; + +import org.junit.Ignore; +import org.junit.Test; + +/** + * Semantic analysis test for Elaticsearch special scalar functions + */ +public class SemanticAnalyzerESScalarFunctionTest extends SemanticAnalyzerTestBase { + + @Test + public void dateFunctionCallWithDateInSelectClauseShouldPass() { + validate("SELECT DAY_OF_MONTH(birthday) FROM semantics"); + validate("SELECT DAY_OF_WEEK(birthday) FROM semantics"); + validate("SELECT DAY_OF_YEAR(birthday) FROM semantics"); + validate("SELECT MINUTE_OF_DAY(birthday) FROM semantics"); + validate("SELECT MINUTE_OF_HOUR(birthday) FROM semantics"); + validate("SELECT MONTH_OF_YEAR(birthday) FROM semantics"); + validate("SELECT WEEK_OF_YEAR(birthday) FROM semantics"); + } + + @Test + public void dateFunctionCallWithDateInWhereClauseShouldPass() { + validate("SELECT * FROM semantics WHERE DAY_OF_MONTH(birthday) = 1"); + validate("SELECT * FROM semantics WHERE DAY_OF_WEEK(birthday) = 1"); + validate("SELECT * FROM semantics WHERE DAY_OF_YEAR(birthday) = 1"); + validate("SELECT * FROM semantics WHERE MINUTE_OF_DAY(birthday) = 1"); + validate("SELECT * FROM semantics WHERE MINUTE_OF_HOUR(birthday) = 1"); + validate("SELECT * FROM semantics WHERE MONTH_OF_YEAR(birthday) = 1"); + validate("SELECT * FROM semantics WHERE WEEK_OF_YEAR(birthday) = 1"); + } + + @Test + public void geoFunctionCallWithGeoPointInWhereClauseShouldPass() { + validate("SELECT * FROM semantics WHERE GEO_BOUNDING_BOX(location, 100.0, 1.0, 101, 0.0)"); + validate("SELECT * FROM semantics WHERE GEO_DISTANCE(location, '1km', 100.5, 0.500001)"); + validate("SELECT * FROM semantics WHERE GEO_DISTANCE_RANGE(location, '1km', 100.5, 0.500001)"); + } + + @Test + public void fullTextMatchFunctionCallWithStringInWhereClauseShouldPass() { + validate("SELECT * FROM semantics WHERE MATCH_PHRASE(address, 'Seattle')"); + validate("SELECT * FROM semantics WHERE MATCHPHRASE(employer, 'Seattle')"); + validate("SELECT * FROM semantics WHERE MATCH_QUERY(manager.name, 'Seattle')"); + validate("SELECT * FROM semantics WHERE MATCHQUERY(manager.name, 'Seattle')"); + validate("SELECT * FROM semantics WHERE QUERY('Seattle')"); + validate("SELECT * FROM semantics WHERE WILDCARD_QUERY(manager.name, 'Sea*')"); + validate("SELECT * FROM semantics WHERE WILDCARDQUERY(manager.name, 'Sea*')"); + } + +} diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerFromClauseTest.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerFromClauseTest.java new file mode 100644 index 0000000000..0a3935a2fb --- /dev/null +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerFromClauseTest.java @@ -0,0 +1,194 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic; + +import org.junit.Ignore; +import org.junit.Test; + +/** + * Semantic analyzer tests for FROM clause, including parse single index, multiple indices, + * index + (deep) nested field and multiple statements like UNION/MINUS etc. Basically, we + * need to make sure the environment be set up properly so that semantic analysis followed + * can be performed correctly. + */ +public class SemanticAnalyzerFromClauseTest extends SemanticAnalyzerTestBase { + + @Ignore("IndexNotFoundException should be thrown from ES API directly") + @Test + public void nonExistingIndexNameShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics1" + ); + } + + @Test + public void useIndexPatternShouldSkipAllCheck() { + validate("SELECT abc FROM semant* WHERE def = 1"); + } + + @Test + public void useIndexAndIndexPatternShouldSkipAllCheck() { + validate("SELECT abc FROM semantics, semant* WHERE def = 1"); + } + + /** + * As shown below, there are multiple cases for alias: + * 1. Alias is not present: either use full index name as prefix or not. + * 2. Alias is present: either use alias as prefix or not. Full index name is illegal. + */ + @Test + public void indexNameAliasShouldBeOptional() { + validate("SELECT address FROM semantics"); + validate("SELECT address FROM semantics s"); + validate("SELECT * FROM semantics WHERE semantics.address LIKE 'Seattle'"); + } + + @Test + public void useFullIndexNameShouldFailIfAliasIsPresent() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics s WHERE semantics.address LIKE 'Seattle'", + "Field [semantics.address] cannot be found or used here", + "Did you mean [s.manager.address]?" + ); + } + + @Test + public void invalidIndexNameAliasInFromClauseShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics s, a.projects p", + "Field [a.projects] cannot be found or used here", + "Did you mean [s.projects]?" + ); + } + + @Test + public void invalidIndexNameAliasInWhereClauseShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics s WHERE a.balance = 10000", + "Field [a.balance] cannot be found or used here", + "Did you mean [s.balance]?" + ); + } + + @Test + public void invalidIndexNameAliasInGroupByClauseShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics s GROUP BY a.balance", + "Field [a.balance] cannot be found or used here", + "Did you mean [s.balance]?" + ); + } + + @Test + public void invalidIndexNameAliasInHavingClauseShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics s HAVING COUNT(a.balance) > 5", + "Field [a.balance] cannot be found or used here", + "Did you mean [s.balance]?" + ); + } + + @Test + public void invalidIndexNameAliasInOrderByClauseShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics s ORDER BY a.balance", + "Field [a.balance] cannot be found or used here", + "Did you mean [s.balance]?" + ); + } + + @Test + public void invalidIndexNameAliasInOnClauseShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics sem JOIN semantic tic ON sem.age = t.age", + "Field [t.age] cannot be found or used here", + "Did you mean [tic.age]?" + ); + } + + @Test + public void nonNestedFieldInFromClauseShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics s, s.manager m", + "Operator [JOIN] cannot work with [INDEX, OBJECT]." + ); + } + + @Test + public void nonExistingNestedFieldInFromClauseShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics s, s.project p", + "Field [s.project] cannot be found or used here", + "Did you mean [s.projects]?" + ); + } + + @Ignore("Need to figure out a better way to detect naming conflict") + @Test + public void duplicateIndexNameAliasInFromClauseShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics s, s.projects s", + "Field [s] is conflicting with field of same name defined by other index" + ); + } + + @Ignore("Need to figure out a better way to detect naming conflict") + @Test + public void duplicateFieldNameFromDifferentIndexShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics INNER JOIN semantics", + "is conflicting with field of same name defined by other index" + ); + } + + @Test + public void validIndexNameAliasShouldPass() { + validate("SELECT * FROM semantics s, s.projects p"); + validate("SELECT * FROM semantics s WHERE s.balance = 10000"); + } + + @Test + public void indexNameWithTypeShouldPass() { + validate("SELECT * FROM semantics/docs WHERE balance = 10000"); + validate("SELECT * FROM semantics/docs s WHERE s.balance = 10000"); + validate("SELECT * FROM semantics/docs s, s.projects p WHERE p.active IS TRUE"); + } + + @Test + public void noIndexAliasShouldPass() { + validate("SELECT * FROM semantics"); + validate("SELECT * FROM semantics, semantics.projects"); + } + + @Test + public void regularJoinShouldPass() { + validate("SELECT * FROM semantics s1, semantics s2"); + validate("SELECT * FROM semantics s1 JOIN semantics s2"); + validate("SELECT * FROM semantics s1 LEFT JOIN semantics s2 ON s1.balance = s2.balance"); + } + + @Test + public void deepNestedFieldInFromClauseShouldPass() { + validate("SELECT * FROM semantics s, s.projects p, p.members m"); + } + + @Test + public void duplicateFieldNameFromDifferentStatementShouldPass() { + validate("SELECT age FROM semantics UNION SELECT age FROM semantic"); + validate("SELECT s.age FROM semantics s UNION SELECT s.age FROM semantic s"); + } + +} diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerIdentifierTest.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerIdentifierTest.java new file mode 100644 index 0000000000..f004aef50f --- /dev/null +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerIdentifierTest.java @@ -0,0 +1,158 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic; + +import org.junit.Ignore; +import org.junit.Test; + +/** + * Semantic analyzer tests for identifier + */ +public class SemanticAnalyzerIdentifierTest extends SemanticAnalyzerTestBase { + + @Ignore("To be implemented") + @Test + public void duplicateFieldAliasInSelectClauseShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT age a, COUNT(*) a FROM semantics s, a.projects p", + "Field [a.projects] cannot be found or used here" + ); + } + + @Test + public void fieldWithDifferentCaseInSelectClauseShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT Age a FROM semantics", + "Field [Age] cannot be found or used here", + "Did you mean [age]?" + ); + } + + @Test + public void useHiddenFieldShouldPass() { + validate("SELECT _score FROM semantics WHERE _id = 1 AND _type = '_doc'"); + } + + @Ignore("Need to remove single quote or back ticks") + @Test + public void useFieldNameWithSpaceShouldPass() { + validate("SELECT ['field with spaces'] FROM semantics"); + validate("SELECT `field with spaces` FROM semantics"); + } + + @Test + public void nonExistingFieldNameInSelectClauseShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT age1 FROM semantics s", + "Field [age1] cannot be found or used here.", + "Did you mean [age]?" + ); + } + + @Test + public void invalidIndexAliasInFromClauseShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics s, a.projects p", + "Field [a.projects] cannot be found or used here.", + "Did you mean [s.projects]?" + ); + } + + @Test + public void nonExistingFieldNameInWhereClauseShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics s WHERE s.balce = 10000", + "Field [s.balce] cannot be found or used here.", + "Did you mean [s.balance]?" + ); + } + + @Test + public void nonExistingFieldNameInGroupByClauseShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics s GROUP BY s.balce", + "Field [s.balce] cannot be found or used here.", + "Did you mean [s.balance]?" + ); + } + + @Test + public void nonExistingFieldNameInHavingClauseShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics s HAVING COUNT(s.balce) > 5", + "Field [s.balce] cannot be found or used here.", + "Did you mean [s.balance]?" + ); + } + + @Test + public void nonExistingFieldNameInOrderByClauseShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics s ORDER BY s.balce", + "Field [s.balce] cannot be found or used here.", + "Did you mean [s.balance]?" + ); + } + + @Test + public void nonExistingFieldNameInFunctionShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics s WHERE LOG(s.balce) = 1", + "Field [s.balce] cannot be found or used here.", + "Did you mean [s.balance]?" + ); + } + + @Test + public void nonExistingNestedFieldNameInWhereClauseShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics s, s.projects p, p.members m WHERE m.nam = 'John'", + "Field [m.nam] cannot be found or used here.", + "Did you mean [m.name]?" + ); + } + + @Test + public void nonExistingNestedFieldNameInFunctionShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics WHERE nested(projects.actives) = TRUE", + "Field [projects.actives] cannot be found or used here.", + "Did you mean [projects.active]?" + ); + } + + @Test + public void useKeywordInMultiFieldShouldPass() { + validate("SELECT employer.keyword FROM semantics WHERE employer.keyword LIKE 'AWS' GROUP BY employer.keyword"); + validate("SELECT * FROM semantics s WHERE s.manager.name.keyword LIKE 'John'"); + } + + @Test + public void useDeepNestedFieldNameShouldPass() { + validate("SELECT p.* FROM semantics s, s.projects p WHERE p IS NULL"); + validate("SELECT p.active FROM semantics s, s.projects p WHERE p.active = TRUE"); + validate("SELECT m.name FROM semantics s, s.projects p, p.members m WHERE m.name = 'John'"); + } + + @Test + public void useConstantLiteralInSelectClauseShouldPass() { + validate("SELECT 1 FROM semantics"); + validate("SELECT 2.0 FROM semantics"); + //validate("SELECT 'test' FROM semantics"); TODO: why 'test' goes to fullColumnName that can be string literal + validate("SELECT TRUE FROM semantics"); + } + +} diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerMultiQueryTest.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerMultiQueryTest.java new file mode 100644 index 0000000000..eec3480046 --- /dev/null +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerMultiQueryTest.java @@ -0,0 +1,104 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic; + +import org.junit.Ignore; +import org.junit.Test; + +/** + * Semantic analyzer tests for multi query like UNION and MINUS + */ +public class SemanticAnalyzerMultiQueryTest extends SemanticAnalyzerTestBase { + + @Test + public void unionDifferentResultTypeOfTwoQueriesShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT balance FROM semantics UNION SELECT address FROM semantics", + "Operator [UNION] cannot work with [DOUBLE, TEXT]." + ); + } + + @Test + public void unionDifferentNumberOfResultTypeOfTwoQueriesShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT balance FROM semantics UNION SELECT balance, age FROM semantics", + "Operator [UNION] cannot work with [DOUBLE, (DOUBLE, INTEGER)]." + ); + } + + @Test + public void minusDifferentResultTypeOfTwoQueriesShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT p.active FROM semantics s, s.projects p MINUS SELECT address FROM semantics", + "Operator [MINUS] cannot work with [BOOLEAN, TEXT]." + ); + } + + @Test + public void unionSameResultTypeOfTwoQueriesShouldPass() { + validate("SELECT balance FROM semantics UNION SELECT balance FROM semantics"); + } + + @Test + public void unionCompatibleResultTypeOfTwoQueriesShouldPass() { + validate("SELECT balance FROM semantics UNION SELECT age FROM semantics"); + validate("SELECT address FROM semantics UNION ALL SELECT city FROM semantics"); + } + + @Test + public void minusSameResultTypeOfTwoQueriesShouldPass() { + validate("SELECT s.projects.active FROM semantics s UNION SELECT p.active FROM semantics s, s.projects p"); + } + + @Test + public void minusCompatibleResultTypeOfTwoQueriesShouldPass() { + validate("SELECT address FROM semantics MINUS SELECT manager.name.keyword FROM semantics"); + } + + @Test + public void unionSelectStarWithExtraFieldOfTwoQueriesShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics UNION SELECT *, city FROM semantics", + "Operator [UNION] cannot work with [(*), KEYWORD]." + ); + } + + @Test + public void minusSelectStarWithExtraFieldOfTwoQueriesShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT *, address, balance FROM semantics MINUS SELECT * FROM semantics", + "Operator [MINUS] cannot work with [(TEXT, DOUBLE), (*)]." + ); + } + + @Test + public void unionSelectStarOfTwoQueriesShouldPass() { + validate("SELECT * FROM semantics UNION SELECT * FROM semantics"); + validate("SELECT *, age FROM semantics UNION SELECT *, balance FROM semantics"); + } + + @Test + public void unionSelectFunctionCallWithSameReturnTypeOfTwoQueriesShouldPass() { + validate("SELECT LOG(balance) FROM semantics UNION SELECT ABS(age) FROM semantics"); + } + + @Ignore("* is empty and ignored in product of select items for now") + @Test + public void unionSelectFieldWithExtraStarOfTwoQueriesShouldFail() { + expectValidationFailWithErrorMessages("SELECT age FROM semantics UNION SELECT *, age FROM semantics"); + } + +} diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerOperatorTest.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerOperatorTest.java new file mode 100644 index 0000000000..6376330a52 --- /dev/null +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerOperatorTest.java @@ -0,0 +1,82 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic; + +import org.junit.Test; + +/** + * Semantic analysis test cases for operator + */ +public class SemanticAnalyzerOperatorTest extends SemanticAnalyzerTestBase { + + @Test + public void compareNumberIsBooleanShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics WHERE age IS FALSE", + "Operator [IS] cannot work with [INTEGER, BOOLEAN]." + ); + } + + @Test + public void compareTextIsNotBooleanShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics WHERE address IS NOT TRUE", + "Operator [IS] cannot work with [TEXT, BOOLEAN]." + ); + } + + @Test + public void compareNumberEqualsToStringShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics WHERE balance = 'test'", + "Operator [=] cannot work with [DOUBLE, STRING]." + ); + } + + @Test + public void compareSubstringFunctionCallEqualsToNumberShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics WHERE SUBSTRING(address, 0, 3) = 1", + "Operator [=] cannot work with [TEXT, INTEGER]." + ); + } + + @Test + public void compareLogAndAbsFunctionCallWithIntegerSmallerThanStringShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics WHERE LOG(ABS(age)) < 'test'", + "Operator [<] cannot work with [INTEGER, STRING]." + ); + } + + @Test + public void compareDoubleWithIntegerShouldPass() { + validate("SELECT * FROM semantics WHERE balance >= 1000"); + validate("SELECT * FROM semantics WHERE balance <> 1000"); + validate("SELECT * FROM semantics WHERE balance != 1000"); + } + + @Test + public void compareDateWithStringShouldPass() { + validate("SELECT * FROM semantics WHERE birthday = '2019-09-30'"); + } + + @Test + public void namedArgumentShouldSkipOperatorTypeCheck() { + validate("SELECT TOPHITS('size'=3, age='desc') FROM semantics GROUP BY city"); + } + +} diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerScalarFunctionTest.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerScalarFunctionTest.java new file mode 100644 index 0000000000..fab03ab182 --- /dev/null +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerScalarFunctionTest.java @@ -0,0 +1,263 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic; + +import org.junit.Test; + +/** + * Semantic analysis tests for scalar function. + */ +public class SemanticAnalyzerScalarFunctionTest extends SemanticAnalyzerTestBase { + + @Test + public void unsupportedScalarFunctionCallInSelectClauseShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT NOW() FROM semantics", + "Function [NOW] cannot be found or used here." + ); + } + + @Test + public void unsupportedScalarFunctionCallInWhereClauseShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics WHERE LOG100(balance) = 1", + "Function [LOG100] cannot be found or used here.", + "Did you mean [LOG10]?" + ); + } + + @Test + public void scalarFunctionCallWithLessArgumentsInWhereClauseShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics WHERE LOG() = 1", + "Function [LOG] cannot work with [].", + "Usage: LOG(NUMBER T) -> T" + ); + } + + @Test + public void scalarFunctionCallWithMoreArgumentsInWhereClauseShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics WHERE LOG(age, city) = 1", + "Function [LOG] cannot work with [INTEGER, KEYWORD].", + "Usage: LOG(NUMBER T) -> T" + ); + } + + @Test + public void logFunctionCallWithOneNestedInSelectClauseShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT LOG(projects) FROM semantics", + "Function [LOG] cannot work with [NESTED_FIELD].", + "Usage: LOG(NUMBER T) -> T" + ); + } + + @Test + public void logFunctionCallWithOneTextInWhereClauseShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics WHERE LOG(city) = 1", + "Function [LOG] cannot work with [KEYWORD].", + "Usage: LOG(NUMBER T) -> T" + ); + } + + @Test + public void logFunctionCallWithOneNumberShouldPass() { + validate("SELECT LOG(age) FROM semantics"); + validate("SELECT * FROM semantics s WHERE LOG(s.balance) = 1000"); + validate("SELECT LOG(s.manager.salary) FROM semantics s"); + } + + @Test + public void logFunctionCallInDifferentCaseShouldPass() { + validate("SELECT log(age) FROM semantics"); + validate("SELECT Log(age) FROM semantics"); + validate("SELECT loG(age) FROM semantics"); + } + + @Test + public void logFunctionCallWithUnknownFieldShouldPass() { + validate("SELECT LOG(new_field) FROM semantics"); + } + + @Test + public void substringWithLogFunctionCallWithUnknownFieldShouldPass() { + validate("SELECT SUBSTRING(LOG(new_field), 0, 1) FROM semantics"); + } + + @Test + public void logFunctionCallWithResultOfAbsFunctionCallWithOneNumberShouldPass() { + validate("SELECT LOG(ABS(age)) FROM semantics"); + } + + @Test + public void logFunctionCallWithMoreNestedFunctionCallWithOneNumberShouldPass() { + validate("SELECT LOG(ABS(SQRT(balance))) FROM semantics"); + } + + @Test + public void substringFunctionCallWithResultOfAnotherSubstringAndAbsFunctionCallShouldPass() { + validate("SELECT SUBSTRING(SUBSTRING(city, ABS(age), 1), 2, ABS(1)) FROM semantics"); + } + + @Test + public void substringFunctionCallWithResultOfMathFunctionCallShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT SUBSTRING(LOG(balance), 2, 3) FROM semantics", + "Function [SUBSTRING] cannot work with [DOUBLE, INTEGER, INTEGER].", + "Usage: SUBSTRING(STRING T, INTEGER, INTEGER) -> T" + ); + } + + @Test + public void logFunctionCallWithResultOfSubstringFunctionCallShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT LOG(SUBSTRING(address, 0, 1)) FROM semantics", + "Function [LOG] cannot work with [TEXT].", + "Usage: LOG(NUMBER T) -> T or LOG(NUMBER T, NUMBER) -> T" + ); + } + + @Test + public void allSupportedMathFunctionCallInSelectClauseShouldPass() { + validate( + "SELECT" + + " ABS(age), " + + " ASIN(age), " + + " ATAN(age), " + + " ATAN2(age), " + + " CBRT(age), " + + " CEIL(age), " + + " COS(age), " + + " COSH(age), " + + " DEGREES(age), " + + " EXP(age), " + + " EXPM1(age), " + + " FLOOR(age), " + + " LOG(age), " + + " LOG2(age), " + + " LOG10(age), " + + " POW(age), " + + " RADIANS(age), " + + " RINT(age), " + + " ROUND(age), " + + " SIN(age), " + + " SINH(age), " + + " SQRT(age), " + + " TAN(age) " + + "FROM semantics" + ); + } + + @Test + public void allSupportedMathFunctionCallInWhereClauseShouldPass() { + validate( + "SELECT * FROM semantics WHERE " + + " ABS(age) = 1 AND " + + " ASIN(age) = 1 AND " + + " ATAN(age) = 1 AND " + + " ATAN2(age) = 1 AND " + + " CBRT(age) = 1 AND " + + " CEIL(age) = 1 AND " + + " COS(age) = 1 AND " + + " COSH(age) = 1 AND " + + " DEGREES(age) = 1 AND " + + " EXP(age) = 1 AND " + + " EXPM1(age) = 1 AND " + + " FLOOR(age) = 1 AND " + + " LOG(age) = 1 AND " + + " LOG2(age) = 1 AND " + + " LOG10(age) = 1 AND " + + " POW(age) = 1 AND " + + " RADIANS(age) = 1 AND " + + " RINT(age) = 1 AND " + + " ROUND(age) = 1 AND " + + " SIN(age) = 1 AND " + + " SINH(age) = 1 AND " + + " SQRT(age) = 1 AND " + + " TAN(age) = 1 " + ); + } + + @Test + public void allSupportedConstantsUseInSelectClauseShouldPass() { + validate( + "SELECT " + + " E(), " + + " PI() " + + "FROM semantics" + ); + } + + @Test + public void allSupportedConstantsUseInWhereClauseShouldPass() { + validate( + "SELECT * FROM semantics WHERE " + + " E() > 1 OR " + + " PI() > 1" + ); + } + + @Test + public void allSupportedStringFunctionCallInSelectClauseShouldPass() { + validate( + "SELECT * FROM semantics WHERE " + + " SUBSTRING(city, 0, 3) = 'Sea' AND " + + " UPPER(city) = 'SEATTLE' AND " + + " LOWER(city) = 'seattle'" + ); + } + + @Test + public void allSupportedStringFunctionCallInWhereClauseShouldPass() { + validate( + "SELECT" + + " SUBSTRING(city, 0, 3), " + + " UPPER(address), " + + " LOWER(manager.name) " + + "FROM semantics " + ); + } + + @Test + public void dateFormatFunctionCallWithNumberShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT DATE_FORMAT(balance, 'yyyy-MM') FROM semantics", + "Function [DATE_FORMAT] cannot work with [DOUBLE, STRING].", + "Usage: DATE_FORMAT(DATE, STRING) -> STRING or DATE_FORMAT(DATE, STRING, STRING) -> STRING" + ); + } + + @Test + public void allSupportedDateFunctionCallShouldPass() { + validate( + "SELECT date_format(birthday, 'yyyy-MM') " + + "FROM semantics " + + "WHERE date_format(birthday, 'yyyy-MM') > '1980-01' " + + "GROUP BY date_format(birthday, 'yyyy-MM') " + + "ORDER BY date_format(birthday, 'yyyy-MM') DESC" + ); + } + + @Test + public void concatRequiresVarargSupportShouldPassAnyway() { + validate("SELECT CONCAT('aaa') FROM semantics"); + validate("SELECT CONCAT('aaa', 'bbb') FROM semantics"); + validate("SELECT CONCAT('aaa', 'bbb', 123) FROM semantics"); + } + +} diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerSubqueryTest.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerSubqueryTest.java new file mode 100644 index 0000000000..1a19f77c1c --- /dev/null +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerSubqueryTest.java @@ -0,0 +1,108 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic; + +import org.junit.Test; + +/** + * Semantic analysis test for subquery + */ +public class SemanticAnalyzerSubqueryTest extends SemanticAnalyzerTestBase { + + @Test + public void useExistClauseOnNestedFieldShouldPass() { + validate( + "SELECT * FROM semantics AS s WHERE EXISTS " + + " ( SELECT * FROM s.projects AS p WHERE p.active IS TRUE ) " + + " AND s.age > 10" + ); + } + + @Test + public void useNotExistClauseOnNestedFieldShouldPass() { + validate( + "SELECT * FROM semantics AS s WHERE NOT EXISTS " + + " ( SELECT * FROM s.projects AS p WHERE p.active IS TRUE ) " + + " AND s.age > 10" + ); + } + + @Test + public void useInClauseOnAgeWithIntegerLiteralListShouldPass() { + validate("SELECT * FROM semantics WHERE age IN (30, 40)"); + } + + @Test + public void useAliasInSubqueryShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics s WHERE EXISTS (SELECT * FROM s.projects p) AND p.active IS TRUE", + "Field [p.active] cannot be found or used here.", + "Did you mean [projects.active]?" + ); + } + + @Test + public void useInClauseWithIncompatibleFieldTypesShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics s WHERE age IN (SELECT p.active FROM s.projects p)", + "Operator [IN] cannot work with [INTEGER, BOOLEAN]." + ); + } + + @Test + public void useInClauseWithCompatibleFieldTypesShouldPass() { + validate("SELECT * FROM semantics s WHERE address IN (SELECT city FROM s.projects p)"); + } + + @Test + public void useNotInClauseWithCompatibleFieldTypesShouldPass() { + validate("SELECT * FROM semantics s WHERE address NOT IN (SELECT city FROM s.projects p)"); + } + + @Test + public void useInClauseWithCompatibleConstantShouldPass() { + validate("SELECT * FROM semantics WHERE age IN (10, 20, 30)"); + validate("SELECT * FROM semantics WHERE city IN ('Seattle', 'Bellevue')"); + validate("SELECT * FROM semantics WHERE birthday IN ('2000-01-01', '2010-01-01')"); + } + + @Test + public void useInClauseWithIncompatibleConstantShouldPass() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics s WHERE age IN ('abc', 'def')", + "Operator [IN] cannot work with [INTEGER, STRING]." + ); + } + + @Test + public void useInClauseWithSelectStarShouldFail() { + expectValidationFailWithErrorMessages( + "SELECT * FROM semantics s WHERE address IN (SELECT * FROM s.projects p)", + "Operator [IN] cannot work with [TEXT, (*)]" + ); + } + + @Test + public void useExistsClauseWithSelectStarShouldPass() { + validate("SELECT * FROM semantics s WHERE EXISTS (SELECT * FROM s.projects p)"); + } + + @Test + public void useExistsClauseWithSelectConstantShouldPass() { + validate("SELECT * FROM semantics s WHERE EXISTS (SELECT 1 FROM s.projects p)"); + } + +} diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerTestBase.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerTestBase.java new file mode 100644 index 0000000000..4c37fe9447 --- /dev/null +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerTestBase.java @@ -0,0 +1,75 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic; + +import com.amazon.opendistroforelasticsearch.sql.antlr.OpenDistroSqlAnalyzer; +import com.amazon.opendistroforelasticsearch.sql.antlr.SqlAnalysisConfig; +import com.amazon.opendistroforelasticsearch.sql.esdomain.LocalClusterState; +import com.google.common.base.Charsets; +import com.google.common.io.Resources; +import org.hamcrest.Matchers; +import org.junit.AfterClass; +import org.junit.BeforeClass; +import org.junit.Rule; +import org.junit.rules.ExpectedException; + +import java.io.IOException; +import java.net.URL; +import java.util.Arrays; + +import static com.amazon.opendistroforelasticsearch.sql.util.CheckScriptContents.mockLocalClusterState; +import static java.util.stream.Collectors.toList; +import static org.hamcrest.Matchers.allOf; + +/** + * Test cases for semantic analysis focused on semantic check which was missing in the past. + */ +public abstract class SemanticAnalyzerTestBase { + + private static final String TEST_MAPPING_FILE = "mappings/semantics.json"; + + /** public accessor is required by @Rule annotation */ + @Rule + public ExpectedException exception = ExpectedException.none(); + + private OpenDistroSqlAnalyzer analyzer = new OpenDistroSqlAnalyzer(new SqlAnalysisConfig(true, true, 1000)); + + @SuppressWarnings("UnstableApiUsage") + @BeforeClass + public static void init() throws IOException { + URL url = Resources.getResource(TEST_MAPPING_FILE); + String mappings = Resources.toString(url, Charsets.UTF_8); + LocalClusterState.state(null); + mockLocalClusterState(mappings); + } + + @AfterClass + public static void cleanUp() { + LocalClusterState.state(null); + } + + protected void expectValidationFailWithErrorMessages(String query, String... messages) { + exception.expect(SemanticAnalysisException.class); + exception.expectMessage(allOf(Arrays.stream(messages). + map(Matchers::containsString). + collect(toList()))); + validate(query); + } + + protected void validate(String sql) { + analyzer.analyze(sql, LocalClusterState.state()); + } +} diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerTests.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerTests.java new file mode 100644 index 0000000000..530db020d5 --- /dev/null +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/SemanticAnalyzerTests.java @@ -0,0 +1,40 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic; + +import org.junit.runner.RunWith; +import org.junit.runners.Suite; + +/** + * Semantic analyzer test suite to prepare mapping and avoid load from file every time. + * But Gradle seems not work well with suite. So move common logic to test base class + * and keep this for quick testing in IDE. + */ +@RunWith(Suite.class) +@Suite.SuiteClasses({ + SemanticAnalyzerBasicTest.class, + SemanticAnalyzerConfigTest.class, + SemanticAnalyzerFromClauseTest.class, + SemanticAnalyzerIdentifierTest.class, + SemanticAnalyzerScalarFunctionTest.class, + SemanticAnalyzerESScalarFunctionTest.class, + SemanticAnalyzerAggregateFunctionTest.class, + SemanticAnalyzerOperatorTest.class, + SemanticAnalyzerSubqueryTest.class, + SemanticAnalyzerMultiQueryTest.class, +}) +public class SemanticAnalyzerTests { +} diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/scope/EnvironmentTest.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/scope/EnvironmentTest.java new file mode 100644 index 0000000000..410fbfd718 --- /dev/null +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/scope/EnvironmentTest.java @@ -0,0 +1,173 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic.scope; + +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.Type; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESIndex; +import org.junit.Assert; +import org.junit.Test; + +import java.util.Map; + +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.BOOLEAN; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.DATE; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.KEYWORD; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.OBJECT; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.TEXT; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESIndex.IndexType.NESTED_FIELD; +import static org.hamcrest.Matchers.aMapWithSize; +import static org.hamcrest.Matchers.allOf; +import static org.hamcrest.Matchers.hasEntry; +import static org.junit.Assert.assertThat; + +/** + * Test cases for environment + */ +public class EnvironmentTest { + + /** Use context class for push/pop */ + private SemanticContext context = new SemanticContext(); + + @Test + public void defineFieldSymbolInDifferentEnvironmentsShouldBeAbleToResolve() { + // Root environment + Symbol birthday = new Symbol(Namespace.FIELD_NAME, "s.birthday"); + environment().define(birthday, DATE); + Assert.assertTrue(environment().resolve(birthday).isPresent()); + + // New environment 1 + context.push(); + Symbol city = new Symbol(Namespace.FIELD_NAME, "s.city"); + environment().define(city, KEYWORD); + Assert.assertTrue(environment().resolve(birthday).isPresent()); + Assert.assertTrue(environment().resolve(city).isPresent()); + + // New environment 2 + context.push(); + Symbol manager = new Symbol(Namespace.FIELD_NAME, "s.manager"); + environment().define(manager, OBJECT); + Assert.assertTrue(environment().resolve(birthday).isPresent()); + Assert.assertTrue(environment().resolve(city).isPresent()); + Assert.assertTrue(environment().resolve(manager).isPresent()); + } + + @Test + public void defineFieldSymbolInDifferentEnvironmentsShouldNotAbleToResolveOncePopped() { + // Root environment + Symbol birthday = new Symbol(Namespace.FIELD_NAME, "s.birthday"); + environment().define(birthday, DATE); + + // New environment + context.push(); + Symbol city = new Symbol(Namespace.FIELD_NAME, "s.city"); + Symbol manager = new Symbol(Namespace.FIELD_NAME, "s.manager"); + environment().define(city, OBJECT); + environment().define(manager, OBJECT); + Assert.assertTrue(environment().resolve(birthday).isPresent()); + Assert.assertTrue(environment().resolve(city).isPresent()); + Assert.assertTrue(environment().resolve(manager).isPresent()); + + context.pop(); + Assert.assertFalse(environment().resolve(city).isPresent()); + Assert.assertFalse(environment().resolve(manager).isPresent()); + Assert.assertTrue(environment().resolve(birthday).isPresent()); + } + + @Test + public void defineFieldSymbolInDifferentEnvironmentsShouldBeAbleToResolveByPrefix() { + // Root environment + Symbol birthday = new Symbol(Namespace.FIELD_NAME, "s.birthday"); + environment().define(birthday, DATE); + + // New environment 1 + context.push(); + Symbol city = new Symbol(Namespace.FIELD_NAME, "s.city"); + environment().define(city, KEYWORD); + + // New environment 2 + context.push(); + Symbol manager = new Symbol(Namespace.FIELD_NAME, "s.manager"); + environment().define(manager, OBJECT); + + Map typeByName = environment().resolveByPrefix(new Symbol(Namespace.FIELD_NAME, "s")); + assertThat( + typeByName, + allOf( + aMapWithSize(3), + hasEntry("s.birthday", DATE), + hasEntry("s.city", KEYWORD), + hasEntry("s.manager", OBJECT) + ) + ); + } + + @Test + public void defineFieldSymbolShouldBeAbleToResolveAll() { + environment().define(new Symbol(Namespace.FIELD_NAME, "s.projects"), new ESIndex("s.projects", NESTED_FIELD)); + environment().define(new Symbol(Namespace.FIELD_NAME, "s.projects.release"), DATE); + environment().define(new Symbol(Namespace.FIELD_NAME, "s.projects.active"), BOOLEAN); + environment().define(new Symbol(Namespace.FIELD_NAME, "s.address"), TEXT); + environment().define(new Symbol(Namespace.FIELD_NAME, "s.city"), KEYWORD); + environment().define(new Symbol(Namespace.FIELD_NAME, "s.manager.name"), TEXT); + + Map typeByName = environment().resolveAll(Namespace.FIELD_NAME); + assertThat( + typeByName, + allOf( + aMapWithSize(6), + hasEntry("s.projects", (Type) new ESIndex("s.projects", NESTED_FIELD)), + hasEntry("s.projects.release", DATE), + hasEntry("s.projects.active", BOOLEAN), + hasEntry("s.address", TEXT), + hasEntry("s.city", KEYWORD), + hasEntry("s.manager.name", TEXT) + ) + ); + } + + @Test + public void defineFieldSymbolInDifferentEnvironmentsShouldBeAbleToResolveAll() { + // Root environment + Symbol birthday = new Symbol(Namespace.FIELD_NAME, "s.birthday"); + environment().define(birthday, DATE); + + // New environment 1 + context.push(); + Symbol city = new Symbol(Namespace.FIELD_NAME, "s.city"); + environment().define(city, KEYWORD); + + // New environment 2 + context.push(); + Symbol manager = new Symbol(Namespace.FIELD_NAME, "s.manager"); + environment().define(manager, OBJECT); + + Map typeByName = environment().resolveAll(Namespace.FIELD_NAME); + assertThat( + typeByName, + allOf( + aMapWithSize(3), + hasEntry("s.birthday", DATE), + hasEntry("s.city", KEYWORD), + hasEntry("s.manager", OBJECT) + ) + ); + } + + private Environment environment() { + return context.peek(); + } + +} diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/scope/SemanticContextTest.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/scope/SemanticContextTest.java new file mode 100644 index 0000000000..dd2f2beea1 --- /dev/null +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/scope/SemanticContextTest.java @@ -0,0 +1,53 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic.scope; + +import org.junit.Assert; +import org.junit.Test; + +/** + * Test cases for semantic context + */ +public class SemanticContextTest { + + private final SemanticContext context = new SemanticContext(); + + @Test + public void rootEnvironmentShouldBeThereInitially() { + Assert.assertNotNull( + "Didn't find root environment. Context is NOT supposed to be empty initially", + context.peek() + ); + } + + @Test + public void pushAndPopEnvironmentShouldPass() { + context.push(); + context.pop(); + } + + @Test + public void popRootEnvironmentShouldPass() { + context.pop(); + } + + @Test(expected = NullPointerException.class) + public void popEmptyEnvironmentStackShouldFail() { + context.pop(); + context.pop(); + } + +} diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/scope/SymbolTableTest.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/scope/SymbolTableTest.java new file mode 100644 index 0000000000..98ab8bff85 --- /dev/null +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/scope/SymbolTableTest.java @@ -0,0 +1,99 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic.scope; + +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.Type; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.TypeExpression; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESIndex; +import org.junit.Assert; +import org.junit.Test; + +import java.util.Map; +import java.util.Optional; + +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.BOOLEAN; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.DATE; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.KEYWORD; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.NUMBER; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.TEXT; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESIndex.IndexType.NESTED_FIELD; +import static org.hamcrest.Matchers.aMapWithSize; +import static org.hamcrest.Matchers.allOf; +import static org.hamcrest.Matchers.hasEntry; +import static org.junit.Assert.assertThat; + +/** + * Test cases for symbol table + */ +public class SymbolTableTest { + + private final SymbolTable symbolTable = new SymbolTable(); + + @Test + public void defineFieldSymbolShouldBeAbleToResolve() { + defineSymbolShouldBeAbleToResolve(new Symbol(Namespace.FIELD_NAME, "birthday"), DATE); + } + + @Test + public void defineFunctionSymbolShouldBeAbleToResolve() { + String funcName = "LOG"; + Type expectedType = new TypeExpression() { + @Override + public String getName() { + return "Temp type expression with [NUMBER] -> NUMBER specification"; + } + + @Override + public TypeExpressionSpec[] specifications() { + return new TypeExpressionSpec[] { + new TypeExpressionSpec().map(NUMBER).to(NUMBER) + }; + } + }; + Symbol symbol = new Symbol(Namespace.FUNCTION_NAME, funcName); + defineSymbolShouldBeAbleToResolve(symbol, expectedType); + } + + @Test + public void defineFieldSymbolShouldBeAbleToResolveByPrefix() { + symbolTable.store(new Symbol(Namespace.FIELD_NAME, "s.projects"), new ESIndex("s.projects", NESTED_FIELD)); + symbolTable.store(new Symbol(Namespace.FIELD_NAME, "s.projects.release"), DATE); + symbolTable.store(new Symbol(Namespace.FIELD_NAME, "s.projects.active"), BOOLEAN); + symbolTable.store(new Symbol(Namespace.FIELD_NAME, "s.address"), TEXT); + symbolTable.store(new Symbol(Namespace.FIELD_NAME, "s.city"), KEYWORD); + symbolTable.store(new Symbol(Namespace.FIELD_NAME, "s.manager.name"), TEXT); + + Map typeByName = symbolTable.lookupByPrefix(new Symbol(Namespace.FIELD_NAME, "s.projects")); + assertThat( + typeByName, + allOf( + aMapWithSize(3), + hasEntry("s.projects", (Type) new ESIndex("s.projects", NESTED_FIELD)), + hasEntry("s.projects.release", DATE), + hasEntry("s.projects.active", BOOLEAN) + ) + ); + } + + private void defineSymbolShouldBeAbleToResolve(Symbol symbol, Type expectedType) { + symbolTable.store(symbol, expectedType); + + Optional actualType = symbolTable.lookup(symbol); + Assert.assertTrue(actualType.isPresent()); + Assert.assertEquals(expectedType, actualType.get()); + } + +} diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/BaseTypeTest.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/BaseTypeTest.java new file mode 100644 index 0000000000..493dcc8518 --- /dev/null +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/BaseTypeTest.java @@ -0,0 +1,116 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types; + +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESIndex; +import org.junit.Ignore; +import org.junit.Test; + +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.BOOLEAN; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.DATE; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.DOUBLE; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.ES_TYPE; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.FLOAT; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.INTEGER; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.KEYWORD; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.LONG; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.NESTED; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.NUMBER; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.SHORT; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.STRING; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.TEXT; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.UNKNOWN; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESIndex.IndexType.NESTED_FIELD; +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertFalse; +import static org.junit.Assert.assertTrue; + +/** + * Test base type compatibility + */ +public class BaseTypeTest { + + @Test + public void unknownTypeNameShouldReturnUnknown() { + assertEquals(UNKNOWN, ESDataType.typeOf("this_is_a_new_es_type_we_arent_aware")); + } + + @Test + public void typeOfShouldIgnoreCase() { + assertEquals(INTEGER, ESDataType.typeOf("Integer")); + } + + @Test + public void sameBaseTypeShouldBeCompatible() { + assertTrue(INTEGER.isCompatible(INTEGER)); + assertTrue(BOOLEAN.isCompatible(BOOLEAN)); + } + + @Test + public void parentBaseTypeShouldBeCompatibleWithSubBaseType() { + assertTrue(NUMBER.isCompatible(DOUBLE)); + assertTrue(DOUBLE.isCompatible(FLOAT)); + assertTrue(FLOAT.isCompatible(INTEGER)); + assertTrue(INTEGER.isCompatible(SHORT)); + assertTrue(INTEGER.isCompatible(LONG)); + assertTrue(STRING.isCompatible(TEXT)); + assertTrue(STRING.isCompatible(KEYWORD)); + assertTrue(DATE.isCompatible(STRING)); + } + + @Test + public void ancestorBaseTypeShouldBeCompatibleWithSubBaseType() { + assertTrue(NUMBER.isCompatible(LONG)); + assertTrue(NUMBER.isCompatible(DOUBLE)); + assertTrue(DOUBLE.isCompatible(INTEGER)); + assertTrue(INTEGER.isCompatible(SHORT)); + assertTrue(INTEGER.isCompatible(LONG)); + } + + @Ignore("Two way compatibility is not necessary") + @Test + public void subBaseTypeShouldBeCompatibleWithParentBaseType() { + assertTrue(KEYWORD.isCompatible(STRING)); + } + + @Test + public void nonRelatedBaseTypeShouldNotBeCompatible() { + assertFalse(SHORT.isCompatible(TEXT)); + assertFalse(DATE.isCompatible(BOOLEAN)); + } + + @Test + public void unknownBaseTypeShouldBeCompatibleWithAnyBaseType() { + assertTrue(UNKNOWN.isCompatible(INTEGER)); + assertTrue(UNKNOWN.isCompatible(KEYWORD)); + assertTrue(UNKNOWN.isCompatible(BOOLEAN)); + } + + @Test + public void anyBaseTypeShouldBeCompatibleWithUnknownBaseType() { + assertTrue(LONG.isCompatible(UNKNOWN)); + assertTrue(TEXT.isCompatible(UNKNOWN)); + assertTrue(DATE.isCompatible(UNKNOWN)); + } + + @Test + public void nestedIndexTypeShouldBeCompatibleWithNestedDataType() { + assertTrue(NESTED.isCompatible(new ESIndex("test", NESTED_FIELD))); + assertTrue(ES_TYPE.isCompatible(new ESIndex("test", NESTED_FIELD))); + } + +} diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/GenericTypeTest.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/GenericTypeTest.java new file mode 100644 index 0000000000..7611f8882e --- /dev/null +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/GenericTypeTest.java @@ -0,0 +1,60 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types; + +import org.junit.Test; + +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.INTEGER; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.KEYWORD; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.LONG; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.NUMBER; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.TEXT; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.TYPE_ERROR; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.function.ScalarFunction.LOG; +import static java.util.Collections.singletonList; +import static org.junit.Assert.assertEquals; + +/** + * Generic type test + */ +public class GenericTypeTest { + + @Test + public void passNumberArgToLogShouldReturnNumber() { + assertEquals(NUMBER, LOG.construct(singletonList(NUMBER))); + } + + @Test + public void passIntegerArgToLogShouldReturnInteger() { + assertEquals(INTEGER, LOG.construct(singletonList(INTEGER))); + } + + @Test + public void passLongArgToLogShouldReturnLong() { + assertEquals(LONG, LOG.construct(singletonList(LONG))); + } + + @Test + public void passTextArgToLogShouldReturnTypeError() { + assertEquals(TYPE_ERROR, LOG.construct(singletonList(TEXT))); + } + + @Test + public void passKeywordArgToLogShouldReturnTypeError() { + assertEquals(TYPE_ERROR, LOG.construct(singletonList(KEYWORD))); + } + +} diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/ProductTypeTest.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/ProductTypeTest.java new file mode 100644 index 0000000000..e1013ee320 --- /dev/null +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/ProductTypeTest.java @@ -0,0 +1,83 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types; + +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.special.Product; +import org.junit.Assert; +import org.junit.Test; + +import java.util.Arrays; + +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.BOOLEAN; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.INTEGER; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.KEYWORD; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.NUMBER; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.STRING; +import static java.util.Collections.singletonList; + +/** + * Test cases fro product type + */ +public class ProductTypeTest { + + @Test + public void singleSameTypeInTwoProductsShouldPass() { + Product product1 = new Product(singletonList(INTEGER)); + Product product2 = new Product(singletonList(INTEGER)); + Assert.assertTrue(product1.isCompatible(product2)); + Assert.assertTrue(product2.isCompatible(product1)); + } + + @Test + public void singleCompatibleTypeInTwoProductsShouldPass() { + Product product1 = new Product(singletonList(NUMBER)); + Product product2 = new Product(singletonList(INTEGER)); + Assert.assertTrue(product1.isCompatible(product2)); + Assert.assertTrue(product2.isCompatible(product1)); + } + + @Test + public void twoCompatibleTypesInTwoProductsShouldPass() { + Product product1 = new Product(Arrays.asList(NUMBER, KEYWORD)); + Product product2 = new Product(Arrays.asList(INTEGER, STRING)); + Assert.assertTrue(product1.isCompatible(product2)); + Assert.assertTrue(product2.isCompatible(product1)); + } + + @Test + public void incompatibleTypesInTwoProductsShouldFail() { + Product product1 = new Product(singletonList(BOOLEAN)); + Product product2 = new Product(singletonList(STRING)); + Assert.assertFalse(product1.isCompatible(product2)); + Assert.assertFalse(product2.isCompatible(product1)); + } + + @Test + public void compatibleButDifferentTypeNumberInTwoProductsShouldFail() { + Product product1 = new Product(Arrays.asList(KEYWORD, INTEGER)); + Product product2 = new Product(singletonList(STRING)); + Assert.assertFalse(product1.isCompatible(product2)); + Assert.assertFalse(product2.isCompatible(product1)); + } + + @Test + public void baseTypeShouldBeIncompatibleWithProductType() { + Product product = new Product(singletonList(INTEGER)); + Assert.assertFalse(INTEGER.isCompatible(product)); + Assert.assertFalse(product.isCompatible(INTEGER)); + } + +} diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/TypeExpressionTest.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/TypeExpressionTest.java new file mode 100644 index 0000000000..0f39b2c1cd --- /dev/null +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/semantic/types/TypeExpressionTest.java @@ -0,0 +1,89 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types; + +import org.junit.Test; + +import java.util.Arrays; + +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.BOOLEAN; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.DATE; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.DOUBLE; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.GEO_POINT; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.INTEGER; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.NUMBER; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.STRING; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.TYPE_ERROR; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.UNKNOWN; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.special.Generic.T; +import static org.junit.Assert.assertEquals; + +/** + * Test cases for default implementation methods in interface TypeExpression + */ +public class TypeExpressionTest { + + private final TypeExpression test123 = new TypeExpression() { + + @Override + public String getName() { + return "TEST123"; + } + + @Override + public TypeExpressionSpec[] specifications() { + return new TypeExpressionSpec[] { + new TypeExpressionSpec().map(T(NUMBER)).to(T), + new TypeExpressionSpec().map(STRING, BOOLEAN).to(DATE) + }; + } + }; + + @Test + public void emptySpecificationShouldAlwaysReturnUnknown() { + TypeExpression expr = new TypeExpression() { + @Override + public TypeExpressionSpec[] specifications() { + return new TypeExpressionSpec[0]; + } + + @Override + public String getName() { + return "Temp type expression with empty specification"; + } + }; + assertEquals(UNKNOWN, expr.construct(Arrays.asList(NUMBER))); + assertEquals(UNKNOWN, expr.construct(Arrays.asList(STRING, BOOLEAN))); + assertEquals(UNKNOWN, expr.construct(Arrays.asList(INTEGER, DOUBLE, GEO_POINT))); + } + + @Test + public void compatibilityCheckShouldPassIfAnySpecificationCompatible() { + assertEquals(DOUBLE, test123.construct(Arrays.asList(DOUBLE))); + assertEquals(DATE, test123.construct(Arrays.asList(STRING, BOOLEAN))); + } + + @Test + public void compatibilityCheckShouldFailIfNoSpecificationCompatible() { + assertEquals(TYPE_ERROR, test123.construct(Arrays.asList(BOOLEAN))); + } + + @Test + public void usageShouldPrintAllSpecifications() { + assertEquals("TEST123(NUMBER T) -> T or TEST123(STRING, BOOLEAN) -> DATE", test123.usage()); + } + +} diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/visitor/AntlrSqlParseTreeVisitorTest.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/visitor/AntlrSqlParseTreeVisitorTest.java new file mode 100644 index 0000000000..c85f7b957f --- /dev/null +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/antlr/visitor/AntlrSqlParseTreeVisitorTest.java @@ -0,0 +1,86 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.antlr.visitor; + +import com.amazon.opendistroforelasticsearch.sql.antlr.OpenDistroSqlAnalyzer; +import com.amazon.opendistroforelasticsearch.sql.antlr.SqlAnalysisConfig; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.scope.SemanticContext; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.Type; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.special.Product; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.visitor.TypeChecker; +import com.amazon.opendistroforelasticsearch.sql.antlr.visitor.AntlrSqlParseTreeVisitor; +import org.antlr.v4.runtime.tree.ParseTree; +import org.junit.Assert; +import org.junit.Test; + +import java.util.Arrays; + +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.DATE; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.INTEGER; +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.base.ESDataType.UNKNOWN; +import static java.util.Collections.emptyList; + +/** + * Test cases for AntlrSqlParseTreeVisitor + */ +public class AntlrSqlParseTreeVisitorTest { + + private TypeChecker analyzer = new TypeChecker(new SemanticContext()) { + @Override + public Type visitIndexName(String indexName) { + return null; // avoid querying mapping on null LocalClusterState + } + + @Override + public Type visitFieldName(String fieldName) { + switch (fieldName) { + case "age": return INTEGER; + case "birthday": return DATE; + default: return UNKNOWN; + } + } + }; + + @Test + public void selectNumberShouldReturnNumberAsQueryVisitingResult() { + Type result = visit("SELECT age FROM test"); + Assert.assertSame(result, INTEGER); + } + + @Test + public void selectNumberAndDateShouldReturnProductOfThemAsQueryVisitingResult() { + Type result = visit("SELECT age, birthday FROM test"); + Assert.assertTrue(result instanceof Product ); + Assert.assertTrue(result.isCompatible(new Product(Arrays.asList(INTEGER, DATE)))); + } + + @Test + public void selectStarShouldReturnEmptyProductAsQueryVisitingResult() { + Type result = visit("SELECT * FROM test"); + Assert.assertTrue(result instanceof Product); + Assert.assertTrue(result.isCompatible(new Product(emptyList()))); + } + + private ParseTree createParseTree(String sql) { + return new OpenDistroSqlAnalyzer(new SqlAnalysisConfig(true, true, 1000)).analyzeSyntax(sql); + } + + private Type visit(String sql) { + ParseTree parseTree = createParseTree(sql); + return parseTree.accept(new AntlrSqlParseTreeVisitor<>(analyzer)); + } + +} diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/esdomain/mapping/FieldMappingsTest.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/esdomain/mapping/FieldMappingsTest.java new file mode 100644 index 0000000000..10df1dc486 --- /dev/null +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/esdomain/mapping/FieldMappingsTest.java @@ -0,0 +1,83 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.esdomain.mapping; + +import com.amazon.opendistroforelasticsearch.sql.esdomain.LocalClusterState; +import com.google.common.base.Charsets; +import com.google.common.io.Resources; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; + +import java.io.IOException; +import java.net.URL; +import java.util.HashMap; +import java.util.Map; + +import static com.amazon.opendistroforelasticsearch.sql.util.CheckScriptContents.mockLocalClusterState; +import static org.hamcrest.MatcherAssert.assertThat; +import static org.hamcrest.Matchers.aMapWithSize; +import static org.hamcrest.Matchers.allOf; +import static org.hamcrest.Matchers.hasEntry; + +/** + * Test for FieldMappings class + */ +public class FieldMappingsTest { + + private static final String TEST_MAPPING_FILE = "mappings/field_mappings.json"; + + @Before + public void setUp() throws IOException { + URL url = Resources.getResource(TEST_MAPPING_FILE); + String mappings = Resources.toString(url, Charsets.UTF_8); + mockLocalClusterState(mappings); + } + + @After + public void cleanUp() { + LocalClusterState.state(null); + } + + @Test + public void flatFieldMappingsShouldIncludeFieldsOnAllLevels() { + IndexMappings indexMappings = LocalClusterState.state().getFieldMappings(new String[]{"field_mappings"}); + FieldMappings fieldMappings = indexMappings.firstMapping().firstMapping(); + + Map typeByFieldName = new HashMap<>(); + fieldMappings.flat(typeByFieldName::put); + assertThat( + typeByFieldName, + allOf( + aMapWithSize(13), + hasEntry("address", "text"), + hasEntry("age", "integer"), + hasEntry("employer", "text"), + hasEntry("employer.raw", "text"), + hasEntry("employer.keyword", "keyword"), + hasEntry("projects", "nested"), + hasEntry("projects.active", "boolean"), + hasEntry("projects.members", "nested"), + hasEntry("projects.members.name", "text"), + hasEntry("manager", "object"), + hasEntry("manager.name", "text"), + hasEntry("manager.name.keyword", "keyword"), + hasEntry("manager.address", "keyword") + ) + ); + } + +} diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/AggregationIT.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/AggregationIT.java index 0c898b0642..3215936d40 100644 --- a/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/AggregationIT.java +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/AggregationIT.java @@ -451,7 +451,7 @@ public void countGroupByRange() throws IOException { public void countGroupByDateTest() throws IOException { String result = explainQuery(String.format("select insert_time from %s group by date_histogram" + - "(field='insert_time','interval'='1.5h','format'='yyyy-MM','min_doc_count'=5) ", TEST_INDEX_ONLINE)); + "('field'='insert_time','interval'='1.5h','format'='yyyy-MM','min_doc_count'=5) ", TEST_INDEX_ONLINE)); Assert.assertThat(result.replaceAll("\\s+", ""), containsString("{\"date_histogram\":{\"field\":\"insert_time\",\"format\":\"yyyy-MM\"," + "\"interval\":\"1.5h\",\"offset\":0,\"order\":{\"_key\":\"asc\"},\"keyed\":false," + @@ -461,7 +461,7 @@ public void countGroupByDateTest() throws IOException { @Test public void countGroupByDateTestWithAlias() throws IOException { String result = explainQuery(String.format("select insert_time from %s group by date_histogram" + - "(field='insert_time','interval'='1.5h','format'='yyyy-MM','alias'='myAlias')", TEST_INDEX_ONLINE)); + "('field'='insert_time','interval'='1.5h','format'='yyyy-MM','alias'='myAlias')", TEST_INDEX_ONLINE)); Assert.assertThat(result.replaceAll("\\s+",""), containsString("myAlias\":{\"date_histogram\":{\"field\":\"insert_time\"," + "\"format\":\"yyyy-MM\",\"interval\":\"1.5h\"")); @@ -508,7 +508,7 @@ public void topHitTest() throws IOException { @Test public void topHitTest_WithInclude() throws IOException { - String query = String.format("select topHits('size'=3,age='desc',include=age) from %s/account group by gender", + String query = String.format("select topHits('size'=3,age='desc','include'=age) from %s/account group by gender", TEST_INDEX_ACCOUNT); JSONObject result = executeQuery(query); JSONObject gender = getAggregation(result, "gender"); @@ -1109,9 +1109,10 @@ public void docsReturnedTestWithDocsHint() throws Exception { Assert.assertThat(getHits(result).length(), equalTo(10)); } + @Ignore("There is not any text field in the index. Need fix later") @Test public void termsWithScript() throws Exception { - String query = String.format("select count(*), avg(number) from %s group by terms('alias'='asdf'," + + String query = String.format("select count(*), avg(all_client) from %s group by terms('alias'='asdf'," + " substring(field, 0, 1)), date_histogram('alias'='time', 'field'='timestamp', " + "'interval'='20d ', 'format'='yyyy-MM-dd') limit 1000", TEST_INDEX_ONLINE); String result = explainQuery(query); @@ -1122,21 +1123,21 @@ public void termsWithScript() throws Exception { @Test public void groupByScriptedDateHistogram() throws Exception { - String query = String.format("select count(*), avg(number) from %s group by date_histogram('alias'='time'," + - " ceil(timestamp), 'interval'='20d ', 'format'='yyyy-MM-dd') limit 1000" , TEST_INDEX_ONLINE); + String query = String.format("select count(*), avg(all_client) from %s group by date_histogram('alias'='time'," + + " ceil(all_client), 'interval'='20d ', 'format'='yyyy-MM-dd') limit 1000" , TEST_INDEX_ONLINE); String result = explainQuery(query); - Assert.assertThat(result, containsString("Math.ceil(doc['timestamp'].value);")); + Assert.assertThat(result, containsString("Math.ceil(doc['all_client'].value);")); Assert.assertThat(result, containsString("\"script\":{\"source\"")); } @Test public void groupByScriptedHistogram() throws Exception { - String query = String.format("select count(*) from %s group by histogram('alias'='field', pow(field,1))", + String query = String.format("select count(*) from %s group by histogram('alias'='all_field', pow(all_client,1))", TEST_INDEX_ONLINE); String result = explainQuery(query); - Assert.assertThat(result, containsString("Math.pow(doc['field'].value, 1)")); + Assert.assertThat(result, containsString("Math.pow(doc['all_client'].value, 1)")); Assert.assertThat(result, containsString("\"script\":{\"source\"")); } diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/ExplainIT.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/ExplainIT.java index bcc1291bac..566872a1c5 100644 --- a/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/ExplainIT.java +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/ExplainIT.java @@ -69,8 +69,8 @@ public void aggregationQuery() throws IOException { String expectedOutput = Files.toString(new File(expectedOutputFilePath), StandardCharsets.UTF_8) .replaceAll("\r",""); - String result = explainQuery(String.format("SELECT a, CASE WHEN gender='0' then 'aaa' else 'bbb'end a2345," + - "count(c) FROM %s GROUP BY terms('field'='a','execution_hint'='global_ordinals'),a2345", + String result = explainQuery(String.format("SELECT address, CASE WHEN gender='0' then 'aaa' else 'bbb'end a2345," + + "count(age) FROM %s GROUP BY terms('field'='address','execution_hint'='global_ordinals'),a2345", TEST_INDEX_ACCOUNT)); Assert.assertThat(result.replaceAll("\\s+",""), equalTo(expectedOutput.replaceAll("\\s+",""))); } @@ -84,7 +84,7 @@ public void explainScriptValue() throws IOException { .replaceAll("\r",""); String result = explainQuery(String.format("SELECT case when gender is null then 'aaa' " + - "else gender end test , cust_code FROM %s", TEST_INDEX_ACCOUNT)); + "else gender end test , account_number FROM %s", TEST_INDEX_ACCOUNT)); Assert.assertThat(result.replaceAll("\\s+",""), equalTo(expectedOutput.replaceAll("\\s+",""))); } @@ -96,8 +96,8 @@ public void betweenScriptValue() throws IOException { String expectedOutput = Files.toString(new File(expectedOutputFilePath), StandardCharsets.UTF_8) .replaceAll("\r",""); - String result = explainQuery(String.format("SELECT case when value between 100 and 200 then 'aaa' " + - "else value end test, cust_code FROM %s", TEST_INDEX_ACCOUNT)); + String result = explainQuery(String.format("SELECT case when balance between 100 and 200 then 'aaa' " + + "else balance end test, account_number FROM %s", TEST_INDEX_ACCOUNT)); Assert.assertThat(result.replaceAll("\\s+",""), equalTo(expectedOutput.replaceAll("\\s+",""))); } @@ -158,9 +158,9 @@ public void multiMatchQuery() throws IOException { String expectedOutput = Files.toString(new File(expectedOutputFilePath), StandardCharsets.UTF_8) .replaceAll("\r", ""); - String result = explainQuery(String.format("SELECT * FROM %s WHERE q=multimatch(query='this is a test'," + - "fields='subject^3,message',analyzer='standard',type='best_fields',boost=1.0," + - "slop=0,tie_breaker=0.3,operator='and')", TEST_INDEX_ACCOUNT)); + String result = explainQuery(String.format("SELECT * FROM %s WHERE multimatch('query'='this is a test'," + + "'fields'='subject^3,message','analyzer'='standard','type'='best_fields','boost'=1.0," + + "'slop'=0,'tie_breaker'=0.3,'operator'='and')", TEST_INDEX_ACCOUNT)); Assert.assertThat(result.replaceAll("\\s+", ""), equalTo(expectedOutput.replaceAll("\\s+", ""))); } @@ -172,17 +172,17 @@ public void termsIncludeExcludeExplainTest() throws IOException { final String expected2 = "\"include\":[\"honda\",\"mazda\"],\"exclude\":[\"jensen\",\"rover\"]"; final String expected3 = "\"include\":{\"partition\":0,\"num_partitions\":20}"; - String result = explainQuery(queryPrefix + " terms(field='correspond_brand_name',size='10'," + - "alias='correspond_brand_name',include='\\\".*sport.*\\\"',exclude='\\\"water_.*\\\"')"); + String result = explainQuery(queryPrefix + " terms('field'='correspond_brand_name','size'='10'," + + "'alias'='correspond_brand_name','include'='\\\".*sport.*\\\"','exclude'='\\\"water_.*\\\"')"); Assert.assertThat(result, containsString(expected1)); - result = explainQuery(queryPrefix + "terms(field='correspond_brand_name',size='10'," + - "alias='correspond_brand_name',include='[\\\"mazda\\\", \\\"honda\\\"]'," + - "exclude='[\\\"rover\\\", \\\"jensen\\\"]')"); + result = explainQuery(queryPrefix + "terms('field'='correspond_brand_name','size'='10'," + + "'alias'='correspond_brand_name','include'='[\\\"mazda\\\", \\\"honda\\\"]'," + + "'exclude'='[\\\"rover\\\", \\\"jensen\\\"]')"); Assert.assertThat(result, containsString(expected2)); - result = explainQuery(queryPrefix + "terms(field='correspond_brand_name',size='10'," + - "alias='correspond_brand_name',include='{\\\"partition\\\":0,\\\"num_partitions\\\":20}')"); + result = explainQuery(queryPrefix + "terms('field'='correspond_brand_name','size'='10'," + + "'alias'='correspond_brand_name','include'='{\\\"partition\\\":0,\\\"num_partitions\\\":20}')"); Assert.assertThat(result, containsString(expected3)); } diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/GetEndpointQueryIT.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/GetEndpointQueryIT.java index b910533ccd..fc657aa24e 100644 --- a/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/GetEndpointQueryIT.java +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/GetEndpointQueryIT.java @@ -41,7 +41,7 @@ public void unicodeTermInQuery() throws IOException { // NOTE: There are unicode characters in name, not just whitespace. final String name = "盛虹"; - final String query = String.format(Locale.ROOT, "SELECT id, firstname FROM %s " + + final String query = String.format(Locale.ROOT, "SELECT _id, firstname FROM %s " + "WHERE firstname=matchQuery('%s') LIMIT 2", TEST_INDEX_ACCOUNT, name); final JSONObject result = executeQueryWithGetRequest(query); diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/MethodQueryIT.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/MethodQueryIT.java index 01bdd02f90..ff1f6aa59e 100644 --- a/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/MethodQueryIT.java +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/MethodQueryIT.java @@ -45,7 +45,7 @@ protected void init() throws Exception { @Test public void queryTest() throws IOException { final String result = explainQuery(String.format(Locale.ROOT, - "select address from %s where q= query('address:880 Holmes Lane') limit 3", + "select address from %s where query('address:880 Holmes Lane') limit 3", TestsConstants.TEST_INDEX_ACCOUNT)); Assert.assertThat(result, containsString("query_string\":{\"query\":\"address:880 Holmes Lane")); diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/PrettyFormatResponseIT.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/PrettyFormatResponseIT.java index 6521070bcd..eda5aecc95 100644 --- a/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/PrettyFormatResponseIT.java +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/PrettyFormatResponseIT.java @@ -118,6 +118,7 @@ public void selectNames() throws IOException { assertContainsData(getDataRows(response), nameFields); } + @Ignore("Semantic analysis takes care of this") @Test public void selectWrongField() throws IOException { JSONObject response = executeQuery( diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/QueryAnalysisIT.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/QueryAnalysisIT.java index 5ca911957a..ac7e468f59 100644 --- a/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/QueryAnalysisIT.java +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/QueryAnalysisIT.java @@ -15,18 +15,29 @@ package com.amazon.opendistroforelasticsearch.sql.esintgtest; -import com.amazon.opendistroforelasticsearch.sql.antlr.syntax.SqlSyntaxAnalysisException; +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.SemanticAnalysisException; +import com.amazon.opendistroforelasticsearch.sql.antlr.syntax.SyntaxAnalysisException; import com.amazon.opendistroforelasticsearch.sql.exception.SqlParseException; import com.amazon.opendistroforelasticsearch.sql.utils.StringUtils; +import org.elasticsearch.action.index.IndexRequest; +import org.elasticsearch.action.index.IndexResponse; +import org.elasticsearch.client.Request; import org.elasticsearch.client.Response; import org.elasticsearch.client.ResponseException; +import org.elasticsearch.client.RestClient; +import org.elasticsearch.rest.RestStatus; +import org.elasticsearch.test.ESIntegTestCase; import org.junit.Assert; import org.junit.Test; import java.io.IOException; import static com.amazon.opendistroforelasticsearch.sql.plugin.SqlSettings.QUERY_ANALYSIS_ENABLED; +import static com.amazon.opendistroforelasticsearch.sql.plugin.SqlSettings.QUERY_ANALYSIS_SEMANTIC_SUGGESTION; +import static com.amazon.opendistroforelasticsearch.sql.plugin.SqlSettings.QUERY_ANALYSIS_SEMANTIC_THRESHOLD; +import static org.elasticsearch.common.xcontent.XContentType.JSON; import static org.elasticsearch.rest.RestStatus.BAD_REQUEST; +import static org.elasticsearch.rest.RestStatus.OK; import static org.hamcrest.Matchers.containsString; import static org.hamcrest.Matchers.equalTo; @@ -42,33 +53,177 @@ protected void init() throws Exception { @Test public void missingFromClauseShouldThrowSyntaxException() { - queryShouldThrowSyntaxException( - "SELECT 1" - ); + queryShouldThrowSyntaxException("SELECT 1"); } @Test public void unsupportedOperatorShouldThrowSyntaxException() { queryShouldThrowSyntaxException( - "SELECT *", - "FROM elasticsearch-sql_test_index_bank", - "WHERE age <=> 1" + "SELECT * FROM elasticsearch-sql_test_index_bank WHERE age <=> 1" ); } @Test - public void unsupportedOperatorShouldThrowOtherExceptionIfAnalyzerDisabled() { + public void unsupportedOperatorShouldSkipAnalysisAndThrowOtherExceptionIfAnalyzerDisabled() { runWithClusterSetting( new ClusterSetting("transient", QUERY_ANALYSIS_ENABLED, "false"), () -> queryShouldThrowException( - SqlParseException.class, - "SELECT *", - "FROM elasticsearch-sql_test_index_bank", - "WHERE age <=> 1" + "SELECT * FROM elasticsearch-sql_test_index_bank WHERE age <=> 1", + SqlParseException.class + ) + ); + } + + @Test + public void suggestionForWrongFieldNameShouldBeProvidedIfSuggestionEnabled() { + runWithClusterSetting( + new ClusterSetting("transient", QUERY_ANALYSIS_SEMANTIC_SUGGESTION, "true"), + () -> queryShouldThrowSemanticException( + "SELECT * FROM elasticsearch-sql_test_index_bank b WHERE a.balance = 1000", + "Field [a.balance] cannot be found or used here.", + "Did you mean [b.balance]?" ) ); } + @Test + public void wrongFieldNameShouldPassIfIndexMappingIsVeryLarge() { + runWithClusterSetting( + new ClusterSetting("transient", QUERY_ANALYSIS_SEMANTIC_THRESHOLD, "5"), + () -> queryShouldPassAnalysis("SELECT * FROM elasticsearch-sql_test_index_bank WHERE age123 = 1") + ); + } + + @Test + public void useNewAddedFieldShouldPass() throws Exception { + // 1.Make sure new add fields not there originally + String query = "SELECT salary FROM elasticsearch-sql_test_index_bank WHERE education = 'PhD'"; + queryShouldThrowSemanticException(query, "Field [education] cannot be found or used here."); + + // 2.Index an document with fields not present in mapping previously + String docWithNewFields = "{\"account_number\":12345,\"education\":\"PhD\",\"salary\": \"10000\"}"; + IndexResponse resp = client().index(new IndexRequest().index("elasticsearch-sql_test_index_bank"). + source(docWithNewFields, JSON)).get(); + Assert.assertEquals(RestStatus.CREATED, resp.status()); + + // 3.Same query should pass + executeQuery(query); + } + + @Test + public void nonExistingFieldNameShouldThrowSemanticException() { + queryShouldThrowSemanticException( + "SELECT * FROM elasticsearch-sql_test_index_bank WHERE balance1 = 1000", + "Field [balance1] cannot be found or used here." + //"Did you mean [balance]?" + ); + } + + @Test + public void nonExistingIndexAliasShouldThrowSemanticException() { + queryShouldThrowSemanticException( + "SELECT * FROM elasticsearch-sql_test_index_bank b WHERE a.balance = 1000", + "Field [a.balance] cannot be found or used here." + //"Did you mean [b.balance]?" + ); + } + + @Test + public void indexJoinNonNestedFieldShouldThrowSemanticException() { + queryShouldThrowSemanticException( + "SELECT * FROM elasticsearch-sql_test_index_bank b1, b1.firstname f1", + "Operator [JOIN] cannot work with [INDEX, TEXT]." + ); + } + + @Test + public void scalarFunctionCallWithTypoInNameShouldThrowSemanticException() { + queryShouldThrowSemanticException( + "SELECT * FROM elasticsearch-sql_test_index_bank WHERE ABSa(age) = 1", + "Function [ABSA] cannot be found or used here.", + "Did you mean [ABS]?" + ); + } + + @Test + public void scalarFunctionCallWithWrongTypeArgumentShouldThrowSemanticException() { + queryShouldThrowSemanticException( + "SELECT * FROM elasticsearch-sql_test_index_bank WHERE LOG(lastname) = 1", + "Function [LOG] cannot work with [KEYWORD].", + "Usage: LOG(NUMBER T) -> T or LOG(NUMBER T, NUMBER) -> T" + ); + } + + @Test + public void aggregateFunctionCallWithWrongNumberOfArgumentShouldThrowSemanticException() { + queryShouldThrowSemanticException( + "SELECT city FROM elasticsearch-sql_test_index_bank GROUP BY city HAVING MAX(age, birthdate) > 1", + "Function [MAX] cannot work with [INTEGER, DATE].", + "Usage: MAX(NUMBER T) -> T" + ); + } + + @Test + public void aggregateFunctionCallWithWrongScalarFunctionCallShouldThrowSemanticException() { + queryShouldThrowSemanticException( + "SELECT MAX(LOG(firstname)) FROM elasticsearch-sql_test_index_bank GROUP BY city", + "Function [LOG] cannot work with [TEXT]." + ); + } + + @Test + public void compareIntegerFieldWithBooleanShouldThrowSemanticException() { + queryShouldThrowSemanticException( + "SELECT * FROM elasticsearch-sql_test_index_bank b WHERE b.age IS FALSE", + "Operator [IS] cannot work with [INTEGER, BOOLEAN].", + "Usage: Please use compatible types from each side." + ); + } + + @Test + public void compareNumberFieldWithStringShouldThrowSemanticException() { + queryShouldThrowSemanticException( + "SELECT * FROM elasticsearch-sql_test_index_bank b WHERE b.age >= 'test'", + "Operator [>=] cannot work with [INTEGER, STRING].", + "Usage: Please use compatible types from each side." + ); + } + + @Test + public void compareLogFunctionCallWithNumberFieldWithStringShouldThrowSemanticException() { + queryShouldThrowSemanticException( + "SELECT * FROM elasticsearch-sql_test_index_bank b WHERE LOG(b.balance) != 'test'", + "Operator [!=] cannot work with [LONG, STRING].", + "Usage: Please use compatible types from each side." + ); + } + + @Test + public void unionNumberFieldWithStringShouldThrowSemanticException() { + queryShouldThrowSemanticException( + "SELECT age FROM elasticsearch-sql_test_index_bank" + + " UNION SELECT address FROM elasticsearch-sql_test_index_bank", + "Operator [UNION] cannot work with [INTEGER, TEXT]." + ); + } + + @Test + public void minusBooleanFieldWithDateShouldThrowSemanticException() { + queryShouldThrowSemanticException( + "SELECT male FROM elasticsearch-sql_test_index_bank" + + " MINUS SELECT birthdate FROM elasticsearch-sql_test_index_bank", + "Operator [MINUS] cannot work with [BOOLEAN, DATE]." + ); + } + + @Test + public void useInClauseWithIncompatibleFieldTypesShouldFail() { + queryShouldThrowSemanticException( + "SELECT * FROM elasticsearch-sql_test_index_bank WHERE male " + + " IN (SELECT 1 FROM elasticsearch-sql_test_index_bank)", + "Operator [IN] cannot work with [BOOLEAN, INTEGER]." + ); + } /** Run the query with cluster setting changed and cleaned after complete */ private void runWithClusterSetting(ClusterSetting setting, Runnable query) { @@ -91,20 +246,26 @@ private void runWithClusterSetting(ClusterSetting setting, Runnable query) { } } - private void queryShouldThrowSyntaxException(String... clauses) { - queryShouldThrowException(SqlSyntaxAnalysisException.class, clauses); + private void queryShouldThrowSyntaxException(String query, String... expectedMsgs) { + queryShouldThrowException(query, SyntaxAnalysisException.class, expectedMsgs); + } + + private void queryShouldThrowSemanticException(String query, String... expectedMsgs) { + queryShouldThrowException(query, SemanticAnalysisException.class, expectedMsgs); } - private void queryShouldThrowException(Class exceptionType, String... clauses) { - String query = String.join(" ", clauses); + private void queryShouldThrowException(String query, Class exceptionType, String... expectedMsgs) { try { - explainQuery(query); + executeQuery(query); Assert.fail("Expected ResponseException, but none was thrown for query: " + query); } catch (ResponseException e) { ResponseAssertion assertion = new ResponseAssertion(e.getResponse()); assertion.assertStatusEqualTo(BAD_REQUEST.getStatus()); assertion.assertBodyContains("\"type\": \"" + exceptionType.getSimpleName() + "\""); + for (String msg : expectedMsgs) { + assertion.assertBodyContains(msg); + } } catch (IOException e) { throw new IllegalStateException( @@ -112,6 +273,22 @@ private void queryShouldThrowException(Class exceptionType, String... cla } } + private void queryShouldPassAnalysis(String query) { + String endpoint = "/_opendistro/_sql?"; + String requestBody = makeRequest(query); + Request sqlRequest = new Request("POST", endpoint); + sqlRequest.setJsonEntity(requestBody); + + try { + RestClient restClient = ESIntegTestCase.getRestClient(); + Response response = restClient.performRequest(sqlRequest); + ResponseAssertion assertion = new ResponseAssertion(response); + assertion.assertStatusEqualTo(OK.getStatus()); + } catch (IOException e) { + throw new IllegalStateException("Unexpected IOException raised for query: " + query); + } + } + private static class ResponseAssertion { private final Response response; private final String body; diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/QueryFunctionsIT.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/QueryFunctionsIT.java index d131736841..009f32d9b8 100644 --- a/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/QueryFunctionsIT.java +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/QueryFunctionsIT.java @@ -174,7 +174,7 @@ public void multiMatchQuerySingleField() throws IOException { query( "SELECT firstname", FROM_ACCOUNTS, - "WHERE MULTI_MATCH(query='Ayers', fields='firstname')" + "WHERE MULTI_MATCH('query'='Ayers', 'fields'='firstname')" ), hits( hasValueForFields("Ayers", "firstname") @@ -188,7 +188,7 @@ public void multiMatchQueryWildcardField() throws IOException { query( "SELECT firstname, lastname", FROM_ACCOUNTS, - "WHERE MULTI_MATCH(query='Bradshaw', fields='*name')" + "WHERE MULTI_MATCH('query'='Bradshaw', 'fields'='*name')" ), hits( hasValueForFields("Bradshaw", "firstname", "lastname") diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/QueryIT.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/QueryIT.java index 500b849b79..db95d54b2c 100644 --- a/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/QueryIT.java +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/QueryIT.java @@ -201,6 +201,7 @@ public void selectSpecificFields() throws IOException { } } + @Ignore("Will fix this in issue https://github.com/opendistro-for-elasticsearch/sql/issues/121") @Test public void selectFieldWithSpace() throws IOException { String[] arr = new String[] {"test field"}; @@ -487,6 +488,7 @@ public void notBetweenTest() throws IOException { } } + @Ignore("Semantic analysis failed because 'age' doesn't exist.") @Test public void inTest() throws IOException { JSONObject response = executeQuery( @@ -521,7 +523,7 @@ public void inTermsTestWithIdentifiersTreatedLikeStrings() throws IOException { JSONObject response = executeQuery( String.format(Locale.ROOT, "SELECT name " + "FROM %s/gotCharacters " + - "WHERE name.firstname = IN_TERMS(daenerys,eddard) " + + "WHERE name.firstname = IN_TERMS('daenerys','eddard') " + "LIMIT 1000", TestsConstants.TEST_INDEX_GAME_OF_THRONES)); @@ -589,7 +591,7 @@ public void termQueryWithStringIdentifier() throws IOException { JSONObject response = executeQuery( String.format(Locale.ROOT, "SELECT name " + "FROM %s/gotCharacters " + - "WHERE name.firstname = term(brandon) " + + "WHERE name.firstname = term('brandon') " + "LIMIT 1000", TestsConstants.TEST_INDEX_GAME_OF_THRONES)); @@ -1433,7 +1435,7 @@ public void nestedOnInTermsQuery() throws IOException { JSONObject response = executeQuery( String.format(Locale.ROOT, "SELECT * " + "FROM %s/nestedType " + - "WHERE nested(message.info) = IN_TERMS(a, b)", + "WHERE nested(message.info) = IN_TERMS('a', 'b')", TestsConstants.TEST_INDEX_NESTED_TYPE)); Assert.assertEquals(3, getTotalHits(response)); diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/TermQueryExplainIT.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/TermQueryExplainIT.java index 6b8bbd00e6..96c706b591 100644 --- a/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/TermQueryExplainIT.java +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/esintgtest/TermQueryExplainIT.java @@ -130,7 +130,7 @@ public void testIdenticalMappings() throws IOException { String result = explainQuery( "SELECT firstname, birthdate, state " + "FROM elasticsearch-sql_test_index_bank, elasticsearch-sql_test_index_bank_two " + - "WHERE state = 'WA' OR male = 'true'" + "WHERE state = 'WA' OR male = true" ); assertThat(result, containsString("term")); assertThat(result, containsString("state.keyword")); @@ -142,7 +142,7 @@ public void testIdenticalMappingsWithTypes() throws IOException { String result = explainQuery( "SELECT firstname, birthdate, state " + "FROM elasticsearch-sql_test_index_bank/account, elasticsearch-sql_test_index_bank_two/account_two " + - "WHERE state = 'WA' OR male = 'true'" + "WHERE state = 'WA' OR male = true" ); assertThat(result, containsString("term")); assertThat(result, containsString("state.keyword")); @@ -155,7 +155,7 @@ public void testIdenticalMappingsWithPartialType() throws IOException { String result = explainQuery( "SELECT firstname, birthdate, state " + "FROM elasticsearch-sql_test_index_bank/account, elasticsearch-sql_test_index_bank_two " + - "WHERE state = 'WA' OR male = 'true'" + "WHERE state = 'WA' OR male = true" ); assertThat(result, containsString("term")); assertThat(result, containsString("state.keyword")); @@ -189,7 +189,7 @@ public void testTextAndKeywordAppendsKeywordAlias() throws IOException { @Test public void testBooleanFieldNoKeywordAlias() throws IOException { - String result = explainQuery("SELECT * FROM elasticsearch-sql_test_index_bank WHERE male = 'false'"); + String result = explainQuery("SELECT * FROM elasticsearch-sql_test_index_bank WHERE male = false"); assertThat(result, containsString("term")); assertThat(result, not(containsString("male."))); } diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/unittest/JSONRequestTest.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/unittest/JSONRequestTest.java index 96caa82a01..efe1fd2d28 100644 --- a/src/test/java/com/amazon/opendistroforelasticsearch/sql/unittest/JSONRequestTest.java +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/unittest/JSONRequestTest.java @@ -262,9 +262,9 @@ public void searchSanity() throws IOException { @Test public void aggregationQuery() throws IOException { String result = explain(String.format("{\"query\":\"" + - "SELECT a, CASE WHEN gender='0' THEN 'aaa' ELSE 'bbb' END AS a2345, count(c) " + + "SELECT address, CASE WHEN gender='0' THEN 'aaa' ELSE 'bbb' END AS a2345, count(age) " + "FROM %s " + - "GROUP BY terms('field'='a','execution_hint'='global_ordinals'), a2345\"}", TestsConstants.TEST_INDEX_ACCOUNT)); + "GROUP BY terms('field'='address','execution_hint'='global_ordinals'), a2345\"}", TestsConstants.TEST_INDEX_ACCOUNT)); String expectedOutput = Files.toString( new File(getResourcePath() + "src/test/resources/expectedOutput/aggregation_query_explain.json"), StandardCharsets.UTF_8) .replaceAll("\r", ""); diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/unittest/LocalClusterStateTest.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/unittest/LocalClusterStateTest.java index ae890818b7..0c1859eeca 100644 --- a/src/test/java/com/amazon/opendistroforelasticsearch/sql/unittest/LocalClusterStateTest.java +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/unittest/LocalClusterStateTest.java @@ -16,9 +16,9 @@ package com.amazon.opendistroforelasticsearch.sql.unittest; import com.amazon.opendistroforelasticsearch.sql.esdomain.LocalClusterState; -import com.amazon.opendistroforelasticsearch.sql.esdomain.LocalClusterState.FieldMappings; -import com.amazon.opendistroforelasticsearch.sql.esdomain.LocalClusterState.IndexMappings; -import com.amazon.opendistroforelasticsearch.sql.esdomain.LocalClusterState.TypeMappings; +import com.amazon.opendistroforelasticsearch.sql.esdomain.mapping.FieldMappings; +import com.amazon.opendistroforelasticsearch.sql.esdomain.mapping.IndexMappings; +import com.amazon.opendistroforelasticsearch.sql.esdomain.mapping.TypeMappings; import com.amazon.opendistroforelasticsearch.sql.esintgtest.TestsConstants; import com.amazon.opendistroforelasticsearch.sql.plugin.SqlSettings; import org.elasticsearch.cluster.ClusterChangedEvent; diff --git a/src/test/resources/expectedOutput/aggregation_query_explain.json b/src/test/resources/expectedOutput/aggregation_query_explain.json index 975e1a25f0..9675b2b5be 100644 --- a/src/test/resources/expectedOutput/aggregation_query_explain.json +++ b/src/test/resources/expectedOutput/aggregation_query_explain.json @@ -3,14 +3,14 @@ "size" : 0, "_source" : { "includes" : [ - "a", + "address", "script", "COUNT" ], "excludes" : [ ] }, "stored_fields" : [ - "a", + "address", "a2345" ], "script_fields" : { @@ -23,9 +23,9 @@ } }, "aggregations" : { - "terms(field=a,execution_hint=global_ordinals)" : { + "terms(field=address,execution_hint=global_ordinals)" : { "terms" : { - "field" : "a", + "field" : "address", "size" : 10, "min_doc_count" : 1, "shard_min_doc_count" : 0, @@ -61,9 +61,9 @@ ] }, "aggregations" : { - "COUNT(c)" : { + "COUNT(age)" : { "value_count" : { - "field" : "c" + "field" : "age" } } } diff --git a/src/test/resources/expectedOutput/between_query.json b/src/test/resources/expectedOutput/between_query.json index a208651a4f..e3610f2dc1 100644 --- a/src/test/resources/expectedOutput/between_query.json +++ b/src/test/resources/expectedOutput/between_query.json @@ -3,14 +3,14 @@ "size" : 200, "_source" : { "includes" : [ - "cust_code" + "account_number" ], "excludes" : [ ] }, "script_fields" : { "test" : { "script" : { - "source" : "if((doc['value'].value >= 100 && doc['value'].value <=200)){'aaa'} else {doc['value'].value}", + "source" : "if((doc['balance'].value >= 100 && doc['balance'].value <=200)){'aaa'} else {doc['balance'].value}", "lang" : "painless" }, "ignore_failure" : false diff --git a/src/test/resources/expectedOutput/script_value.json b/src/test/resources/expectedOutput/script_value.json index c105e62d19..3c03baccff 100644 --- a/src/test/resources/expectedOutput/script_value.json +++ b/src/test/resources/expectedOutput/script_value.json @@ -3,7 +3,7 @@ "size" : 200, "_source" : { "includes" : [ - "cust_code" + "account_number" ], "excludes" : [ ] }, diff --git a/src/test/resources/mappings/field_mappings.json b/src/test/resources/mappings/field_mappings.json new file mode 100644 index 0000000000..bf059d0ae9 --- /dev/null +++ b/src/test/resources/mappings/field_mappings.json @@ -0,0 +1,72 @@ +{ + "semantics": { + "mappings": { + "account": { + "properties": { + "address": { + "type": "text" + }, + "age": { + "type": "integer" + }, + "employer": { + "type": "text", + "fields": { + "raw": { + "type": "text", + "ignore_above": 256 + }, + "keyword": { + "type": "keyword", + "ignore_above": 256 + } + } + }, + "projects": { + "type": "nested", + "properties": { + "members": { + "type": "nested", + "properties": { + "name": { + "type": "text" + } + } + }, + "active": { + "type": "boolean" + } + } + }, + "manager": { + "properties": { + "name": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 256 + } + } + }, + "address": { + "type": "keyword" + } + } + } + } + } + }, + "settings": { + "index": { + "number_of_shards": 5, + "number_of_replicas": 0, + "version": { + "created": "6050399" + } + } + }, + "mapping_version": "1", + "settings_version": "1" + } +} \ No newline at end of file diff --git a/src/test/resources/mappings/semantics.json b/src/test/resources/mappings/semantics.json new file mode 100644 index 0000000000..46de8f2eaa --- /dev/null +++ b/src/test/resources/mappings/semantics.json @@ -0,0 +1,92 @@ +{ + "field_mappings": { + "mappings": { + "account": { + "properties": { + "address": { + "type": "text" + }, + "age": { + "type": "integer" + }, + "balance": { + "type": "double" + }, + "city": { + "type": "keyword" + }, + "birthday": { + "type": "date" + }, + "location": { + "type": "geo_point" + }, + "new_field": { + "type": "some_new_es_type_outside_type_system" + }, + "field with spaces": { + "type": "text" + }, + "employer": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 256 + } + } + }, + "projects": { + "type": "nested", + "properties": { + "members": { + "type": "nested", + "properties": { + "name": { + "type": "text" + } + } + }, + "active": { + "type": "boolean" + }, + "release": { + "type": "date" + } + } + }, + "manager": { + "properties": { + "name": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 256 + } + } + }, + "address": { + "type": "keyword" + }, + "salary": { + "type": "long" + } + } + } + } + } + }, + "settings": { + "index": { + "number_of_shards": 5, + "number_of_replicas": 0, + "version": { + "created": "6050399" + } + } + }, + "mapping_version": "1", + "settings_version": "1" + } +} \ No newline at end of file