Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation for simple_query_string. #61

Closed
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
a422774
Support MATCH_BOOL_PREFIX in SQL parser.
May 17, 2022
838da0f
Add an interface for PPL and SQL parsers.
May 17, 2022
5b346ea
Implementation for `simple_query_string`.
Yury-Fridlyand May 17, 2022
f57429c
Implement Parser interface on PPLSyntaxParser.
May 18, 2022
e80a334
Add base class for all SQL tests.
May 18, 2022
531f31c
Add match_bool_prefix to list of known functions.
May 18, 2022
1b85a9e
Add match_bool_prefix to DSL.
May 18, 2022
dd4f77b
Add MatchBoolPrefixQuery query builder.
May 18, 2022
1ba94d9
Add MatchBoolPrefixQuery to FilterQueryBuilder.
May 18, 2022
a51c0ca
Fix code style.
Yury-Fridlyand May 18, 2022
1855e39
Address PR comments.
May 19, 2022
4321fc1
100% unit test coverage for RelevanceQuery class.
May 19, 2022
92087d9
Add integration test for match_bool_prefix in SQL.
May 19, 2022
c8a5da2
Renaming to make code clearer.
Yury-Fridlyand May 19, 2022
94748b5
Fix parser to accept strings only for field.
Yury-Fridlyand May 19, 2022
d5def08
Fix tests according to the parser update.
Yury-Fridlyand May 19, 2022
3a36a0f
Add DSL test.
Yury-Fridlyand May 19, 2022
2b4a99b
Remaning to fit code standard.
Yury-Fridlyand May 19, 2022
84d84e1
Typo fix.
Yury-Fridlyand May 19, 2022
d379af2
Typo fix for `c8a5da2`.
Yury-Fridlyand May 19, 2022
56ead1b
Enable MatchBoolPrefixParserTest tests since they now pass.
May 20, 2022
bea74fe
Update parser syntax. Rework on field list as on 2D array.
Yury-Fridlyand May 20, 2022
4ae2375
Next iteration on `simple_query_string`.
Yury-Fridlyand May 25, 2022
ebf563c
Simplify processing of `RelevanceFieldList`.
Yury-Fridlyand May 26, 2022
a149dd7
Add user documentation for match_bool_prefix.
May 27, 2022
5ccadd5
Merge from upstream/main and resolve conflicts.
May 30, 2022
b879b03
ANTLR: Extend list of relenvace function arguments.
Yury-Fridlyand May 31, 2022
a7b0a8c
Small optimization in AST processing. Code clean-up. Added argument c…
Yury-Fridlyand May 31, 2022
4e43ccd
Add parser tests.
Yury-Fridlyand May 31, 2022
a641212
Add rest unit and DSL tests.
Yury-Fridlyand May 31, 2022
b2639e3
Add integration tests.
Yury-Fridlyand May 31, 2022
f7923ee
Add doctest.
Yury-Fridlyand May 31, 2022
33edff4
Update test notes.
Yury-Fridlyand May 31, 2022
7c3ff71
Merge remote-tracking branch 'origin/opensearch-project-main' into de…
Yury-Fridlyand Jun 1, 2022
b3bf2c8
Fix legacy test compilation errors, gradle task `:legacy:compileTestJ…
Yury-Fridlyand Jun 1, 2022
928faf5
Fix merge errors.
Yury-Fridlyand Jun 1, 2022
5541d95
Code style.
Yury-Fridlyand Jun 1, 2022
de179c3
Rework `SimpleQueryStringQuery` to use the `RelevanceFunction` interf…
Yury-Fridlyand Jun 1, 2022
a9d5c19
Add more unit tests and test notes.
Yury-Fridlyand Jun 1, 2022
65ec516
Fix merge errors in legacy test. Revert `b3bf2c8`.
Yury-Fridlyand Jun 1, 2022
71bf20c
Minor optimizations according to PR feedback.
Yury-Fridlyand Jun 1, 2022
048984c
Fix flakyness of regular `Map` (`HashMap`), by replacing it with `Imm…
Yury-Fridlyand Jun 2, 2022
cba5749
Revert merge with `opensearch-project-main`, commit `7c3ff71`.
Yury-Fridlyand Jun 2, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
package org.opensearch.sql.common.antlr;

import org.antlr.v4.runtime.tree.ParseTree;

public interface Parser {
ParseTree parse(String query);
}
12 changes: 12 additions & 0 deletions core/src/main/java/org/opensearch/sql/expression/DSL.java
Original file line number Diff line number Diff line change
Expand Up @@ -654,6 +654,18 @@ public FunctionExpression match(Expression... args) {
return compile(BuiltinFunctionName.MATCH, args);
}

public FunctionExpression match_phrase(Expression... args) {
return compile(BuiltinFunctionName.MATCH_PHRASE, args);
}

public FunctionExpression match_bool_prefix(Expression... args) {
return compile(BuiltinFunctionName.MATCH_BOOL_PREFIX, args);
}

private FunctionExpression compile(BuiltinFunctionName bfn, Expression... args) {
return (FunctionExpression) repository.compile(bfn.getName(), Arrays.asList(args.clone()));
}

public FunctionExpression simple_query_string(Expression... args) {
return (FunctionExpression) repository
.compile(BuiltinFunctionName.SIMPLE_QUERY_STRING.getName(), Arrays.asList(args.clone()));
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -187,6 +187,9 @@ public enum BuiltinFunctionName {
* Relevance Function.
*/
MATCH(FunctionName.of("match")),
MATCH_PHRASE(FunctionName.of("match_phrase")),
MATCHPHRASE(FunctionName.of("matchphrase")),
MATCH_BOOL_PREFIX(FunctionName.of("match_bool_prefix")),
SIMPLE_QUERY_STRING(FunctionName.of("simple_query_string")),

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,15 @@
import static org.opensearch.sql.data.type.ExprCoreType.STRING;
import static org.opensearch.sql.data.type.ExprCoreType.STRUCT;

import com.google.common.collect.ImmutableList;
import com.google.common.collect.ImmutableMap;
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;
import lombok.experimental.UtilityClass;
import org.apache.commons.lang3.tuple.ImmutablePair;
import org.opensearch.sql.data.model.ExprValue;
import org.opensearch.sql.data.type.ExprCoreType;
import org.opensearch.sql.data.type.ExprType;
Expand All @@ -29,27 +31,44 @@ public class OpenSearchFunctions {
public static final int MATCH_MAX_NUM_PARAMETERS = 12;
public static final int MATCH_PHRASE_MAX_NUM_PARAMETERS = 3;
public static final int MIN_NUM_PARAMETERS = 2;
public static final int SIMPLE_QUERY_STRING_MAX_NUM_PARAMETERS = 12;

/**
* Add functions specific to OpenSearch to repository.
*/
public void register(BuiltinFunctionRepository repository) {
repository.register(match_bool_prefix());
repository.register(match());
repository.register(simple_query_string());
// Register MATCHPHRASE as MATCH_PHRASE as well for backwards
// compatibility.
repository.register(match_phrase(BuiltinFunctionName.MATCH_PHRASE));
repository.register(match_phrase(BuiltinFunctionName.MATCHPHRASE));
}

private static FunctionResolver match_bool_prefix() {
FunctionName name = BuiltinFunctionName.MATCH_BOOL_PREFIX.getName();
// TODO: Create different resolver more suited for relevance functions.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we do this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is caused by the merge. Fixed in cba5749, the merge was reverted.

return new FunctionResolver(name,
ImmutableMap.<FunctionSignature, FunctionBuilder>builder()
.put(new FunctionSignature(name, ImmutableList.of(STRING, STRING)),
args -> new OpenSearchFunction(name, args))
.build());
}

private static FunctionResolver match() {
FunctionName funcName = BuiltinFunctionName.MATCH.getName();
// At most field, query, and all optional parameters
final int matchMaxNumParameters = 14;
return getRelevanceFunctionResolver(funcName, matchMaxNumParameters, STRING);
return getRelevanceFunctionResolver(funcName, MATCH_MAX_NUM_PARAMETERS, STRING);
}

private static FunctionResolver match_phrase(BuiltinFunctionName matchPhrase) {
FunctionName funcName = matchPhrase.getName();
return getRelevanceFunctionResolver(funcName, MATCH_PHRASE_MAX_NUM_PARAMETERS, STRING);
}

private static FunctionResolver simple_query_string() {
FunctionName funcName = BuiltinFunctionName.SIMPLE_QUERY_STRING.getName();
// At most field, query, and all optional parameters
final int simpleQueryStringMaxNumParameters = 12;
return getRelevanceFunctionResolver(funcName, simpleQueryStringMaxNumParameters, STRUCT);
return getRelevanceFunctionResolver(funcName, SIMPLE_QUERY_STRING_MAX_NUM_PARAMETERS, STRUCT);
}

private static FunctionResolver getRelevanceFunctionResolver(
Expand All @@ -59,11 +78,12 @@ private static FunctionResolver getRelevanceFunctionResolver(
}

private static Map<FunctionSignature, FunctionBuilder> getRelevanceFunctionSignatureMap(
FunctionName funcName, int maxNumParameters, ExprCoreType firstArgType) {
final int minNumParameters = 1;
FunctionName funcName, int numOptionalParameters, ExprCoreType firstArgType) {
FunctionBuilder buildFunction = args -> new OpenSearchFunction(funcName, args);
var signatureMapBuilder = ImmutableMap.<FunctionSignature, FunctionBuilder>builder();
for (int numParameters = minNumParameters; numParameters <= maxNumParameters; numParameters++) {
for (int numParameters = MIN_NUM_PARAMETERS - 1;
numParameters <= MIN_NUM_PARAMETERS + numOptionalParameters;
numParameters++) {
List<ExprType> args = new ArrayList<>(Collections.nCopies(numParameters, STRING));
args.add(0, firstArgType);
signatureMapBuilder.put(new FunctionSignature(funcName, args), buildFunction);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -342,6 +342,17 @@ public void named_non_parse_expression() {
assertAnalyzeEqual(DSL.ref("string_field", STRING), qualifiedName("string_field"));
}

@Test
void match_bool_prefix_expression() {
assertAnalyzeEqual(
dsl.match_bool_prefix(
dsl.namedArgument("field", DSL.literal("fieldA")),
dsl.namedArgument("query", DSL.literal("sample query"))),
AstDSL.function("match_bool_prefix",
AstDSL.unresolvedArg("field", stringLiteral("fieldA")),
AstDSL.unresolvedArg("query", stringLiteral("sample query"))));
}

@Test
void visit_span() {
assertAnalyzeEqual(
Expand Down Expand Up @@ -394,8 +405,8 @@ void simple_query_string_expression_with_params() {
}

@Test
// Test is disabled because DSL and AstDSL expressions always has fields in the different order
// regardless of how they are specified
// Test is disabled because DSL and AstDSL expressions sometimes has fields in the different order
// regardless of how they are specified. Test passes and fails in a flaky behavior.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unfortunate.
Can we raise a follow-up ticket to resolve this in a timely manner?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 048984c.

@Disabled
void simple_query_string_expression_two_fields() {
assertAnalyzeEqual(
Expand All @@ -406,8 +417,8 @@ void simple_query_string_expression_two_fields() {
dsl.namedArgument("query", DSL.literal("sample query"))),
AstDSL.function("simple_query_string",
AstDSL.unresolvedArg("fields", new RelevanceFieldList(Map.of(
stringLiteral("field2"), floatLiteral(.3F),
stringLiteral("field1"), floatLiteral(1.F)))),
stringLiteral("field1"), floatLiteral(1.F),
stringLiteral("field2"), floatLiteral(.3F)))),
AstDSL.unresolvedArg("query", stringLiteral("sample query"))));
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
import static org.opensearch.sql.data.type.ExprCoreType.BOOLEAN;

import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
import org.junit.jupiter.api.Test;
import org.opensearch.sql.data.model.ExprTupleValue;
Expand Down
64 changes: 60 additions & 4 deletions docs/user/dql/functions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2183,8 +2183,6 @@ Example with only ``field`` and ``query`` expressions, and all other parameters
| Bates | 789 Madison Street |
+------------+--------------------+



Another example to show how to set custom values for the optional parameters::

os> SELECT lastname FROM accounts WHERE match(firstname, 'Hattie', operator='AND', boost=2.0);
Expand Down Expand Up @@ -2237,8 +2235,6 @@ Example with only ``fields`` and ``query`` expressions, and all other parameters
| Nanette | Bates | Nogal | 789 Madison Street |
+-------------+------------+--------+--------------------+



Another example to show how to set custom values for the optional parameters::

os> select firstname, lastname, city, address from accounts where simple_query_string(['firstname', city ^ 2], 'Amber Nogal', analyzer=keyword, default_operator='AND');
Expand All @@ -2247,3 +2243,63 @@ Another example to show how to set custom values for the optional parameters::
| firstname | lastname | city | address |
|-------------+------------+--------+-----------|
+-------------+------------+--------+-----------+

MATCH_PHRASE
------------

Description
>>>>>>>>>>>

``match_phrase(field_expression, query_expression[, option=<option_value>]*)``

The match_phrase function maps to the match_phrase query used in search engine, to return the documents that match a provided text with a given field. Available parameters include:

- analyzer
- slop
- zero_terms_query

For backward compatibility, matchphrase is also supported and mapped to match_phrase query as well.

Example with only ``field`` and ``query`` expressions, and all other parameters are set default values::

os> SELECT author, title FROM books WHERE match_phrase(author, 'Alexander Milne');
fetched rows / total rows = 2/2
+----------------------+--------------------------+
| author | title |
|----------------------+--------------------------|
| Alan Alexander Milne | The House at Pooh Corner |
| Alan Alexander Milne | Winnie-the-Pooh |
+----------------------+--------------------------+

Another example to show how to set custom values for the optional parameters::

os> SELECT author, title FROM books WHERE match_phrase(author, 'Alan Milne', slop = 2);
fetched rows / total rows = 2/2
+----------------------+--------------------------+
| author | title |
|----------------------+--------------------------|
| Alan Alexander Milne | The House at Pooh Corner |
| Alan Alexander Milne | Winnie-the-Pooh |
+----------------------+--------------------------+


MATCH_BOOL_PREFIX
-----------------

Description
>>>>>>>>>>>

``match_bool_prefix(field_expression, query_expression)``

The match_bool_prefix function maps to the match_bool_prefix query in the search engine. match_bool_prefix creates a match query from all but the last term in the query string. The last term is used to create a prefix query.

Example with only ``field`` and ``query`` expressions, and all other parameters are set default values::

os> SELECT firstname, address FROM accounts WHERE match_bool_prefix(address, 'Bristol Stre');
fetched rows / total rows = 2/2
+-------------+--------------------+
| firstname | address |
|-------------+--------------------|
| Hattie | 671 Bristol Street |
| Nanette | 789 Madison Street |
+-------------+--------------------+
40 changes: 38 additions & 2 deletions docs/user/ppl/functions/relevance.rst
Original file line number Diff line number Diff line change
Expand Up @@ -99,8 +99,6 @@ Example with only ``fields`` and ``query`` expressions, and all other parameters
| Nanette | Bates | Nogal | 789 Madison Street |
+-------------+------------+--------+--------------------+



Another example to show how to set custom values for the optional parameters::

os> source=accounts | where simple_query_string(['firstname', city ^ 2], 'Amber Nogal', analyzer=keyword, default_operator='AND') | fields firstname, lastname, city, address;
Expand All @@ -110,6 +108,44 @@ Another example to show how to set custom values for the optional parameters::
|-------------+------------+--------+-----------|
+-------------+------------+--------+-----------+

MATCH_PHRASE
------------

Description
>>>>>>>>>>>

``match_phrase(field_expression, query_expression[, option=<option_value>]*)``

The match_phrase function maps to the match_phrase query used in search engine, to return the documents that match a provided text with a given field. Available parameters include:

- analyzer
- slop
- zero_terms_query

For backward compatibility, matchphrase is also supported and mapped to match_phrase query as well.

Example with only ``field`` and ``query`` expressions, and all other parameters are set default values::

os> source=books | where match_phrase(author, 'Alexander Milne') | fields author, title
fetched rows / total rows = 2/2
+----------------------+--------------------------+
| author | title |
|----------------------+--------------------------|
| Alan Alexander Milne | The House at Pooh Corner |
| Alan Alexander Milne | Winnie-the-Pooh |
+----------------------+--------------------------+

Another example to show how to set custom values for the optional parameters::

os> source=books | where match_phrase(author, 'Alan Milne', slop = 2) | fields author, title
fetched rows / total rows = 2/2
+----------------------+--------------------------+
| author | title |
|----------------------+--------------------------|
| Alan Alexander Milne | The House at Pooh Corner |
| Alan Alexander Milne | Winnie-the-Pooh |
+----------------------+--------------------------+

Limitations
>>>>>>>>>>>

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -480,7 +480,7 @@ public void castStatementInWhereClauseLessThanConstantTest() {
public void castStatementInWhereClauseDatetimeCastTest() {
JSONObject response = executeJdbcRequest("SELECT date_keyword FROM "
+ TestsConstants.TEST_INDEX_DATE
+ " WHERE (CAST(date_keyword AS DATETIME) = \'2014-08-19T07:09:13.434Z\')");
+ " WHERE (CAST(date_keyword AS DATETIME) = '2014-08-19T07:09:13.434Z')");

String schema_result = "{\"name\":\"date_keyword\",\"type\":\"keyword\"}";
assertEquals(response.getJSONArray("schema").get(0).toString(), schema_result);
Expand Down Expand Up @@ -704,7 +704,7 @@ public void ifFuncShouldPassJDBC() {
JSONObject response = executeJdbcRequest(
"SELECT IF(age > 30, 'True', 'False') AS Ages FROM " + TEST_INDEX_ACCOUNT
+ " WHERE age IS NOT NULL GROUP BY Ages");
assertEquals("IF(age > 30, \'True\', \'False\')", response.query("/schema/0/name"));
assertEquals("IF(age > 30, 'True', 'False')", response.query("/schema/0/name"));
assertEquals("Ages", response.query("/schema/0/alias"));
assertEquals("keyword", response.query("/schema/0/type"));
}
Expand Down Expand Up @@ -742,7 +742,7 @@ public void ifnullShouldPassJDBC() throws IOException {
JSONObject response = executeJdbcRequest(
"SELECT IFNULL(lastname, 'unknown') AS name FROM " + TEST_INDEX_ACCOUNT
+ " GROUP BY name");
assertEquals("IFNULL(lastname, \'unknown\')", response.query("/schema/0/name"));
assertEquals("IFNULL(lastname, 'unknown')", response.query("/schema/0/name"));
assertEquals("name", response.query("/schema/0/alias"));
assertEquals("keyword", response.query("/schema/0/type"));
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ public void ifnullShouldPassJDBC() throws IOException {
JSONObject response = executeJdbcRequest(
"SELECT IFNULL(lastname, 'unknown') AS name FROM " + TEST_INDEX_ACCOUNT
+ " GROUP BY name");
assertEquals("IFNULL(lastname, \'unknown\')", response.query("/schema/0/name"));
assertEquals("IFNULL(lastname, 'unknown')", response.query("/schema/0/name"));
assertEquals("name", response.query("/schema/0/alias"));
assertEquals("keyword", response.query("/schema/0/type"));
}
Expand Down Expand Up @@ -92,7 +92,7 @@ public void ifnullWithMissingInputTest() {
public void nullifShouldPassJDBC() throws IOException {
JSONObject response = executeJdbcRequest(
"SELECT NULLIF(lastname, 'unknown') AS name FROM " + TEST_INDEX_ACCOUNT);
assertEquals("NULLIF(lastname, \'unknown\')", response.query("/schema/0/name"));
assertEquals("NULLIF(lastname, 'unknown')", response.query("/schema/0/name"));
assertEquals("name", response.query("/schema/0/alias"));
assertEquals("keyword", response.query("/schema/0/type"));
}
Expand Down Expand Up @@ -181,8 +181,8 @@ public void isnullWithMathExpr() throws IOException{
@Test
public void ifShouldPassJDBC() throws IOException {
JSONObject response = executeJdbcRequest(
"SELECT IF(2 > 0, \'hello\', \'world\') AS name FROM " + TEST_INDEX_ACCOUNT);
assertEquals("IF(2 > 0, \'hello\', \'world\')", response.query("/schema/0/name"));
"SELECT IF(2 > 0, 'hello', 'world') AS name FROM " + TEST_INDEX_ACCOUNT);
assertEquals("IF(2 > 0, 'hello', 'world')", response.query("/schema/0/name"));
assertEquals("name", response.query("/schema/0/alias"));
assertEquals("keyword", response.query("/schema/0/type"));
}
Expand Down
Loading