Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL: Add support for FROZEN indices #41558

Merged
merged 11 commits into from
May 9, 2019
Merged

Conversation

costin
Copy link
Member

@costin costin commented Apr 25, 2019

Allow querying of FROZEN indices both through dedicated SQL grammar
extension:

SELECT field FROM FROZEN index

and also through driver configuration parameter, namely:

index.include.frozen: true/false

Fix #39390
Fix #39377

Allow querying of FROZEN indices both through dedicated SQL grammar
extension:
> SELECT field FROM FROZEN index
and also through driver configuration parameter, namely:
> index.include.frozen: true/false

Fix elastic#39390
Fix elastic#39377
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

@costin costin requested a review from bpintea April 26, 2019 07:43
try (Connection csv = csvConnection(testCase); Connection es = esJdbc()) {
executeAndAssert(csv, es);
}
try (Connection csv = csvConnection(testCase); Connection es = esJdbc()) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matriv The previous hack was unnecessary - the conditional should be addressed in connectionProperties and further more, esJdbc() actually calls esJdbc(connectionProperties()) so the code was not only duplicated but also identical in results.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get how this is addressed only for a specific file, in this case: time.csv-spec.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not but then again, it wasn't before.
In other words, the if ("time".equals(groupName)) does the right thing but calling esJdbc(connectionProperties()) is the same as calling esJdbc() meaning the if and else branch end up calling the same code path.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not the case, if you check again there are 2 different connectionProperties() involved there. and the one for the time sets the timezone to UTC where the other uses a random timezone.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@costin Did you check this ^^ ?

test_alias_emp |VIEW
test_emp |BASE TABLE
test_emp_copy |BASE TABLE
name | type | native
Copy link
Member Author

@costin costin Apr 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still thinking whether the third column should be native or something else (storage doesn't fit either) and mapping seems inappropriate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like native but how about native_type ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think native_type is more clear...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

somewhat redundant - type, native type - plus should it be native_type or native type? A single word is much preferred.

Copy link
Contributor

@matriv matriv May 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imho, the name of a previous column (type) is completely irrelevant to the name of the next one, and I believe it's more clear to have it as native_type. And I would for sure vote for the _ instead of a space: .

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opted for kind - one word, small (4 letters like the rest) and fairly accurate.

@@ -33,12 +33,14 @@
private static final ObjectParser<SqlQueryRequest, Void> PARSER = objectParser(SqlQueryRequest::new);
static final ParseField COLUMNAR = new ParseField("columnar");
static final ParseField FIELD_MULTI_VALUE_LENIENCY = new ParseField("field_multi_value_leniency");
static final ParseField INDEX_INCLUDE_FROZEN = new ParseField("index_include_frozen");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bpintea this is a new parameter that drivers can pass in to allow all queries inside one connection to include frozen indices. If not, the user would have to manually mention this through INCLUDE FROZEN and FROZEN.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened elastic/elasticsearch-sql-odbc#151. Note that the parameter is false by default.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bpintea thanks - I think there's an area that needs more exploring, namely SYS commands in particular that of types. Since a frozen index is still mapped as a BASE TABLE, the server needs to properly take the setting into account but I have to double check whether that's the case - just something to keep in mind.

| SHOW TABLES (tableLike=likePattern | tableIdent=tableIdentifier)? #showTables
| SHOW COLUMNS (FROM | IN) (tableLike=likePattern | tableIdent=tableIdentifier) #showColumns
| (DESCRIBE | DESC) (tableLike=likePattern | tableIdent=tableIdentifier) #showColumns
| SHOW TABLES (INCLUDE FROZEN)? (tableLike=likePattern | tableIdent=tableIdentifier)? #showTables
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the declarative approach for frozen which is useful when not using index.include.frozen (again . are used by drivers, _ is used in rest).

| (DESCRIBE | DESC) (tableLike=likePattern | tableIdent=tableIdentifier) #showColumns
| SHOW TABLES (INCLUDE FROZEN)? (tableLike=likePattern | tableIdent=tableIdentifier)? #showTables
| SHOW COLUMNS (INCLUDE FROZEN)? (FROM | IN) (tableLike=likePattern | tableIdent=tableIdentifier) #showColumns
| (DESCRIBE | DESC) (INCLUDE FROZEN)? (tableLike=likePattern | tableIdent=tableIdentifier) #showColumns
| SHOW FUNCTIONS (likePattern)? #showFunctions
| SHOW SCHEMAS #showSchemas
| SYS TABLES (CATALOG clusterLike=likePattern)?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notice the SYS commands do not require INCLUDE FROZEN since these are meant only for drivers in which case the option is set through the configuration.

@@ -408,14 +444,14 @@ public void resolveAsSeparateMappings(String indexWildcard, String javaRegex, Ac
/**
* Assemble an index-based mapping from the field caps (which is field based) by looking at the indices associated with
* each field.
*/
*/
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like formatting got affected in merging -will follow up.

Copy link
Contributor

@matriv matriv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great overall! Left some minor comments.

Could we have a test also for the translate api on a frozen index?

name:s | type:s
test_alias |VIEW
test_alias_emp |VIEW
name:s | type:s | native:s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, maybe take the opportunity and align the outputs in the test results.

@@ -23,9 +24,9 @@
public static class PreAnalysis {
public static final PreAnalysis EMPTY = new PreAnalysis(emptyList());

public final List<TableIdentifier> indices;
public final List<Tuple<TableIdentifier, Boolean>> indices;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add the frozen as a property in TableIdentifier instead, maybe even rename it to TableInfo.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TableIdentifier is mapped to parsing and has a well defined scope - frozen is a property of a table not of its id (which is just another property).
Since in the future we might have other properties (potentially remote from CCS), I've retrofitted the tuple into a TableInfo class inside the analysis package.


public static final EnumSet<IndexType> VALID = EnumSet.of(INDEX, ALIAS);
public static final EnumSet<IndexType> VALID = EnumSet.of(STANDARD_INDEX, ALIAS, FROZEN_INDEX);
public static final EnumSet<IndexType> VALID_WO_FROZEN = EnumSet.of(STANDARD_INDEX, ALIAS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor suggestion: I would choose to name it plain VALID and the other VALID_WITH_FROZEN.

// and create a MultiTypeField

for (Entry<String, FieldCapabilities> type : types.entrySet()) {
// build the error message
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've already spotted it, but just a reminder to revert the formatting changes.

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general. Left few comments.
Documentation will be added later?

test_alias_emp |VIEW
test_emp |BASE TABLE
test_emp_copy |BASE TABLE
name | type | native
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think native_type is more clear...


private final String toSql;
private final String toNative;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a bit more descriptive as toNativeType?

GetIndexRequest indexRequest = new GetIndexRequest()
.local(true)
.indices(indices)
.features(Feature.SETTINGS)
.includeDefaults(false)
.indicesOptions(INDICES_ONLY_OPTIONS);

// if frozen indices are request, make sure to update the request accordingly
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: if frozen indices are request**ed**

SearchRequest search = client.prepareSearch(indices)
// always track total hits accurately
.setTrackTotalHits(true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why removing this one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch - merging error.

@@ -449,7 +464,8 @@ public boolean equals(Object obj) {
&& Objects.equals(fields, other.fields)
&& Objects.equals(aliases, other.aliases)
&& Objects.equals(sort, other.sort)
&& Objects.equals(limit, other.limit);
&& Objects.equals(limit, other.limit) && Objects.equals(trackHits, other.trackHits)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each equals on its own line, to have a consistent approach?

@costin
Copy link
Member Author

costin commented May 3, 2019

LGTM in general. Left few comments.
Documentation will be added later?

Yes, I'd like to keep it separate since it would be easy to iterate over it in the CI without running the whole suite.

@costin
Copy link
Member Author

costin commented May 6, 2019

Currently blocked by #41795

Copy link
Contributor

@matriv matriv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@costin
Copy link
Member Author

costin commented May 9, 2019

run elasticsearch-ci/1

@costin costin merged commit 2445a93 into elastic:master May 9, 2019
@costin costin deleted the sql/frozen-indices branch May 9, 2019 21:27
imotov added a commit that referenced this pull request May 13, 2019
costin added a commit that referenced this pull request May 23, 2019
Allow querying of FROZEN indices both through dedicated SQL grammar
extension:
> SELECT field FROM FROZEN index
and also through driver configuration parameter, namely:
> index.include.frozen: true/false

Fix #39390
Fix #39377

(cherry picked from commit 2445a93)
matriv pushed a commit to matriv/elasticsearch that referenced this pull request May 24, 2019
Allow querying of FROZEN indices both through dedicated SQL grammar
extension:
> SELECT field FROM FROZEN index
and also through driver configuration parameter, namely:
> index.include.frozen: true/false

Fix elastic#39390
Fix elastic#39377

(cherry picked from commit 2445a93)
@costin costin added v7.3.0 and removed v7.2.0 labels May 25, 2019
gurkankaymak pushed a commit to gurkankaymak/elasticsearch that referenced this pull request May 27, 2019
Allow querying of FROZEN indices both through dedicated SQL grammar
extension:
> SELECT field FROM FROZEN index
and also through driver configuration parameter, namely:
> index.include.frozen: true/false

Fix elastic#39390
Fix elastic#39377
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SQL: Support Frozen indices Support for ignore_throttled parameter in /_xpack/sql request
6 participants