-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SQL: Add support for FROZEN indices #41558
Conversation
Allow querying of FROZEN indices both through dedicated SQL grammar extension: > SELECT field FROM FROZEN index and also through driver configuration parameter, namely: > index.include.frozen: true/false Fix elastic#39390 Fix elastic#39377
Pinging @elastic/es-search |
try (Connection csv = csvConnection(testCase); Connection es = esJdbc()) { | ||
executeAndAssert(csv, es); | ||
} | ||
try (Connection csv = csvConnection(testCase); Connection es = esJdbc()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@matriv The previous hack was unnecessary - the conditional should be addressed in connectionProperties
and further more, esJdbc()
actually calls esJdbc(connectionProperties())
so the code was not only duplicated but also identical in results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get how this is addressed only for a specific file, in this case: time.csv-spec
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not but then again, it wasn't before.
In other words, the if ("time".equals(groupName)) does the right thing but calling esJdbc(connectionProperties()) is the same as calling esJdbc() meaning the if and else branch end up calling the same code path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not the case, if you check again there are 2 different connectionProperties()
involved there. and the one for the time
sets the timezone to UTC where the other uses a random timezone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@costin Did you check this ^^ ?
test_alias_emp |VIEW | ||
test_emp |BASE TABLE | ||
test_emp_copy |BASE TABLE | ||
name | type | native |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still thinking whether the third column should be native
or something else (storage
doesn't fit either) and mapping
seems inappropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like native
but how about native_type
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think native_type
is more clear...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
somewhat redundant - type, native type - plus should it be native_type
or native type
? A single word is much preferred.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imho, the name of a previous column (type
) is completely irrelevant to the name of the next one, and I believe it's more clear to have it as native_type
. And I would for sure vote for the _
instead of a space:
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opted for kind
- one word, small (4 letters like the rest) and fairly accurate.
@@ -33,12 +33,14 @@ | |||
private static final ObjectParser<SqlQueryRequest, Void> PARSER = objectParser(SqlQueryRequest::new); | |||
static final ParseField COLUMNAR = new ParseField("columnar"); | |||
static final ParseField FIELD_MULTI_VALUE_LENIENCY = new ParseField("field_multi_value_leniency"); | |||
static final ParseField INDEX_INCLUDE_FROZEN = new ParseField("index_include_frozen"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bpintea this is a new parameter that drivers can pass in to allow all queries inside one connection to include frozen indices. If not, the user would have to manually mention this through INCLUDE FROZEN
and FROZEN
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened elastic/elasticsearch-sql-odbc#151. Note that the parameter is false
by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bpintea thanks - I think there's an area that needs more exploring, namely SYS
commands in particular that of types. Since a frozen index is still mapped as a BASE TABLE
, the server needs to properly take the setting into account but I have to double check whether that's the case - just something to keep in mind.
| SHOW TABLES (tableLike=likePattern | tableIdent=tableIdentifier)? #showTables | ||
| SHOW COLUMNS (FROM | IN) (tableLike=likePattern | tableIdent=tableIdentifier) #showColumns | ||
| (DESCRIBE | DESC) (tableLike=likePattern | tableIdent=tableIdentifier) #showColumns | ||
| SHOW TABLES (INCLUDE FROZEN)? (tableLike=likePattern | tableIdent=tableIdentifier)? #showTables |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the declarative approach for frozen which is useful when not using index.include.frozen
(again .
are used by drivers, _
is used in rest).
| (DESCRIBE | DESC) (tableLike=likePattern | tableIdent=tableIdentifier) #showColumns | ||
| SHOW TABLES (INCLUDE FROZEN)? (tableLike=likePattern | tableIdent=tableIdentifier)? #showTables | ||
| SHOW COLUMNS (INCLUDE FROZEN)? (FROM | IN) (tableLike=likePattern | tableIdent=tableIdentifier) #showColumns | ||
| (DESCRIBE | DESC) (INCLUDE FROZEN)? (tableLike=likePattern | tableIdent=tableIdentifier) #showColumns | ||
| SHOW FUNCTIONS (likePattern)? #showFunctions | ||
| SHOW SCHEMAS #showSchemas | ||
| SYS TABLES (CATALOG clusterLike=likePattern)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notice the SYS
commands do not require INCLUDE FROZEN
since these are meant only for drivers in which case the option is set through the configuration.
@@ -408,14 +444,14 @@ public void resolveAsSeparateMappings(String indexWildcard, String javaRegex, Ac | |||
/** | |||
* Assemble an index-based mapping from the field caps (which is field based) by looking at the indices associated with | |||
* each field. | |||
*/ | |||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like formatting got affected in merging -will follow up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great overall! Left some minor comments.
Could we have a test also for the translate api on a frozen index?
name:s | type:s | ||
test_alias |VIEW | ||
test_alias_emp |VIEW | ||
name:s | type:s | native:s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor, maybe take the opportunity and align the outputs in the test results.
@@ -23,9 +24,9 @@ | |||
public static class PreAnalysis { | |||
public static final PreAnalysis EMPTY = new PreAnalysis(emptyList()); | |||
|
|||
public final List<TableIdentifier> indices; | |||
public final List<Tuple<TableIdentifier, Boolean>> indices; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add the frozen
as a property in TableIdentifier
instead, maybe even rename it to TableInfo
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TableIdentifier is mapped to parsing and has a well defined scope - frozen is a property of a table not of its id (which is just another property).
Since in the future we might have other properties (potentially remote from CCS), I've retrofitted the tuple into a TableInfo
class inside the analysis package.
|
||
public static final EnumSet<IndexType> VALID = EnumSet.of(INDEX, ALIAS); | ||
public static final EnumSet<IndexType> VALID = EnumSet.of(STANDARD_INDEX, ALIAS, FROZEN_INDEX); | ||
public static final EnumSet<IndexType> VALID_WO_FROZEN = EnumSet.of(STANDARD_INDEX, ALIAS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor suggestion: I would choose to name it plain VALID
and the other VALID_WITH_FROZEN
.
// and create a MultiTypeField | ||
|
||
for (Entry<String, FieldCapabilities> type : types.entrySet()) { | ||
// build the error message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've already spotted it, but just a reminder to revert the formatting changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM in general. Left few comments.
Documentation will be added later?
test_alias_emp |VIEW | ||
test_emp |BASE TABLE | ||
test_emp_copy |BASE TABLE | ||
name | type | native |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think native_type
is more clear...
|
||
private final String toSql; | ||
private final String toNative; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a bit more descriptive as toNativeType
?
GetIndexRequest indexRequest = new GetIndexRequest() | ||
.local(true) | ||
.indices(indices) | ||
.features(Feature.SETTINGS) | ||
.includeDefaults(false) | ||
.indicesOptions(INDICES_ONLY_OPTIONS); | ||
|
||
// if frozen indices are request, make sure to update the request accordingly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: if frozen indices are request**ed**
SearchRequest search = client.prepareSearch(indices) | ||
// always track total hits accurately | ||
.setTrackTotalHits(true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why removing this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch - merging error.
@@ -449,7 +464,8 @@ public boolean equals(Object obj) { | |||
&& Objects.equals(fields, other.fields) | |||
&& Objects.equals(aliases, other.aliases) | |||
&& Objects.equals(sort, other.sort) | |||
&& Objects.equals(limit, other.limit); | |||
&& Objects.equals(limit, other.limit) && Objects.equals(trackHits, other.trackHits) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each equals
on its own line, to have a consistent approach?
Yes, I'd like to keep it separate since it would be easy to iterate over it in the CI without running the whole suite. |
Currently blocked by #41795 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
run elasticsearch-ci/1 |
Allow querying of FROZEN indices both through dedicated SQL grammar extension: > SELECT field FROM FROZEN index and also through driver configuration parameter, namely: > index.include.frozen: true/false Fix elastic#39390 Fix elastic#39377 (cherry picked from commit 2445a93)
Allow querying of FROZEN indices both through dedicated SQL grammar extension: > SELECT field FROM FROZEN index and also through driver configuration parameter, namely: > index.include.frozen: true/false Fix elastic#39390 Fix elastic#39377
Allow querying of FROZEN indices both through dedicated SQL grammar
extension:
and also through driver configuration parameter, namely:
Fix #39390
Fix #39377