Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Elasticsearch query runner #4293

Open
arikfr opened this issue Oct 27, 2019 · 6 comments · Fixed by lumonus/redash#1
Open

New Elasticsearch query runner #4293

arikfr opened this issue Oct 27, 2019 · 6 comments · Fixed by lumonus/redash#1

Comments

@arikfr
Copy link
Member

arikfr commented Oct 27, 2019

Our current Elasticsearch support is in a very bad state, it has issues with aggregated queries, nested objects, and new versions of Elasticsearch. Also the codebase is in a very poor state not allowing for easy fixes.

We are going to re-implement the connector, addressing these issues and making room for future improvements.

Implementation notes:

  • Unless there will be request for it we will only support Elasticsearch's API and not the Kibana flavor.
  • This will be a new query runner. The current ones will be marked as deprecated, but existing queries using them will keep using them.
  • We will have support for SQL queries on top of Elasticsearch. There seem to be two flavors of it: community version and the X-Pack version. We should support both. Worth considering having the different SQL connectors as separate types from the ones that use the API directly (but reuse code, of course).

Previous issues:

Previous attempts at addressing these issues:

@arikfr
Copy link
Member Author

arikfr commented Oct 27, 2019

Worth noting that the new implementation will at least have tests for the query parsing logic to make future changes easier.

@NicolasLM
Copy link
Contributor

  1. Since the ES API is quite easily accessible with HTTP requests, I guess we don't want to add the dependency to a more specific Python ES client, keeping it the same way as the current ES query runner. Am I right?
  2. Do we want to support nested fields and nested aggregations the same way it's done for the MongoDB query runner? Nested objects get flattened and the name of the column uses a delimiter like . to indicate that it originally comes from a nested object. Note that this is the approach taken out of the box by the X-Pack SQL wrapper.
  3. Same questions regarding documents with different fields in the same collection. It seems that the MongoDB runner looks at the first and last documents and takes fields from there while X-Pack SQL shows all fields possible defined in the collection for each document (with the value null if the document does not have a specific field).

@arikfr
Copy link
Member Author

arikfr commented Nov 6, 2019

  1. Yes, I had the same thoughts. From what I've seen for our needs the Elasticsearch clients don't offer much.

  2. Considering this is what X-Pack SQL does and it's consistent with what we do with other connectors, it makes sense to keep it. I would also consider reusing the code between Mongo and ES.

  3. What you described happens when we query MongoDB for schema. But in the query result we pick up columns from all the documents (see parse_results). We should follow similar pattern.

NicolasLM added a commit that referenced this issue Nov 23, 2019
The new elasticsearch query runner is split into three:

- A runner supporting the newest versions of ES,
  aggregation, nested aggregations and nested fields.
- A runner for the SQL OpenDistro flavor
- A runner for the SQL X-Pack flavor

Fixes #4293
@arikfr arikfr modified the milestones: v9, Next Apr 14, 2020
@vkuznet
Copy link

vkuznet commented Jul 8, 2020

Hi, can you describe the overall status of the new ES pipeline, and give us some timeline for its completeness/readiness for production. We are experience problems with ES queries and we're awaiting when new implementation will be in place.

@itssimon
Copy link

Same here, we're desperately waiting for the new ES connector. Is there any way I can contribute to getting it across the line?

susodapop pushed a commit that referenced this issue Jul 13, 2022
The new elasticsearch query runner is split into three:

- A runner supporting the newest versions of ES,
  aggregation, nested aggregations and nested fields.
- A runner for the SQL OpenDistro flavor
- A runner for the SQL X-Pack flavor

Fixes #4293
@mavencode01
Copy link

@susodapop @arikfr
Please is this available in the latest stable docker tag: 10.1.0.b50633 ?

docker pull redash/redash:10.1.0.b50633

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants