Skip to content
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.

Tableau: <table>.<column> syntax not supported #213

Closed
jordanw-bq opened this issue Oct 9, 2019 · 12 comments
Closed

Tableau: <table>.<column> syntax not supported #213

jordanw-bq opened this issue Oct 9, 2019 · 12 comments
Assignees
Labels
BI integration Issues for integration with BI tools enhancement New feature or request

Comments

@jordanw-bq
Copy link
Contributor

jordanw-bq commented Oct 9, 2019

Using Docker image amazon/opendistro-for-elasticsearch:1.2.0
Dataset: Flights sample data (loaded through Kibana)
SQL Plugin version: 1.2.0.0
Tableau version: 2019.3

Tableau generated SQL (MySQL dialect): SELECT `kibana_sample_data_flights`.`Carrier` from `kibana_sample_data_flights` GROUP BY 1
Generated SQL without quotes (#212): SELECT kibana_sample_data_flights.Carrier from kibana_sample_data_flights GROUP BY 1

Result: No data

{
  "schema": [],
  "total": 10000,
  "datarows": [
    [],
    [],
    [],
    [],
...
    []
  ],
  "size": 50,
  "status": 200
}

Expected Behaviour: kibana_sample_data_flights.Carrier should be recognized as a qualified column name (column Carrier from the kibana_sample_data_flights table), rather than a full column name (column kibana_sample_data_flights.Carrier from the kibana_sample_data_flights table).

Impact: Since this syntax is used for every Tableau query, all queries to the Elasticsearch data source will fail. This means that the JDBC driver cannot currently be used with Tableau.

@dai-chen dai-chen added the BI integration Issues for integration with BI tools label Oct 9, 2019
@dai-chen dai-chen self-assigned this Oct 9, 2019
@dai-chen
Copy link
Member

dai-chen commented Oct 9, 2019

Thanks for reporting! I can reproduce the issue from my end. Looking into the code.

@dai-chen
Copy link
Member

dai-chen commented Oct 9, 2019

There seems 2 functionality gaps for this issue in our code:

  1. Our code is unable to work with full table name as alias. The new added ANTLR parser recently can perform analysis and work without issue in this case. But the old backend code requires changes.
  2. Placeholder in GROUP BY or ORDER BY cannot be recognized and handled correctly. (GROUP BY 1 means group by the first column regardless of its name).

I'm trying to figure out workaround for both. Thanks.

@dai-chen dai-chen added the enhancement New feature or request label Oct 9, 2019
@jordanw-bq
Copy link
Contributor Author

Thanks for the update, Chen!

@dai-chen
Copy link
Member

dai-chen commented Oct 9, 2019

@jordanw-bq I'm going to open a separate issue for support for ordinal in GROUP BY and ORDER BY clause. Want to confirm with you first to make sure this feature is required by the integration with Tableau. Is this kind of SQL query (with ordinal) generated by Tableau in certain case? Or it's always the case for GROUP BY and ORDER BY?

@jordanw-bq
Copy link
Contributor Author

@dai-chen From our investigation so far, most if not all queries generated by Tableau have a GROUP BY clause.

@dai-chen
Copy link
Member

@dai-chen From our investigation so far, most if not all queries generated by Tableau have a GROUP BY clause.

Just to clarify, I meant GROUP BY 1 or ORDER BY 1 which use ordinal instead of column name.

@dai-chen
Copy link
Member

I opened a separate issue #215 for this. Please let us know if you found this support is actually not required for Tableau.

@dai-chen
Copy link
Member

dai-chen commented Oct 17, 2019

I tried to apply our existing alias generate rewriter to all queries (previously only to sub query) or use table name as table alias if alias not present. However, both approaches will break many test cases and need more patches elsewhere because:

  1. Once you set table alias, you need to prefix field name in query too. Because most our code makes that assumption.
  2. During DSL generation, you need to remove the table alias prefix. Otherwise you will generate wrong DSL with table alias in it (which ES doesn't recognize it).

Based on the troubles I've met, I'm trying to do this in the opposite way - remove table alias prefix for field name in query. This seems a much easier patch for the old codebase and can be done in preprocessing manner without touching many places.

Will finish the coding and send out PR once issue #212 be merged.

@abbashus
Copy link
Contributor

This change should take care when multiple tables are passed as part of simple query, because then there would be multiple aliases.
SELECT a.firstname, b.lastname FROM table1 a, table2 b.

I created a separate issue #221 .

@dai-chen
Copy link
Member

The fix here seems able to fix issue #99 too with minor tweak. Rather than remove table name prefix only when table alias is not present, we may be able to extend the scope of new rewriter to remove table alias prefix for all simple queries too. Making the code changes and verifying if any special case.

@dai-chen
Copy link
Member

As Abbas reminded, issue #175 may be able to be fixed altogether. Adding new IT to verify.

@dai-chen
Copy link
Member

This issue is already fixed but need to add IT to double check once #212.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
BI integration Issues for integration with BI tools enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants