add pyexasol datasource, ensure that integer dont overflow in javascript #4378

stefan-mees · 2019-11-20T15:42:37Z

What type of PR is this? (check all applicable)

New Query Runner (Data Source)

Description

Adds Exasol as datasource to Redash, we are running our own fork with this connector since a while and would like to contribute it back to the mainline.

arikfr

Thanks! It's great we can finally add Exasol to the list of supported databases without using the more compelx setup of the ODBC driver.

I've added some comments ⬇

requirements.txt

redash/query_runner/exasol.py

arikfr · 2019-11-20T15:57:05Z

requirements_all_ds.txt

@@ -32,3 +32,5 @@ phoenixdb==0.7
 certifi>=2019.9.11
 pydgraph==2.0.2
 azure-kusto-data==0.0.35
+pyexasol==0.9.1
+python-rapidjson==0.8.0


Did you notice any noticeable performance difference when using this?

Its recommended in the best practices of pyexasol, i have not done any benchmarking on this by myself.

https://github.com/exasol/pyexasol/blob/master/docs/BEST_PRACTICES.md#consider-faster-json-parsing-libraries

arikfr

Looks good 👍 Will merge once CI passes.

littleK0i · 2019-12-13T10:44:54Z

@stefan-mees , @arikfr , Hi guys,

PyEXASOL creator is here. Thank you for the pull request!

Let's see if we can improve this code a little bit:

Regarding to rapidjson library. It improves performance of json decoding by ~10-30% in ideal case scenario. But in this query runner the response is encoded in JSON format again by redash.utils.json_dumps, so serialisation overhead is doubled or even tripled. In this case rapidjson won't really help much. Maybe it's not worth to add it as a mandatory dependency.
Regarding to _exasol_type_mapper. Please note that creation of datetime.date and datetime.datetime objects is quite expensive. If you have a lot of DATE / TIMESTAMP columns in result set, it may take more CPU resources than all other actions. But we do not really need such objects. They are serialised into strings again before returning the result set. Maybe it's possible to simplify mapper or even avoid using it altogether.
It might be worth adding more configurable parameters (full reference). At very least we need compression (it should be False in the cloud and in the same data center, it should be True for remote data center or WiFi connection), encryption (security), socket_timeout, schema, http_proxy.
Please note that having separate host and port parameters might be a bad idea in the long run. Currently it works fine, but starting from Exasol 7.0+ we'll have an ability to create multiple read-only clusters, and each of them might have a different port. DSN might look like this: cluster1exa1..10:8563,cluster2exa1..10:8564. It might be a good idea to keep this parameter as pure dsn and pass it directly to pyexasol.connect() without concatenation.
It is advised to add /*snapshot execution*/ prefix to this SQL query:

        SELECT
            COLUMN_SCHEMA,
            COLUMN_TABLE,
            COLUMN_NAME
        FROM EXA_ALL_COLUMNS

This is a new feature described in https://www.exasol.com/support/browse/IDEA-476. It will help to prevent locks while retrieiving large amounts of meta data.

Variable error does not seem to be populated by exceptions. It might be worth adding a basic except block catching exceptions and returning text message to ReDash. Please note that PyEXASOL exceptions are instances of pyexasol.ExaException, but sometimes we may get OSException from underlying WebSocket / socket networking layer. If default error messages are too verbose for ReDash, you may set verbose_error=False to get only text without extra information.

I would patch it myself, but we do not currently use ReDash in Badoo, so I cannot test those changes on prod long term.

Thank you! Have a nice day.

arikfr mentioned this pull request Nov 20, 2019

feat(QueryRunner): add exasol #2133

Closed

arikfr reviewed Nov 20, 2019

View reviewed changes

add pyexasol datasource, ensure that integer dont overflow in javascript

1673093

stefan-mees force-pushed the feat/add_pyexasol branch from 25798fc to 1673093 Compare November 21, 2019 09:47

stefan-mees requested a review from arikfr November 21, 2019 09:49

weekly-digest bot mentioned this pull request Nov 25, 2019

Weekly Digest (18 November, 2019 - 25 November, 2019) #4398

Closed

arikfr approved these changes Nov 27, 2019

View reviewed changes

Merge branch 'master' into feat/add_pyexasol

300ac92

arikfr merged commit e82373a into getredash:master Nov 27, 2019

weekly-digest bot mentioned this pull request Dec 2, 2019

Weekly Digest (25 November, 2019 - 2 December, 2019) #4417

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add pyexasol datasource, ensure that integer dont overflow in javascript #4378

add pyexasol datasource, ensure that integer dont overflow in javascript #4378

stefan-mees commented Nov 20, 2019

arikfr left a comment

arikfr Nov 20, 2019

stefan-mees Nov 21, 2019

arikfr left a comment

littleK0i commented Dec 13, 2019 •

edited

Loading

add pyexasol datasource, ensure that integer dont overflow in javascript #4378

add pyexasol datasource, ensure that integer dont overflow in javascript #4378

Conversation

stefan-mees commented Nov 20, 2019

What type of PR is this? (check all applicable)

Description

arikfr left a comment

Choose a reason for hiding this comment

arikfr Nov 20, 2019

Choose a reason for hiding this comment

stefan-mees Nov 21, 2019

Choose a reason for hiding this comment

arikfr left a comment

Choose a reason for hiding this comment

littleK0i commented Dec 13, 2019 • edited Loading

littleK0i commented Dec 13, 2019 •

edited

Loading