-
Notifications
You must be signed in to change notification settings - Fork 14.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add duckdb as DataSource - Fixes #14563 #19265
Conversation
needs the forked version of [duckdb-engine](https://github.com/alitrack/duckdb_engine)
update _time_grain_expressions
"P1Y": "DATE_TRUNC('year', {col})", | ||
} | ||
|
||
custom_errors: Dict[Pattern[str], Tuple[str, SupersetErrorType, Dict[str, Any]]] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this be better solved on the duckdb-engine side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you would have a better judgment call on that than me; however, this looks to be more or less copy-pasted from the equivalent SQLite datasource - https://github.com/apache/superset/blob/master/superset/db_engine_specs/sqlite.py#L61
so presumably fine for now at least?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution!
@classmethod | ||
def get_all_datasource_names( | ||
cls, database: "Database", datasource_type: str | ||
) -> List[utils.DatasourceName]: | ||
schemas = database.get_all_schema_names( | ||
cache=database.schema_cache_enabled, | ||
cache_timeout=database.schema_cache_timeout, | ||
force=True, | ||
) | ||
schema = schemas[0] | ||
if datasource_type == "table": | ||
return database.get_all_table_names_in_schema( | ||
schema=schema, | ||
force=True, | ||
cache=database.table_cache_enabled, | ||
cache_timeout=database.table_cache_timeout, | ||
) | ||
if datasource_type == "view": | ||
return database.get_all_view_names_in_schema( | ||
schema=schema, | ||
force=True, | ||
cache=database.table_cache_enabled, | ||
cache_timeout=database.table_cache_timeout, | ||
) | ||
raise Exception(f"Unsupported datasource_type: {datasource_type}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part looks like it can directly inherit the Super Class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, good catch -
https://github.com/apache/superset/blob/master/superset/db_engine_specs/base.py#L825
incidentally, looks like the SQLite datasource follows the same pattern
https://github.com/apache/superset/blob/master/superset/db_engine_specs/sqlite.py#L74
I don't think it would be wise to clean that up in this PR but might be a quick win.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
made the change here - rwhaling@9fa02d7
I forgot that I can't point an existing PR to a new branch, might need to close this one - will wait for any additional feedback first though. Still working on getting set up for local testing as well.
progress - got a local development environment mostly working, screenshots above. Working through some bugs still Will do a little more debugging, might close this and open a new PR from a branch off my own fork. |
Got it working on a new branch, still kind of hackish - going to close this and open a fresh PR, apologies. |
SUMMARY
Adding DuckDB as a datasource, as it's SQLAlchemy dialect appears to have matured a good deal
Mause/duckdb_engine#219
To be clear, this is @alitrack's hard work, not mine, just opening up a PR to keep momentum, and working on testing it myself. Haven't set up for local dev on superset before, so might take me a few days.
BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
Almost works! Stuck on this error - I think the issue is the implementation of `do_ping()` in the duckdb dialect needs to be tweaked a bit, but struggling to get a local duckdb_engine working.
TESTING INSTRUCTIONS
duckdb_engine
installed:$ pip install duckdb-engine
ADDITIONAL INFORMATION