Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(duckdb): use .sql instead of .execute in performance-senitive locations #8669

Merged
merged 2 commits into from
Mar 17, 2024

Conversation

jcrist
Copy link
Member

@jcrist jcrist commented Mar 15, 2024

In #6875 we switched to using con.sql(sql) instead of con.execute(sql) when retrieving an arrow table of results. The former is more efficient than the latter since the .sql api lets the query planner take advantage of the known output type, possibly resulting in a more efficient execution.

In the the-epic-split refactor this optimization was dropped (note - it seems less necessary since duckdb 0.9.0, although they say a performance difference is still expected). We add the optimization back here, and apply it to all locations where we return in-memory data.

We choose not to use duckdb_con.sql instead of duckdb_con.execute in the public-facing raw_sql api, since:

  • The returned object isn't strictly compatible with the dbapi API
  • In the case of operations that don't return results (e.g. DDL), con.sql returns None

Fixes #8631.

@jcrist
Copy link
Member Author

jcrist commented Mar 15, 2024

Oh joy - segfaults on CI, but I can't reproduce them locally: https://github.com/ibis-project/ibis/actions/runs/8302276853/job/22724239870?pr=8669 👀 👀.

Debugging on CI it is then.

@cpcloud
Copy link
Member

cpcloud commented Mar 15, 2024

Copy link
Member

@cpcloud cpcloud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@cpcloud cpcloud added this to the 9.0 milestone Mar 17, 2024
@cpcloud cpcloud added performance Issues related to ibis's performance duckdb The DuckDB backend labels Mar 17, 2024
@cpcloud cpcloud enabled auto-merge (squash) March 17, 2024 11:08
@cpcloud cpcloud merged commit aa6aa0c into ibis-project:main Mar 17, 2024
80 of 82 checks passed
@jcrist jcrist deleted the duckdb-use-con-sql branch March 17, 2024 13:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duckdb The DuckDB backend performance Issues related to ibis's performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

perf: should we be using con.sql(), not con.execute()?
2 participants