You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would expect the query templated by dbt show to include limit {limit}.
Otherwise, dbt could run a query that returns 10k rows, load all of them into memory (in an agate.Table), and only then filter down to the default (5). My sense is that risks significant slowdown, and OOM issues.
Acceptance criteria:
Instead of just directly executing compiled_node.compiled_code, template it into a subquery that includes limit
In order to write the subquery in a cross-database-compatible way, we should either
Use a (Jinja) macro or materialization to template the subquery with limit
Use an adapter (Python) method — much easier to implement, and probably sufficient
Out of scope: Supporting BigQuery's no-cost "preview" API (bq head), which would require us to do a diff check / cache validation on the logical/applied state of this model. (logical = current SQL, applied = most recently materialized) Let's keep thinking about if there are ways to support that in the future. Update: Opened a separate issue for this: #7391.
The text was updated successfully, but these errors were encountered:
github-actionsbot
changed the title
dbt show should include --limit in compiled query
[CT-2428] dbt show should include --limit in compiled query
Apr 18, 2023
@jtcohen6 Per BLG we'd actually like to implement this in the base adapter (option 2). It strikes a good balance between doing all-the-things (supporting bq head and any other adapter specific logic we might want to use) and doing the quick-and-dirty version where we edit the sql just for the show command. It's been pointed accordingly.
dbt show --limit
will limit the number of rows of data previewed on the CLI.Currently,
dbt show
applies alimit
after it's already run the query and loaded the results into memory:dbt-core/core/dbt/task/show.py
Lines 20 to 22 in 121fa57
I would expect the query templated by
dbt show
to includelimit {limit}
.Otherwise, dbt could run a query that returns 10k rows, load all of them into memory (in an
agate.Table
), and only then filter down to the default (5
). My sense is that risks significant slowdown, and OOM issues.Acceptance criteria:
compiled_node.compiled_code
, template it into a subquery that includeslimit
limit
Out of scope: Supporting BigQuery's no-cost "preview" API (
bq head
), which would require us to do a diff check / cache validation on the logical/applied state of this model. (logical = current SQL, applied = most recently materialized) Let's keep thinking about if there are ways to support that in the future. Update: Opened a separate issue for this: #7391.The text was updated successfully, but these errors were encountered: