-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CT-1034] [Bug] Unexpected behavior when chaining methods on dbt-ref'ed/sourced dataframes #5646
Comments
@lostmygithubaccount I looked into this issue a bit and noticed that this is related to how we traverse the parsed AST. Note to myself: some example regex for
|
Without having tested at all locally — I'd guess that the regex approach risks being significantly slower, and potentially liable to abuse, versus using Python's AST. I'd lean slightly in the direction of just documenting that |
As the one who experienced and reported the issue, I personally think that it's preferable to give a technical solution here. |
@jtcohen6 thinking again it should be possible to do all of this using the AST NodeVisitor. When I get to it, if it is really tricky to support syntax like |
@elongl definitely agree that'd be ideal, but to @jtcohen6's point we should be conscious of the implications. will defer to @ChenyuLInx to further investigate the options and performance/other technical impacts before we make a call. @elongl appreciate you reporting this! it is something that's going to trip people up and good to catch early in the beta |
As part of estimation we're including doing a very basic benchmark of doing a full ast parse (as opposed to the top-level we do currently) |
Is this a new bug in dbt-core?
Current Behavior
Community Slack report here:
https://getdbt.slack.com/archives/C03QUA7DWCW/p1660223963454249
I reproduced easily -- the first run had
df = df.limit(100)
line uncommented and no chained call on the prior line, the second run is what's shown in the screenshot:Expected Behavior
Would not expect method chaining to cause errors like this on
dbt.ref(...)
ordbt.source(...)
. We should clearly document this behavior and give a better error method if there's some underlying reason why we can't easily fix this, otherwise we should fix it.Steps To Reproduce
See screenshot above. Basically add a
.limit(N)
to your reference call, e.g.df = dbt.ref(model_name).limit(100)
.Relevant log output
Environment
Which database adapter are you using with dbt?
snowflake
Additional Context
cc: @ChenyuLInx
No response
The text was updated successfully, but these errors were encountered: