Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sqlfmt breaks up depends_on statements across multiple lines, causing an error in parsing #628

Closed
ryantimjohn opened this issue Aug 27, 2024 · 7 comments
Labels
bug Something isn't working

Comments

@ryantimjohn
Copy link
Contributor

ryantimjohn commented Aug 27, 2024

Describe the bug
sqlfmt breaks up depends_on statements across multiple lines, causing an error in parsing.

To Reproduce

Take a sufficiently long depends_on statement from the top a sql file in a dbt project, for instance:

-- depends_on: {{ ref('xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx') }}

sqlfmt will break it across multiple lines, like so:

-- depends_on: {{
-- ref('xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx') }}

dbt is unable to correctly parse this:
bad operand type for unary -: 'BigQueryRelation'

Expected behavior
sqlfmt should leave depends_on statements on a single line so they correctly parse, no matter what the length.

Additional context
sqlfmt, version 0.23.2

package shandy-sqlfmt 0.23.2, installed using Python 3.11.3
    - sqlfmt
    - sqlfmt_primer
@tconbeer
Copy link
Owner

I can't reproduce this, with or without jinjafmt. I get:

~/open/sqlfmt main ≡  3.11 ❯ sqlfmt -                                                                                                                          10:17:25
-- depends_on: {{ref('xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx') }}
-- depends_on:
-- {{ref('xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx') }}


1 file formatted.
0 files left unchanged.
- formatted.
~/open/sqlfmt main ≡  3.11 ❯ sqlfmt --no-jinjafmt -
-- depends_on: {{ref('xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx') }}
-- depends_on:
-- {{ref('xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx') }}


1 file formatted.
0 files left unchanged.
- formatted.

(where the whole jinja tag gets moved to the second line, but that should be perfectly valid for dbt)

@ryantimjohn
Copy link
Contributor Author

I can't reproduce this, with or without jinjafmt. I get:

~/open/sqlfmt main ≡  3.11 ❯ sqlfmt -                                                                                                                          10:17:25
-- depends_on: {{ref('xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx') }}
-- depends_on:
-- {{ref('xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx') }}


1 file formatted.
0 files left unchanged.
- formatted.
~/open/sqlfmt main ≡  3.11 ❯ sqlfmt --no-jinjafmt -
-- depends_on: {{ref('xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx') }}
-- depends_on:
-- {{ref('xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx') }}


1 file formatted.
0 files left unchanged.
- formatted.

(where the whole jinja tag gets moved to the second line, but that should be perfectly valid for dbt)

This is so bizarre, don't know why we're getting such different results!

@ryantimjohn
Copy link
Contributor Author

@tconbeer so sorry, I made a copy-paste error, try formatting this:

-- depends_on: {{ ref('xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx') }}

Repository owner deleted a comment Aug 27, 2024
@tconbeer tconbeer added bug Something isn't working and removed cannot reproduce labels Aug 27, 2024
@tconbeer
Copy link
Owner

yeah ok that did it. thanks for the report.

@ryantimjohn
Copy link
Contributor Author

Happy to help with a PR if you can point me to where I can work on it!

@tconbeer
Copy link
Owner

Thanks. It's this method:

@classmethod
def _split_before(cls, text: str, max_length: int) -> Iterator[str]:
"""
When rendering very long comments, we try to split them at the desired line
length and wrap them onto multiple lines. This method takes the contents of
a comment (without the marker) and a maximum length, and splits the original
text at whitespace, yielding each split as a stringd
"""
if len(text) < max_length:
yield text.rstrip()
else:
for idx, char in enumerate(reversed(text[:max_length])):
if char.isspace():
yield text[: max_length - idx].rstrip()
yield from cls._split_before(text[max_length - idx :], max_length)
break
else: # no spaces in the comment
yield text.rstrip()

Probably best to check the contents of the comment for a jinja tag ({{...}} or {%...%} pair) and skip splitting the comment if we find one.

ryantimjohn added a commit to ryantimjohn/sqlfmt that referenced this issue Nov 12, 2024
tconbeer pushed a commit that referenced this issue Nov 12, 2024
* fix sqlfmt breaks up depends_on statements across multiple lines, causing an error in parsing #628

* add test to not split comments with jinja tags
@tconbeer
Copy link
Owner

closed by #642

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants