Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reformatting a string containing both single and double quotes breaks down #4838

Closed
2 tasks done
kgutwin opened this issue Aug 19, 2024 · 1 comment
Closed
2 tasks done
Labels
bug Invalid compiler output or panic

Comments

@kgutwin
Copy link
Collaborator

kgutwin commented Aug 19, 2024

What happened?

If you use prqlc fmt on a triple-quoted string, typically it strips the triple-quotes and gives back a string with escape sequences for things like newlines.

let foo = """first
second"""

formats to let foo = "first\nsecond".

If you also include a set of double quotes in the triple-quoted string, the formatting output quotes the string in single quotes.

let foo = """first
and "second" here"""

formats to let foo = 'first\nand "second" here'

But if you include both single and double quotes in the triple-quoted string, the formatting output goes off the rails, using two double quotes to quote the string:

let foo = """first's
and "second" here"""

formats to let foo = ""first's\nand "second" here"" -- which is invalid PRQL.

This is also not limited to triple-quoted string input -- if you try to format an escaped version of the string, you get the same output.

let foo = 'first\'s\nand "second" here'

also formats to let foo = ""first's\nand "second" here"".

Perhaps prqlc fmt should emit triple-quoted strings when both single and double quotes are in the string?

PRQL input

let foo = """formatting's "failing" here"""
from employees

SQL output

let foo = ""formatting's "failing" here""

from employees

# Generated by PRQL compiler version:0.13.0-25-g53f747c1 (https://prql-lang.org)

Expected SQL output

let foo = """formatting's "failing" here"""

from employees

# Generated by PRQL compiler version:0.13.0-25-g53f747c1 (https://prql-lang.org)

MVCE confirmation

  • Minimal example
  • New issue

Anything else?

No response

@kgutwin kgutwin added the bug Invalid compiler output or panic label Aug 19, 2024
max-sixty added a commit to max-sixty/prql that referenced this issue Aug 24, 2024
max-sixty added a commit to max-sixty/prql that referenced this issue Aug 25, 2024
@max-sixty
Copy link
Member

This I think is now fixed, by #4850. Thanks for the issue @kgutwin .

I added a couple of TODOs in the code around more minor issues (we could open a new issue if this is salient to anyone). Copying below for reference. I also would like to add some more tests for escapes in raw strings; any bug reports there are very welcome.


TODO: one problem here is that we don't remember whether the original
string contained an actual line break or contained an \n string,
because we immediately normalize both to \n. This means that when
we format the PRQL, we can't retain the original. I think three ways of
resolving this:

  • Have different tokens in the lexer and parser; normalize at the
    parsing stage, and then use the token in the lexer for writing out
    the formatted PRQL. Literals are one of the only data structures we
    retain between the lexer and parser. (note that this requires the
    current effort to use tokens from the lexer as part of prqlc fmt;
    ongoing as of 2024-08)
  • Don't normalize at all, and then normalize when we use the string.
    I think this might be viable and maybe easy, but is a bit less
    elegant; the parser is designed to normalize this sort of thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Invalid compiler output or panic
Projects
None yet
Development

No branches or pull requests

2 participants