Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid escaping of {staged_files} #786

Open
stevebeauge opened this issue Jul 25, 2024 · 16 comments
Open

Invalid escaping of {staged_files} #786

stevebeauge opened this issue Jul 25, 2024 · 16 comments
Labels
bug Something isn't working

Comments

@stevebeauge
Copy link

stevebeauge commented Jul 25, 2024

🔧 Summary

If file names contains some utf8 (I guess) characters, lefthook fails because of invalid escaping.

error replacing {staged_files}: CreateFile C:\code\repro\lefthook-wrong-escape\"files\KO \303\251.json": La syntaxe du nom de fichier, de répertoire ou de volume est incorrecte.

(part of the message is in French, probably the system error lefthook trapped).

Lefthook version

1.7.7

Running on Windows 10. Both reproed using PowerShell 7 and legacy cmd console.

Steps to reproduce

Clone this repo: https://github.com/stevebeauge/repro-lefthook-wrong-escape

git clone https://github.com/stevebeauge/repro-lefthook-wrong-escape
cd repro-lefthook-wrong-escape
pnpm i

Edit the file files/KO é.json. Set whatever you want in the file. If you are using a powershell console you can run

Get-Date | ConvertTo-Json | Set-Content -LiteralPath '.\files\KO é.json' -Encoding utf8 

Then try to commit the changes:

git add .
git commit -m "Something"

Expected results

It should work whatever the file name is (assuming it's a valid filename for the system).

Actual results

It fails because utf8 character is escaped, but the escape character is itself not properly escaped:

error replacing {staged_files}: CreateFile C:\code\repro\lefthook-wrong-escape"files\KO \303\251.json": La syntaxe du nom de fichier, de répertoire ou de volume est incorrecte.

Possible Solution

Logs / Screenshots

Running pnpm lefthook run pre-commit -v in a powershell console generates this logs:

│ [lefthook] cmd: [git rev-parse --show-toplevel]
│ [lefthook] err: <nil>
│ [lefthook] out: C:/code/repro/lefthook-wrong-escape

│ [lefthook] cmd: [git rev-parse --git-path hooks]
│ [lefthook] err: <nil>
│ [lefthook] out: .git/hooks

│ [lefthook] cmd: [git rev-parse --git-path info]
│ [lefthook] err: <nil>
│ [lefthook] out: .git/info

│ [lefthook] cmd: [git rev-parse --git-dir]
│ [lefthook] err: <nil>
│ [lefthook] out: .git

│ [lefthook] cmd: [git hash-object -t tree /dev/null]
│ [lefthook] err: <nil>
│ [lefthook] out: 4b825dc642cb6eb9a060e54bf8d69288fbee4904

╭──────────────────────────────────────╮
│ 🥊 lefthook v1.7.7  hook: pre-commit │
╰──────────────────────────────────────╯
│ [lefthook] cmd: [git status --short --porcelain]
│ [lefthook] dir: C:/code/repro/lefthook-wrong-escape
│ [lefthook] err: <nil>
│ [lefthook] out: M  "files/KO \303\251.json"

│ [lefthook] cmd: [git diff --name-only --cached --diff-filter=ACMR]
│ [lefthook] dir: C:/code/repro/lefthook-wrong-escape
│ [lefthook] err: <nil>
│ [lefthook] out: "files/KO \303\251.json"

│  prettier (skip) error replacing {staged_files}: CreateFile C:\code\repro\lefthook-wrong-escape\"files\KO \303\251.json": La syntaxe du nom de fichier, de répertoire ou de volume est incorrecte.
│ [lefthook] cmd: [git stash list]
│ [lefthook] dir: C:/code/repro/lefthook-wrong-escape
│ [lefthook] err: <nil>
│ [lefthook] out:

  ────────────────────────────────────
summary: (done in 0.28 seconds)
🥊  prettier: error replacing {staged_files}: CreateFile C:\code\repro\lefthook-wrong-escape\"files\KO \303\251.json": La syntaxe du nom de fichier, de répertoire ou de volume est incorrecte.
╭

As a comparison, modifying files/ok.json is working. It's verbose output is:

│ [lefthook] cmd: [git rev-parse --show-toplevel]
│ [lefthook] err: <nil>
│ [lefthook] out: C:/code/repro/lefthook-wrong-escape

│ [lefthook] cmd: [git rev-parse --git-path hooks]
│ [lefthook] err: <nil>
│ [lefthook] out: .git/hooks

│ [lefthook] cmd: [git rev-parse --git-path info]
│ [lefthook] err: <nil>
│ [lefthook] out: .git/info

│ [lefthook] cmd: [git rev-parse --git-dir]
│ [lefthook] err: <nil>
│ [lefthook] out: .git

│ [lefthook] cmd: [git hash-object -t tree /dev/null]
│ [lefthook] err: <nil>
│ [lefthook] out: 4b825dc642cb6eb9a060e54bf8d69288fbee4904

╭──────────────────────────────────────╮
│ 🥊 lefthook v1.7.7  hook: pre-commit │
╰──────────────────────────────────────╯
│ [lefthook] cmd: [git status --short --porcelain]
│ [lefthook] dir: C:/code/repro/lefthook-wrong-escape
│ [lefthook] err: <nil>
│ [lefthook] out: M  files/ok.json

│ [lefthook] cmd: [git diff --name-only --cached --diff-filter=ACMR]
│ [lefthook] dir: C:/code/repro/lefthook-wrong-escape
│ [lefthook] err: <nil>
│ [lefthook] out: files/ok.json

│ [lefthook] files before filters:
[files/ok.json]
│ [lefthook] files after filters:
[files/ok.json]
│ [lefthook] files after escaping:
[files/ok.json]
│ [lefthook] executing: pnpm prettier files/ok.json
┃  prettier ❯

"2024-07-25T18:29:38.9212089+02:00"

│ [lefthook] cmd: [git stash list]
│ [lefthook] dir: C:/code/repro/lefthook-wrong-escape
│ [lefthook] err: <nil>
│ [lefthook] out:

  ────────────────────────────────────
summary: (done in 1.38 seconds)
✔️  prettier
@stevebeauge stevebeauge added the bug Something isn't working label Jul 25, 2024
@mrexox
Copy link
Member

mrexox commented Jul 26, 2024

Hey! Thank you for submitting this issue. I couldn't reproduce the issue on my macOS, so I assume this is something OS-related. I will try to figure it out.

@mrexox
Copy link
Member

mrexox commented Jul 26, 2024

From my perspective it looks like git command returns the file list in different encoding.

Could you try setting the default encoding for Powershell to UTF-8 (if not set yet)?

I found some useful options here: https://stackoverflow.com/questions/57131654/using-utf-8-encoding-chcp-65001-in-command-prompt-windows-powershell-window/57134096#57134096

@stevebeauge
Copy link
Author

It is already on utf8 code page (65001).

Which git command lefthook runs? Can I "test" the single comment to see how it behaves?

@stevebeauge
Copy link
Author

image

@mrexox
Copy link
Member

mrexox commented Jul 26, 2024

Yes, it's git diff --name-only --cached --diff-filter=ACMR. So, it returns the same corrupted result when executed explicitly, right?

@mrexox
Copy link
Member

mrexox commented Jul 26, 2024

@stevebeauge
Copy link
Author

Ok, found a workaround.

git config core.quotepath false
git status

Returns:

Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        modified:   files/KO é.json

Restoring the config

git config core.quotepath true
git status

Returns:

Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        modified:   "files/KO \303\251.json"

I close the issue. It's not related to lefthook, but to git actually. Thanks for the help

@stevebeauge
Copy link
Author

stevebeauge commented Jul 26, 2024

Sorry, I reopen the issue.

The git config I set remove the escaping, but also the surrounding quote, which breaks lefthook

Without quotepath set to false: modified: "files/KO \303\251.json"

With quotepath set to true (which is the git default): modified: files/KO é.json. No quote around the file name.

This leads to lefthook error:

│ [lefthook] cmd: [git rev-parse --show-toplevel]
│ [lefthook] err: <nil>
│ [lefthook] out: C:/code/repro/lefthook-wrong-escape

│ [lefthook] cmd: [git rev-parse --git-path hooks]
│ [lefthook] err: <nil>
│ [lefthook] out: .git/hooks

│ [lefthook] cmd: [git rev-parse --git-path info]
│ [lefthook] err: <nil>
│ [lefthook] out: .git/info

│ [lefthook] cmd: [git rev-parse --git-dir]
│ [lefthook] err: <nil>
│ [lefthook] out: .git

│ [lefthook] cmd: [git hash-object -t tree /dev/null]
│ [lefthook] err: <nil>
│ [lefthook] out: 4b825dc642cb6eb9a060e54bf8d69288fbee4904

╭──────────────────────────────────────╮
│ 🥊 lefthook v1.7.7  hook: pre-commit │
╰──────────────────────────────────────╯
│ [lefthook] cmd: [git status --short --porcelain]
│ [lefthook] dir: C:/code/repro/lefthook-wrong-escape
│ [lefthook] err: <nil>
│ [lefthook] out: M  "files/KO é.json"

│ [lefthook] cmd: [git diff --name-only --cached --diff-filter=ACMR]
│ [lefthook] dir: C:/code/repro/lefthook-wrong-escape
│ [lefthook] err: <nil>
│ [lefthook] out: files/KO é.json

│ [lefthook] files before filters:
[files/KO é.json]
│ [lefthook] files after filters:
[files/KO é.json]
│ [lefthook] files after escaping:
['files/KO é.json']
│ [lefthook] executing: pnpm prettier 'files/KO é.json'
⠋ waiting: prettier[error] No files matching the pattern were found: "'files/KO".
[error] No files matching the pattern were found: "é.json'".
┃  prettier ❯


│ [lefthook] cmd: [git stash list]
│ [lefthook] dir: C:/code/repro/lefthook-wrong-escape
│ [lefthook] err: <nil>
│ [lefthook] out:

  ────────────────────────────────────
summary: (done in 1.33 seconds)
🥊  prettier

I'm not sure how lefthook works internally, but I wonder if it run a git command and analyses the results, or if it uses any package. If it works using the former, maybe there's something to do to be more reliable ?

@stevebeauge stevebeauge reopened this Jul 26, 2024
@stevebeauge
Copy link
Author

I don't know if it can solve the issue, but git diff has a -z flag that does not encode the file name:

git diff --name-only --cached --diff-filter=ACMR                                                                                                                                                            
"files/KO \303\251.json"

git diff --name-only --cached --diff-filter=ACMR -z                                                                                                                                                         
files/KO é.json^@

The final char is a NUL char. So what about adding the flag to lefthook, take the output minus the last NUL char and adding surrounding quotes?

@mrexox
Copy link
Member

mrexox commented Jul 26, 2024

Could you try wrapping the {staged_files} in double quotes? Like this:

pre-commit:
  commands:
    lint:
      run: yarn lint "{staged_files}"

Of single quotes if double quotes don't work. This quoting should wrap each filename with the quotes.

@stevebeauge
Copy link
Author

If I both wrap the {staged_files} with quotes and set git config core.quotepath false, it works as expected.

However, I don't feel comfortable with this solution. Git options are not committed as part of the source code. It requires every developer to set the option, and will eventually lead to issues or conflict with other tools that may rely on git commands (I'd bet turborepo or NX are using same technics to detect affected projects).

I never works with go and I have no idea of the impact/complexity, but I'd wish lefthook take the normalization into account.

From the git doc

Path names are encoded in UTF-8 normalization form C.

Maybe there's some lib in go that can revert the normalization back to plain utf8 encoding ?

(🤫 https://chatgpt.com/share/448c4b2f-c19d-4532-a57c-c7c0d9573a44)

This would create a bullet proof and cross platform way to read the file names.

@mrexox
Copy link
Member

mrexox commented Jul 26, 2024

For the last issue the problem is that Windows doesn't work well with single quotes (as far as I know), and considers 'files/KO é.json' as two files: 'files/KO and é.json'. So, this is not the encoding issue here.

But I like your idea for normalization of the output. It makes sense. I'll put it on my radar 👍

@nicky1038
Copy link

I've also bumped into an issue with wrong escaping of {staged_files}, but it was of another kind.

If file name contains square brackets ([ or ]), it will be additionally wrapped with single quotes, making file name invalid for eslint, for example.

@mrexox
Copy link
Member

mrexox commented Jul 30, 2024

@nicky1038 , in this case use "{staged_files}" in your configuration, so every file is wrapped in double quotes when calling with eslint

@nicky1038
Copy link

@nicky1038 , in this case use "{staged_files}" in your configuration, so every file is wrapped in double quotes when calling with eslint

@mrexox Thank you for the quick response, the suggested solution works like a charm! And sorry for not reading docs before asking questions 😶

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants
@stevebeauge @nicky1038 @mrexox and others