Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.env files with UTF16-LE BOM are ignored #14564

Open
7 tasks done
shamrin opened this issue Oct 9, 2023 · 6 comments
Open
7 tasks done

.env files with UTF16-LE BOM are ignored #14564

shamrin opened this issue Oct 9, 2023 · 6 comments

Comments

@shamrin
Copy link
Contributor

shamrin commented Oct 9, 2023

Describe the bug

Vite does not correctly handle .env files with BOM. For example, it happens if the file is created with Windows echo command:

echo 'VITE_VAR=1' >> .env.development.local

With this file import.meta.env.VITE_VAR expands to undefined.

Reproduction: https://stackblitz.com/edit/vitejs-vite-m3as94?file=main.js

Instead of a silent error I would expect one of two things:

  1. Correct expansion to '1' (best)
  2. Error while trying to process .env.development.local (better)

Debugging

$ xxd .env.development.local
00000000: fffe 5600 4900 5400 4500 5f00 5600 4100  ..V.I.T.E._.V.A.
00000010: 5200 3d00 3100 0d00 0a00                 R.=.1.....

Outside of Windows, the file can be created with the Python script:

with open('.env.development.local', 'wb') as f:
    f.write(b'\xff\xfe')
    f.write('VITE_VAR=1\r\n'.encode('utf-16-le'))

Related issue

motdotla/dotenv#445

Reproduction

https://stackblitz.com/edit/vitejs-vite-m3as94?file=main.js

Steps to reproduce

No response

System Info

System:
    OS: Linux 5.0 undefined
    CPU: (8) x64 Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
    Memory: 0 Bytes / 0 Bytes
    Shell: 1.0 - /bin/jsh
  Binaries:
    Node: 18.18.0 - /usr/local/bin/node
    Yarn: 1.22.19 - /usr/local/bin/yarn
    npm: 9.4.2 - /usr/local/bin/npm
    pnpm: 8.6.12 - /usr/local/bin/pnpm
  npmPackages:
    vite: ^4.4.8 => 4.4.11 

Used Package Manager

npm

Logs

No response

Validations

@bluwy
Copy link
Member

bluwy commented Oct 10, 2023

I guess the fix is that we have to try loading the .env file with multiple tries of encodings here:

return Object.entries(parse(fs.readFileSync(filePath)))

And that would slightly hurt startup time, so I'm not really sure about fixing this. I think they should be strictly utf-8, similar to how Vite reads any file in general, e.g.

export function loadFallbackPlugin(): Plugin {
return {
name: 'vite:load-fallback',
async load(id) {
try {
// if we don't add `await` here, we couldn't catch the error in readFile
return await fsp.readFile(cleanUrl(id), 'utf-8')
} catch (e) {
return fsp.readFile(id, 'utf-8')
}
},
}
}

@shamrin
Copy link
Contributor Author

shamrin commented Oct 15, 2023

@bluwy Yes, trying multiple encodings seems unreliable and potentially slow. Enforcing utf-8 seems like a good solution. However, fsp.readFile does not fail with this UTF-16 BOM file. Here is what happens:

import fsp from 'node:fs/promises'

try {
  const r = await fsp.readFile('./.env.development.local', 'utf-8')
  console.log(r)
  console.log(Array.from(r).map(b => b.charCodeAt(0)))
} catch (e) {
  console.log(e)
}

Output:

��VITE_VAR=1

[
  65533, 65533, 86,  0, 73,
      0,    84,  0, 69,  0,
     95,     0, 86,  0, 65,
      0,    82,  0, 61,  0,
     49,     0, 13,  0, 10,
      0
]

It looks like a fundamental limitation with readFile() call. It simply ignores decoding errors and replaces weirdness with U+FFFD � replacement character. Also keeps invisible '\0' characters when dealing with UTF-16.

I think it could be solved if we decode with util.TextDecoder:

import fsp from 'node:fs/promises'
import util from 'node:util'

const decoder = new util.TextDecoder('utf-8', {fatal: true})
const buffer = await fsp.readFile('./.env.development.local')
try {
  console.log(decoder.decode(buffer))
} catch (e) {
  console.log(e)
}

The above correctly fails with TypeError for the broken .env file:

TypeError: The encoded data was not valid for encoding utf-8
...
{
  code: 'ERR_ENCODING_INVALID_ENCODED_DATA'
}

@bluwy
Copy link
Member

bluwy commented Oct 16, 2023

I'm still not sure if it's worth fixing, wouldn't that also affect perf? There's a lot of tools today that reads files with utf-8 and they would silently fail too, we're only fixing a portion of the issue.

@shamrin
Copy link
Contributor Author

shamrin commented Oct 16, 2023

@bluwy I believe it's worth fixing.

  1. I was not the first one who stumbled upon Windows echo command default encoding. Examples: dotenv, pipenv, phpdotenv.
  2. It's not the first encoding-related problem that was fixed in Vite: https://github.com/vitejs/vite/issues?q=is%3Aissue+bom+is%3Aclosed
  3. It gives a bad impression when trying Vite for the first time on Windows (when the rest of the team is on Mac or Linux).

I haven't found measurable performance difference. Half of the time TextDecoder was faster:

$ npm run benchmark
 ✓ test/encoding.bench.ts (2) 1309ms
     name                              hz     min      max    mean     p75     p99    p995    p999     rme  samples
   · readFile                   32,921.34  0.0100  12.5700  0.0304  0.0250  0.1150  0.1600  1.7350  ±8.41%    16461  
   · readFile with TextDecoder  36,706.53  0.0100  11.7250  0.0272  0.0200  0.1100  0.1450  1.4550  ±8.11%    18354   fastest


 BENCH  Summary

  readFile with TextDecoder - test/encoding.bench.ts > 
    1.11x faster than readFile

https://stackblitz.com/edit/vitest-dev-vitest-oqhnfg?file=test%2Fencoding.bench.ts

@sapphi-red
Copy link
Member

FWIW fs.readFile(, 'utf8') has a fast path (nodejs/node#48658).

Even if we fixed every place in our code base, there'd be plenty of codes that we depend on that assumes the file is encoded in UTF-8. I don't think we can fix this.

@shamrin
Copy link
Contributor Author

shamrin commented Nov 7, 2023

@sapphi-red nice find! However, Vite does not supply "utf8" argument when loading env files. I did a benchmark only to show there would be no measurable difference between current implementation (that silently ignores decoding errors) and a correct one (that tells users about incorrect encoding).

I would argue broken env handling on Windows is common enough to warrant a fix for this specific case (not general). I’ve posted some links in my previous comment.

I think It would be very nice if Vite would not fail silently and in a hard-to-debug manner.

How does Vite contribution flow work? Is it a good idea to get an approval from a maintainer first?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants