Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enforce latin-1 encoding in source #2672

Closed
wants to merge 2 commits into from

Conversation

Ethan-Arrowood
Copy link
Collaborator

Related to: nodejs/node#51605

this replaces all with ' and the ä with a in a licensing comment.

Based on my research, the licensing comment change does not impact its validity. It would actually be more appropriate to have a complete License file along with the code instead of having it inlined.

I also tried out the --minify-whitespace flag for esbuild. This would remove comments from node:build, but also remove all other whitespace. There does not seem to be an esbuild configuration option that only removes comments. We could consider another tool for that so we can retain other whitespace.

Copy link
Member

@KhafraDev KhafraDev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we remove those comments it'll make debugging fetch 10x harder.

We can no longer copy and paste these comments and find where they originate from in the spec (I do this frequently when updating the impl. or fixing bugs). Since esbuild should be removing comments, it's likely a bug there. We can alternatively add a post processing script that strips these only in the bundled file.

I'm very much a -1 against this change. It'll make maintaining fetch harder for little benefit.

@Ethan-Arrowood
Copy link
Collaborator Author

Not a bug in esbuild: evanw/esbuild#3117 (comment)

@Ethan-Arrowood
Copy link
Collaborator Author

Just tried it and changing the character has no effect on search/find on the spec 😄

@KhafraDev
Copy link
Member

something must have changed with the site then, it definitely did not work in the past...

@Ethan-Arrowood
Copy link
Collaborator Author

But I hear you. I agree too. I think this is an unnecessary change. I think we should minify the build and use source maps for proper stack traces. Not sure how Node.js internal deps handles that so I have to look into it further.

@KhafraDev
Copy link
Member

Internal deps are kinda a nightmare: they get their own version of require that doesn't work well with multiple files (which is why undici is bundled unfortunately).

@KhafraDev
Copy link
Member

KhafraDev commented Jan 30, 2024

The issue with the require we get there is that it only prepends 'internal' to the path, which means we'd have to replace every require in undici with require('undici' + oldPath) (which then gets resolved to internal + undici + oldPath which should theoretically work?). I did look into it a while back, when we had the --keep-names issue with the bundle, but quickly found out it was definitely the correct answer to bundle.

I don't think this change should land unless it's enforced, and at that point we should just npm run build:node && npx removeComments undici-fetch.js or whatever.

@codecov-commenter
Copy link

Codecov Report

Attention: 207 lines in your changes are missing coverage. Please review.

Comparison is base (e39a632) 85.54% compared to head (c984325) 85.35%.
Report is 254 commits behind head on main.

Files Patch % Lines
lib/fetch/index.js 73.17% 55 Missing ⚠️
lib/fetch/util.js 48.45% 50 Missing ⚠️
lib/handler/RetryHandler.js 74.35% 30 Missing ⚠️
lib/cache/cache.js 12.12% 29 Missing ⚠️
lib/api/readable.js 83.92% 9 Missing ⚠️
lib/core/diagnostics.js 84.74% 9 Missing ⚠️
lib/eventsource/eventsource.js 96.09% 5 Missing ⚠️
lib/fetch/headers.js 90.38% 5 Missing ⚠️
lib/client.js 95.52% 3 Missing ⚠️
lib/compat/dispatcher-weakref.js 57.14% 3 Missing ⚠️
... and 7 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2672      +/-   ##
==========================================
- Coverage   85.54%   85.35%   -0.20%     
==========================================
  Files          76       84       +8     
  Lines        6858     7540     +682     
==========================================
+ Hits         5867     6436     +569     
- Misses        991     1104     +113     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@KhafraDev
Copy link
Member

KhafraDev commented Jan 30, 2024

@Ethan-Arrowood I made an alternate PR that converts the bundled file to latin1 only

@joyeecheung
Copy link
Member

joyeecheung commented Jan 31, 2024

they get their own version of require that doesn't work well with multiple files (which is why undici is bundled unfortunately).

I am pretty sure that can be special cased for undici, though I am not sure what undici uses as module ids if not bundled. The internal require will always be just minimal so nothing fancy like module path monkey patching would ever be supported, but prepending certain prefixes when the request comes from under internal/deps/undici wouldn't be that hard.

@mcollina
Copy link
Member

mcollina commented Feb 9, 2024

@Ethan-Arrowood @KhafraDev what's the status of this? There are conflicts now.

@Ethan-Arrowood
Copy link
Collaborator Author

I think @KhafraDev preferred another methodology. Since he's modifying this source more than I am these days, I'm okay deferring to his preference 😄

@Uzlopak Uzlopak deleted the enforce-latin-1-encoding branch February 21, 2024 22:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants