Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic routes cause a "ChunkLoadError" when using the app router in a dockerized Next.js 13 SPA #54008

Closed
1 task done
eide-1 opened this issue Aug 14, 2023 · 66 comments · Fixed by #56187
Closed
1 task done
Assignees
Labels
bug Issue was opened via the bug report template. linear: next Confirmed issue that is tracked by the Next.js team.

Comments

@eide-1
Copy link

eide-1 commented Aug 14, 2023

Verify canary release

  • I verified that the issue exists in the latest Next.js canary release

Provide environment information

Operating System:
      Platform: linux
      Arch: x64
      Version: #85-Ubuntu SMP Fri Jul 7 15:25:09 UTC 2023
    Binaries:
      Node: 18.15.0
      npm: 9.5.0
      Yarn: N/A
      pnpm: N/A
    Relevant Packages:
      next: 13.4.15
      eslint-config-next: N/A
      react: 18.2.0
      react-dom: 18.2.0
      typescript: N/A
    Next.js Config:
      output: N/A

Which area(s) of Next.js are affected? (leave empty if unsure)

App Router

Link to the code that reproduces this issue or a replay of the bug

https://github.com/eide-1/sample-project

To Reproduce

Steps to Reproduce:

  1. Run the the project in a production environment by following the instructions in the README file
  2. Click on the "Go to Dynamic Route" button. You will be presented with an application error
  3. Right-click anywhere on the webpage and click "Inspect". The browser console will display the "ChunkLoadError"

Describe the Bug

I am currently working on upgrading an existing Next.js SPA that uses the pages router to the app router. The project is deployed using Docker and Nginx. However, I noticed that dynamic routes cause a "ChunkLoadError" when the SPA is deployed in a production environment. All pages (including dynamic routes) are functional in a development environment.

The dynamic route in question uses the "use client" directive. I noticed that if I remove the "use client" directive and any component or hook that requires it from the page, I can get it to render properly in production mode.

I created a sample project that uses the same tech stack as my current SPA to demonstrate the issue and I included the steps to reproduce the issue below.

I also found the following Stack Overflow question detailing a similar issue to the one I described above: In Next.js 13 using App Router, why can't I export dynamic routes with "use client"?

Expected Behavior

Dynamic routes should work in the production environment just like they used to under the pages router without causing a ChunkLoadError.

Which browser are you using? (if relevant)

Microsoft Edge 115.0.1901.203 (64-bit)

How are you deploying your application? (if relevant)

Docker/Nginx

NEXT-1641

@eide-1 eide-1 added the bug Issue was opened via the bug report template. label Aug 14, 2023
@hungcrush

This comment was marked as off-topic.

@danielle-o3h
Copy link

danielle-o3h commented Aug 17, 2023

I am seeing the same issue in our Docker Environments reverting to version 13.4.9 is a temporary fix for the issue.

Any urls with [] (%5B, %5D) return a 404 error which includes the generated js files for any dynamic page

@hykelvinlee42
Copy link

I am seeing the same issue in our Docker Environments reverting to version 13.4.9 is a temporary fix for the issue.

Any urls with [] (%5B, %5D) return a 404 error which includes the generated js files for any dynamic page

I can confirm this solution works. The issue begins occurring since version 13.4.13-canary.16.

@harryoppa

This comment was marked as off-topic.

@Baukaalm

This comment was marked as off-topic.

NicolasDuboisToulouse pushed a commit to NicolasDuboisToulouse/a-tes-souhaits that referenced this issue Aug 19, 2023
@bryceAebi

This comment was marked as off-topic.

@bryceAebi
Copy link

bryceAebi commented Aug 20, 2023

I have more or less the same setup and the same issue (only dynamic routes), but I am only experiencing the error when I have Google Search Console "live test" my production URL, which makes this difficult to debug.

I'm not able to reproduce the error in production via Chrome Beta.

@scottpaulin
Copy link

I am experiencing similar issues with Google Search Console "live test" on production URL for a site deployed on Vercel. It makes it difficult to get pages indexed in Google Search.

It seems a little non-deterministic for me. Sometimes Google Search Console "live test" does not work on a page. Yesterday Google Search Console "live test" failed twice on a page then worked the third time.

[email protected] and app router deployed on Vercel

@bryceAebi
Copy link

bryceAebi commented Aug 20, 2023

@scottpaulin I realized that my issue is not limited to my dynamic routes. It affects a bunch of my pages. The Live Test fails about two thirds of the time. But I do think my pages are getting indexed as I can find them using Google Search and my production pages load just fine when I navigate to them in the browser. Perhaps it's an issue with the Live Test crawler erroneously thinking that a certain chunk is unnecessary and not loading it. I'm hoping it's not actually impacting SEO...

@scottpaulin
Copy link

@bryceAebi yea I have seen it in a static route too.

Google also (eventually) indexes my pages. I think Google is recrawling pages every few days and eventually has a successful render.

Ahrefs is showing lots of errors about failed metadata etc, I'm guessing it's because the render fails.

Yea not sure if seo is being affected. Fingers crossed it isn't. Pretty scary issue for sites that rely on search traffic for revenue.

I am seeing a bunch of http status 0 errors in datadog. Might be unrelated though.

@marwinlewis

This comment was marked as off-topic.

@LucaNerlich
Copy link

LucaNerlich commented Aug 23, 2023

ive tested many versions and 13.4.12 is the last working one. As soon as I upgrade next to any version higher, dynamic routing breaks in interesting ways:

404 ChunkLoadError in a Docker Environment and just no content and no error for non docker deployments.

Super weird and very annoying. Also only on unix?

I am unable to reproduce this locally - with any version. Even latest works fine. But as soon as I deploy on a ubuntu cloud server -> router is broken again.

This affects both app and pages router.

@marwinlewis
Copy link

ive tested many versions and 13.4.12 is the last working one. As soon as I upgrade next to any version higher, dynamic routing breaks in interesting ways.

404 ChunkLoadError in a Docker Environment and just no content and no error for non docker deployments. Super weird and very annoying. Also only on unix? I am unable to reproduce this locally - with any version. Even latest works fine. But as soon as I deploy on a ubuntu cloud server -> router is broken again.

This affects both app and pages router.

Yes, locally even the production version works fine for the latest version. The problem occurs only after deployment to a unix server.

@shehi
Copy link

shehi commented Aug 24, 2023

I develop and build on Docker env - so both npm run dev and npm run build are Dockerized all the time. Dockerized Linux (host = Ubuntu, guest/containers = debian).

I am seeing this issue on PROD builds (i.e. npm run build), starting v13.4.13 (works fine in v13.4.12). Testing every single release from v13.4.13 to v13.4.19 - all fail.

Dev env (again, Dockerized npm run dev) works just fine on the latest NextJs.

@domosedov

This comment was marked as off-topic.

@domosedov

This comment was marked as off-topic.

@pwestern
Copy link

For what it's worth, I have also had the same issue, and downgrading to 13.4.12 resolves it.

@shehi
Copy link

shehi commented Aug 28, 2023 via email

@janpaepke
Copy link

Guys this caused me a lot of head scratching. Thanks for this issue!

For the record (host OS: OSX):

  • local dev env (npm run dev) works
  • local build start (npm run build && npm start) works
  • local dockerized run (docker build, start container, forward port) WORKS!
  • deployed dockerized run - BREAKS

The most confusing part to me was that even running in docker worked for me locally.
It lead me down a rabbit hole investigating if there are some weird ingress configurations on our cluster.

Whenever I deployed it, on the production version it would show:

"Application error: a client-side exception has occurred (see the browser console for more information)."
In the console I could then see that all the js chunks that are in dynamic routes could not be loaded.
I even checked in the docker container to confirm that the files are there (.next/static/chunks/app/[dynamicBit]).

It turned out to be this.

  • Downgrading to 13.4.12 fixed this
  • Upgrading to 13.4.20-canary.18 did NOT fix this.

Hope this saves someone some head-scratching.

I will investigate some more what might be going on here.

@domosedov
Copy link

13.4.12 - Latest stable version

@janpaepke
Copy link

I have further narrowed down, that the breaking change must have been introduced in v13.4.13-canary.0 – it's the first version where this occurs.

Looking at the changes, nothing stood out and unfortunately I lack the time for further analysis.
Maybe this is a good starting point for someone, though.

Until then, we'll (have to) stick with 13.4.12

@poorscousertommy8
Copy link

poorscousertommy8 commented Sep 7, 2023

same issue here
I also hope that the Next.js team is working on it. We are currently stagnating on version 13.4.12
We don't use Docker but on a Windows Server with IISNode

@beverloo
Copy link

beverloo commented Sep 7, 2023

I've also been having this issue, and can reproduce Jan's observation that this broke between v13.4.12 and v13.4.13-canary.0, which contains the following commits:

v13.4.12...v13.4.13-canary.0

Another observation is the following: given a dynamic route parameter [slug],

  • "static/chunks/app/events/[slug]/layout-dd5472667df98622.js" exists on disk and is included in the build manifest,
  • "/_next/static/chunks/app/events/%5Bslug%5D/layout-dd5472667df98622.js" is requested by the browser, but 404s,
  • "/_next/static/chunks/app/events/%255Bslug%255D/layout-dd5472667df98622.js" works just fine.

Since the file exists on disk without any form of encoding, which also was the case in v13.4.12, that makes me think this likely is caused by a change in the included server.

This commit changes to how files are resolved, particularly the diffs in packages/next/src/server/lib/router-utils/. I see some encoding and decoding going on in there, but nothing stands out as the cause. There's significant logic differences in filesystem.ts between dev and prod behaviour, and an interesting (but likely unrelated) clause in on :560-565 that seems to guard against this type of mismatch:

// In dev, we ensure encoded paths match
// decoded paths on the filesystem so check
// that variation as well
const tempItemPath = decodeURIComponent(curItemPath)
fsPath = path.posix.join(itemsRoot, tempItemPath)
found = await fileExists(fsPath, FileType.File)

I'm not really having any luck getting local changes to end up in the docker image, will try that again later...

@beverloo
Copy link

beverloo commented Sep 8, 2023

I've confirmed that 1398de9 is the cause of the issue.

Looking at packages/next/src/server/lib/router-utils/filesystem.ts, the check on line 517 fails because curItemPath is not found in items (for type=nextStaticFolder). The state at this point is:

let curItemPath = "/_next/static/chunks/app/events/[slug]/page-8f8f16ccd387e59d.js";
let curDecodedItemPath = "/_next/static/chunks/app/events/[slug]/page-8f8f16ccd387e59d.js";
let items = [
    // ...
    "/_next/static/chunks/app/events/%5Bslug%5D/page-8f8f16ccd387e59d.js",
];

The entry in items was encoded on line 181:

// ensure filename is encoded
nextStaticFolderItems.add(
  path.posix.join('/_next/static', encodeURI(file))
)

If I remove the encodeURI call the app works properly and this issue is solved. I see that files in the public directory are similarly encoded on :147; I can reproduce this exact issue in that directory using a file called [hello].txt, which 404s in v13.4.13-canary.0, but works when I remove that encodeURI call too.

Hey @ijjk - I hope you don't mind the mention - are you sure that the encodeURI calls are necessary here? Iow. could it be that the change you made to test/integration/i18n-support/test/shared.js hides an actual failure?

[edit]
I also confirmed that the issue still exists in v13.4.20-canary.23, and that this commit in my fork fixes it. I haven't yet been able to run the NextJS test suite to see if anything else breaks...

@beverloo
Copy link

Final message from me, apologies for triple posting. While this definitely is a regression in NextJS, I found the underlying root cause for my case: nginx was messing with the URL.

I use nginx to terminate TLS and reverse proxy the requests. When the proxy_pass rule ends with a slash, it gets rewritten. When it doesn't end with a slash, the original request URL is passed through. (Obviously 🙄...)

I fixed it by changing my location rule from this:

location / {
	proxy_pass http://localhost:3001/;  # <-- ends with a slash
	proxy_set_header Host $host;
	proxy_set_header X-Real-IP $remote_addr;
	proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}

to the following, after which the latest NextJS canary works just fine:

location / {
	proxy_pass http://localhost:3001;  # <-- does not end with a slash
	proxy_set_header Host $host;
	proxy_set_header X-Real-IP $remote_addr;
	proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}

beverloo added a commit to beverloo/volunteer-manager that referenced this issue Sep 10, 2023
This was blocked on a regression in NextJS, which previously compensated for rather surprised behaviour from nginx.

vercel/next.js#54008 (comment)
@janpaepke
Copy link

As to Next's standalone server: apparently it has its own URL rewrite
issues, which have surfaced here. Nginx fix doesn't fix it of course, but
it fixes our use case, where Next server is replaced by Nginx. I hope I
could explain my point.

@shehi yes but I'd like to focus on the actual issue which is an obvious regression.

There is also a problem with different letter case, "[slug]" works, but [slugCase] doesn't.

@AmirL , the same as @poorscousertommy8 I have this issue even with all lowercase, [lang] to be specific.
This is implementing i18n with the app router, roughly following the official example.

We have mentioned the issue to several maintainers now and it is clear we need their support.
Let's make it more likely that they will look into this issue by trying to limit additional comments to those which might shed further light on a potential solution.

@poorscousertommy8

This comment was marked as off-topic.

@bielinskiadrian

This comment was marked as off-topic.

@JavierMartinz
Copy link

JavierMartinz commented Sep 20, 2023

I'm having also this issue and I'm not using a trailing slash in my nginx reverse proxy config

Buggy config

  location /_next {
    proxy_pass              http://localhost:3000/_next;
    proxy_http_version      1.1;
    proxy_set_header        Upgrade $http_upgrade;
    proxy_set_header        Connection 'upgrade';
    proxy_set_header        Host $host;
    proxy_cache_bypass      $http_upgrade;
  }

  location /_next/static {
    proxy_pass              http://localhost:3000/_next/static;
    proxy_http_version      1.1;
    proxy_set_header        Upgrade $http_upgrade;
    proxy_set_header        Connection 'upgrade';
    proxy_set_header        Host $host;
    proxy_cache_bypass      $http_upgrade;
  }

I just found the fix, after reading this https://serverfault.com/a/463932

Working config

  location /_next {
    proxy_pass              http://localhost:3000;
    proxy_http_version      1.1;
    proxy_set_header        Upgrade $http_upgrade;
    proxy_set_header        Connection 'upgrade';
    proxy_set_header        Host $host;
    proxy_cache_bypass      $http_upgrade;
  }

  location /_next/static {
    proxy_pass              http://localhost:3000;
    proxy_http_version      1.1;
    proxy_set_header        Upgrade $http_upgrade;
    proxy_set_header        Connection 'upgrade';
    proxy_set_header        Host $host;
    proxy_cache_bypass      $http_upgrade;
  }

@balazsorban44 balazsorban44 added the linear: next Confirmed issue that is tracked by the Next.js team. label Sep 20, 2023
@poorscousertommy8

This comment was marked as off-topic.

@jlalmes
Copy link

jlalmes commented Sep 25, 2023

Been tearing my hair out over this bug today 😭. Was introduced in v13.4.13-canary.0 by @ijjk in this PR #53029.

The problem is that when servers such as NGINX (& Apache) reverse proxy a request (using proxy_pass) they decode the URL before forwarding the request (StackOverflow). NextJS has always handled this, but it has been broken since v13.4.13-canary.0.

200 ✅ `/_next/static/chunks/pages/%5Bparent%5D/%5Bchild%5D-hash.js`
404 ❌ `/_next/static/chunks/pages/[parent]/[child]-hash.js` # URL transformed by NGINX

FWIW, this one-liner patch resolves the issue:

Screenshot 2023-09-25 at 23 09 29

cc. #54008 (comment)

@janpaepke
Copy link

janpaepke commented Sep 26, 2023

@jlalmes thanks for your insights. Would you mind adding a pr for this?

@balazsorban44
Copy link
Member

Hi everyone, we are looking into this!

The best way to help is to provide a minimal reproduction instead of "same issue", etc. comments. It should be clear from your reproduction if you are using nginx, standalone (output: "standalone"), Docker, or a combination of these. You don't need to comment "it's still happening on x" as this information in itself is not helpful if we cannot see your code/config, instead provide a URL to a public reproduction. 🙏

Note: Please refrain from tagging maintainers for visibility, the issue is already tracked, so we are aware.

@omarmciver
Copy link
Contributor

@balazsorban44

Minified version of the problem here: https://github.com/omarmciver/ui-example-web

Spin it up with VSCode devcontainer extension. Include simple nginx config.

The README.md give instructions on how to recreate, and fix by not using client side rendering.

Using 13.5.3

@omarmciver
Copy link
Contributor

TL;DR 🚀

If you're using a configuration like this:

location /next {
    proxy_pass http://ui-example-web:3000/next;
}

Change it to:

location /next {
    proxy_pass http://ui-example-web:3000$request_uri;
}

⚠️ You may also need to add a DNS resolver directive if you don't already have one:

resolver 127.0.0.11 valid=30s;

If this helped you, consider buying me a coffee.


Full Explanation 📖

The Issue 🐛

When NGINX is given a proxy_pass value that includes a URI (anything after the protocol, hostname and port), it always decodes the URI before forwarding the request.

This leads Next.js to receive a decoded request URI for a static chunk, like so:

http://localhost:3000/next/_next/static/chunks/app/things/[thingId]/page-a4112d9d01403386.js

It (or webpack?) expects to receive it like so:

http://localhost:3000/next/_next/static/chunks/app/things/%5BthingId%5D/page-a4112d9d01403386.js

The Consequence ❌

Next.js doesn't handle that decoded URI well and returns a 404 error. This issue is specific to next start (doesn't happen with next dev) and occurs only if your dynamic route has a use client; directive.

The Solution 💡

The workaround is to use NGINX variables. The $request_uri variable contains everything after the original host and port from the incoming request. By appending $request_uri at the end of proxy_pass, NGINX won't decode the URI.

Final Thoughts 💭

Understanding both NGINX and Next.js behaviors is crucial. I use NGINX in development and NGINX ingress controller in a Kubernetes production environment. The same config change applies to both.

Note: When I switched to this new proxy_pass configuration, I had to add a DNS resolver directive to my NGINX config.


Feel free to share your thoughts and experiences. If this helped you, I'm still looking for help buying coffee.

@janpaepke
Copy link

janpaepke commented Sep 28, 2023

thanks @omarmciver for your detailed analysis.

I would like to point out something from it in particular, because I suspect this to be the case for everyone here like me thinking this can't be related to nginx, because they're not using it.

Initially I thought "Well this again doesn't concern me, because I'm not using nginx, but the nextjs standalone server."
But that quickly changed, when I read this in your text:

I use NGINX in development and NGINX ingress controller in a Kubernetes production environment.

I don't use nginx in development, but it turns out we DO use an nginx ingress controller in the kubernetes production environment.
I hadn't checked the nginx config on it, because I didn't configure it - I actually think it might have been auto-configured.

The bad news is: I have checked the config now and there is no special configuration for /next and no proxy_pass with trailing slashes anywhere, so there's no easy workaround for me. It's also worth noting, that our current configuration worked up until 13.4.12

The good news is: We may have found the origin. Next needs to handle the encoding/decoding of the chunk URIs properly. It also means my initial gut feel pointing towards our ingress controller, which I mentioned in my initial error report, was probably correct.

@poorscousertommy8
Copy link

Sorry Balázs for the tagging.
I would love to provide a minimal reproduction, but I'm having a hard time with it because, similar to my colleagues who use nginx, the problem is probably between Next.js and my Windows Server (as already mentioned, I use iisnode) .

What I do is explained relatively simply: I run the setup according to the Next.js website (npx create-next-app@latest - incl. App Router) and create a folder in the app directory with a dynamic route ([...pagename ]). After deployment, loading the chunks fails.

@omarmciver
Copy link
Contributor

@janpaepke I can share with you the Kubernetes ingress solution a little later today. I don't know what, if any, blocker there is for nextjs handling a decoded uri. I might look at that previous commit that was referenced as introducing the issue.

@omarmciver
Copy link
Contributor

@janpaepke and others.
I didn't spend time on my AKS ingress config. Instead, I managed to get up and running with debugging nextjs and I've got a fix in a draft PR.

PR #56187

It will remain in draft until I get time to write a test to support this.

@alexrabin
Copy link

alexrabin commented Sep 28, 2023

@omarmciver Thank you!

@jviall
Copy link

jviall commented Sep 28, 2023

@janpaepke and others. I didn't spend time on my AKS ingress config. Instead, I managed to get up and running with debugging nextjs and I've got a fix in a draft PR.

PR #56187

It will remain in draft until I get time to write a test to support this.

Tysm for your efforts and contributions!!

@omarmciver
Copy link
Contributor

Still working on getting a test in place, then hopefully working my way to the top of the maintainers review list. Man, they have a lot of work!

@poorscousertommy8
Copy link

For all users who use windows and iisnode, there is a solution #54325 (comment).

@huozhi huozhi self-assigned this Oct 2, 2023
@janpaepke
Copy link

For all followers of this: There's movement in @omarmciver's PR. Looks like this might be solved soon. 🥳

@huozhi
Copy link
Member

huozhi commented Oct 3, 2023

This is fixed in latest patch release 13.5.4, please upgrade to the new version, thanks y'all for digging deep on this issue 🙏

@mick-feller
Copy link

i can confirm that this patch worked, i upgraded to 13.5.5 and all good again!

@rihards-simanovics
Copy link

people please stop spamming, this is not a discussion but an issue, and I only want to get updates when there is progress, if it works for you just 👍the first message that says it started working or the issue got fixed.

@vercel vercel locked as resolved and limited conversation to collaborators Oct 14, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Issue was opened via the bug report template. linear: next Confirmed issue that is tracked by the Next.js team.
Projects
None yet
Development

Successfully merging a pull request may close this issue.