Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chunk file output not deterministic #478

Open
samcic opened this issue Dec 23, 2021 · 11 comments · May be fixed by #479
Open

Chunk file output not deterministic #478

samcic opened this issue Dec 23, 2021 · 11 comments · May be fixed by #479

Comments

@samcic
Copy link

samcic commented Dec 23, 2021

Repro steps (run on a Windows 10 with node 14.16.1):

  • npm install -g [email protected]
  • ember new ember-auto-import-chunk-demo
  • Update "ember-auto-import": "^2.2.4" in package.json
  • Add "webpack": "^5.65.0" to package.json
  • npm install
  • ember build -prod
  • Run ember build -prod again to see if the output in dist is deterministic

Expected result: The chunk files in dist remain unchanged.

Actual result: Some of the chunk files in dist have different content each time a build is done.

The main culprit seems to be the chunk which is named chunk.143 (at least that's how it's named on my machine). Every time a build happens, the integer ids change:

image

The usual app- and vendor- js files, as well as one of the chunk files, do indeed have a deterministic output.

Is this expected? From the perspective of optimizing cache usage for end-user browsers, it seems sub-optimal.

If it is indeed unexpected, can it be fixed with a custom webpack config? I looked into the deterministic option for chunk ids in webpack, but it seems that should be default (and setting it explicitly didn't change the behavior).

I'd be happy to investigate and prepare a contribution, however I could use a bit of a pointer in the correct direction from someone who's familiar with

@ef4
Copy link
Collaborator

ef4 commented Dec 23, 2021

Yes, this seems like a good thing to take care of in the defaults settings. I suspect it's optimization.moduleIds that needs to be set to "deterministic".

Could you try that out and PR it into our defaults if it solves the problem?

Thanks.

@samcic
Copy link
Author

samcic commented Dec 23, 2021

Thanks @ef4 !

So I added:

  let app = new EmberApp(defaults, {
    autoImport: {
      webpack: {
        optimization: {
          moduleIds: 'deterministic'
        }
      }
    }
  });

...however it doesn't change the behavior unfortunately. Are those integers indeed "module ids", or are they something else?

@ef4
Copy link
Collaborator

ef4 commented Dec 23, 2021

They look like moduleIds to me, but I could be wrong.

Double check that your setting is really getting through to webpack. Maybe some other bug is preventing it from getting merged into the webpack config correctly.

If you change this setting and diff the output from before, does anything change? Something should if the setting is having any effect.

@samcic
Copy link
Author

samcic commented Dec 23, 2021

@ef4 Thanks heaps for the guidance. I enabled debug mode for ember-auto-import* because I saw that the add-on dumps the config used:

image

It seems clear that the option is passed through to webpack based on this, but indeed, nothing else changes regardless of whether I have this option set or not. That is, these module ids always differ from one build to the next, regardless of the option, but no other asset files change (app- and vendor- files are unchanged as expected, even if I compare them across one build with the option enabled and one build without).

I guess I'll start looking into ways to enable debug logs on webpack. If you have any other ideas, let me know!

@samcic
Copy link
Author

samcic commented Dec 23, 2021

Interestingly, changing the option to something else like "named" (instead of "deterministic") has an effect and indeed changes the culprit integers to names. Will keep digging...

@samcic
Copy link
Author

samcic commented Dec 23, 2021

Ok, getting closer here. I noticed when using the "named" option that the module names themselves change every build, based on what looks like broccoli behavior:

image

Based on the webpack docs, deterministic module ids are based on hashing the module name, so if the module name isn't deterministic then determinism is obviously lost.

Will now start digging to see if the broccoli names can be made deterministic.

@samcic
Copy link
Author

samcic commented Dec 23, 2021

So using the "size" option, i.e.

    autoImport: {
      webpack: {
        optimization: {
          moduleIds: 'size'
        }
      },
    },

...solves this issue in the sense that the multiple successive builds of the same unchanged source files give the same output files.

I don't think this approach offers the same "long term caching" benefits of the "deterministic" option documented by webpack, but for my use-case the problem is solved (I just needed repeat builds to be deterministic).

@ef4 What do you think...is "size" a suitable default in order to resolve the issue (if so, I can create a PR), or do you perhaps know how to make the broccoli-style module names from my screenshot above deterministic so the "deterministic" webpack option could work?

@ef4
Copy link
Collaborator

ef4 commented Dec 24, 2021

Nice investigating, this is important:

the module names themselves change every build, based on what looks like broccoli behavior

I think that's what we should fix. It's true that if we let broccoli pick the temp directory it will be non-deterministic. But we don't have to let it pick, we can make our own. For the same reason, Embroider generates a stable tempdir per project, and that's what it shows to webpack.

@samcic
Copy link
Author

samcic commented Dec 26, 2021

@ef4 Thanks heaps for the comments. I agree that it makes sense to resolve the core issue. I'm happy to spend time to make a contribution, but after taking browsing this repo, the broccoli repo and the Embroider repo, I'm struggling to find how to control which temp directory gets used. Could you perhaps point me in the right direction?

@ef4
Copy link
Collaborator

ef4 commented Dec 27, 2021

I just thought of a possible alternative solution that might be better, which you could test. I don't think we currently set webpack's context. If we set it to the this.stagingDir, that might make all the paths relative to there, which would mean that the location of the stagingDir on the local machine wouldn't influence production bundle names, which would be good.

If that doesn't work, we can continue down the path of taking control of the tempdir name. This is where we set the directory:

https://github.com/ef4/ember-auto-import/blob/6fc0e75aa0e663ca40296100507258133571be0c/packages/ember-auto-import/ts/webpack.ts#L123

this.cachePath is a temp dir that broccoli created for us because we passed needsCache to the constructor:

https://github.com/ef4/ember-auto-import/blob/6fc0e75aa0e663ca40296100507258133571be0c/packages/ember-auto-import/ts/webpack.ts#L94

Instead of using this broccoli-provided temp dir, we would make our own instead. It should get a stable name based off the app we're building, so that it's the same across rebuilds. Embroider does a similar thing here by hashing the filesystem path to the app that is being developed:

https://github.com/embroider-build/embroider/blob/6a2e220f08db8558af8366e9d186d2f5df86953f/packages/compat/src/default-pipeline.ts#L37

@samcic
Copy link
Author

samcic commented Dec 27, 2021

@ef4 Yep bingo 👍 . Setting context: stagingDir and using the moduleIds: "named" option shows that the module names are now shorter and not dependent on any temp directory:

image

Setting moduleIds: "deterministic" (or removing the config in ember-cli-build.js entirely, i.e. the default behavior), then gives deterministic three-digit module ids persistently across repeated builds. That is, builds will be deterministic for the default Ember app configuration with this change, which is presumably what we want.

I'll go ahead and create a PR...

@samcic samcic linked a pull request Dec 27, 2021 that will close this issue
davidtaylorhq added a commit to discourse/discourse that referenced this issue Nov 7, 2022
This commit works around a couple of issues in `ember-auto-import` which were causing non-deterministic chunk filenames and content. Deterministic output should improve cache-reusability across deploys.

Refs:
embroider-build/ember-auto-import#519
embroider-build/ember-auto-import#548
embroider-build/ember-auto-import#478
davidtaylorhq added a commit to discourse/discourse that referenced this issue Nov 7, 2022
This commit works around a couple of issues in `ember-auto-import` which were causing non-deterministic chunk filenames and content. Deterministic output should improve cache-reusability across deploys.

Refs:
embroider-build/ember-auto-import#519
embroider-build/ember-auto-import#548
embroider-build/ember-auto-import#478
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants