Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instant Search: Builds probably shouldn't include cachebusters in the file names. #19752

Closed
georgestephanis opened this issue May 5, 2021 · 11 comments · Fixed by #20926 or #21748
Closed
Assignees
Labels
Build [Feature] Search For all things related to Search [Pri] High [Status] Needs Author Reply We would need you to make some changes or provide some more details about your PR. Thank you! [Type] Enhancement Changes to an existing feature — removing, adding, or changing parts of it

Comments

@georgestephanis
Copy link
Member

Steps to reproduce the issue

  1. Have a full page caching system
  2. Update Jetpack
  3. Previously cached pages still enqueue old filenames, which now 404.
  4. Sad christmas.

What I expected

Bundle filenames to be consistent across versions -- using cachebusters in the query string instead as per standard WordPress practice.

What happened instead

The filenames changed, resulting in 404s.

@georgestephanis georgestephanis added [Type] Bug When a feature is broken and / or not performing as intended [Feature] Search For all things related to Search Build Instant Search labels May 5, 2021
@georgestephanis
Copy link
Member Author

Worth noting that this has occurred in hosting environments where Jetpack is updated from a filesystem side, not through WordPress, so there aren't (to my knowledge) any actions firing that would flush the caches.

@jeherve jeherve added [Pri] Normal [Type] Enhancement Changes to an existing feature — removing, adding, or changing parts of it and removed [Type] Bug When a feature is broken and / or not performing as intended labels May 5, 2021
@bluefuton bluefuton added this to the jetpack/9.8 milestone May 10, 2021
@kangzj kangzj assigned kangzj and unassigned kangzj May 11, 2021
@jsnmoon
Copy link
Contributor

jsnmoon commented May 13, 2021

@georgestephanis:

Currently, all entry point assets enqueued via PHP rely purely on query string cache-busting and do not include cache-busting hashes in their filenames (e.g. jp-search-main.bundle.js and jp-search-ie11-polyfill-loader.bundle.js). Assets asynchronously loaded by those entry points, however, do include hashes in their filenames (e.g. jp-search.chunk-main-payload-[hash].css and jp-search.chunk-main-payload-[hash].js).

These async asset names are only referenced within entry point assets. Their hashed filenames should never directly appear in the DOM, so I'm curious as to how a full-page caching system might enqueue old filenames.

Which files were being incorrectly enqueued? If it's one of the hash-named assets, then it's likely that we're failing to bust the cache for one of the entry point assets using query string chachebusters.

You mentioned that this occurred in an environment where "Jetpack [was] updated from a filesystem side". During that process, was JETPACK__VERSION bumped? If not, the output of Jetpack_Search_Helpers::get_asset_version would not cache-bust jp-search-main.bundle.js.

Also, how might I go about setting up a full-page caching system for my test environment? Is there a plugin you'd recommend?

@jsnmoon jsnmoon self-assigned this May 13, 2021
@georgestephanis
Copy link
Member Author

I'll provide more info shortly, but given that my assumption is that the stub js was loaded with it's hard references to the cachebusted in filename files, and then the upgrade happened before it loaded the cachebusted files into the browser, so it 404'ed

And then the full page cache kept pulling from the browser cache for the stub which kept 404'ing when it tried to pull the deleted outdated files.

If that makes sense

@samiff samiff added the [Status] Needs Author Reply We would need you to make some changes or provide some more details about your PR. Thank you! label May 19, 2021
@samiff samiff removed this from the jetpack/9.8 milestone May 20, 2021
@kangzj kangzj self-assigned this May 26, 2021
@deke207
Copy link

deke207 commented Jul 6, 2021

Noting:
We ran into this issue today after Jetpack update 9.9. Oddly, the files initially just started loading on the site very slow, then eventually returned a 404 status. Cache/CDN purge was needed to clear the older [hashed] files.

image

@mdrovdahl
Copy link

mdrovdahl commented Nov 3, 2021

RE-opening this issue as we encountered today what appears to be the same issue on another Team 51 site a few hours after a Jetpack release.

Apologies for not capturing the exact JS error, but it was something like "can't fetch chunk" which caused the request to the public-api.wordpress.com endpoint to not fire. Similar to above, a cache clear and cdn purge resolved it.

@mdrovdahl mdrovdahl reopened this Nov 3, 2021
@eoinsav
Copy link

eoinsav commented Nov 10, 2021

Search stopped working again on our site after upgrading to Jetpack 10.4-a.3

@kraftbj
Copy link
Contributor

kraftbj commented Nov 11, 2021

@Automattic/jetpack-search and @Automattic/ground-control -- Can we please prioritize getting this one resolved? Based on what @mdrovdahl is reporting, this is effectively breaking Instant Search with every new release on impacted hosts, which includes Pressable (and potentially all Atomic?) every week.

@georgestephanis
Copy link
Member Author

My big question -- and this is probably a @jsnmoon question -- is what benefit does this give by changing the filename of an asset with every release, instead of dropping a cachebuster in the get string? Why does it have to be this way and it deserves so much work and workaround, rather than just ... not cachebusting a filename?

@anomiex
Copy link
Contributor

anomiex commented Nov 12, 2021

@Automattic/jetpack-search and @Automattic/ground-control -- Can we please prioritize getting this one resolved? Based on what @mdrovdahl is reporting, this is effectively breaking Instant Search with every new release on impacted hosts, which includes Pressable (and potentially all Atomic?) every week.

It's not clear to me what exactly is breaking in the first place, or whether whatever is breaking would just break in more subtle ways once the requested change is made. But anyway it's already on my list in p9dueE-3MG-p2.

Note the first time they tried to resolve this it wound up causing #21349. Some of the work in p9dueE-3MG-p2 involves fixing things so that doesn't happen again when we re-try it.

@georgestephanis
Copy link
Member Author

georgestephanis commented Nov 12, 2021

It's not clear to me what exactly is breaking in the first place

So, my current understanding of the situation is this.

There's the HTML, which loads the MAIN JS BUNDLE, which in turn loads HASHED JS CHUNK.

The issue arises when Jetpack is updated, and then the error state becomes --

HTML loads an OLD MAIN JS BUNDLE (potentially from a CDN that hasn't updated the file change, more likely from a browser cache), which tries to load OLD HASHED JS CHUNK, which 404's and fails out because that file on the server suddenly no longer exists, rather than just having different content.

Why precisely does the browser have the OLD MAIN JS BUNDLE in its cache instead of grabbing a new version, but not rely on a browser cached version of OLD HASHED JS CHUNK? That seems to be the main question that folks have been intent on debugging and solving, but at this point I'm concerned that's more of an intellectual exercise that is incredibly hard to get a full answer from, and in the interim the problem keeps arising.

Part of me wonders if the JS files are being cached differently by the browser depending on if they're being loaded from the html directly, or loaded from another JS file -- which would also mean the html and the cached js files are on different TLDs, but both JS files are a common cdn domain. I genuinely can't speak to that, but it strikes me as a possibility if someone with more familiarity with browser caching could weigh in.

From my understanding -- and please, please correct me if i am wrong on this -- the linked issue with the library choking on a css file ending with a cachebuster get string and not a file extension feels much more like something that should be accounted for upstream in the rtl library, because assets with get strings isn't an uncommon thing. And not hashing the filename would mean that at worst, the browser/cdn/server would just be able to fail back on the prior version's asset, which would solve the issue of the 404 erroring completely -- as it seems failing to an older version of the file is better than the whole thing exploding.

edit: apologies if I'm rambly, just wanted to brain dump as much context as I could manage in hope that some of is is useful to glean some understanding from.

@anomiex
Copy link
Contributor

anomiex commented Nov 12, 2021

HTML loads an OLD MAIN JS BUNDLE (potentially from a CDN that hasn't updated the file change, more likely from a browser cache),

If the PHP is properly setting the version hash provided by @wordpress/dependency-extraction-webpack-plugin, then either the CDN or browser in your example is broken.

Another possibility is if someone manages to request the just-deployed PHP that serves the new hash and then fetch the JS file before the deployment has updated that too.

From my understanding -- and please, please correct me if i am wrong on this -- the linked issue with the library choking on a css file ending with a cachebuster get string and not a file extension feels much more like something that should be accounted for upstream in the rtl library,

You're not wrong. But someone had to actually fix it for it to be fixed, and we didn't want to leave everything in a broken state in the mean time.

because assets with get strings isn't an uncommon thing.

It's not all that common in Webpack, apparently. 🤷

And not hashing the filename would mean that at worst, the browser/cdn/server would just be able to fail back on the prior version's asset, which would solve the issue of the 404 erroring completely -- as it seems failing to an older version of the file is better than the whole thing exploding.

On the other hand, it might still break if the old caller got the new bundle (or vice versa) because of your broken browser or CDN. It's just more likely to break in weird ways instead of with a 404.

edit: apologies if I'm rambly, just wanted to brain dump as much context as I could manage in hope that some of is is useful to glean some understanding from.

No problem, thanks for the info.

anomiex added a commit that referenced this issue Nov 15, 2021
We renamed them to `.min.js` to avoid a broken wpcom minifier. But now
that we have `Assets::register_script()` to automatically add
`?minify=false` to them, we can put them back at the `.js` names.

Also we can again remove the hashes from the chunk filenames now that
`@automattic/webpack-rtl-plugin` is fixed.

This should fix #19752, and get us closer on #21343. We've got other
files being processed by gulp instead of webpack that may need attention
(or maybe just conversion to being processed by Webpack) before we can
say #21343 is complete.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Build [Feature] Search For all things related to Search [Pri] High [Status] Needs Author Reply We would need you to make some changes or provide some more details about your PR. Thank you! [Type] Enhancement Changes to an existing feature — removing, adding, or changing parts of it
Projects
None yet