-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmark PnP runtime overhead #1817
Comments
I just did on a very small scale example, and there is already a significant impact of yarn berry that I did not expect. https://github.com/crubier/yarn-benchmarks For example, on a simple create-react-app + storybook, build storybook takes 23s with yarn berry, VS 12s with npm. I think we suffer from this internally, on our big monorepo with many storybooks, the storybooks take a total of 90 min to build.... reducing to 45min would already help a lot... |
@crubier Have you tried to check if the impact in Yarn berry due to zip compression, i.e. |
@larixer no I didn't know about this option. Let me try it now.. |
BTW What is the reason for compressing the cache? Is it only for saving space on the hard drive? |
@crubier Yep, to save space on the hard drive or in the Git repo |
Putting For example storybook goes from 21s-25s to 19s-24s |
That's not a good comparison, what you actually want to compare is PnP vs node_modules in V2 |
I think the issue raised by @remorses here is a very good idea. I think the runtime performance impact of yarn on real-life benchmark has been very overlooked, and now might be the time to start thinking about it a bit more. On my part for example, I dont care about saving a few GB on my disk if this meant getting faster performance. An example of the overlooking of runtime performance was #973 , which was mostly resolved thankfully (on our monorepo there is now "only" a delay of 7s before each yarn command, instead of 5minutes...) What surprised me in this little benchmark is that yarn berry seems to have a 150-200% performance impact on many real life scripts. |
And yes, I include PnP when I say Yarn v2. The point of this issue is mostly about PnP I think. I could add yarn v1 with pnp and yarn v2 without pnp in the benchmark. |
Your install benchmark isn't useful the way it's currently done, since it doesn't take into account if the archives are already in the cache or if the lockfile is already generated. If you need an example of a good install benchmark, here is ours; the results are here.
That's wouldn't be useful, as Yarn 1's PnP was just an experiment and not production ready.
Yes, Yarn 2 with the |
I know about that of course, I ran the benchmark only after installing in each folder (And the benchmark is ran several times in a row without cleanup). So it's a "reinstall" benchmark, not a "first install" benchmark. |
npm and Yarn v1 are irrelevant imo, what we're interested in is the performance overhead of Yarn V2's PnP mode vs node_modules. npm and Yarn v1 has hoisting bugs which can give them unfair advantages (v2 as far as we know, doesn't)
That's a false statement, i've done optimizations on real world usecases, for example:
You can see all the performance related PRs i've done here https://github.com/yarnpkg/berry/pulls?q=is%3Apr+author%3Amerceyz+perf+is%3Aclosed
The only case our benchmarks doesn't cover is runtime overhead of PnP vs node_modules, the rest we cover already https://p.datadoghq.eu/sb/d2wdprp9uki7gfks-c562c42f4dfd0ade4885690fa719c818 |
@merceyz "overlooked" does not mean "completely forgotten": In the past 3 months you made at least 2 PRs that each improved by 100% the performance on certain use cases, this is very good and going in the right direction, but it also shows that there are still wide margins for improvement. As I mentioned I also made a PR 7 months ago ( #998 ) which improved by 2500% the runtime performance on large monorepos. There again, there was large margin for improvement of the runtime performance. My only point is that we should indeed start measuring this runtime performance improvement because this is now becoming the limiting factor for us internally. |
Comparing npm with yarn berry without pnp gives indeed smilar results. So yes, as expected, we can compare yarn pnp vs yarn node_modules, and we get:
|
@remorses Out of curiosity, where does |
It comes from |
It is in
|
@remorses Indeed this algorithm is very inefficient because of |
Regarding the initial topic, yep, we currently track install times but not run times. It would be a good idea to have a dashboard for runtimes as well 👍 Also, we should put the dashboard link somewhere on the website 🤔 |
I updated my benchmark repo to start performing benchmark on monorepos at runtime. https://github.com/crubier/yarn-benchmarks Here are some quick results:
That's a x4 performance penalty by doing the same thing, but in a monorepo (without importing dependencies from the monorepo, just by being in the monorepo), and using PnP (because it doesn't work without it anyway, probably storybooks fault again.) |
This seems to match with what @remorses says, the 4x performance hit seems to match the 72% of the time spent in So the good news is that this particular problem seems fixable! I think I can help, but the file in question lacks documentation so it's hard to know where to start |
@larixer Do you intend to open a PR improving the performance of |
@paul-soporan To be fair I'm busy now with |
This worries me, a bit :) |
Then
|
The output of |
Storybook doesn't always work in monorepos, it depends on the hoisting layout you get. PR here storybookjs/storybook#11753 |
@larixer indeed removing After removing
So to recap: removing |
@crubier Looks like we have another bottleneck then. |
I opened a PR for the |
@crubier I think it should be all fine, This method just registers all the locations, so visiting graph nodes exactly once should be enough. You should also run |
@crubier One observation that bothers me is that from the same set of source files the storybook produces output files that significantly differ in size! Compare output files for |
I'm curious of why |
This is something I have noticed on my internal monorepo as well. Here is the webpack bundle analyzer output: For yarn-node-modules For yarn-pnp |
The problem seems to be that @merceyz since you seem to know a bit about storybook maybe you have an idea? Again this is default create-react-app + default storybook |
@crubier I see a problem with Storybook/Webpack not been able to delegate to sb_dll/*_dl.js files under PnP and thus recompiling all the stuff it needs from node_modules again and again. To check this you can create
basically turn off all optimizations, switch to development mode and generate single bundle. -/***/ "./node_modules/@emotion/core/dist/core.browser.esm.js":
-/*!************************************************************************************************!*\
- !*** delegated ./@emotion/core/dist/core.browser.esm.js from dll-reference storybook_docs_dll ***!
- \************************************************************************************************/
+/***/ "./.yarn/$$virtual/@emotion-core-virtual-283d648f44/0/cache/@emotion-core-npm-10.0.35-565218757c-5
+/*!*****************************************************************************************************
+ !*** ./.yarn/$$virtual/@emotion-core-virtual-283d648f44/0/cache/@emotion-core-npm-10.0.35-565218757c-5
+ \***************************************************************************************************** e.g. NM bundle delegates many functions to DLLs, but PNP bundle compiles all of them into bundle |
@larixer Oh, the storybook is distributing prebuilt DLLs:
inside DLLs you can see:
This is a suboptimal decision, because this way the DLLs are dependent on hoisting layout, so they will be reused to a less degree by the package manager that is different from the one used to build the DLLs, even if package manager is the same, but only the version is different the hoisting layout will be different. And of course node_modules prebuilt DLLs won't be used by PnP at all. They should at least have an option to build dll on a user side. But the functionality like this will not be added anytime soon I think. You can try disabling DLL usage via |
@larixer I just tried, and using It does not make the |
It's an original caching, I wonder if it could be made compatible with PnP, perhaps some with extra Webpack config 🤔 Probably worth opening an issue on the Storybook repository rather than here, since it's not actionable on our side (let's keep this issue here about building a runtime perf dashboard). |
@crubier |
@arcanis to go back to the idea of a runtime benchmark, here are a few requirements for a good benchmark in my opinion, which I started to implement on my repo https://github.com/crubier/yarn-benchmarks :
The tests above should allow to benchmark different kinds of overheads that we have seen:
|
Besides the pnp resolution, I would also love to see a benchmark for the patched fs layer |
**What's the problem this PR addresses?** Cache compression doesn't seem the best option nowadays: - It adds significant time to installs - It adds a non-negligible runtime overhead (partially related, #1817) - It prevents Git from diffing archives when they're modified Its main advantage is for some amazingly large packages, where storing them inside a Git repository would be impossible (GitHub has a 600MB file limit). But since we changed the cache default to `enableGlobalCache: true`, this isn't relevant by default, so the cache compression should be set to zero. One potential issue is how it may give the impression that Yarn became worse for global cache users. We don't believe it's the case (Git can delta uncompressed archives in some measure, making it less expensive to upgrade from one package version to another), but to the average eye it may not look as such. **How did you fix it?** This diff sets the new default compression setting to 0. To avoid breaking projects' (or at least not making the migration more difficult), I added a rule to automatically set `compressionLevel` (and `enableGlobalCache`) to their 3.x default values when running `yarn install` in a 3.x project. When users are ready they can remove these settings and re-run the install. I kept the current compression level on our repository to avoid changing all archives again. I think we'll have to think of purging the repository in some form to migrate it to `compressionLevel: 0`. **Checklist** <!--- Don't worry if you miss something, chores are automatically tested. --> <!--- This checklist exists to help you remember doing the chores when you submit a PR. --> <!--- Put an `x` in all the boxes that apply. --> - [x] I have read the [Contributing Guide](https://yarnpkg.com/advanced/contributing). <!-- See https://yarnpkg.com/advanced/contributing#preparing-your-pr-to-be-released for more details. --> <!-- Check with `yarn version check` and fix with `yarn version check -i` --> - [x] I have set the packages that need to be released for my changes to be effective. <!-- The "Testing chores" workflow validates that your PR follows our guidelines. --> <!-- If it doesn't pass, click on it to see details as to what your PR might be missing. --> - [x] I will check that all automated PR checks pass before the PR gets reviewed.
Describe the user story
I saw that the yarn
pnp.js
file extends some nodejs native functionality likerequire
,path
andfs
functionsTo prevent performance degradation we should make benchmark and stress tests that measure the runtime overhead, for example creating a bundle with webpack of a big project with at least 50 packages
Other performance degradations could be in glob searches
I am also scared that some of these functions reimplementations have time complexity that grows with the workspace packages you have, some of us have workspaces with 200 packages
Describe the solution you'd like
Implement a benchmark to measure the runtime overhead
Describe the drawbacks of your solution
Describe alternatives you've considered
Additional context
It is still not very clear to me why
yarn
reimplementsfs
, it would be cool to read more about itThe text was updated successfully, but these errors were encountered: