-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(gatsby-plugin-mdx): prevent babel parse step at sourcing time #25437
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
gatsbot
bot
added
the
status: triage needed
Issue or pull request that need to be triaged and assigned to a reviewer
label
Jul 1, 2020
pvdz
removed
the
status: triage needed
Issue or pull request that need to be triaged and assigned to a reviewer
label
Jul 1, 2020
pvdz
force-pushed
the
skip-mdx-babel-once
branch
3 times, most recently
from
July 6, 2020 09:45
10daace
to
c424102
Compare
Your pull request can be previewed in Gatsby Cloud: https://build-fcca516d-a972-47c2-87c6-e2a80c0f9051.staging-previews.gtsb.io |
The mdx plugin was doing its default parsing step for every time it got called. At sourcing time it's only called to retrieve the import bindings and for this we can be much faster by manually processing the import statements. So that's what this does. A very simple flat no-image mdx benchmark cuts down sourcing time in half for this change.
Closed
Closed
pieh
reviewed
Jul 7, 2020
Confirmed a canary to build (https://github.com/gatsbyjs/gatsby/pull/25569/checks?check_run_id=845921135) |
johno
approved these changes
Jul 7, 2020
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😻
laurieontech
pushed a commit
that referenced
this pull request
Jul 16, 2020
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The mdx plugin was doing its default parsing step for every time it got called. At sourcing time it's only called to retrieve the import bindings and for this we can be much faster by manually processing the import statements. So that's what this does.
This change reduces baseline mdx processing by 30% by skipping a Babel parse step in one of the bootstrap steps
A very simple flat no-image mdx benchmark cuts down sourcing time in half for this change.
Benchmark numbers
This are the numbers for a simple, flat, local sourced, zero-image, mdx benchmark where each mdx file is automatically generated and look like this:
So there's a header, an import, a link, and two paragraphs of random garble (pre-generated, not part of the timings). The benchmark can run with an arbitrary number of pages (
N=1000 yarn bench
) so that's what we do for 1k, 2k, 4k, 8k, and 16k. By 2x-ing every step we can detect logaritmic perf problems easier. While the baseline Gatsby pipeline holds for a million pages easily, for mdx that ceiling is a lot lower right now (that's what I'm trying to improve :p )The benchmark runs on a dedicated Intel NUC, which is a headless linux setup, specifically to benchmark, so numbers ought to be fairly consistent relative to other runs.
Data
Master is currently roughly on [email protected] and [email protected] ( + whatever is unpublished).
The raw data for both runs can be found in two gists: