-
Notifications
You must be signed in to change notification settings - Fork 27k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
perf(turbo-tasks): Call
.shrink_to_fit()
on common collection types…
… when constructing a cell (#72113) Cell contents are immutable once constructed, so there's no chance that they'll grow in size again. Common collections can be shrunk to avoid storing empty spare capacity in this case (note: if they're already correctly sized, `shrink_to_fit` bails out early). **Result:** This gives a ~1.4% decrease in top-line peak memory consumption, for a theoretical CPU/Wall time cost that's too small to measure. **Inspiration:** The inspiration for this was vercel/turborepo#2873, which decreased task storage (not the top-line memory usage?) by ~14%, vercel/turborepo#8657, and other similar optimization PRs. ## Additional Opportunities - There may be more places where cell are constructed (e.g. persistent storage deserialization) where a cell's `SharedReference` is constructed that is not currently captured by this. - Depending on the library used, deserialization may already construct exact-sized collections. - As an additional example, manually constructing a `ReadRef` and converting it into a cell skips this optimization because `ReadRef::cell` internally uses the type-erased shared-reference `raw_cell` API which is incompatible with this optimization. We could special-case that in the `ReadRef::new_owned` constructor (not in `ReadRef::new_arc` though), but nobody should be manually constructing `ReadRef`s. - We still don't use `shrink_to_fit` on `RcStr` types. Some of these are in-place extended (when they have a refcount of 1) with `RcStr::map`, so we probably don't want to be too aggressive about this to avoid `O(n^2)` time complexity blowups. ## Memory Benchmark Setup ```bash cd ~/next.js cargo run -p next-build-test --release -- generate ~/shadcn-ui/apps/www/ > ~/shadcn-ui/apps/www/project_options.json pnpm pack-next --project ~/shadcn-ui/apps/www/ ``` ```bash cd ~/shadcn-ui pnpm i cd ~/shadcn-ui/apps/www/ heaptrack ~/next.js/target/release/next-build-test run sequential 1 1 '/sink' heaptrack --analyze ~/shadcn-ui/apps/www/heaptrack.next-build-test.3604648.zst ``` ### Memory Before (canary branch) First Run: ``` peak heap memory consumption: 3.23G peak RSS (including heaptrack overhead): 4.75G ``` Second Run: ``` peak heap memory consumption: 3.23G peak RSS (including heaptrack overhead): 4.75G ``` ### Memory After (this PR) First Run: ``` peak heap memory consumption: 3.18G peak RSS (including heaptrack overhead): 4.74G ``` Second Run: ``` peak heap memory consumption: 3.19G peak RSS (including heaptrack overhead): 4.73G ``` This is about a 1.4% decrease in top-line memory consumption. ## Wall Time with `hyperfine` (Counter-Metric) This is theoretically a time-memory tradeoff, as we'll spend some time `memcpy`ing things into smaller allocations, though in some cases reducing memory usage can improve cache locality, so it's not always obvious. ``` hyperfine --warmup 3 -r 30 --time-unit millisecond '~/next.js/target/release/next-build-test run sequential 1 1 /sink' ``` This benchmark is slow and takes about 30 minutes to run. Before: ``` Benchmark 1: ~/next.js/target/release/next-build-test run sequential 1 1 /sink Time (mean ± σ): 56387.5 ms ± 212.6 ms [User: 107807.5 ms, System: 9509.8 ms] Range (min … max): 55934.4 ms … 56872.9 ms 30 runs ``` After: ``` Benchmark 1: ~/next.js/target/release/next-build-test run sequential 1 1 /sink Time (mean ± σ): 56020.9 ms ± 235.4 ms [User: 107483.8 ms, System: 9371.8 ms] Range (min … max): 55478.2 ms … 56563.6 ms 30 runs ``` This is a ~0.65% *reduction* in wall time. This is small enough (<2 standard deviations) to likely just be noise. ## Wall Time with `turbopack-bench` (Counter-Metric) ``` cargo bench -p turbopack-bench -p turbopack-cli -- "hmr_to_eval/Turbopack CSR" ``` Gives: ``` bench_hmr_to_eval/Turbopack CSR/1000 modules time: [15.123 ms 15.208 ms 15.343 ms] change: [-0.8471% +0.4882% +1.9719%] (p = 0.55 > 0.05) No change in performance detected. ``` Using https://github.com/bgw/benchmark-scripts/ In practice, it's not really possible to measure changes in wall time <1%, so this is within "noise" territory (as noted in the criterion output). Closes PACK-3361
- Loading branch information
Showing
17 changed files
with
357 additions
and
54 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
../../turbo-tasks-testing/tests/shrink_to_fit.rs |
35 changes: 33 additions & 2 deletions
35
turbopack/crates/turbo-tasks-macros-shared/src/primitive_input.rs
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,47 @@ | ||
use proc_macro2::Span; | ||
use syn::{ | ||
parse::{Parse, ParseStream}, | ||
Result, Type, | ||
punctuated::Punctuated, | ||
spanned::Spanned, | ||
Meta, Result, Token, Type, | ||
}; | ||
|
||
#[derive(Debug)] | ||
pub struct PrimitiveInput { | ||
pub ty: Type, | ||
pub manual_shrink_to_fit: Option<Span>, | ||
} | ||
|
||
impl Parse for PrimitiveInput { | ||
fn parse(input: ParseStream) -> Result<Self> { | ||
let ty: Type = input.parse()?; | ||
Ok(PrimitiveInput { ty }) | ||
let mut parsed_input = PrimitiveInput { | ||
ty, | ||
manual_shrink_to_fit: None, | ||
}; | ||
if input.parse::<Option<Token![,]>>()?.is_some() { | ||
let punctuated: Punctuated<Meta, Token![,]> = input.parse_terminated(Meta::parse)?; | ||
for meta in punctuated { | ||
match ( | ||
meta.path() | ||
.get_ident() | ||
.map(ToString::to_string) | ||
.as_deref() | ||
.unwrap_or_default(), | ||
&meta, | ||
) { | ||
("manual_shrink_to_fit", Meta::Path(_)) => { | ||
parsed_input.manual_shrink_to_fit = Some(meta.span()) | ||
} | ||
(_, meta) => { | ||
return Err(syn::Error::new_spanned( | ||
meta, | ||
"unexpected token, expected: \"manual_shrink_to_fit\"", | ||
)) | ||
} | ||
} | ||
} | ||
} | ||
Ok(parsed_input) | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
53 changes: 53 additions & 0 deletions
53
turbopack/crates/turbo-tasks-macros/src/derive/shrink_to_fit_macro.rs
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
use proc_macro::TokenStream; | ||
use proc_macro2::TokenStream as TokenStream2; | ||
use quote::quote; | ||
use syn::{parse_macro_input, DeriveInput, FieldsNamed, FieldsUnnamed}; | ||
use turbo_tasks_macros_shared::{generate_exhaustive_destructuring, match_expansion}; | ||
|
||
pub fn derive_shrink_to_fit(input: TokenStream) -> TokenStream { | ||
let derive_input = parse_macro_input!(input as DeriveInput); | ||
let ident = &derive_input.ident; | ||
let (impl_generics, ty_generics, where_clause) = derive_input.generics.split_for_impl(); | ||
|
||
let shrink_items = match_expansion(&derive_input, &shrink_named, &shrink_unnamed, &shrink_unit); | ||
quote! { | ||
impl #impl_generics turbo_tasks::ShrinkToFit for #ident #ty_generics #where_clause { | ||
fn shrink_to_fit(&mut self) { | ||
#shrink_items | ||
} | ||
} | ||
} | ||
.into() | ||
} | ||
|
||
fn shrink_named(_ident: TokenStream2, fields: &FieldsNamed) -> (TokenStream2, TokenStream2) { | ||
let (captures, fields_idents) = generate_exhaustive_destructuring(fields.named.iter()); | ||
( | ||
captures, | ||
quote! { | ||
{#( | ||
turbo_tasks::macro_helpers::ShrinkToFitDerefSpecialization::new( | ||
#fields_idents, | ||
).shrink_to_fit(); | ||
)*} | ||
}, | ||
) | ||
} | ||
|
||
fn shrink_unnamed(_ident: TokenStream2, fields: &FieldsUnnamed) -> (TokenStream2, TokenStream2) { | ||
let (captures, fields_idents) = generate_exhaustive_destructuring(fields.unnamed.iter()); | ||
( | ||
captures, | ||
quote! { | ||
{#( | ||
turbo_tasks::macro_helpers::ShrinkToFitDerefSpecialization::new( | ||
#fields_idents, | ||
).shrink_to_fit(); | ||
)*} | ||
}, | ||
) | ||
} | ||
|
||
fn shrink_unit(_ident: TokenStream2) -> TokenStream2 { | ||
quote! { { } } | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
../../turbo-tasks-testing/tests/shrink_to_fit.rs |
27 changes: 27 additions & 0 deletions
27
turbopack/crates/turbo-tasks-testing/tests/shrink_to_fit.rs
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
#![feature(arbitrary_self_types)] | ||
#![feature(arbitrary_self_types_pointers)] | ||
#![allow(clippy::needless_return)] // tokio macro-generated code doesn't respect this | ||
|
||
use anyhow::Result; | ||
use turbo_tasks::Vc; | ||
use turbo_tasks_testing::{register, run, Registration}; | ||
|
||
static REGISTRATION: Registration = register!(); | ||
|
||
#[turbo_tasks::value(transparent)] | ||
struct Wrapper(Vec<u32>); | ||
|
||
#[tokio::test] | ||
async fn test_shrink_to_fit() -> Result<()> { | ||
run(®ISTRATION, || async { | ||
// `Vec::shrink_to_fit` is implicitly called when a cell is constructed. | ||
let a: Vc<Wrapper> = Vc::cell(Vec::with_capacity(100)); | ||
assert_eq!(a.await?.capacity(), 0); | ||
|
||
let b: Vc<Wrapper> = Vc::local_cell(Vec::with_capacity(100)); | ||
assert_eq!(b.await?.capacity(), 0); | ||
|
||
Ok(()) | ||
}) | ||
.await | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.