Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjacency list optimizations #9444

Merged
merged 23 commits into from
Feb 13, 2024
Merged

Adjacency list optimizations #9444

merged 23 commits into from
Feb 13, 2024

Conversation

lettertwo
Copy link
Contributor

@lettertwo lettertwo commented Dec 15, 2023

EDIT: Added docs, too! They're in the PR, but easier to read on the branch: https://github.com/parcel-bundler/parcel/blob/adjacency-list-optimizations/docs/AdjacencyList.md


By making two adjustments to Parcel’s AdjacencyList:

  • the memory footprint of Parcel’s three biggest graphs is reduced by ~52%
  • writes are faster by ~5%

For a real world, very large app, this amounts to ~800MB reduction in size with no regression in startup, build, or shutdown times.

Background

AdjacencyList is already highly optimized for avoiding overhead in message passing. In Parcel, graphs are used by multiple threads, so this implementation of the AdjacencyList stores data external to the JS heap to allow it to be shared across threads without incurring the overhead of serializing and deserializing what is often a large number of edges. This had big impact on Parcel’s runtime characteristics (see #6922).

However, it’s not all roses; some suboptimal behaviors have been observed, particularly at scale:

  • AdjacencyList uses a lot of memory
    • even though this memory is shared, it’s still surprisingly high for how efficiently the data is stored
  • serialized data is very large on disk
    • in Parcel’s cache, AdjacencyList data makes up a sizable percentage of the total cache size

It turns out that these are two symptoms of the same disease: premature optimization.

Optimization 1: the load factor

Today’s version of AdjacencyList automatically resizes itself as nodes and edges are added. This resizing event occurs when the ratio of edges to capacity meets or exceeds a constant term known as the load factor.

The current implementation uses a load factor of 0.7, which is meant to trigger a resize sooner than absolutely necessary. The intended benefit of this optimization is that collisions in the hashmap are less likely due to there always being at least 30% capacity available.

In hindsight, this may have been premature; it turns out that maintaining that much excess capacity at scale is quite expensive, memory-wise, and while it may yield some overall benefit in terms of amortizing a cost to collisions (as evident in the duration being shorter for really large graphs, see permutation B in the benchmarks), as it turns out, maximizing the load (a load factor of 1) has a big impact (~833 MB!) on memory footprint (and consequently, cache size).

Optimization 2: Right-sizing buckets

In the version of AdjacencyList shipped today, space is allocated to accommodate a bucket size of 2. The intention here was that we can avoid excessive resizing by allowing a higher number of collisions in the underlying hashmap before running out of space.

This optimization has also proven to be premature; it turns out that not allocating any extra space to accommodate collisions at all (a bucket size of 1), when combined with a load factor of 1, has roughly equivalent outcomes to the defaults, while still maintaining most of the size benefit of just adjusting the load factor alone. In fact, the benchmarks indicate that these two adjustments yield wins in both memory footprint and read/write/resize performance!

The benchmarks

The approach for these benchmarks starts with instrumenting the AdjacencyList to record every write operation that is applied during a production build of a large real world app. The resulting recordings are then played back using a differently instrumented version of AdjacencyList that allows tweaking the parameters of the list’s allocation behaviors.

The below charts show the impact of these changes on the AssetGraph, the BundleGraph, and the RequestGraph.

  • Permutation A is the default parameters, which reflects what is in production today.
  • Permutation B shows the effect of adjusting the load factor to 1.
  • Permutation C shows the effect of combining a bucket size of 1 with the adjusted load factor.
AssetGraph BundleGraph RequestGraph
nodes: 521,958 (0 unconnected) nodes: 481,920 (493,589 unconnected) nodes: 1,381,969 (2,571,265 unconnected)
edges: 806,717 (0 deleted) edges: 6,139,973 (6,462 deleted) edges: 12,068,253 (0 deleted)
Screenshot 2023-12-14 at 3 14 13 PM Screenshot 2023-12-14 at 3 14 21 PM Screenshot 2023-12-14 at 3 14 28 PM

Of particular interest in these results:

the load (the ratio of data to capacity) jumps from below 50% across all 3 graphs to above 90%. This means that, with these changes, almost all of the allocated space is being used for all three graphs (whereas before, there was more than 50% going unused).

the collision rate remains nearly identical with both changes applied.

Tweaking the resize curve

The AdjacencyList resizes the capacities for nodes and edges differently. For nodes, it simply doubles the capacity at each resize, but for edges, it resizes more aggressively early on, and then less aggressively, in linear regression until an inflection point, after which it is also just doubling the capacity each resize.

These parameters are now exposed for tweaking, but in my testing so far, I haven’t found any combo that is strictly better than the current defaults, so I did not change them.

Previously, we were allocating extra space for 'buckets'
to accommodate hash collisions, but this turns out to waste a lot of space
in large graphs.

Additionally, we are no longer allocating space for nodes ahead of time;
now, the nodes array will grow on demand, as edges are added.
This unlocks the ability to resize without creating a new intermediary AdjacencyList.
The (incorrect) assumption was that there should be the same node record count
after a resize of edges, but this is not necessarily the case; if there were
deleted edges before the resize, then there may be node records that will
also be deleted (by virtue of no longer having any edges connected to them)
as part of the resize.
Copy link
Contributor

@mattcompiles mattcompiles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great bud. The write-up is excellent 👏 Just one comment about the lingering TODO.

packages/core/graph/src/AdjacencyList.js Outdated Show resolved Hide resolved
@lettertwo
Copy link
Contributor Author

I used this branch to do a full production build of a big project (the same project used to benchmark the changes), and compared it to a build of the same project using v2, and there were no differences in build output 🎉

There were differences in cache output:

AssetGraph BundleGraph RequestGraph
nodes: 447,509 (0 unconnected) nodes: 646,833 (661,425 unconnected) nodes: 1,234,145 (2,145,889 unconnected)
edges: 659,997 edges: 5,549,467 edges: 10,253,224
Screenshot 2023-12-22 at 12 52 37 PM Screenshot 2023-12-22 at 12 50 59 PM Screenshot 2023-12-22 at 12 46 13 PM

Duration here is reflecting the time spent in v8’s deserialize function (captured while loading the graphs from cache to dump their stats). Almost all of the cost belongs to the node properties, which are stored in regular JS arrays, not the AdjacencyList.

The size recorded here is just the size of the array buffers; the impact on disk was
For this build, was a cumulative size reduction of the graphs from ~3.1GB to ~2.3GB, with a savings of ~750MB.

Overall, there is no regression in runtime performance apparent, and the reduction in memory footprint is sizeable.

The largest graph deserialized the fastest! this is because, despite having many more nodes and edges compared to the other graphs, the RequestGraph has relatively few and simpler node properties.

Copy link
Member

@devongovett devongovett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great!

1[Node 1] -- incoming --> a[[edge a]]
1[Node 1] -- incomingReverse --> a[[edge a]]
1[Node 1] -- outgoing --> c[[edge c]]
1[Node 1] -- outgoingReverse --> c[[edge c]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these reverse arrows be pointing the other way? They look the same as the non-reversed arrows right now. Did I misunderstand something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this any better?

graph LR
  subgraph 0[Node 0]
    direction LR
    0o([outgoing]) --- 0oa[[a]] <--> 0ob[[b]] --- 0or([outgoingReverse])
  end

  subgraph 1[Node 1]
    direction LR
    1i([incoming]) --- 1ia[[a]] --- 1ir([incomingReverse])
    1o([outgoing]) --- 1oc[[c]] --- 1or([outgoingReverse])
  end

  subgraph 2[Node 2]
    direction LR
    2i([incoming]) --- 2ib[[b]] <--> 2ic[[c]] --- 2ir([incomingReverse])
  end
Loading

na41 -- first out --> ea31
na41 -- last out --> ea31
```

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoa

packages/core/graph/src/AdjacencyList.js Outdated Show resolved Hide resolved
* upstream/v2: (22 commits)
  Add source map support to the inline-require optimizer (#9511)
  [Web Extension] Add content script world property to manifest schema validation (#9510)
  feat: add getCurrentPackageManager (#9505)
  Default Bundler Contributor Notes (#9488)
  rename parentAsset to root for msb config and remove unstable (#9486)
  Macro errors -> v2 (#9501)
  Statically evaluate constants referenced by macros (#9487)
  Multiple css bundles in Entry bundle groups issue (#9023)
  Fix macro issues (#9485)
  Bump follow-redirects from 1.14.7 to 1.15.4 (#9475)
  Revert more CI changes to centos job (#9472)
  Use lightningcss to implement CSS packager (#8492)
  Fixup CI again (#9471)
  Clippy and use napi's Either3 (#9047)
  Upgrade to eslint 8 (#8580)
  Add support for JS macros (#9299)
  Fixup REPL CI (#9467)
  Drop per-pipeline transformation cache (#9459)
  Upgrade some CI actions (#9466)
  REPL (#9365)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants