-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(es/minifier): Make DCE analyzer parallel #9865
base: main
Are you sure you want to change the base?
Conversation
|
d71bf03
to
593d5e9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any ideas to improve the performance even more?
@@ -1111,6 +1203,87 @@ impl VisitMut for TreeShaker { | |||
} | |||
} | |||
|
|||
fn merge_data(data: Arc<ThreadLocal<RefCell<Data>>>) -> Data { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, this function is slow and I want to optimize this function further.
} | ||
}; | ||
} | ||
} | ||
|
||
/// Traverse the graph and subtract usages from `used_names`. | ||
fn subtract_cycles(&mut self) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this function call is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any ideas to improve the performance even more?
I tried profiling, but it seems to have almost no effect on current benchmark suite, which only contains libraries. DCE is impactful for real-world apps, though. |
CodSpeed Performance ReportMerging #9865 will not alter performanceComparing Summary
|
PR
Main
|
Description:
The analyzer of the DCE pass can be parallelized in a similar way to mark-sweep GC. It currently uses
&mut petgraph::DiGraphMap
to find reachable bindings, but we can split the graph creation into multiple threads.Also, I decided to amort allocations because
swc_parallel::join
is called an enormous amount of time, meaning that it will be slower if I allocate one hashmap from each closure. I'll use thread_local crate to reuse the hashmap and collect all of them after the traversal is done.