Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(threadpool): expose node worker-threads pool adapter #81

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

feat(threadpool): expose node worker-threads pool adapter #81

wants to merge 1 commit into from

Conversation

kwonoj
Copy link

@kwonoj kwonoj commented Nov 27, 2019

Description

This PR attempts to discuss way to utilize worker_threads in node.js as runner for thread-loader execution.

  • Allow opt-in worker-thread pool

thread-loader option exposes useThreadPool , which will create new ThreadPool instance instead of existing worker pool. New ThreadPool is naive adapter interface to conform existing workerPool behavior, while accepts limited options (workerType, maxWorker) compare to original workerpool. Once new pool is initialized loader-runner will be executed under each worker thread.

  • Non-breaking changes to stabilizing worker thread implementation

Due to the different nature of worker thread to process based thread, replacing existing implementation to worker thread will create major breaking changes from options to some behavior (i.e warmup is no-op). Once it is verified to have near-feature parity / also majority of userbase have node.js supports thread, can flip switch to trigger breaking changes.

What to do

[ ] is this PR makes sense?
[ ] Is it ok to use workerpool to manage workers? or should write own pool instead? or other alternative like jest-worker? (https://github.com/facebook/jest/tree/master/packages/jest-worker) needs custom threadpool to support non-transferrable object (mostly functions) communication between main - worker thread.
[ ] unit test
[ ] basic e2e test allows manual auditing build results

Things to note

  • some of native-module based loader (i.e sass-loader) may not work until underlying native module supports contextAware.
  • after delegating pool logic into workerpool, this module can be very thin wrapper to pool if we can make breaking changes.
  • worker-main proc communication will use a structured clone, so some of object type will not be serialized but this should be similar or better than what current process-based serialization does. Unclear we can use sharedarraybuffer or not, but that should not be blocker for first implementation.

@jsf-clabot
Copy link

jsf-clabot commented Nov 27, 2019

CLA assistant check
All committers have signed the CLA.

@kwonoj
Copy link
Author

kwonoj commented Nov 27, 2019

As what to do section indicates, this PR is not ready to check-in but only serves as initial discussion point.

@kwonoj
Copy link
Author

kwonoj commented Nov 30, 2019

using jest-worker or other existing pool impl won't allow interop point to marshall webpack's loadercontext fn to worker thread. Due to those reason this PR may not work on some loaders, reason impl https://www.npmjs.com/package/webpack-loader-worker uses custom threadpool instead: (https://github.com/kwonoj/webpack-loader-worker/blob/master/src/threadPool.ts)

Still it doesn't require ipc serialization (most POJO can be cloned, only fn requires proxy) serialization logics are much simpler.

@sokra
Copy link
Member

sokra commented Dec 3, 2019

One challenge in the current thread-loader implementation was to efficiently pass Buffers from and to workers. Currently this is done by avoiding all serialization/ipc techniques and passing raw buffers via input/output streams.

I really appreciate a worker_threads implementation, especially as it could leverage a more efficient way of passing Buffers from and to workers.
Via transferList you could move Buffers between workers/main without the need of copy them. https://nodejs.org/api/worker_threads.html#worker_threads_port_postmessage_value_transferlist

This means using a library for abstracting worker_threads, may or may not have drawbacks here. I actually didn't look into your PR in detail, which leads me to the question: Is workerpool able to pass the Buffers from and to workers without copying them? Or might it make more sense to implement a solution on the raw worker_threads module to get these benefits?

@kwonoj
Copy link
Author

kwonoj commented Dec 3, 2019

You may refer https://github.com/kwonoj/webpack-loader-worker for more completed proposal.

To answer your q

  1. No, current proposal relies on structured clone. (Neither workerpool or module I implemented)

  2. Buffer transferring is indeed faster, but it should count on surrounding aspects - when webpack.compiler hands over object, to transfer it via buffer it involves serialization / deserialization from object to buffer and it should be counted as transfer cost (cause there's no way to transfer without serialization). When I did work on other pocs, usually trying to serialize object to buffer always defeated compare to structured clone: only except if original primitive was well designed to reduce those cost. From what I can tell, current loaderContext is complex enough. I may not fully aware loaderContext's detail and it may possible to pass buffer context only between threads, then it may get bit easier. Second thing to think about is transfer requires more explicit worker-to-main sync design, since it is being transferred only one thread can access those value: it may involves atomics or similar, or non-trivial lock-free synchronization.

  3. I do believe transferring buffer is something may need look into, but I think that is a large standalone scope work.

@kwonoj
Copy link
Author

kwonoj commented Dec 3, 2019

As fyi, general perf around structured clone is well explained here: https://dassur.ma/things/is-postmessage-slow/. I think considering other alternative would makes sense if structured clone is proven to slower than current out-of-proc communication.

@kwonoj
Copy link
Author

kwonoj commented Dec 3, 2019

PR now updated to port working poc code from module. (Lot of things are failing, wanted to share poc to kick off discussion to move on actual work)

@kwonoj
Copy link
Author

kwonoj commented Dec 3, 2019

Forgot to mention updated changes from https://github.com/kwonoj/webpack-loader-worker have the possibility to support buffer transfer via comlink.transfer but not being used yet.

@kwonoj
Copy link
Author

kwonoj commented Dec 4, 2019

@sokra and this change shows some example worker -> main proc buffer transfer: kwonoj/webpack-loader-worker#23 As you can see, it doesn't try to serialize whole object but only transfer if transferrable value exists in result - this'll allow plain object benefits from structured clone perf, and have additional gain by not copying raw buffers.

@jsg2021
Copy link

jsg2021 commented Nov 8, 2020

How is the performance? Would this help lift some of the limitations?

@malash
Copy link

malash commented Aug 5, 2021

Hello, any update about this feature ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants