Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add benchmark scaffolding #29

Closed
wants to merge 44 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
a3cd6ba
first pass at benchmarks
Oct 11, 2016
f4abc06
add first pass at batch bench
Oct 11, 2016
e03dd4c
no need for tape
Oct 12, 2016
a472983
add sleep param
Oct 12, 2016
77dc915
properly mock multithreading
Oct 12, 2016
492c554
add expensive function for benchtesting
Oct 13, 2016
0cd364f
add fib arg to bench
Oct 13, 2016
29ee946
latest experiments for hefty benchmarks
Oct 18, 2016
c1989ec
hoist options out of callback to ensure we are only doing c++ work wi…
Oct 18, 2016
cf913b8
do expensive work by default in threadpool
Oct 18, 2016
1fe605e
keep the tests passing
Oct 18, 2016
125a6a7
Merge branch 'master' into bench
Oct 18, 2016
bf6498d
Merge branch 'master' into bench
Oct 19, 2016
3045b62
add more info about bench tests and how to use
Oct 20, 2016
325f1dc
add sleep example
Oct 20, 2016
4457258
specify seconds arg
Oct 20, 2016
1bd440e
update API docs
Oct 20, 2016
ae8dcf4
Add more detail to scenario descriptions
Oct 20, 2016
cf331e3
separate thread behaviour into their own functions
Oct 21, 2016
f8cba48
add tests for new functions
Oct 21, 2016
ad2219b
add contentiousThread function and test
Oct 21, 2016
dce7a3b
add more test coverage and rough draft of mutex lock logic
Oct 21, 2016
d3957ff
use master mason branch
Oct 21, 2016
67643e7
no need for sleep in shout()
Oct 21, 2016
66bf73c
report speed in runs/second - add use strict to test files
Nov 3, 2016
03ff524
benchmark tweaks
Nov 4, 2016
44b653a
deduce last argument as callback rather than using fixed index
Nov 4, 2016
1cd611c
finish work on accepting last arg as callback
Nov 4, 2016
ddbaccb
reuse AfterAsync function
Nov 4, 2016
7b9f85f
require/allow mode to be passed in to dynamically choose async functi…
Nov 4, 2016
38fa572
reduce the busyThreads work slightly
Nov 4, 2016
e52c3ab
fix bench-batch.js usage
Nov 4, 2016
fb1fca8
make the do_contentious_work function do work and lock to demonstrate…
Nov 4, 2016
b847347
fix spelling in comment
Nov 4, 2016
0b4053e
Merge branch 'master' into bench
Nov 4, 2016
5d1f896
add more details about each bench mode
Nov 7, 2016
bee23ec
add mode to docs
Dec 22, 2016
0ce38a5
Merge branch 'master' into bench
Dec 22, 2016
4efd99a
markdown cleanup
Jun 12, 2017
27aee62
Merge branch 'master' into bench
Jun 12, 2017
6276e25
Merge branch 'master' into bench
Jun 12, 2017
7f469bb
in case coverage was never run
Jun 12, 2017
1a39ac9
hopefully Travis is happy with expliclty setting sleep as part of the…
Jun 12, 2017
4970302
re-add needed deps for running bench and fix busythreads example
Jun 12, 2017
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 20 additions & 1 deletion API.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,28 @@ Shout a phrase really loudly by adding an exclamation to the end, asynchronously
**Examples**

```javascript
// Shout
var HW = new HelloWorld();
HW.shout('rawr', {}, function(err, shout) {
if (err) throw err;
console.log(shout); // => 'rawr!'
console.log(shout); // => 'rawr!...and just did a bunch of stuff'
});
```

```javascript
// Shout louder
var HW = new HelloWorld();
HW.shout('rawr', { louder: true }, function(err, shout) {
if (err) throw err;
console.log(shout); // => 'rawr!!!!!'
});
```

```javascript
// Shout then sleep for x seconds
var HW = new HelloWorld();
HW.shout('rawr', { sleep: 2 }, function(err, shout) {
if (err) throw err;
console.log(shout); // => 'rawr! zzzZZZ'
});
```
4 changes: 2 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,8 @@ clean:
rm -rf lib/binding
rm -rf build
# remove remains from running 'make coverage'
rm *.profraw
rm *.profdata
rm -f *.profraw
rm -f *.profdata
@echo "run 'make distclean' to also clear node_modules, mason_packages, and .mason directories"

distclean: clean
Expand Down
71 changes: 70 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,4 +87,73 @@ The `.travis.yml` file uses the `matrix` to set up each individual job, which sp
install: *setup
script: *test
after_script: *publish
```
```

# Benchmark Performance

This project includes [bench tests](https://github.com/mapbox/node-cpp-skel/tree/master/test/bench) you can use to experiment with and measure performance. We've included a couple different scenarios that demonstrate the affects of concurrency and threads within a process or processes.

For example, you can run:

```
node test/bench/bench-batch.js --iterations 50 --concurrency 10 --mode shout
```

This will run a bunch of calls to HelloWorld's `shout()` function. You can control three things:

- iterations: number of times to call `shout()`
- concurrency: max number of threads the test can utilize, by setting `UV_THREADPOOL_SIZE`. When running the bench-batch test, you can see this number of threads reflected in your [Activity Monitor](https://github.com/springmeyer/profiling-guide#activity-monitorapp-on-os-x)/[htop window](https://hisham.hm/htop/).
- mode: you can specify which scenario youd like to bench. Ex: shout (rename this to basic async function...or something), contentiousThreads, busyThreads...

## This bench-batch test can demonstrate various performance scenarios:

### Good scenarios

**Ideally, you want your workers to run your code ~99% of the time.**

These scenarios demonstrate ideal behavior for a healthy node c++ addon. They are what you would ideally expect to see when you've picked a good problem to solve with node.

1. An async function that is CPU intensive and takes a while to finish (expensive creation and querying of a `std::map` and string comparisons). This scenario demonstrates when worker threads are busy doing a lot of work, and the main loop is relatively idle. Depending on how many threads (concurrency) you enable, you may see your CPU% sky-rocket and your cores max out. Yeaahhh!!!

```
node test/bench/bench-batch.js --iterations 100 --concurrency 10 --mode busyThreads
```

If you bump up `--iterations` to 500 and profile in Activity Monitor.app, you'll see the main loop is idle as expected since the threads are doing all the work. You'll also see the threads busy doing work in AsyncBusyThreads function 99% of the time :tada:

![screenshot 2016-11-07 11 50 59](https://user-images.githubusercontent.com/1209162/27053695-cce91e9c-4f83-11e7-904b-b717feb065cf.png)

### Bad scenarios

These scenarios demonstrate non-ideal behavior for a node c++ addon. They represent situations you need to watch out for that may spell trouble in your code or that you are trying to solve a problem that is not well suited to node.

#### Contentious Threads (using a mutex lock)

1. An async function where the code running inside the threadpool locks a global mutex and continues to do expensive work. Only one thread at a time can have access to the global mutex, therefore only one thread can do work at one time. This causes all threads to contend with one another. In this situation, all threads are full with work, but they are really slow since they're each waiting for their turn for the mutex lock. This is called "lock contention".

```
node test/bench/bench-batch.js --iterations 50 --concurrency 10 --mode contentiousThreads
```

If you bump up `--iterations` to 500 and profile in Activity Monitor.app, you'll see the main loop is idle. This is expected because it is only dispatching work to the threads. The threads however are all "majority busy" in `psynch_mutexwait` (waiting for a locked mutex) as more time is spent waiting than doing the expensive work. This is because one thread will grab a lock, do work, all others will wait, another will grab the released lock, do work, all other threads will wait. This is all too common and the reason you don't want to use mutex locks. This is the profiling output of this non-ideal situation:

~[](https://cloud.githubusercontent.com/assets/20300/19990905/7e9a677a-a1ee-11e6-8ba2-c63ff63b1a1b.png)

When locks are unavoidable in real-world applications, we would hope that the % of time spent in `psynch_mutexwait` would be very small rather than very big. The real-world optimization would be to either rewrite the code to avoid needing locks or at least to rewrite the code to hold onto a lock for less time (scope the lock more).

#### Sleepy Threads

2. An async function that sleeps in the thread pool. This is a bizarre example since you'd never want to do this in practice. This scenario demonstrates when all worker threads have work (threadpool is full) but the work they are doing is not CPU intensive. This is an antipattern: it does not make sense to push work to the threadpool unless it is CPU intensive. Typically in this situation, the callstack of your process will show your workers spending most of their time in some kind of 'cond_wait' state. To run this scenario, be sure to set the number of seconds you'd like your workers to `--sleep`:

```
node test/bench/bench-batch.js --iterations 50 --concurrency 10 --sleep 1
```

#### Activity Monitor will display a few different kinds of threads:
- main thread (this is the event loop)
- [worker threads (libuv)](https://github.com/libuv/libuv/blob/1a96fe33343f82721ba8bc93adb5a67ddcf70ec4/src/threadpool.c#L64-L104) will include `worker (in node)` in the callstack. These are usually unnamed: `Thread_2206161` (some of these might not actually be running your code)
- V8 WorkerThread: we dont really need to care about these right now. They dont actually run your code.

To learn more about what exactly is happening with threads behind the scenes in Node and how `UV_THREADPOOL_SIZE` is involved, check out [this great blogpost](https://www.future-processing.pl/blog/on-problems-with-threads-in-node-js/).

Feel free to play around with these bench tests, and profile the code to get a better idea of how threading can affect the performance of your code. We are in the process of [adding more benchmarks](https://github.com/mapbox/node-cpp-skel/issues/30) that demonstrate a number of other scenarios.
5 changes: 4 additions & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,10 @@
"bundledDependencies":["node-pre-gyp"],
"devDependencies": {
"aws-sdk": "^2.4.7",
"tape": "^4.5.1"
"documentation": "^4.0.0-beta5",
"tape": "^4.5.1",
"d3-queue": "^3.0.1",
"minimist": "~1.2.0"
},
"binary": {
"module_name": "hello_world",
Expand Down
Loading