[WIP] Add benchmark scaffolding #29

GretaCB · 2016-10-11T21:57:42Z

Per #25

Add single run test
Add batch run test
Try out process.memoryUsage(). Thinking about what info is useful here
Add sleep option for mocking when threads are busy, but aren't doing much work
Add more heavy-duty benchmarks per Add profiling/performance docs #30 (comment)

Per chat with @springmeyer:

Can we write a test that proves memory optimization?
node module-ify @springmeyer 's libnew lib for tracking of memory allocations
Document how to profile (using Activity Monitor). The skel isn't doing heavy operations, but might be possible to profile during batch bench run.

cc @mapsam @springmeyer

codecov-io · 2016-10-11T22:01:47Z

Codecov Report

Merging #29 into master will decrease coverage by 0.43%.
The diff coverage is 98.16%.

@@            Coverage Diff             @@
##           master      #29      +/-   ##
==========================================
- Coverage   98.82%   98.38%   -0.44%     
==========================================
  Files           2        2              
  Lines          85      186     +101     
==========================================
+ Hits           84      183      +99     
- Misses          1        3       +2

Impacted Files	Coverage Δ
src/hello_world.hpp	`0% <ø> (ø)`	⬆️
src/hello_world.cpp	`98.91% <98.16%> (-1.09%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 59d0092...4970302. Read the comment docs.

springmeyer · 2016-10-12T16:59:20Z

test/bench/bench-batch.js

+var fs = require('fs');
+var path = require('path');
+var argv = require('minimist')(process.argv.slice(2));
+var tape = require('tape');


@GretaCB I would propose not depending on tape for the benchmark. Since it is not critical and does not offer anything too useful. For the assert below you could use the var assert = require('assert').

👍 awesome thank you! just committed the change.

…th minimal JS logic

springmeyer · 2016-10-18T23:03:44Z

@GretaCB cf913b8 adds a very expensive usage of std::map (because it invokes lots of memory allocation internally in the map, searching of the map, and string comparisons). Now with:

~/projects/node-cpp-skel[bench]$ time node test/bench/bench-batch.js  --iterations 10 --concurrency 10

real    0m3.491s
user    0m11.704s
sys 0m0.535s

I get my CPU usage spiking to > 500%. Running node test/bench/bench-batch.js --iterations 100 --concurrency 10 to keep it going long enough to easily attach in activity monitor gives a callstack 98% idle in the main event loop (as expected if the threads are doing all the work) and with 99.9-100% of the threads reporting busy doing work (:tada:):

GretaCB · 2016-10-21T17:32:44Z

Currently working on adding a couple more benchmark scenarios before merging.

Add a benchmark to demonstrate the cost of interacting with libuv and the threadpool. Demonstrate when not to use async functions, in the case that the function's work is faster than libuv's ability to interact with the threadpool.
Add a bench scenario where the code running inside the threadpool locks a mutex. This certainly happens in node-mbgl and node-mapnik. And will be a situation where all threads are full with work, it will not be CPU intensive, and they will be really slow (assuming lock contention is happening). We can create lock contention by having each thread attempt to access a global lock. Perf will be horrible
Bump up coverage
Update API docs and benchmark docs

springmeyer · 2016-10-24T17:33:38Z

per chat with @GretaCB - next I'm going to take a look at profiling and tuning a few things in the PR. In particular I'll look at the impl of the mutex lock and make sure there is enough work being done in the async function that locks the global such that we are properly demonstrating (e.g. providing an example you can profile) the kind of thread contention programmers should avoid.

…on to benchmark

… worst case

springmeyer · 2016-11-04T00:55:16Z

In particular I'll look at the impl of the mutex lock and make sure there is enough work being done in the async function that locks the global such that we are properly demonstrating (e.g. providing an example you can profile) the kind of thread contention programmers should avoid.

Done in fb1fca8. Now the contentiousThreads demo is properly awful. It can be tested like:

node test/bench/bench-batch.js --iterations 50 --concurrency 10 --mode contentiousThreads
Benchmark speed: 15 runs/s (runs:50 ms:3245 )
Benchmark iterations: 50 concurrency: 10 mode: contentiousThreads

If you bump up --iterations to 500 and profile in Activity Monitor.app you'll see the main loop is idle. This is expected because it is only dispatching work to the threads. The threads however are all "majority busy" in psynch_mutexwait (waiting for a locked mutex) as more time is spent waiting than doing the expensive work. This is because one thread will grab a lock, do work, all others will wait, another will grab the released lock, do work, all other threads will wait. This is all too common and the reason you don't want to use mutex locks. This is the profiling output of this non-ideal situation:

When locks are unavoidable in real-world applications, we would hope that the % of time spent in psynch_mutexwait would be very small rather than very big. The real-world optimization would be to either rewrite the code to avoid needing locks or at least to rewrite the code to hold onto a lock for less time (scope the lock more).

springmeyer · 2017-02-13T17:27:23Z

Just looked back at this branch. It's got some great stuff that I think we should merge soon and keep iterating on. My one reservation before merging I'm slightly uncomfortable with how we are mixing best practice/simple/hello-world style code with new code demonstrating both advanced and non-ideal scenarios. I think we should split things apart before merging such that:

src/hello_world.hpp and src/hello_world.cpp contain only simple hello world style code: only binding a single standalone function (this is what you would copy if you just want to start writing a module fast and use the skel as a base)
src/async_<name>.hpp and src/async_<name>.cpp and src/async_<name>.readme.md that demonstrate common async scenarios in code and have a readme talking about what about them to care about them (good and bad). (This is what you'd profile, and poke at within skel, to learn about the nuances of node async performance).

… baton

springmeyer · 2017-08-02T00:27:54Z

@GretaCB I just returned here to reflect on next steps. I feel like a good approach would be to split this work into 2 phases:

phase 1

Land a first PR that:

Adds a ./bench folder
Adds two benchmark scripts to cover the two performance focused functions currently in master:
- ./bench/hello_async.bench.js
- ./bench/hello_object_async.test.js
These scripts could be based on a simplified version of the test/bench/bench-batch.js currently in this PR
Very simple docs added to the README on how to run the benchmark and a comment that it is not realworld yet since the code does not do much, but that the idea is for developers using skel to adapt it and run it to monitor the performance of the code they add.

^^ this gets the key structure in place to make it easy and fast to start benchmarks for any module that uses skel. This is a great first step.

phase 2

We could revisit adding examples of performance scenarios. However I feel like this is a really advanced topic most suitable outside of node-cpp-skel. The skel is complex enough currently without diving deep on performance. But given performance is critical it would be great to cover it and benefit from the skel structure. So, here is an idea. Instead of building into the skel directly, we could:

Fork skel, and remove lots of unneeded code and docs (like the sync examples)
Add back key async performance scenarios
Add docs for this
Then link to this fork as an example of a skel implementation
The fork would be a separate "performance deep dive example using node-cpp-skel"

springmeyer · 2017-08-25T01:23:25Z

@GretaCB now that Phase 1 is done in #61 - how about closing this ticket? Phase 2 remains but I feel like my idea is not concrete enough to warrant a ticket. I'm feeling really good with what we have and don't see a major need to ticket more work here. Rather we'll apply node-cpp-skel, learn perf issues, and then - at that time - have ideas of things to build back, or add to the docs.

springmeyer · 2018-07-14T15:05:23Z

Closing this. What we have are:

Benchmark scripts: https://github.com/mapbox/node-cpp-skel/tree/master/bench
Benchmarking docs: https://github.com/mapbox/node-cpp-skel/blob/master/docs/benchmarking.md
Future performance questions to answer like [FUTURE] Write a comparison benchmark using worker_threads module #143

first pass at benchmarks

a3cd6ba

add first pass at batch bench

f4abc06

springmeyer reviewed Oct 12, 2016

View reviewed changes

gretacb and others added 9 commits October 12, 2016 14:29

no need for tape

e03dd4c

add sleep param

a472983

properly mock multithreading

77dc915

add expensive function for benchtesting

492c554

add fib arg to bench

0cd364f

latest experiments for hefty benchmarks

29ee946

hoist options out of callback to ensure we are only doing c++ work wi…

c1989ec

…th minimal JS logic

do expensive work by default in threadpool

cf913b8

keep the tests passing

1fe605e

gretacb and others added 7 commits October 18, 2016 19:35

Merge branch 'master' into bench

125a6a7

Merge branch 'master' into bench

bf6498d

add more info about bench tests and how to use

3045b62

add sleep example

325f1dc

specify seconds arg

4457258

update API docs

1bd440e

Add more detail to scenario descriptions

ae8dcf4

gretacb added 6 commits October 21, 2016 15:41

separate thread behaviour into their own functions

cf331e3

add tests for new functions

f8cba48

add contentiousThread function and test

ad2219b

add more test coverage and rough draft of mutex lock logic

dce7a3b

use master mason branch

d3957ff

no need for sleep in shout()

67643e7

report speed in runs/second - add use strict to test files

66bf73c

Dane Springmeyer added 6 commits November 3, 2016 17:31

finish work on accepting last arg as callback

1cd611c

reuse AfterAsync function

ddbaccb

require/allow mode to be passed in to dynamically choose async functi…

7b9f85f

…on to benchmark

reduce the busyThreads work slightly

38fa572

fix bench-batch.js usage

e52c3ab

make the do_contentious_work function do work and lock to demonstrate…

fb1fca8

… worst case

Dane Springmeyer and others added 5 commits November 3, 2016 17:57

fix spelling in comment

b847347

Merge branch 'master' into bench

0b4053e

add more details about each bench mode

5d1f896

add mode to docs

bee23ec

Merge branch 'master' into bench

0ce38a5

springmeyer mentioned this pull request Feb 20, 2017

Examples of dangerous bugs and diagnostics catching them #40

Closed

gretacb added 6 commits June 12, 2017 10:34

markdown cleanup

4efd99a

Merge branch 'master' into bench

27aee62

Merge branch 'master' into bench

6276e25

in case coverage was never run

7f469bb

hopefully Travis is happy with expliclty setting sleep as part of the…

1a39ac9

… baton

re-add needed deps for running bench and fix busythreads example

4970302

GretaCB mentioned this pull request Jun 14, 2017

Standalone function refactor #44

Merged

4 tasks

This was referenced Aug 2, 2017

[WIP] Questions/Brainstorming #50

Closed

Standardize build location mapbox/hpp-skel#18

Closed

GretaCB mentioned this pull request Aug 10, 2017

Add bench scripts for async examples #61

Merged

2 tasks

springmeyer mentioned this pull request Aug 25, 2017

Add profiling/performance docs #30

Closed

springmeyer closed this Jul 14, 2018

springmeyer deleted the bench branch July 14, 2018 15:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Add benchmark scaffolding #29

[WIP] Add benchmark scaffolding #29

GretaCB commented Oct 11, 2016 •

edited

Loading

codecov-io commented Oct 11, 2016 •

edited

Loading

springmeyer Oct 12, 2016 •

edited

Loading

GretaCB Oct 12, 2016

springmeyer commented Oct 18, 2016

GretaCB commented Oct 21, 2016 •

edited

Loading

springmeyer commented Oct 24, 2016

springmeyer commented Nov 4, 2016 •

edited

Loading

springmeyer commented Feb 13, 2017 •

edited

Loading

springmeyer commented Aug 2, 2017 •

edited by GretaCB

Loading

springmeyer commented Aug 25, 2017

springmeyer commented Jul 14, 2018

[WIP] Add benchmark scaffolding #29

[WIP] Add benchmark scaffolding #29

Conversation

GretaCB commented Oct 11, 2016 • edited Loading

codecov-io commented Oct 11, 2016 • edited Loading

Codecov Report

springmeyer Oct 12, 2016 • edited Loading

Choose a reason for hiding this comment

GretaCB Oct 12, 2016

Choose a reason for hiding this comment

springmeyer commented Oct 18, 2016

GretaCB commented Oct 21, 2016 • edited Loading

springmeyer commented Oct 24, 2016

springmeyer commented Nov 4, 2016 • edited Loading

springmeyer commented Feb 13, 2017 • edited Loading

springmeyer commented Aug 2, 2017 • edited by GretaCB Loading

phase 1

phase 2

springmeyer commented Aug 25, 2017

springmeyer commented Jul 14, 2018

GretaCB commented Oct 11, 2016 •

edited

Loading

codecov-io commented Oct 11, 2016 •

edited

Loading

springmeyer Oct 12, 2016 •

edited

Loading

GretaCB commented Oct 21, 2016 •

edited

Loading

springmeyer commented Nov 4, 2016 •

edited

Loading

springmeyer commented Feb 13, 2017 •

edited

Loading

springmeyer commented Aug 2, 2017 •

edited by GretaCB

Loading