Optimize WASM calls #1211

lucksus · 2019-04-03T17:52:38Z

Erasing 81% of run_dna's CPU time

Please check one of the following, relating to the CHANGELOG, which should be updated if relevant

my changes to the code affect some exposed aspect of the developer experience, and I have added an item to relevant 'Added, Fixed, Changed, Removed, Deprecated, or Security' heading under the 'Unreleased' heading of the CHANGELOG, with the format - summary of change [PR#1234](https://github.com/holochain/holochain-rust/pull/1234)
my changes to the code do not affect any exposed aspect of the developer experience

Context

Sub-items of performance optimization for WASM calls:

Main changes

In the old FlameGraph we can see that within run_dna most time is spent creating the parsed WASM module from the byte array (wasmi::Module::from_buffer). After first trying to hold the whole wasmi::ModuleInstance in the nucleus state, for which I also refactored wasmi in a fork so that ModuleInstance implements Send, I realized that it makes much more sense to just hold a parsed wasmi::Module inside the DNA, next to the unparsed byte array. We can then still have run_dna create a new instance running in a new thread, but we use the already present Module that we find in the DNA. I ended up not using my wasmi fork since the mutex applied there introduced new performance issues for WASM memory access.

Also, with this PR DnaWasm::code changes from being a plain Vec<u8> to an Arc<Vec<u8>> which makes DNA cloning (which happens quite a bit, not yet a major thing but already visible on the FlameGraphs) much cheaper.

Next to those main changes I cleaned up by removing the now unnecessary parameter wasm from run_dna and the functions that call it. Same with dna_name which can be get from the DNA, via the context, where needed.

Before

See SVG in SoA tree: https://realtimeboard.com/app/board/o9J_kyiXmFs=/?moveToWidget=3074457346553142160
Module::from_buffer makes up 81% for run_dna

After

Outlook

This is a substantial improvement, and we can do more. Instantiating ModuleInstance is still more expensive than actually executing the WASM code - at least in these holochain-basic-chat based profiles. Also, WASM-side initialization like init_globals (since it's called with every new ModuleInstance) makes up for a good part of spent CPU time. A promising next step could be re-purposing a fixed pool of ModuleInstances and initializing them from memory snapshots that include the result of init_globals etc.

Because we have tests that provide invalid WASM

Instead of creating a new one

…ded.

…lochain-rust into optimize-wasm-calls # Conflicts: # core/src/nucleus/ribosome/run_dna.rs

…lochain-rust into optimize-wasm-calls

lucksus · 2019-04-05T19:41:44Z

@willemolding, would love to hear how this branch performs in your Holo test setup.

ddd-mtl · 2019-04-05T20:19:44Z

@lucksus I think its a good idea to cache the result of init_globals but be careful about some of the globals which are not constants and can change value between dna calls. Like agent_latest_hash for example. A 'dirty' flag should do the trick.

thedavidmeister · 2019-04-06T01:25:36Z

yes, i like where this is going and i also have a friendly reminder for future work re: stack based memory not being safe to work with concurrently - we would need to move to wee_alloc to support that i believe

lucksus · 2019-04-06T15:44:20Z

Right, thanks @ddd-mtl. So it actually means that we would need to recreate the initialized memory snapshot after any commit to the source chain. Isn't that very related to the "as at", @thedavidmeister, @artbrock?

ddd-mtl · 2019-04-06T17:09:38Z

@lucksus Not "any" commit, only a commit that updates the agentId. Thats what agent_latest_hash is tracking. Unless you were referring to a different global?

thedavidmeister · 2019-04-07T01:31:58Z

@lucksus yeha well the summary of that is that the head commit at the start of the function call invalidates the result of the function call if the head has moved by the end of the function call

that is something else that we can avoid thinking about atm with sequential function calls but would need to be considered across concurrent calls

well.... perhaps, we might find benefit in adding "as at" to actions in the action loop... but that's another discussion as we're not doing that yet

zippy

awesome!

lucksus and others added 27 commits April 3, 2019 19:41

Extract wasmi_factor out of run_dna

62a9097

Add Ribosomes to state and use forked WASMi that has Send ModuleInstance

5257891

Error log instead of bailing during initialization when WASM is invalid

6048805

Because we have tests that provide invalid WASM

In run_dna, allocate Ribosome from nucleus state

529f9bd

Instead of creating a new one

Only reuse WASM modules and recreate instances from those

3a3b36b

rustfmt

3361ee5

Rename: Ribsome -> module, since it's just that now

be765eb

Merge branch 'develop' into optimize-wasm-calls

bc3f398

Rename wasmi_factory to wasm_module_factory

f7a6c54

We only need one Module per zome

2d11144

Rename ModuleMutex -> ModuleArc since the Mutex is gone

0631311

Arc'ed WASM binary in DNA

b12c02f

Get WASM module from DnaWasm

2238576

rustfmt

bd8b7f5

Remove parameter wasm from run_dna, which is unused by now.

aaa9586

Bubble up removal of wasm parameter.

7505290

Remove dna_name from wasm call data and get it from context where nee…

43bd3ff

…ded.

Merge branch 'optimize-wasm-calls' of https://github.com/holochain/ho…

51942fd

…lochain-rust into optimize-wasm-calls # Conflicts: # core/src/nucleus/ribosome/run_dna.rs

Bubble up removal of dna_name and wasm parameters.

795aaaa

Switch back to mainstream wasmi

592380a

rustfmt

918a028

Pin wasmi version again

ce19ad4

rustdocs

701466d

wasm factories rename

991e585

rustfmt

bc28e71

Changelog

2685f3d

Merge branch 'optimize-wasm-calls' of https://github.com/holochain/ho…

33172b0

…lochain-rust into optimize-wasm-calls

lucksus changed the title ~~[WIP] Optimize WASM calls~~ Optimize WASM calls Apr 5, 2019

lucksus marked this pull request as ready for review April 5, 2019 18:39

Merge branch 'develop' into optimize-wasm-calls

4d7829c

Fix comment

6369d2e

lucksus requested review from zippy, thedavidmeister and willemolding April 5, 2019 19:40

thedavidmeister approved these changes Apr 6, 2019

View reviewed changes

zippy approved these changes Apr 8, 2019

View reviewed changes

zippy merged commit 7056f34 into develop Apr 8, 2019

zippy deleted the optimize-wasm-calls branch October 4, 2019 18:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize WASM calls #1211

Optimize WASM calls #1211

lucksus commented Apr 3, 2019 •

edited

Loading

lucksus commented Apr 5, 2019

ddd-mtl commented Apr 5, 2019 •

edited

Loading

thedavidmeister commented Apr 6, 2019

lucksus commented Apr 6, 2019

ddd-mtl commented Apr 6, 2019

thedavidmeister commented Apr 7, 2019

zippy left a comment

Optimize WASM calls #1211

Optimize WASM calls #1211

Conversation

lucksus commented Apr 3, 2019 • edited Loading

Erasing 81% of run_dna's CPU time

Context

Main changes

Before

After

Outlook

lucksus commented Apr 5, 2019

ddd-mtl commented Apr 5, 2019 • edited Loading

thedavidmeister commented Apr 6, 2019

lucksus commented Apr 6, 2019

ddd-mtl commented Apr 6, 2019

thedavidmeister commented Apr 7, 2019

zippy left a comment

Choose a reason for hiding this comment

lucksus commented Apr 3, 2019 •

edited

Loading

ddd-mtl commented Apr 5, 2019 •

edited

Loading