Skip to content
This repository has been archived by the owner on Feb 3, 2023. It is now read-only.

Optimize WASM calls #1211

Merged
merged 29 commits into from
Apr 8, 2019
Merged

Optimize WASM calls #1211

merged 29 commits into from
Apr 8, 2019

Conversation

lucksus
Copy link
Collaborator

@lucksus lucksus commented Apr 3, 2019

Erasing 81% of run_dna's CPU time

Please check one of the following, relating to the CHANGELOG, which should be updated if relevant

  • my changes to the code affect some exposed aspect of the developer experience, and I have added an item to relevant 'Added, Fixed, Changed, Removed, Deprecated, or Security' heading under the 'Unreleased' heading of the CHANGELOG, with the format - summary of change [PR#1234](https://github.com/holochain/holochain-rust/pull/1234)
  • my changes to the code do not affect any exposed aspect of the developer experience

Context

Sub-items of performance optimization for WASM calls:

Screenshot 2019-04-05 at 20 42 13

Main changes

In the old FlameGraph we can see that within run_dna most time is spent creating the parsed WASM module from the byte array (wasmi::Module::from_buffer). After first trying to hold the whole wasmi::ModuleInstance in the nucleus state, for which I also refactored wasmi in a fork so that ModuleInstance implements Send, I realized that it makes much more sense to just hold a parsed wasmi::Module inside the DNA, next to the unparsed byte array. We can then still have run_dna create a new instance running in a new thread, but we use the already present Module that we find in the DNA. I ended up not using my wasmi fork since the mutex applied there introduced new performance issues for WASM memory access.

Also, with this PR DnaWasm::code changes from being a plain Vec<u8> to an Arc<Vec<u8>> which makes DNA cloning (which happens quite a bit, not yet a major thing but already visible on the FlameGraphs) much cheaper.

Next to those main changes I cleaned up by removing the now unnecessary parameter wasm from run_dna and the functions that call it. Same with dna_name which can be get from the DNA, via the context, where needed.

Before

Screenshot 2019-04-05 at 20 41 26

See SVG in SoA tree: https://realtimeboard.com/app/board/o9J_kyiXmFs=/?moveToWidget=3074457346553142160
Module::from_buffer makes up 81% for run_dna

After

Screenshot 2019-04-05 at 20 41 32

Outlook

This is a substantial improvement, and we can do more. Instantiating ModuleInstance is still more expensive than actually executing the WASM code - at least in these holochain-basic-chat based profiles. Also, WASM-side initialization like init_globals (since it's called with every new ModuleInstance) makes up for a good part of spent CPU time. A promising next step could be re-purposing a fixed pool of ModuleInstances and initializing them from memory snapshots that include the result of init_globals etc.

lucksus and others added 27 commits April 3, 2019 19:41
Because we have tests that provide invalid WASM
…lochain-rust into optimize-wasm-calls

# Conflicts:
#	core/src/nucleus/ribosome/run_dna.rs
@lucksus lucksus changed the title [WIP] Optimize WASM calls Optimize WASM calls Apr 5, 2019
@lucksus lucksus marked this pull request as ready for review April 5, 2019 18:39
@lucksus
Copy link
Collaborator Author

lucksus commented Apr 5, 2019

@willemolding, would love to hear how this branch performs in your Holo test setup.

@ddd-mtl
Copy link
Contributor

ddd-mtl commented Apr 5, 2019

@lucksus I think its a good idea to cache the result of init_globals but be careful about some of the globals which are not constants and can change value between dna calls. Like agent_latest_hash for example. A 'dirty' flag should do the trick.

@thedavidmeister
Copy link
Contributor

yes, i like where this is going and i also have a friendly reminder for future work re: stack based memory not being safe to work with concurrently - we would need to move to wee_alloc to support that i believe

@lucksus
Copy link
Collaborator Author

lucksus commented Apr 6, 2019

Right, thanks @ddd-mtl. So it actually means that we would need to recreate the initialized memory snapshot after any commit to the source chain. Isn't that very related to the "as at", @thedavidmeister, @artbrock?

@ddd-mtl
Copy link
Contributor

ddd-mtl commented Apr 6, 2019

@lucksus Not "any" commit, only a commit that updates the agentId. Thats what agent_latest_hash is tracking. Unless you were referring to a different global?

@thedavidmeister
Copy link
Contributor

@lucksus yeha well the summary of that is that the head commit at the start of the function call invalidates the result of the function call if the head has moved by the end of the function call

that is something else that we can avoid thinking about atm with sequential function calls but would need to be considered across concurrent calls

well.... perhaps, we might find benefit in adding "as at" to actions in the action loop... but that's another discussion as we're not doing that yet

Copy link
Member

@zippy zippy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome!

@zippy zippy merged commit 7056f34 into develop Apr 8, 2019
@zippy zippy deleted the optimize-wasm-calls branch October 4, 2019 18:28
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants