-
Notifications
You must be signed in to change notification settings - Fork 267
Conversation
Because we have tests that provide invalid WASM
Instead of creating a new one
…lochain-rust into optimize-wasm-calls # Conflicts: # core/src/nucleus/ribosome/run_dna.rs
…lochain-rust into optimize-wasm-calls
@willemolding, would love to hear how this branch performs in your Holo test setup. |
@lucksus I think its a good idea to cache the result of |
yes, i like where this is going and i also have a friendly reminder for future work re: stack based memory not being safe to work with concurrently - we would need to move to |
Right, thanks @ddd-mtl. So it actually means that we would need to recreate the initialized memory snapshot after any commit to the source chain. Isn't that very related to the "as at", @thedavidmeister, @artbrock? |
@lucksus Not "any" commit, only a commit that updates the agentId. Thats what |
@lucksus yeha well the summary of that is that the head commit at the start of the function call invalidates the result of the function call if the head has moved by the end of the function call that is something else that we can avoid thinking about atm with sequential function calls but would need to be considered across concurrent calls well.... perhaps, we might find benefit in adding "as at" to actions in the action loop... but that's another discussion as we're not doing that yet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome!
Erasing 81% of run_dna's CPU time
Please check one of the following, relating to the CHANGELOG, which should be updated if relevant
- summary of change [PR#1234](https://github.com/holochain/holochain-rust/pull/1234)
Context
Sub-items of performance optimization for WASM calls:
Main changes
In the old FlameGraph we can see that within
run_dna
most time is spent creating the parsed WASM module from the byte array (wasmi::Module::from_buffer
). After first trying to hold the wholewasmi::ModuleInstance
in the nucleus state, for which I also refactored wasmi in a fork so thatModuleInstance
implementsSend
, I realized that it makes much more sense to just hold a parsedwasmi::Module
inside the DNA, next to the unparsed byte array. We can then still haverun_dna
create a new instance running in a new thread, but we use the already present Module that we find in the DNA. I ended up not using my wasmi fork since the mutex applied there introduced new performance issues for WASM memory access.Also, with this PR
DnaWasm::code
changes from being a plainVec<u8>
to anArc<Vec<u8>>
which makes DNA cloning (which happens quite a bit, not yet a major thing but already visible on the FlameGraphs) much cheaper.Next to those main changes I cleaned up by removing the now unnecessary parameter
wasm
fromrun_dna
and the functions that call it. Same withdna_name
which can be get from the DNA, via the context, where needed.Before
See SVG in SoA tree: https://realtimeboard.com/app/board/o9J_kyiXmFs=/?moveToWidget=3074457346553142160
Module::from_buffer
makes up 81% forrun_dna
After
Outlook
This is a substantial improvement, and we can do more. Instantiating
ModuleInstance
is still more expensive than actually executing the WASM code - at least in these holochain-basic-chat based profiles. Also, WASM-side initialization likeinit_globals
(since it's called with every new ModuleInstance) makes up for a good part of spent CPU time. A promising next step could be re-purposing a fixed pool of ModuleInstances and initializing them from memory snapshots that include the result ofinit_globals
etc.