Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Dev wasm executor panics on block production #11132

Closed
ggwpez opened this issue Mar 28, 2022 · 13 comments · Fixed by #11206
Closed

Dev wasm executor panics on block production #11132

ggwpez opened this issue Mar 28, 2022 · 13 comments · Fixed by #11206
Assignees
Labels
I3-bug The node fails to follow expected behavior. U3-nice_to_have Issue is worth doing eventually.

Comments

@ggwpez
Copy link
Member

ggwpez commented Mar 28, 2022

I try to benchmark something and the wasm executor seems to panic for the polkadot-dev runtime by hitting a memory limit.
Failed to allocate memory: "Requested allocation size is too large" and going to fail due to allocating 66774016.

Following setup:

  • Generate genesis spec withbuild-spec --chain=polkadot-dev
  • Insert 1M accounts into the chain spec (download here)
  • run the node withpolkadot --chain dot-spec-1M-accs.json --base-path /tmp/dot-dev --execution wasm --wasm-execution compiled --pruning archive --force-authoring --alice --pool-limit 100000
  • Send 10,800 endowing transfer extrinsics (from //Alice to //Bob/i), I can provide the python script if needed.

You should see this:

Version: 0.9.18-f2ec25eb0b-x86_64-linux-gnu

   0: sp_panic_handler::set::{{closure}}
   1: std::panicking::rust_panic_with_hook
             at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/std/src/panicking.rs:610:17
   2: std::panicking::begin_panic_handler::{{closure}}
             at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/std/src/panicking.rs:502:13
   3: std::sys_common::backtrace::__rust_end_short_backtrace
             at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/std/src/sys_common/backtrace.rs:139:18
   4: rust_begin_unwind
             at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/std/src/panicking.rs:498:5
   5: core::panicking::panic_fmt
             at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/core/src/panicking.rs:116:14
   6: core::result::unwrap_failed
             at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/core/src/result.rs:1690:5
   7: tracing::span::Span::in_scope
   8: sp_io::allocator::ExtAllocatorMallocVersion1::call
   9: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
  10: <F as wasmtime::func::IntoFunc<T,(wasmtime::func::Caller<T>,A1),R>>::into_func::wasm_to_host_shim
  11: <unknown>
  12: <unknown>
  13: <unknown>
  14: <unknown>
  15: <unknown>
  16: <unknown>
  17: <unknown>
  18: wasmtime_runtime::traphandlers::catch_traps::call_closure
  19: wasmtime_setjmp
  20: wasmtime_runtime::traphandlers::catch_traps
  21: wasmtime::func::invoke_wasm_and_catch_traps
  22: wasmtime::func::typed::TypedFunc<Params,Results>::call
  23: sc_executor_wasmtime::instance_wrapper::EntryPoint::call
  24: sc_executor_wasmtime::runtime::perform_call
  25: <sc_executor_wasmtime::runtime::WasmtimeInstance as sc_executor_common::wasm_runtime::WasmInstance>::call
  26: sc_executor_common::wasm_runtime::WasmInstance::call_export
  27: std::panicking::try
  28: std::thread::local::LocalKey<T>::with
  29: sc_executor::native_executor::WasmExecutor<H>::with_instance::{{closure}}
  30: sc_executor::wasm_runtime::RuntimeCache::with_instance
  31: <sc_executor::native_executor::NativeElseWasmExecutor<D> as sp_core::traits::CodeExecutor>::call
  32: sp_state_machine::execution::StateMachine<B,H,Exec>::execute_aux
  33: sp_state_machine::execution::StateMachine<B,H,Exec>::execute_using_consensus_failure_handler
  34: <sc_service::client::call_executor::LocalCallExecutor<Block,B,E> as sc_client_api::call_executor::CallExecutor<Block>>::contextual_call
  35: polkadot_primitives::v2::runtime_decl_for_ParachainHost::candidate_events_call_api_at
  36: polkadot_primitives::v2::ParachainHost::candidate_events
  37: polkadot_node_core_runtime_api::make_runtime_api_request
  38: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
  39: <sc_service::task_manager::prometheus_future::PrometheusFuture<T> as core::future::future::Future>::poll
  40: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
  41: std::thread::local::LocalKey<T>::with
  42: tokio::park::thread::CachedParkThread::block_on
  43: tokio::runtime::handle::Handle::block_on
  44: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll
  45: tokio::runtime::task::harness::Harness<T,S>::poll
  46: tokio::runtime::blocking::pool::Inner::run
  47: std::sys_common::backtrace::__rust_begin_short_backtrace
  48: core::ops::function::FnOnce::call_once{{vtable.shim}}
  49: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
             at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/alloc/src/boxed.rs:1854:9
      <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
             at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/alloc/src/boxed.rs:1854:9
      std::sys::unix::thread::Thread::new::thread_start
             at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/std/src/sys/unix/thread.rs:108:17
  50: start_thread
             at /build/glibc-sMfBJT/glibc-2.31/nptl/pthread_create.c:477:8
  51: clone


Thread 'tokio-runtime-worker' panicked at 'Failed to allocate memory: "Requested allocation size is too large"', /home/vados/.cargo/git/checkouts/substrate-7e08433d4c370a21/fe3acf4/primitives/io/src/lib.rs:1295

This is a bug. Please report it at:

        https://github.com/paritytech/polkadot/issues/new

2022-03-23 19:54:20 going to fail due to allocating 66774016 
@ggwpez ggwpez added I5-tests Tests need fixing, improving or augmenting. U3-nice_to_have Issue is worth doing eventually. labels Mar 28, 2022
@koute
Copy link
Contributor

koute commented Apr 1, 2022

I can take a look at this if you want, if you can provide the Python script necessary to reproduce this.

@ggwpez
Copy link
Member Author

ggwpez commented Apr 1, 2022

Thanks @koute, here it is killer.py.txt (+pip install substrate-interface).
I just tried it again on latest polkadot master and it still happens.
Make sure to always delete your /tmp/dot-dev folder before re-running the node so that it starts with a fresh genesis.
The node takes a minute to build the genesis block, so you could attach your debugger after thats done, otherwise it probably takes longer.

Just to recap:

  1. Download the spec and the python script from above
  2. Delete /tmp/dot-dev or pick a non-existing dir
  3. polkadot --chain dot-spec-1M-accs.json --base-path /tmp/dot-dev --execution wasm --wasm-execution compiled --pruning archive --force-authoring --alice --pool-limit 100000
  4. Wait until it starts producing blocks
  5. Start the python script
  6. Block producer should panic

@koute
Copy link
Contributor

koute commented Apr 11, 2022

Stack trace from the runtime (after my two PRs it now gets printed out):

0: 0x47d51e - <unknown>!sp_io::allocator::extern_host_function_impls::malloc::hed26f12cb611fdea
1: 0x703e5 - <unknown>!alloc::raw_vec::finish_grow::h5b508086d31786ec
2: 0x70ec8 - <unknown>!alloc::raw_vec::RawVec<T,A>::reserve::do_reserve_and_handle::h196b09d9bc226fe8
3: 0x3a21c - <unknown>!parity_scale_codec::codec::decode_vec_with_len::h45dc2aeb96b8057c
4: 0x399d3b - <unknown>!frame_support::storage::unhashed::get::hefd8b3286c7e61dc
5: 0x399a7c - <unknown>!frame_system::<impl frame_system::pallet::Pallet<T>>::read_events_no_consensus::h3294c8ce39a180eb
6: 0x1ec690 - <unknown>!polkadot_runtime_parachains::runtime_api_impl::v2::candidate_events::heb324a904e8907db
7: 0x3e9c23 - <unknown>!ParachainHost_candidate_events

So it calls this function from polkadot/runtime/parachains/src/runtime_api_impl/v2.rs:

/// Implementation for the `candidate_events` function of the runtime API.
// NOTE: this runs without block initialization, as it accesses events.
// this means it can run in a different session than other runtime APIs at the same block.
pub fn candidate_events<T, F>(extract_event: F) -> Vec<CandidateEvent<T::Hash>>
where
        T: initializer::Config,
        F: Fn(<T as frame_system::Config>::Event) -> Option<inclusion::Event<T>>,
{
        use inclusion::Event as RawEvent;

        <frame_system::Pallet<T>>::read_events_no_consensus()
                .into_iter()
                .filter_map(|record| extract_event(record.event))
                .map(|event| match event {
                        RawEvent::<T>::CandidateBacked(c, h, core, group) =>
                                CandidateEvent::CandidateBacked(c, h, core, group),
                        RawEvent::<T>::CandidateIncluded(c, h, core, group) =>
                                CandidateEvent::CandidateIncluded(c, h, core, group),
                        RawEvent::<T>::CandidateTimedOut(c, h, core) =>
                                CandidateEvent::CandidateTimedOut(c, h, core),
                        RawEvent::<T>::__Ignore(_, _) => unreachable!("__Ignore cannot be used"),
                })
                .collect()
}

which in turn calls this from substrate/frame/system/src/lib.rs:

/// Get the current events deposited by the runtime.
///
/// Should only be called if you know what you are doing and outside of the runtime block
/// execution else it can have a large impact on the PoV size of a block.
pub fn read_events_no_consensus() -> Vec<EventRecord<T::Event, T::Hash>> {
    Events::<T>::get()
}

So it looks like it tries to decode this value, and it's too big, so it can't allocate a big enough Vec to fit all of it.

Curiously, we can see this comment where it is defined:

/// Events deposited for the current block.
///
/// NOTE: This storage item is explicitly unbounded since it is never intended to be read
/// from within the runtime.
#[pallet::storage]
pub(super) type Events<T: Config> =
    StorageValue<_, Vec<EventRecord<T::Event, T::Hash>>, ValueQuery>;

"it is never intended to be read from within the runtime" and yet it is? (Unless "by runtime" here means "during normal block execution"?)

Some potential ways we could fix this:

  1. Don't call the runtime to get this; fetch it directly from the storage.
  2. Call into the runtime and return this data in chunks.
  3. Check how many entries there are and return an error if there are too many.
  4. Do not let this value grow this big in the first place.
  5. Something else?

Alas it's outside of my area of expertise to be able to tell what's the most appropriate way to fix this.

@ggwpez ggwpez added I3-bug The node fails to follow expected behavior. and removed I5-tests Tests need fixing, improving or augmenting. labels Apr 11, 2022
@kianenigma
Copy link
Contributor

This is not really a surprise. When you shove a million transfers into one (or few) blocks, then you are going to have millions of events deposited (I think we deposit like 4 events per transfer), and reading them will cause the runtime to crash.

Interestingly, because we ::append, inserting will not fail, but reading will.

As the name suggests, events should not, exactly for this reason, be read in the runtime. I am not why and how it is done now in the parachains runtime, but it is def. something to be reconsidered. cc @ordian @rphmeier.

@kianenigma
Copy link
Contributor

UPDATE: this code is reading events in a runtime api call, not in a consensus-context runtime execution code. So while someone should look at the implications of this runtime api potentially failing, this has no severe consensus issues.

@bkchr
Copy link
Member

bkchr commented Apr 11, 2022

@koute @ggwpez how many transactions are inside the block when this fails?

Answering myself, around 6800 transactions.

@bkchr
Copy link
Member

bkchr commented Apr 11, 2022

2. Call into the runtime and return this data in chunks.

I think I would go with this. Having some kind of special interface stream_iter or something that reads from the storage in chunks.

However, before we do this we should check the size of the Polkadot runtime level Event with size_of. If we would add some boxing we could probably reduce the in memory size as we have done this for Call already.

@ggwpez could you get us the size with mem::size_of in wasm?

@ggwpez
Copy link
Member Author

ggwpez commented Apr 11, 2022

could you get us the size with mem::size_of in wasm?

@bkchr I assume frame_system::Event since it is being decoded here:
Event [byte]: Native 1224, Wasm 952
EventRecord<T::Event, T::Hash> [byte]: Native 1256, Wasm 976

Someone go 🥊 this.

PS:
The error message above states that it failed to allocate 66774016 byte, which is evenly divisible by 976.
66774016 / 976 = 68416, so that would be 68k events.
On native executor I got about 54.4k events, so this is reasonable.

@bkchr
Copy link
Member

bkchr commented Apr 11, 2022

PS:
The error message above states that it failed to allocate 66774016 byte, which is evenly divisible by 976.
66774016 / 976 = 68416, so that would be 68k events.
On native executor I got about 54.4k events, so this is reasonable.

You also need to take into account the allocation strategy. They are probably always increasing by 2 or something when they hit the upper size of the vector. Otherwise you are running constantly into allocations. So, I would assume it is more around 34-40k events.

@xlc
Copy link
Contributor

xlc commented Apr 11, 2022

I have brought this up multiple times but can we just not having all the events stored in a single storage key? It is not light client / XCM friendly anyway. No one is using event topics but we should be using it.

The only way to prove something have happened on other chain is via events. But we don't really want to send all the events in a single block just to prove a single event.

@rphmeier
Copy link
Contributor

rphmeier commented Apr 13, 2022

Well, we could also optimize this on the parachains runtime side by not using FRAME events for CandidateEvents (cc @drahnr) and instead having a separate storage entry for tracking candidate events. The migration from one mode to the other would need to be done carefully, since the first block where the new runtime code is used might have a parent that didn't use the new storage entry. I'd recommend a two-part migration off of FRAME events: a first change where the new storage entry is introduced, and a following change (after the first is already running on Polkadot) where the runtime API is fully switched off of FRAME events.

@bkchr
Copy link
Member

bkchr commented Apr 13, 2022

@rphmeier we already have a fix for this. So not sure you need this.

@xlc I think we don't have a proper issue and idea for this yet?

@xlc
Copy link
Contributor

xlc commented Apr 13, 2022

This is one of the previous discussion #9480 (comment)

I don't see a simple solution to this so we really need to spend sometime to think about it and come up some potential solutions for further discussion.

I asked Gav about it last year May and what I got is "we may refactor the events system with a new cryptographic structure in the not too distant future."

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
I3-bug The node fails to follow expected behavior. U3-nice_to_have Issue is worth doing eventually.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants