Enrich detail of Seal contract trap errors #5239

Robbepop · 2020-03-12T16:57:27Z

Extended Trap Information

Problem

There are mainly two different types how a contract execution can trap:

It either traps on its own due to a logic error.
It traps through the host implementation because certain preconditions of a host function has not been met.

For the second case the contracts pallet currently only prints trap during contract execution message to the console. This mostly doesn't help a contract writer a lot and they eventually have to path their way forward with println-debugging which is not sustainable nor scalable.

Proposal

But we can do better in the second case where the trap happens due to some preconditions that have not been met.

Example

In the case of a call to ext_get_storage(key_ptr: u32, key_len: u32) the contracts pallet traps if it cannot properly decode the (key_ptr, key_len) representing slice of bytes into a proper Key. While currently the contracts pallet simply prints out trap during contract execution it actually could do better, e.g. by providing a reason:

trap during contract execution. reason: could not decode key in ext_get_storage call

For a whole call stack we could even provide the call stack until the point of failure:

trap during contract execution.
    reason: could not decode key in ext_get_storage call
    called from: contract at 0x293843249
    called from: contract at 0x873317821
    called from: etc.

This would provide the contract writer with much more helpful additional information to actually get to the point of failure within their contract logic.
Also this should be fairly simple to add to the contracts pallet since it already maintains the call stack of every contract execution. Printing this information would simply access the already managed state.

Note that this is not comparable to stack traces and just a middle ground solution that should provide practical help quickly. It should be noted here that better error reporting is one of the most frequently asked for features in contracts. So we really need to push this forward!

Also note that for the implemented error reports to the console we would never guarantee their contents which means that once we can do better we could easily exchange the whole error reporting mechanism with a more mature one, e.g. proper stack traces upon a trap.

The text was updated successfully, but these errors were encountered:

pepyakin · 2020-03-12T17:16:01Z

This sounds very useful.

First, I think we need to do the printing only opt-in ensuring that this functionality is not enabled by default on production nodes. Probably --dev would be the best option here.

Second, there is complication that the formatting machineries are basically non-existent in substrate at the moment of writing, so we should need to workaround that or settle on more simpler formatting, e.g.:

trap during contract execution. reason: 
could not decode key in ext_get_storage call
0x293843249
0x873317821

Linking #2082 since it can help with this kind of separation of runtime logic and diagnostic logic.

athei · 2020-03-12T18:16:16Z

A reasonable demand. I also ran into the dreaded traped during contract execution while wondering why it trapped. Maybe we could then improve our integration teststo check for more meaningful error messages.

athei · 2020-04-03T12:47:06Z

I did some investigation. It seems doable as there are ways of passing errors from wasmi host functions to the invoker. Currently, sp_sandbox defines a host function as:

/// Error error that can be returned from host function.
#[derive(Encode, Decode)]
#[derive(crate::RuntimeDebug)]
pub struct HostError;

pub type HostFuncType<T> = fn(&mut T, &[Value]) -> Result<ReturnValue, HostError>

We need to change the error type in order to be able to pass error information from a host function to the invoker. I suppose we should use something like failure in order to record the error causes:

trap during contract execution.
    reason:  ext_get_storage failed
    caused by: failed to decode key
    called from: contract at 0x293843249
    called from: contract at 0x873317821
    called from: etc.

The "called from" lines are just added by the contract module which walks through all the callers in case when the sandbox reports a failure.

I am not sure if we can use failure though because it uses fmt I guess?

pepyakin · 2020-04-06T17:23:51Z

Heh, this was my first idea when we discussed this with Robin and I am not sure how that is gonna work. When a contract calls another contract and then the callee fails, then the failure is not cascaded through the call chain. Instead, the caller gets a return status code. I also don't see contracts cooperatively passing an error up the stack.

Instead, we could just print this information to the runtime's stdout. It seems this is the best we can do so far.

Anyway, regardless however it is implemented, the production code path should not regress in performance.

athei · 2020-04-07T06:52:01Z

Instead, we could just print this information to the runtime's stdout. It seems this is the best we can do so far.

You mean just using error! (log crate)? Is that even possible to use inside the sandbox?

pepyakin · 2020-04-07T14:18:43Z

Not quite, although similar: the Printable API.

athei · 2020-07-23T08:34:15Z

So what turns out to be a pretty good solution for having at least some decent error reporting for supervisor triggered errors is the following approach that I already used when I adding a new error kind:

Add the error to the decl_error! macro
Use the TrapReason::SupervisorError(err) variant to funnel the error back the executor

This will not allow for dynamic error messages (formatting a message). As a corollary to this stack traces are also not possible. On the upside it allows those messages to be printed to the UI (e.G polkadot JS apps).

atenjin · 2021-01-13T10:48:41Z

Currently, our tool for run contract in a sandbox Europa has support the similar feature like:

1: NestedRuntime {
    ext_result: [success] ExecReturnValue { flags: 0, data: 7c000000 },
    caller: 0000000000000000000000000000000000000000000000000000000000000000 (5C4hrfjw...),
    self_account: 36b1ebfb6e6f31fa0a9b9d934db3033bc5766bc0a52ddd5656cb54e91903ad70 (5DJRJAXF...),
    selector: 0x1e5ca456,
    args: None,
    value: 0,
    gas_limit: 4999999999999,
    gas_left: 4997774243082,
    env_trace: [
        seal_value_transferred(Some(0x00000000000000000000000000000000)),
        seal_input(Some(0x1e5ca456)),
        seal_get_storage((Some(0x0000000000000000000000000000000000000000000000000000000000000000), Some(0x829153a24020d6c6e3a66aaaa7e5235114553fc0a2e91b1f2bd49b50c2317696))),
        seal_call((Some(0x829153a24020d6c6e3a66aaaa7e5235114553fc0a2e91b1f2bd49b50c2317696), 0, Some(0), Some(0x1e5ca456), Some(0x7c000000), Some(ReturnCode::Success))),
        seal_return((0, Some(0x7c000000))),
    ],
    trap_reason: TrapReason::Return(ReturnData { flags: 0, data: 7c000000 }),
    nest: [
        2: NestedRuntime {
            ext_result: [success] ExecReturnValue { flags: 0, data: 7c000000 },
            caller: 36b1ebfb6e6f31fa0a9b9d934db3033bc5766bc0a52ddd5656cb54e91903ad70 (5DJRJAXF...),
            self_account: 829153a24020d6c6e3a66aaaa7e5235114553fc0a2e91b1f2bd49b50c2317696 (5F1uFe5U...),
            selector: 0x1e5ca456,
            args: None,
            value: 0,
            gas_limit: 4998256583394,
            gas_left: 4997888971426,
            env_trace: [
                seal_value_transferred(Some(0x00000000000000000000000000000000)),
                seal_input(Some(0x1e5ca456)),
                seal_get_storage((Some(0x0000000000000000000000000000000000000000000000000000000000000000), Some(0x7c000000))),
                seal_return((0, Some(0x7c000000))),
            ],
            trap_reason: TrapReason::Return(ReturnData { flags: 0, data: 7c000000 }),
            nest: [],
        },
    ],
}

The simple log:

1: NestedRuntime {
    self_account: 36b1ebfb6e6f31fa0a9b9d934db3033bc5766bc0a52ddd5656cb54e91903ad70 (5DJRJAXF...),
    nest: [
        2: NestedRuntime {
            self_account: 829153a24020d6c6e3a66aaaa7e5235114553fc0a2e91b1f2bd49b50c2317696 (5F1uFe5U...),
            nest: [],
        }
   ]
}

is same with

trap during contract execution.
    reason:  ext_get_storage failed
    caused by: failed to decode key
    called from: contract at 0x293843249
    called from: contract at 0x873317821
    called from: etc.

And we have detailed wasm backtrace to locate the panic or error location, even for host_function error.

More information could refer our doc

And I think our implemention may provide some help for you guys. :p

Our implementation is mainly in this pr patractlabs#16

athei · 2021-01-13T10:55:14Z

Thanks for your great work creating the europa sandbox. The thing is that we cannot allow any overhead during on-chain execution. This prevents us from just doing the same changes you did.

Additionally, I don't think that generating backtraces is appropriate for pallet contracts. With the existence of europa it can be the place to debug contracts before deployment. There is no need to have fully fledged back tracing in pallet contracts. We need to keep it simple and auditable.

atenjin · 2021-01-14T08:07:40Z

Thanks for your great work creating the europa sandbox. The thing is that we cannot allow any overhead during on-chain execution. This prevents us from just doing the same changes you did.

Additionally, I don't think that generating backtraces is appropriate for pallet contracts. With the existence of europa it can be the place to debug contracts before deployment. There is no need to have fully fledged back tracing in pallet contracts. We need to keep it simple and auditable.

Yes, I very agree with you opinion. I just notice this issue in "Smart Contract" project TODO list, and notice this issue is posted long time ago. I just so you know if you guys need to do this feature again, Europa may provide a method to do it. All in all, I think this issue could be closed for this thing may not in you plan, for the final result you just log the error info in the chain event, do not need other information.

athei · 2021-03-22T12:15:23Z

I plan to tacke this issue with the following plan:

Enriched debug messages are only collected when the code is executed as RPC in order to not add overhead to on-chain execution.
The enriched messages are logged to the console and in addition returned via RPC in the already existing debug_message field. This will allow the UI to display them.
The outputted format will look roughly as proposed out by @Robbepop

This will give users and novice developers good feedback from the get go. For more in-depth debugging patractlab's Europa can be used.

athei · 2021-06-10T09:08:03Z

Downgrading this to easy because the hard part was already merged with #8604 and #8773. The only things that remains is writing the call stack to the debug buffer on error. This will also allow us to expose the custom messages in any DispatchError::Custom variant.

athei · 2022-04-01T11:50:21Z

This is mostly done by having the debug buffer. I think this issue is way to convoluted. A follow up issue should be opened in case the debug buffer isn't sufficient.

Robbepop added the M8-contracts label Mar 12, 2020

gavofyork removed the M8-contracts label Jun 12, 2020

athei added the J0-enhancement An additional feature request. label Jun 16, 2020

athei changed the title ~~Enrich detail of SEAL contract trap errors~~ Enrich detail of Seal contract trap errors Jun 16, 2020

atenjin mentioned this issue Jan 13, 2021

Europa/contracts/enhance error handling patractlabs/substrate#16

Merged

athei added U3-nice_to_have Issue is worth doing eventually. Z3-substantial Can be fixed by an experienced coder with a working knowledge of the codebase. labels Mar 22, 2021

athei mentioned this issue Mar 22, 2021

seal: dedicated function for deliberate revert #2852

Closed

athei added U2-some_time_soon Issue is worth doing soon. and removed U3-nice_to_have Issue is worth doing eventually. U2-some_time_soon Issue is worth doing soon. labels Mar 23, 2021

athei mentioned this issue May 10, 2021

contracts: Move Schedule from Storage to Config #8773

Merged

athei added Z1-easy Can be fixed primarily by duplicating and adapting code by an intermediate coder and removed Z3-substantial Can be fixed by an experienced coder with a working knowledge of the codebase. labels Jun 10, 2021

sventime mentioned this issue Jun 25, 2021

rename psp20 to psp17, add ipsp17receiver Supercolony-net/PSPs#2

Merged

xgreenx mentioned this issue Jul 29, 2021

Implement error forwarding use-ink/ink#641

Closed

athei closed this as completed Apr 1, 2022

athei moved this to Done in Smart Contracts Jul 27, 2022

athei added this to Smart Contracts Jul 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enrich detail of Seal contract trap errors #5239

Enrich detail of Seal contract trap errors #5239

Robbepop commented Mar 12, 2020 •

edited

Loading

pepyakin commented Mar 12, 2020

athei commented Mar 12, 2020

athei commented Apr 3, 2020

pepyakin commented Apr 6, 2020

athei commented Apr 7, 2020

pepyakin commented Apr 7, 2020

athei commented Jul 23, 2020

atenjin commented Jan 13, 2021 •

edited

Loading

athei commented Jan 13, 2021 •

edited

Loading

atenjin commented Jan 14, 2021

athei commented Mar 22, 2021 •

edited

Loading

athei commented Jun 10, 2021

athei commented Apr 1, 2022

Enrich detail of Seal contract trap errors #5239

Enrich detail of Seal contract trap errors #5239

Comments

Robbepop commented Mar 12, 2020 • edited Loading

Extended Trap Information

Problem

Proposal

Example

pepyakin commented Mar 12, 2020

athei commented Mar 12, 2020

athei commented Apr 3, 2020

pepyakin commented Apr 6, 2020

athei commented Apr 7, 2020

pepyakin commented Apr 7, 2020

athei commented Jul 23, 2020

atenjin commented Jan 13, 2021 • edited Loading

athei commented Jan 13, 2021 • edited Loading

atenjin commented Jan 14, 2021

athei commented Mar 22, 2021 • edited Loading

athei commented Jun 10, 2021

athei commented Apr 1, 2022

Robbepop commented Mar 12, 2020 •

edited

Loading

atenjin commented Jan 13, 2021 •

edited

Loading

athei commented Jan 13, 2021 •

edited

Loading

athei commented Mar 22, 2021 •

edited

Loading