Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Enrich detail of Seal contract trap errors #5239

Closed
Robbepop opened this issue Mar 12, 2020 · 13 comments
Closed

Enrich detail of Seal contract trap errors #5239

Robbepop opened this issue Mar 12, 2020 · 13 comments
Labels
J0-enhancement An additional feature request. Z1-easy Can be fixed primarily by duplicating and adapting code by an intermediate coder

Comments

@Robbepop
Copy link
Contributor

Robbepop commented Mar 12, 2020

Extended Trap Information

Problem

There are mainly two different types how a contract execution can trap:

  1. It either traps on its own due to a logic error.
  2. It traps through the host implementation because certain preconditions of a host function has not been met.

For the second case the contracts pallet currently only prints trap during contract execution message to the console. This mostly doesn't help a contract writer a lot and they eventually have to path their way forward with println-debugging which is not sustainable nor scalable.

Proposal

But we can do better in the second case where the trap happens due to some preconditions that have not been met.

Example

In the case of a call to ext_get_storage(key_ptr: u32, key_len: u32) the contracts pallet traps if it cannot properly decode the (key_ptr, key_len) representing slice of bytes into a proper Key. While currently the contracts pallet simply prints out trap during contract execution it actually could do better, e.g. by providing a reason:

trap during contract execution. reason: could not decode key in ext_get_storage call

For a whole call stack we could even provide the call stack until the point of failure:

trap during contract execution.
    reason: could not decode key in ext_get_storage call
    called from: contract at 0x293843249
    called from: contract at 0x873317821
    called from: etc.

This would provide the contract writer with much more helpful additional information to actually get to the point of failure within their contract logic.
Also this should be fairly simple to add to the contracts pallet since it already maintains the call stack of every contract execution. Printing this information would simply access the already managed state.

Note that this is not comparable to stack traces and just a middle ground solution that should provide practical help quickly. It should be noted here that better error reporting is one of the most frequently asked for features in contracts. So we really need to push this forward!

Also note that for the implemented error reports to the console we would never guarantee their contents which means that once we can do better we could easily exchange the whole error reporting mechanism with a more mature one, e.g. proper stack traces upon a trap.

@pepyakin
Copy link
Contributor

This sounds very useful.

First, I think we need to do the printing only opt-in ensuring that this functionality is not enabled by default on production nodes. Probably --dev would be the best option here.

Second, there is complication that the formatting machineries are basically non-existent in substrate at the moment of writing, so we should need to workaround that or settle on more simpler formatting, e.g.:

trap during contract execution. reason: 
could not decode key in ext_get_storage call
0x293843249
0x873317821

Linking #2082 since it can help with this kind of separation of runtime logic and diagnostic logic.

@athei
Copy link
Member

athei commented Mar 12, 2020

A reasonable demand. I also ran into the dreaded traped during contract execution while wondering why it trapped. Maybe we could then improve our integration teststo check for more meaningful error messages.

@athei
Copy link
Member

athei commented Apr 3, 2020

I did some investigation. It seems doable as there are ways of passing errors from wasmi host functions to the invoker. Currently, sp_sandbox defines a host function as:

/// Error error that can be returned from host function.
#[derive(Encode, Decode)]
#[derive(crate::RuntimeDebug)]
pub struct HostError;

pub type HostFuncType<T> = fn(&mut T, &[Value]) -> Result<ReturnValue, HostError>

We need to change the error type in order to be able to pass error information from a host function to the invoker. I suppose we should use something like failure in order to record the error causes:

trap during contract execution.
    reason:  ext_get_storage failed
    caused by: failed to decode key
    called from: contract at 0x293843249
    called from: contract at 0x873317821
    called from: etc.

The "called from" lines are just added by the contract module which walks through all the callers in case when the sandbox reports a failure.

I am not sure if we can use failure though because it uses fmt I guess?

@pepyakin
Copy link
Contributor

pepyakin commented Apr 6, 2020

Heh, this was my first idea when we discussed this with Robin and I am not sure how that is gonna work. When a contract calls another contract and then the callee fails, then the failure is not cascaded through the call chain. Instead, the caller gets a return status code. I also don't see contracts cooperatively passing an error up the stack.

Instead, we could just print this information to the runtime's stdout. It seems this is the best we can do so far.

Anyway, regardless however it is implemented, the production code path should not regress in performance.

@athei
Copy link
Member

athei commented Apr 7, 2020

Instead, we could just print this information to the runtime's stdout. It seems this is the best we can do so far.

You mean just using error! (log crate)? Is that even possible to use inside the sandbox?

@pepyakin
Copy link
Contributor

pepyakin commented Apr 7, 2020

Not quite, although similar: the Printable API.

@athei athei added the J0-enhancement An additional feature request. label Jun 16, 2020
@athei athei changed the title Enrich detail of SEAL contract trap errors Enrich detail of Seal contract trap errors Jun 16, 2020
@athei
Copy link
Member

athei commented Jul 23, 2020

So what turns out to be a pretty good solution for having at least some decent error reporting for supervisor triggered errors is the following approach that I already used when I adding a new error kind:

  • Add the error to the decl_error! macro
  • Use the TrapReason::SupervisorError(err) variant to funnel the error back the executor

This will not allow for dynamic error messages (formatting a message). As a corollary to this stack traces are also not possible. On the upside it allows those messages to be printed to the UI (e.G polkadot JS apps).

@atenjin
Copy link
Contributor

atenjin commented Jan 13, 2021

Currently, our tool for run contract in a sandbox Europa has support the similar feature like:

1: NestedRuntime {
    ext_result: [success] ExecReturnValue { flags: 0, data: 7c000000 },
    caller: 0000000000000000000000000000000000000000000000000000000000000000 (5C4hrfjw...),
    self_account: 36b1ebfb6e6f31fa0a9b9d934db3033bc5766bc0a52ddd5656cb54e91903ad70 (5DJRJAXF...),
    selector: 0x1e5ca456,
    args: None,
    value: 0,
    gas_limit: 4999999999999,
    gas_left: 4997774243082,
    env_trace: [
        seal_value_transferred(Some(0x00000000000000000000000000000000)),
        seal_input(Some(0x1e5ca456)),
        seal_get_storage((Some(0x0000000000000000000000000000000000000000000000000000000000000000), Some(0x829153a24020d6c6e3a66aaaa7e5235114553fc0a2e91b1f2bd49b50c2317696))),
        seal_call((Some(0x829153a24020d6c6e3a66aaaa7e5235114553fc0a2e91b1f2bd49b50c2317696), 0, Some(0), Some(0x1e5ca456), Some(0x7c000000), Some(ReturnCode::Success))),
        seal_return((0, Some(0x7c000000))),
    ],
    trap_reason: TrapReason::Return(ReturnData { flags: 0, data: 7c000000 }),
    nest: [
        2: NestedRuntime {
            ext_result: [success] ExecReturnValue { flags: 0, data: 7c000000 },
            caller: 36b1ebfb6e6f31fa0a9b9d934db3033bc5766bc0a52ddd5656cb54e91903ad70 (5DJRJAXF...),
            self_account: 829153a24020d6c6e3a66aaaa7e5235114553fc0a2e91b1f2bd49b50c2317696 (5F1uFe5U...),
            selector: 0x1e5ca456,
            args: None,
            value: 0,
            gas_limit: 4998256583394,
            gas_left: 4997888971426,
            env_trace: [
                seal_value_transferred(Some(0x00000000000000000000000000000000)),
                seal_input(Some(0x1e5ca456)),
                seal_get_storage((Some(0x0000000000000000000000000000000000000000000000000000000000000000), Some(0x7c000000))),
                seal_return((0, Some(0x7c000000))),
            ],
            trap_reason: TrapReason::Return(ReturnData { flags: 0, data: 7c000000 }),
            nest: [],
        },
    ],
}

The simple log:

1: NestedRuntime {
    self_account: 36b1ebfb6e6f31fa0a9b9d934db3033bc5766bc0a52ddd5656cb54e91903ad70 (5DJRJAXF...),
    nest: [
        2: NestedRuntime {
            self_account: 829153a24020d6c6e3a66aaaa7e5235114553fc0a2e91b1f2bd49b50c2317696 (5F1uFe5U...),
            nest: [],
        }
   ]
}

is same with

trap during contract execution.
    reason:  ext_get_storage failed
    caused by: failed to decode key
    called from: contract at 0x293843249
    called from: contract at 0x873317821
    called from: etc.

And we have detailed wasm backtrace to locate the panic or error location, even for host_function error.

More information could refer our doc

And I think our implemention may provide some help for you guys. :p

Our implementation is mainly in this pr patractlabs#16

@athei
Copy link
Member

athei commented Jan 13, 2021

Thanks for your great work creating the europa sandbox. The thing is that we cannot allow any overhead during on-chain execution. This prevents us from just doing the same changes you did.

Additionally, I don't think that generating backtraces is appropriate for pallet contracts. With the existence of europa it can be the place to debug contracts before deployment. There is no need to have fully fledged back tracing in pallet contracts. We need to keep it simple and auditable.

@atenjin
Copy link
Contributor

atenjin commented Jan 14, 2021

Thanks for your great work creating the europa sandbox. The thing is that we cannot allow any overhead during on-chain execution. This prevents us from just doing the same changes you did.

Additionally, I don't think that generating backtraces is appropriate for pallet contracts. With the existence of europa it can be the place to debug contracts before deployment. There is no need to have fully fledged back tracing in pallet contracts. We need to keep it simple and auditable.

Yes, I very agree with you opinion. I just notice this issue in "Smart Contract" project TODO list, and notice this issue is posted long time ago. I just so you know if you guys need to do this feature again, Europa may provide a method to do it. All in all, I think this issue could be closed for this thing may not in you plan, for the final result you just log the error info in the chain event, do not need other information.

@athei
Copy link
Member

athei commented Mar 22, 2021

I plan to tacke this issue with the following plan:

  • Enriched debug messages are only collected when the code is executed as RPC in order to not add overhead to on-chain execution.
  • The enriched messages are logged to the console and in addition returned via RPC in the already existing debug_message field. This will allow the UI to display them.
  • The outputted format will look roughly as proposed out by @Robbepop

This will give users and novice developers good feedback from the get go. For more in-depth debugging patractlab's Europa can be used.

@athei athei added U3-nice_to_have Issue is worth doing eventually. Z3-substantial Can be fixed by an experienced coder with a working knowledge of the codebase. labels Mar 22, 2021
@athei athei added U2-some_time_soon Issue is worth doing soon. and removed U3-nice_to_have Issue is worth doing eventually. U2-some_time_soon Issue is worth doing soon. labels Mar 23, 2021
@athei athei added Z1-easy Can be fixed primarily by duplicating and adapting code by an intermediate coder and removed Z3-substantial Can be fixed by an experienced coder with a working knowledge of the codebase. labels Jun 10, 2021
@athei
Copy link
Member

athei commented Jun 10, 2021

Downgrading this to easy because the hard part was already merged with #8604 and #8773. The only things that remains is writing the call stack to the debug buffer on error. This will also allow us to expose the custom messages in any DispatchError::Custom variant.

@athei
Copy link
Member

athei commented Apr 1, 2022

This is mostly done by having the debug buffer. I think this issue is way to convoluted. A follow up issue should be opened in case the debug buffer isn't sufficient.

@athei athei closed this as completed Apr 1, 2022
@athei athei moved this to Done in Smart Contracts Jul 27, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
J0-enhancement An additional feature request. Z1-easy Can be fixed primarily by duplicating and adapting code by an intermediate coder
Projects
Status: Done
Development

No branches or pull requests

5 participants