Add get_processed_inner_instruction syscall #22515

jackcmay · 2022-01-14T18:24:22Z

Problem

There are scenarios where instructions (top-level or via CPI) would like to know what the last instruction processed was. Some examples are verification of assert instructions or to check for memo instructions. Programs can already check tx level instructions via the Sysvar1nstructions1111111111111111111111111 but there is no way to get information about what inner instructions have been processed.

Summary of Changes

Add a syscall that allows a program to query what inner instructions have been successfully processed.

Fixes #22437

programs/bpf/tests/programs.rs

program-runtime/src/invoke_context.rs

programs/bpf/rust/sibling_instruction/src/lib.rs

programs/bpf_loader/src/syscalls.rs

CriesofCarrots

For my part, looks good once CI is happy

Pull request has been modified.

program-runtime/src/invoke_context.rs

Lichtso · 2022-01-15T00:36:44Z

programs/bpf_loader/src/syscalls.rs

@@ -2636,6 +2659,8 @@ fn call<'a, 'b: 'a>(
        )
        .map_err(SyscallError::InstructionError)?;

+    invoke_context.add_sibling_instruction(instruction);


Two things:

Can we use CompiledInstruction which has account indices, instead of Pubkeys? (better for ABIv2).

How about referencing the recorded instructions by index / a range?

Instructions aren't always recorded, are you thinking we should always turn on full instruction recording?

I'll take a look at using CompiledInstruction, are you referring to what the syscall returns, or what is stored internally in the invoke_context?

This refactoring should have made it cheap enough to always run it. And this PR here comes close to "always on" as well. So why not have them both use an unified underlying mechanism?

Yeah, I'll take a look at that, this implementation is pretty cheap as is, but we can probably do better

One thing to keep in mind is that the instruction recorder records instructions before they are invoked, so it includes instructions that haven't been fully processed yet which also results in a different order

Recorded inner instructions are defined being invoked, not successfully processed:
https://docs.solana.com/developing/clients/jsonrpc-api#inner-instructions-structure

codecov · 2022-01-15T03:04:07Z

Codecov Report

Merging #22515 (b1aab0d) into master (6edeed8) will increase coverage by 0.4%.
The diff coverage is 81.8%.

@@            Coverage Diff            @@
##           master   #22515     +/-   ##
=========================================
+ Coverage    81.1%    81.5%   +0.4%     
=========================================
  Files         560      553      -7     
  Lines      151206   149771   -1435     
=========================================
- Hits       122633   122127    -506     
+ Misses      28573    27644    -929

sdk/program/src/instruction.rs

programs/bpf/rust/inner_instruction/src/lib.rs

mvines · 2022-01-19T16:27:06Z

sdk/program/src/instruction.rs

+/// when calling the `sol_get_processed_inner_instruction` syscall.
+#[repr(C)]
+#[derive(Default, Debug, Clone, Copy)]
+pub struct InnerMeta {


nit: InnerMeta seems a little too generic. ProcessedInnerInstructionMeta?

Lichtso · 2022-01-19T16:35:40Z

program-runtime/src/invoke_context.rs

@@ -197,6 +197,7 @@ pub struct InvokeContext<'a> {
    pub timings: ExecuteDetailsTimings,
    pub blockhash: Hash,
    pub lamports_per_signature: u64,
+    pub processed_inner_instructions: Vec<(usize, Instruction)>,


This should go in TransactionContext, because it is not runtime internal and should be exposed to ABIv2 programs as well. I am preparing a PR that moves the InstructionRecorder there, on which you could rebase this PR.

I'm going to merge this today, feel free to move it after

Lichtso · 2022-01-19T16:36:47Z

programs/bpf_loader/src/syscalls.rs

+                    ),
+                    result
+                );
+


Lets convert from CompiledInstruction to Instruction on demand here.

As we discussed, we can make those changes after this pr lands.

Pull request has been modified.

mvines · 2022-01-19T19:26:28Z

I think there is a "B" missing in that example like so: [(1, F), (1, E), (2, D), (1, C), (0, B), (2, Z), (1, Y), (1, X), (0, A)] , right?

right, i missed (0,B) there :)

Lichtso · 2022-01-19T19:26:57Z

As I said, we should always include all instructions. It is easy to filter out the unfinished ones if a program does not want that.

mvines · 2022-01-19T19:27:02Z

Could pass a max depth:

👍🏼

mvines · 2022-01-19T19:28:44Z

As I said, we should always include all instructions. It is easy to filter out the unfinished ones if a program does not want that.

The future inner instructions are not known though, so the runtime could only return the unfinished transaction instructions and we already have the instructions sysvar for can be used for that purpose.

jackcmay · 2022-01-19T19:34:34Z

I'm a little uneasy about [(1, F), (1, E), (2, D), (1, C), (0, B), (2, Z), (1, Y), (1, X), (0, A)] where all the instructions are processed but B, and the caller has to distinguish that B is the not-yet-fully-processed instruction in the list.

mvines · 2022-01-19T19:51:44Z

I'm a little uneasy about [(1, F), (1, E), (2, D), (1, C), (0, B), (2, Z), (1, Y), (1, X), (0, A)] where all the instructions are processed but B, and the caller has to distinguish that B is the not-yet-fully-processed instruction in the list.

🤔 yeah

/// A
/// A -> X
/// A -> Y -> Z
///                ** Z calls `get_processed_instruction()` **
/// B

At this call site, I think the result would currently be [(1, Y), (1, X), (0, A)] or [(2, Z), (1, X), (0, A)]? If only fully processed instructions were returned then Z would see a result of [(1,X)]. The former seems neat as it enables an invoked program to figure out who's calling it in all cases

jackcmay · 2022-01-19T20:20:48Z

At this call site, I think the result would currently be [(1, Y), (1, X), (0, A)] or [(2, Z), (1, X), (0, A)]?

Current result would be [(1, X)], proposed result would be: [(1, Y), (1, X), (0, A)] where both (1, Y), and (0, A) have not been fully processed

mvines · 2022-01-19T20:29:05Z

Result would be empty if the syscall was called before A invoked X too?

jackcmay · 2022-01-19T20:44:00Z

Result would be empty if the syscall was called before A invoked X too?

The current implementation yes, the proposed would return [(0, A)]

jackcmay · 2022-01-19T22:42:30Z

@mvines @CriesofCarrots , given:

/// A
/// A -> X
/// A -> Y -> Z
///                ** Z calls `get_processed_instruction()` **
/// B

If SPL token is Z, will returning [] be expected, or does SPL token need X? And if needs X, does it also need A?

jackcmay · 2022-01-19T22:43:23Z

@Lichtso One of the impedments to relying directly on InstructionRecorder and something we need to fix before ABIv2 is this: #19467

mvines · 2022-01-19T23:59:12Z

@mvines @CriesofCarrots , given:
/// A
/// A -> X
/// A -> Y -> Z
///                ** Z calls `get_processed_instruction()` **
/// B
If SPL token is Z, will returning [] be expected, or does SPL token need X? And if needs X, does it also need A?

The case we'd need is when SPL Token is Y and it needs to see X.

But for the sake of discussion, if SPL token was Z (or Y) then I'd expect it to see X since X has completed. I think what we're trying to determine is whether Z would also see Y and A since those instructions haven't completed yet.

I'm leaning towards including Y and A, even though they are in progress, since this allows Z to figure out who's calling it and I think that enables some interesting use cases (programs changing their behaviour based on caller). But perhaps that's actually an anti-pattern/footgun that we shouldn't enable?

jackcmay · 2022-01-20T00:05:57Z

The reason I ask is that X isn't a direct sibling of Z within Y.

Seems like a separate syscall should be added, something like get_invoke_stack to get the callers, and later get_all_instructions to get all the instructions called so far. Some of these could be utility routines over the updated InstructionRecorder that may be exposed in ABIv2

mvines · 2022-01-20T00:11:16Z

Got it. Yeah not opposed to deferring a more general solution to later. SPL Token just needs to inspect the previous sibling instruction at its depth.

CriesofCarrots · 2022-01-20T04:28:33Z

Hey, all. Sorry I was offline for most of this dialog. Is the PR up to date with the currently proposed implementation? If so, I'll catch up with a close read.
Based on above, it sounds like it might be possible to have this syscall work for top-level instructions after all? 🙏 I hope so. I still think it will be better to have the api consistent for instructions at any depth in the call stack.

jackcmay · 2022-01-20T06:01:59Z

@CriesofCarrots Yeah, I see the desire for this api to include top-level instructions and that becomes non-trivial to define the desired behavior. For example, if we include top-level instructions do we also include their inner instructions? Do we include all instructions a depth minus? Do we include instructions currently being processed and let the program figure out via depth if they are in-progress or not? A lot of info was shared in this thread, reading through that would be helpful. My current thinking is for this API to include all inner instructions of the current top-level and take a max depth. Then we could provide a separate syscall to get the call stack. I want to make sure we cover the use case for SPL but also provide a clear API for other use cases without clouding or returning the world. I think there is also an argument to include all instructions at depth or below that have been processed which would include processed top-level instructions.

CriesofCarrots · 2022-01-25T23:02:40Z

Okay, I've read through everything carefully now.
If we take the general approach, I think that we do need to include instructions currently being processed. Otherwise, it may not be clear which lower level ix called a higher one. eg, in the example way above, the output [(1, F), (1, E), (2, D), (1, C), (2, Z), (1, Y), (1, X), (0, A)] could be from either of these call trees (or one of many other combinations):

/// A
/// A -> X
/// A -> Y -> Z
/// B -> C -> D
/// B -> E
/// B -> F
/// ** B calls `get_processed_instruction()` here **
/// G

/// A
/// A -> X
/// A -> Y -> Z
/// A -> C -> D
/// A -> E
/// A -> F
/// B
/// ** B calls `get_processed_instruction()` here **
/// G

I think that ambiguity could be a big footgun.

Yeah not opposed to deferring a more general solution to later. SPL Token just needs to inspect the previous sibling instruction at its depth.

That said, this ^ . It does appeal to me to implement the simpler approach now, and offer the more general utility later if needed. That simpler implementation being:

/// A
/// A -> X
/// A -> Y -> Z
/// B -> C -> D
/// B -> E
/// B -> F
/// ** F calls `get_processed_instruction()` here **
/// G
/// ** G calls `get_processed_instruction()` here **

F sees: [(1, E), (1, C)] or [E, C]
B G sees: [(0, B), (0, A)] or [B, A]

mvines · 2022-01-26T20:40:42Z

Maybe we need

/// A
/// A -> X
/// A -> Y -> Z
/// B -> C -> D
/// B -> E
/// B -> F
/// ** F calls `get_processed_instruction()` here **
/// G
/// ** G calls `get_processed_instruction()` here **

F sees: [(1, E), (1, C)] or [E, C]
B sees: [(0, B), (0, A)] or [B, A]

"B sees" -> "G sees" right? Sounds good to me if so!

But maybe we need a call with everybody on the PR to work out any remaining design discussion with our vocal folds?

CriesofCarrots · 2022-01-26T20:48:59Z

"B sees" -> "G sees" right? Sounds good to me if so!

Oops, yep, edited in my comment

But maybe we need a call with everybody on the PR to work out any remaining design discussion with our vocal folds?

Happy to oscillate some vocal folds. I think Jack is unavailable the rest of this week, so maybe Monday?

jackcmay · 2022-01-31T16:55:24Z

In the examples above I'm inclined to return [] from call depth 0 since there are no sibling instructions (just top-level instructions). The main reason for doing that is I think it is confusing to return F sees: [E,C] and not include the top-level successfully processed instruction A ([E, C, A]). A has been processed and so it come down to what is the definition of this API, is it "sibling instructions" or "processed instructions", or maybe going with @CriesofCarrots 's suggestion "processed instructions at my call depth"?

mvines · 2022-01-31T17:10:50Z

The previously (successfully -- implicit) processed instructions at the caller's depth is what we're immediately after.
At depth 0, this set would be the previously processed top-level instructions.

jackcmay · 2022-02-01T08:31:31Z

Replaced by #22859

jackcmay · 2022-02-04T22:30:24Z

Replaced by #22859

jackcmay requested review from Lichtso and CriesofCarrots January 14, 2022 18:24

CriesofCarrots reviewed Jan 14, 2022

View reviewed changes

programs/bpf/tests/programs.rs Outdated Show resolved Hide resolved

program-runtime/src/invoke_context.rs Outdated Show resolved Hide resolved

programs/bpf/rust/sibling_instruction/src/lib.rs Outdated Show resolved Hide resolved

mvines added the v1.9 label Jan 14, 2022

jackcmay commented Jan 14, 2022

View reviewed changes

programs/bpf_loader/src/syscalls.rs Outdated Show resolved Hide resolved

jackcmay force-pushed the sibling-instructions branch from 86b997e to 07553d2 Compare January 14, 2022 21:36

CriesofCarrots previously approved these changes Jan 14, 2022

View reviewed changes

t-nelson reviewed Jan 14, 2022

View reviewed changes

program-runtime/src/invoke_context.rs Outdated Show resolved Hide resolved

Lichtso reviewed Jan 15, 2022

View reviewed changes

jackcmay force-pushed the sibling-instructions branch from c2b15a0 to 1db88a8 Compare January 15, 2022 01:22

jackcmay force-pushed the sibling-instructions branch from 1db88a8 to b58222f Compare January 15, 2022 20:20

jackcmay changed the title ~~Add get_sibling_instruction syscall~~ Add get_processed_inner_instruction syscall Jan 15, 2022

mvines reviewed Jan 16, 2022

View reviewed changes

sdk/program/src/instruction.rs Outdated Show resolved Hide resolved

jackcmay force-pushed the sibling-instructions branch 2 times, most recently from 9f79099 to 6d7366c Compare January 18, 2022 01:19

jackcmay commented Jan 18, 2022

View reviewed changes

sdk/program/src/instruction.rs Show resolved Hide resolved

jackcmay commented Jan 18, 2022

View reviewed changes

programs/bpf/rust/inner_instruction/src/lib.rs Show resolved Hide resolved

jackcmay force-pushed the sibling-instructions branch 3 times, most recently from 72812a3 to d3441b9 Compare January 19, 2022 06:14

jackcmay requested review from Lichtso, CriesofCarrots and mvines January 19, 2022 08:05

mvines reviewed Jan 19, 2022

View reviewed changes

mvines previously approved these changes Jan 19, 2022

View reviewed changes

Lichtso previously requested changes Jan 19, 2022

View reviewed changes

jackcmay added 3 commits January 19, 2022 14:54

Add get_processed_inner_instruction syscall

2386aad

add get_invoke_depth syscall

1355eaf

rename

b1aab0d

jackcmay force-pushed the sibling-instructions branch from 0d1bbe5 to b1aab0d Compare January 19, 2022 23:04

jackcmay mentioned this pull request Feb 1, 2022

Add get_processed_sibling_instruction syscall #22859

Merged

jackcmay closed this Feb 4, 2022

jackcmay deleted the sibling-instructions branch April 20, 2022 22:48

Add get_processed_inner_instruction syscall #22515

Add get_processed_inner_instruction syscall #22515

Conversation

jackcmay commented Jan 14, 2022

Problem

Summary of Changes

CriesofCarrots left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jackcmay Jan 15, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Jan 15, 2022 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mvines commented Jan 19, 2022

Lichtso commented Jan 19, 2022

mvines commented Jan 19, 2022

mvines commented Jan 19, 2022

jackcmay commented Jan 19, 2022

mvines commented Jan 19, 2022

jackcmay commented Jan 19, 2022 • edited Loading

mvines commented Jan 19, 2022

jackcmay commented Jan 19, 2022

jackcmay commented Jan 19, 2022

jackcmay commented Jan 19, 2022

mvines commented Jan 19, 2022

jackcmay commented Jan 20, 2022

mvines commented Jan 20, 2022

CriesofCarrots commented Jan 20, 2022

jackcmay commented Jan 20, 2022

CriesofCarrots commented Jan 25, 2022 • edited Loading

mvines commented Jan 26, 2022

CriesofCarrots commented Jan 26, 2022

jackcmay commented Jan 31, 2022

mvines commented Jan 31, 2022

jackcmay commented Feb 1, 2022

jackcmay commented Feb 4, 2022

jackcmay Jan 15, 2022 •

edited

Loading

codecov bot commented Jan 15, 2022 •

edited

Loading

jackcmay commented Jan 19, 2022 •

edited

Loading

CriesofCarrots commented Jan 25, 2022 •

edited

Loading