Add return data implementation

This consists of: - syscalls - passing return data from invoked to invoker - printing to stable log - rust and C SDK changes
solana-labs · Sep 10, 2021 · 0985852 · 0985852
1 parent 2dee098
commit 0985852
Show file tree

Hide file tree

Showing 23 changed files with 641 additions and 35 deletions.
diff --git a/Cargo.lock b/Cargo.lock
diff --git a/docs/src/proposals/return-data.md b/docs/src/proposals/return-data.md
@@ -0,0 +1,144 @@
+# Return data from BPF programs
+
+## Problem
+
+In the Solidity langauge it is permitted to return any number of values from a function,
+for example a variable length string can be returned:
+
+```
+function foo1() public returns (string) {
+    return "Hello, world!\n";
+}
+```
+
+Multiple values, arrays and structs are permitted too.
+
+```
+struct S {
+    int f1;
+    bool f2
+};
+
+function foo2() public returns (string, int[], S) {
+    return (a, b, c);
+}
+```
+
+All the return values are eth abi encoded to a variable-length byte array.
+
+On ethereum errors can be returned too:
+
+```
+function withdraw() public {
+    require(msg.sender == owner, "Permission denied");
+}
+
+function failure() public {
+    revert("I afraid I can't do that dave");
+}
+```
+These errors help the developer debug any issue they are having, and can
+also be caught in a Solidity `try` .. `catch` block. Outside of a `try` .. `catch`
+block, any of these would cause the transaction or rpc to fail.
+
+## Existing solution
+
+The existing solution that Solang uses, writes the return data to the callee account data.
+The caller's account cannot be used, since the callee may not be the same BPF program, so
+it will not have permission to write to the callee's account data.
+
+Another solution would be to have a single return data account which is passed
+around through CPI. Again this does not work for CPI as the callee may not have
+permission to write to it.
+
+The problem with this solution is:
+
+- It does not work for RPC calls
+- It is very racey; a client has to submit the Tx and then retrieve the account
+  data. This is not atomic so the return data can be overwritten by another transaction.
+
+## Requirements for Solution
+
+It must work for:
+
+- RPC: An RPC should be able to return any number of values without writing to account data
+- Transaction: An transaction should be able to return any number of values without needing to write them account data
+- CPI: The callee must "set" return value, and the caller must be able to retrieve it.
+
+## Review of other chains
+
+### Ethereum (EVM)
+
+The `RETURN` opcode allows a contract to set a buffer as a returndata. This opcode takes a pointer to memory and a size. The `REVERT` opcode works similarly but signals that the call failed, and all account data changes must be reverted.
+
+For CPI, the caller can retrieve the returned data of the callee using the `RETURNDATASIZE` opcode which returns the length, and the `RETURNDATACOPY` opcode, which takes a memory destination pointer, offset into the returndata, and a length argument.
+
+Ethereum stores the returndata in blocks.
+
+### Parity Substrate
+
+The return data can be set using the `seal_return(u32 flags, u32 pointer, u32 size)` syscall.
+- Flags can be 1 for revert, 0 for success (nothing else defined)
+- Function does not return
+
+CPI: The `seal_call()` syscall takes pointer to buffer and pointer to buffer size where return data goes
+ - There is a 32KB limit for return data.
+
+Parity Substrate does not write the return data to blocks.
+
+## Rejected Solution
+
+The concept of ephemeral accounts has been proposed a solution for this. This would
+certainly work for the CPI case, but this would not work RPC or Transaction case.
+
+## Proposed Solution
+
+The callee can set the return data using a new system call `sol_set_return_data(buf: *const u8, length: u64)`.
+There is a limit of 1024 bytes for the returndata. This function can be called multiple times, and
+will simply overwrite what was written in the last call.
+
+The return data can be retrieved with `sol_get_return_data(buf: *mut u8, length: u64, program_id: *mut Pubkey) -> u64`.
+This function copies the return buffer, and the program_id that set the return data, and
+returns the length of the return data, or `0` if no return data is set. In this case, program_id is not set.
+
+When an instruction calls `sol_invoke()`, the return data of the callee is copied into the return data
+of the current instruction. This means that any return data is automatically passed up the call stack,
+to the callee of the current instruction (or the RPC call).
+
+Note that `sol_invoke()` clears the returns data before invoking the callee, so that any return data from
+a previous invoke is not reused if the invoked fails to set a return data. For example:
+
+ - A invokes B
+ - Before entry to B, return data is cleared.0
+ - B sets some return data and returns
+ - A invokes C
+ - Before entry to C, return data is cleared.
+ - C does not set return data and returns
+ - A checks return data and finds it empty
+
+Another scenario to consider:
+
+ - A invokes B
+ - B invokes C
+ - C sets return data and returns
+ - B does not touch return data and returns
+ - A gets return data from C
+ - A does not touch return data
+ - Return data from transaction is what C set.
+
+The compute costs are calculated for getting and setting the return data using
+the syscalls.
+
+For a normal RPC or Transaction, the returndata is base64-encoded and stored along side the sol_log
+strings in the [stable log](https://github.com/solana-labs/solana/blob/95292841947763bdd47ef116b40fc34d0585bca8/sdk/src/process_instruction.rs#L275-L281).
+
+## Note on returning errors
+
+Solidity on Ethereum allows the contract to return an error in the return data. In this case, all
+the account data changes for the account should be reverted. On Solana, any non-zero exit code
+for a BPF prorgram means the entire transaction fails. We do not wish to support an error return
+by returning success and then returning an error in the return data. This would mean we would have
+to support reverting the account data changes; this too expensive both on the VM side and the BPF
+contract side.
+
+Errors will be reported via sol_log.
diff --git a/program-runtime/src/instruction_processor.rs b/program-runtime/src/instruction_processor.rs
@@ -631,6 +631,9 @@ impl InstructionProcessor {
             // Verify the calling program hasn't misbehaved
             invoke_context.verify_and_update(instruction, accounts, caller_write_privileges)?;
 
+            // clear the return data
+            invoke_context.set_return_data(None);
+
             // Invoke callee
             invoke_context.push(program_id, message, instruction, program_indices, accounts)?;
 

diff --git a/programs/bpf/Cargo.lock b/programs/bpf/Cargo.lock
diff --git a/programs/bpf/c/src/invoke/invoke.c b/programs/bpf/c/src/invoke/invoke.c
@@ -8,6 +8,7 @@
 #include <sol/log.h>
 #include <sol/assert.h>
 #include <sol/deserialize.h>
+#include <sol/return_data.h>
 
 static const uint8_t TEST_SUCCESS = 1;
 static const uint8_t TEST_PRIVILEGE_ESCALATION_SIGNER = 2;
@@ -26,6 +27,7 @@ static const uint8_t TEST_WRITABLE_DEESCALATION_WRITABLE = 14;
 static const uint8_t TEST_NESTED_INVOKE_TOO_DEEP = 15;
 static const uint8_t TEST_EXECUTABLE_LAMPORTS = 16;
 static const uint8_t ADD_LAMPORTS = 17;
+static const uint8_t TEST_RETURN_DATA_TOO_LARGE = 18;
 
 static const int MINT_INDEX = 0;
 static const int ARGUMENT_INDEX = 1;
@@ -174,6 +176,32 @@ extern uint64_t entrypoint(const uint8_t *input) {
                  sol_invoke(&instruction, accounts, SOL_ARRAY_SIZE(accounts)));
     }
 
+    sol_log("Test return data");
+    {
+      SolAccountMeta arguments[] = {{accounts[ARGUMENT_INDEX].key, true, true}};
+      uint8_t data[] = { SET_RETURN_DATA };
+      uint8_t buf[100];
+
+      const SolInstruction instruction = {accounts[INVOKED_PROGRAM_INDEX].key,
+                                          arguments, SOL_ARRAY_SIZE(arguments),
+                                          data, SOL_ARRAY_SIZE(data)};
+
+      // set some return data, so that the callee can check it is cleared
+      sol_set_return_data((uint8_t[]){1, 2, 3, 4}, 4);
+
+      sol_assert(SUCCESS ==
+                 sol_invoke(&instruction, accounts, SOL_ARRAY_SIZE(accounts)));
+
+      SolPubkey setter;
+
+      uint64_t ret = sol_get_return_data(data, sizeof(data), &setter);
+
+      sol_assert(ret == sizeof(RETURN_DATA_VAL));
+
+      sol_assert(sol_memcmp(data, RETURN_DATA_VAL, sizeof(RETURN_DATA_VAL)));
+      sol_assert(SolPubkey_same(&setter, accounts[INVOKED_PROGRAM_INDEX].key));
+    }
+
     sol_log("Test create_program_address");
     {
       uint8_t seed1[] = {'Y', 'o', 'u', ' ', 'p', 'a', 's', 's',
@@ -542,27 +570,33 @@ extern uint64_t entrypoint(const uint8_t *input) {
     break;
   }
   case TEST_EXECUTABLE_LAMPORTS: {
-      sol_log("Test executable lamports");
-      accounts[ARGUMENT_INDEX].executable = true;
-      *accounts[ARGUMENT_INDEX].lamports -= 1;
-      *accounts[DERIVED_KEY1_INDEX].lamports +=1;
-      SolAccountMeta arguments[] = {
-          {accounts[ARGUMENT_INDEX].key, true, false},
-          {accounts[DERIVED_KEY1_INDEX].key, true, false},
-      };
-      uint8_t data[] = {ADD_LAMPORTS, 0, 0, 0};
-      SolPubkey program_id;
-      sol_memcpy(&program_id, params.program_id, sizeof(SolPubkey));
-      const SolInstruction instruction = {&program_id,
-                                          arguments, SOL_ARRAY_SIZE(arguments),
-                                          data, SOL_ARRAY_SIZE(data)};
-      sol_invoke(&instruction, accounts, SOL_ARRAY_SIZE(accounts));
-      *accounts[ARGUMENT_INDEX].lamports += 1;
-      break;
+    sol_log("Test executable lamports");
+    accounts[ARGUMENT_INDEX].executable = true;
+    *accounts[ARGUMENT_INDEX].lamports -= 1;
+    *accounts[DERIVED_KEY1_INDEX].lamports +=1;
+    SolAccountMeta arguments[] = {
+      {accounts[ARGUMENT_INDEX].key, true, false},
+      {accounts[DERIVED_KEY1_INDEX].key, true, false},
+    };
+    uint8_t data[] = {ADD_LAMPORTS, 0, 0, 0};
+    SolPubkey program_id;
+    sol_memcpy(&program_id, params.program_id, sizeof(SolPubkey));
+    const SolInstruction instruction = {&program_id,
+					arguments, SOL_ARRAY_SIZE(arguments),
+					data, SOL_ARRAY_SIZE(data)};
+    sol_invoke(&instruction, accounts, SOL_ARRAY_SIZE(accounts));
+    *accounts[ARGUMENT_INDEX].lamports += 1;
+    break;
   }
   case ADD_LAMPORTS: {
-      *accounts[0].lamports += 1;
-      break;
+    *accounts[0].lamports += 1;
+     break;
+  }
+  case TEST_RETURN_DATA_TOO_LARGE: {
+    sol_log("Test setting return data too long");
+    // The actual buffer doesn't matter, just pass null
+    sol_set_return_data(NULL, 1027);
+    break;
   }
 
   default:

diff --git a/programs/bpf/c/src/invoked/instruction.h b/programs/bpf/c/src/invoked/instruction.h
@@ -16,3 +16,6 @@ const uint8_t VERIFY_PRIVILEGE_DEESCALATION = 8;
 const uint8_t VERIFY_PRIVILEGE_DEESCALATION_ESCALATION_SIGNER = 9;
 const uint8_t VERIFY_PRIVILEGE_DEESCALATION_ESCALATION_WRITABLE = 10;
 const uint8_t WRITE_ACCOUNT = 11;
+const uint8_t SET_RETURN_DATA = 12;
+
+#define RETURN_DATA_VAL "return data test"
diff --git a/programs/bpf/c/src/invoked/invoked.c b/programs/bpf/c/src/invoked/invoked.c
@@ -14,6 +14,9 @@ extern uint64_t entrypoint(const uint8_t *input) {
     return ERROR_INVALID_ARGUMENT;
   }
 
+  // on entry, return data must not be set
+  sol_assert(sol_get_return_data(NULL, 0, NULL) == 0);
+
   if (params.data_len == 0) {
     return SUCCESS;
   }
@@ -91,6 +94,12 @@ extern uint64_t entrypoint(const uint8_t *input) {
     sol_log("return Ok");
     return SUCCESS;
   }
+  case SET_RETURN_DATA: {
+    sol_set_return_data((const uint8_t*)RETURN_DATA_VAL, sizeof(RETURN_DATA_VAL));
+    sol_log("set return data");
+    sol_assert(sol_get_return_data(NULL, 0, NULL) == sizeof(RETURN_DATA_VAL));
+    return SUCCESS;
+  }
   case RETURN_ERROR: {
     sol_log("return error");
     return 42;

diff --git a/programs/bpf/c/src/return_data/return_data.c b/programs/bpf/c/src/return_data/return_data.c
@@ -0,0 +1,40 @@
+/**
+ * @brief return data Syscall test
+ */
+#include <solana_sdk.h>
+
+#define DATA "the quick brown fox jumps over the lazy dog"
+
+extern uint64_t entrypoint(const uint8_t *input) {
+  uint8_t buf[1024];
+  SolPubkey me;
+
+  // There should be no return data on entry
+  uint64_t ret = sol_get_return_data(NULL, 0, NULL);
+
+  sol_assert(ret == 0);
+
+  // set some return data
+  sol_set_return_data((const uint8_t*)DATA, sizeof(DATA));
+
+  // ensure the length is correct
+  ret = sol_get_return_data(NULL, 0, &me);
+  sol_assert(ret == sizeof(DATA));
+
+  // try getting a subset
+  ret = sol_get_return_data(buf, 4, &me);
+
+  sol_assert(ret == sizeof(DATA));
+
+  sol_assert(!sol_memcmp(buf, "the ", 4));
+
+  // try getting the whole thing
+  ret = sol_get_return_data(buf, sizeof(buf), &me);
+
+  sol_assert(ret == sizeof(DATA));
+
+  sol_assert(!sol_memcmp(buf, (const uint8_t*)DATA, sizeof(DATA)));
+
+  // done
+  return SUCCESS;
+}
diff --git a/programs/bpf/rust/invoke/src/lib.rs b/programs/bpf/rust/invoke/src/lib.rs
@@ -10,7 +10,7 @@ use solana_program::{
     entrypoint,
     entrypoint::{ProgramResult, MAX_PERMITTED_DATA_INCREASE},
     msg,
-    program::{invoke, invoke_signed},
+    program::{get_return_data, invoke, invoke_signed, set_return_data},
     program_error::ProgramError,
     pubkey::{Pubkey, PubkeyError},
     system_instruction,
@@ -394,6 +394,27 @@ fn process_instruction(
                     assert_eq!(data[i], i as u8);
                 }
             }
+
+            msg!("Test return data via invoked");
+            {
+                // this should be cleared on entry, the invoked tests for this
+                set_return_data(b"x");
+
+                let instruction = create_instruction(
+                    *accounts[INVOKED_PROGRAM_INDEX].key,
+                    &[(accounts[ARGUMENT_INDEX].key, false, true)],
+                    vec![SET_RETURN_DATA],
+                );
+                let _ = invoke(&instruction, accounts);
+
+                assert_eq!(
+                    get_return_data(),
+                    Some((
+                        *accounts[INVOKED_PROGRAM_INDEX].key,
+                        b"Set by invoked".to_vec()
+                    ))
+                );
+            }
         }
         TEST_PRIVILEGE_ESCALATION_SIGNER => {
             msg!("Test privilege escalation signer");

diff --git a/programs/bpf/rust/invoked/src/instruction.rs b/programs/bpf/rust/invoked/src/instruction.rs
@@ -18,6 +18,7 @@ pub const VERIFY_PRIVILEGE_DEESCALATION_ESCALATION_SIGNER: u8 = 9;
 pub const VERIFY_PRIVILEGE_DEESCALATION_ESCALATION_WRITABLE: u8 = 10;
 pub const WRITE_ACCOUNT: u8 = 11;
 pub const CREATE_AND_INIT: u8 = 12;
+pub const SET_RETURN_DATA: u8 = 13;
 
 pub fn create_instruction(
     program_id: Pubkey,