Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lambda support in VM: Wolfgang's PR plus some simplifications (removing mask, clarifying ops) #15387

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

brmataptos
Copy link
Contributor

@brmataptos brmataptos commented Nov 25, 2024

Description

This PR started with a cherry-pick commit of Wolfgang's draft PR 15171 = commit #d85b919.

On top of that, I

  • Remove the Mask feature since it complicates review,
  • Extend and rename the opcodes to make them more general and clear
    • Any function value can have arguments bound to it in advance,
      not requiring a defined function handle except in LdFunction[Generic].
  • Tried to avoid unnecessary changes, but may have accidentally done so.
    • It took me a while to realize that we do indeed need a type parameter to
      EarlyBind and Invoke because of the way the verifier works today.
  • add some semantics comment to file_format.rs which I'd worked out previously.

The current operations are:
- LdFunction, LdFunctionGeneric = generate a closure from a defined function
- EarlyBind = bind more arguments to any closure
- Invoke = call a closure with final arguments

As noted by Wolfgang in his commit, we are still lacking:

  • function value representation and interpreter
  • function value serialization and deserialization
  • some connections with front-end

How Has This Been Tested?

Added tests for feature gating. Should be safe to merge.

Key Areas to Review

Did I lose any verifier support when modfiying Wolfgang's opcodes?

Copy link

trunk-io bot commented Nov 25, 2024

⏱️ 1h 30m total CI duration on this PR
Job Cumulative Duration Recent Runs
rust-move-tests 13m 🟩
rust-move-tests 12m 🟩
rust-move-tests 12m 🟩
rust-move-tests 12m 🟥
rust-move-tests 12m 🟥
rust-cargo-deny 10m 🟩🟩🟩🟩🟩 (+1 more)
check-dynamic-deps 9m 🟩🟩🟩🟩🟩 (+1 more)
rust-move-tests 4m 🟥
general-lints 3m 🟩🟩🟩🟩🟩 (+1 more)
semgrep/ci 2m 🟩🟩🟩🟩🟩 (+1 more)
file_change_determinator 1m 🟩🟩🟩🟩🟩 (+1 more)
permission-check 21s 🟩🟩🟩🟩🟩 (+1 more)
permission-check 16s 🟩🟩🟩🟩🟩 (+1 more)
check-branch-prefix 1s 🟩

🚨 1 job on the last run was significantly faster/slower than expected

Job Duration vs 7d avg Delta
check-dynamic-deps 5m 1m +236%

settingsfeedbackdocs ⋅ learn more about trunk.io

@brmataptos brmataptos changed the title Verbatim cherry-pick of Wolfgang's draft PR 15171 = commit #d85b919 Wolfgang's VM closure support + reconciliation with lambda simplifications (removing mask, clarifying ops) Nov 25, 2024
@brmataptos brmataptos changed the title Wolfgang's VM closure support + reconciliation with lambda simplifications (removing mask, clarifying ops) Lambda support in VM: Wolfgang's PR plus some simplifications (removing mask, clarifying ops) Nov 25, 2024
@brmataptos brmataptos marked this pull request as ready for review November 25, 2024 05:07
@brmataptos brmataptos force-pushed the 11-21-lambda-compiler-completion branch from 42421fe to 92ae56c Compare November 27, 2024 10:24
@brmataptos brmataptos force-pushed the 11-23-vm_changes_merged_in branch from ebb0fcc to 8154bb7 Compare November 27, 2024 10:24
@brmataptos brmataptos force-pushed the 11-21-lambda-compiler-completion branch from 92ae56c to 3c02ba1 Compare November 27, 2024 22:41
@brmataptos brmataptos force-pushed the 11-23-vm_changes_merged_in branch from d52d75a to f49f228 Compare November 27, 2024 22:42
@brmataptos brmataptos force-pushed the 11-21-lambda-compiler-completion branch from 3c02ba1 to ae4444b Compare December 1, 2024 05:57
@brmataptos brmataptos force-pushed the 11-23-vm_changes_merged_in branch from f49f228 to ab7e327 Compare December 1, 2024 05:57
@brmataptos brmataptos force-pushed the 11-21-lambda-compiler-completion branch from ae4444b to 8b37927 Compare December 1, 2024 06:46
@brmataptos brmataptos force-pushed the 11-23-vm_changes_merged_in branch from ab7e327 to 84b10f6 Compare December 1, 2024 06:46
Copy link
Contributor

@runtian-zhou runtian-zhou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will need some more time to double check the bytecode verification logics.

assert ty == locals[#args - i - 1]
"#]
#[gas_type_creation_tier_1 = "closure_ty"]
Invoke(SignatureIndex),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rename it to CallFunctionPointer just to align with the Call instruction we had previously?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that is useful, as it is not really a pointer. I'd be tempted to rename it Apply as a nod to Lisp's nomenclature, but I don't think it really matters here. Does Invoke confuse you?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving it.

@@ -1254,6 +1261,8 @@ pub enum SignatureToken {
Signer,
/// Vector
Vector(Box<SignatureToken>),
/// Function, with n argument types and m result types, and an associated ability set.
Function(Vec<SignatureToken>, Vec<SignatureToken>, AbilitySet),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to use named struct instead of tuple here.

Also for the AbilitySet, I suppose we will need some subtyping rules here in the type checking phase?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing to a struct causes a lot of little changes, hope other reviewers don't mind all the changed code.

Already have it in functions invoke_function and early_bind_function in move-bytecode-verifier/src/type_safety.rs. Do we need more?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a struct-style enum variant here causes those mysterious Clippy problems. I think it's related to one of the derive macros on SignatureToken declaration (probably dearbitrary::Dearbitrary) lacking handling of struct-style enum variant. In the interest of moving faster, I'm going back to the tuple-style enum variant instead, rather than trying to fix that 3d-party code. :-)

@@ -800,6 +800,9 @@ fn serialize_signature_token_single_node_impl(
binary.push(SerializedType::TYPE_PARAMETER as u8)?;
serialize_type_parameter_index(binary, *idx)?;
},
SignatureToken::Function(..) => {
unimplemented!("serialization of function types")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you planning to fix this part for this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coming in a later PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

later

assert ty == locals[#args - i - 1]
"#]
#[gas_type_creation_tier_1 = "closure_ty"]
InvokeFunction(SignatureIndex),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the instructions proposed, we will require closures to have their type arguments being fully provided at LdFunction time, not at invocation time. Is this an intentional design? Asking this because the current dynamic dispatch code (for dispatchable fungible asset) actually performs type binding at call time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? LdFunction does not need to load until use? We store idx/name in a function value, and only when we call Invoke we resolve that index/name and load the function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SignatureIndex here is a static type, and is needed for verifier to allow the stack depth checker to know how many parameters/results are being popped/pushed on the stack, since the stack depth analysis is independent of the type analysis in the verifier (if these 2 analyses were combined, then this need not be included in the instruction, as it should be easy to derive from context; I really don't know why there are bunch of independent passes in the verifier).

Execution will have to look at the function value (1) to get the early-bound parameters to pass, and (2) to get the FunctionHandle/FunctionInstantiation to (a) get its type to figure out the number and type of parameters needed to pop off the stack for this call, and (b) determine which function is being called.

The dynamic types will have to be checked at some point, either when the value is created (LdFunction/EarlyBind/Deserialization) or called (InvokeFunction). If we think it will be called at least once, then earlier type-checking would make sense, but if pulling the function data into cache is the slow part of type checking, we might want to do it at Call time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marking this replied for bookkeeping.

@@ -17,6 +17,10 @@ fn sig_to_ty(sig: &SignatureToken) -> Option<MoveTypeLayout> {
SignatureToken::U128 => Some(MoveTypeLayout::U128),
SignatureToken::U256 => Some(MoveTypeLayout::U256),
SignatureToken::Vector(v) => Some(MoveTypeLayout::Vector(Box::new(sig_to_ty(v.as_ref())?))),
SignatureToken::Function(..) => {
// TODO(LAMBDA): do we need representation in MoveTypeLayout?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need it here? It's a good question however. MoveTypeLayout is used for ser/de in the VM so it would be dependent on our implementation there?

Copy link
Contributor

@georgemitenkov georgemitenkov Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need it here (function values cannot be constants), but we do need it otherwise. I have a PR adding it: #15442

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible that we could allow function values as constants in the future. But yeah, maybe not worth it now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, for structs it is also possible but we did not do it yet, so None is a way to go :)

func param types match provided parameter types
"#]
#[gas_type_creation_tier_0 = "closure_ty"]
EarlyBindFunction(SignatureIndex, u8),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make more sense to bind by mask? i.e: EarlyBindFunction will only bind one argument at a time and the u8 will be used to specify which argument to bind. This would allow more flexibility?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any function generated for the body of an lambda expression can order free variables before lambda parameters so that it is natural to early-bind the k first parameters. If the user wants to use an existing function with parameters in the wrong order, the user can add a wrapper function as necessary.

The mask idea and the code Wolfgang and I had written to deal with more arbitrary permutations of early/late-bound parameters was very "clever" and therefore hard to review and really an unnecessary security risk.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replied already

..
stack >> arg_k

old_conservative_invocation_mode = conservative_invocation_mode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this invocation mode?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For LAMBDA MVP, we are playing it very safe and disallowing any call to module m if there is any dynamic call on the stack. This is enforced by a VM state flag conservative_invocation_mode, which is set if we are in a dynamic call.

This is because dynamic calls violate the static dependency guarantees. After we have the MVP implemented we will see how far we can prove dynamic safety, e.g., using access specifiers.

There was an error here in that the dynamic call also needs to always check that its module is not already on the stack. Fixed.

// TODO(LAMBDA) In the future this may change.
let function_handle = self.module.function_handle_at(fh_idx);
let function_acquired_resources = self.function_acquired_resources(function_handle, fh_idx);
if !function_acquired_resources.is_empty() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a necessary change nor this is safe. Acquire list is only a local concept for now. Imagine a public move function that has acquire list, you can easily create another move module that invokes this function. In that module, the acquire list would be empty. In this sense you would be able to bypass the constraints here easily.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We disallow both reentrancy and nonempty acquires lists for now, so it's definitely safe, if a bit limited. Your foreign move module can't call back into this one, since you're already on the stack.

In the future we can add more general access_specifiers to functions and function types, and check that the dynamic call doesn't exceed the accesses of the caller. I'd rather work that out first, but people are anxious about an MVP deadline.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was being overly cautious here. You're right, as long as we disallow all dynamic reentrancy we can ignore the acquires list of the function here.

Let's discuss post-MVP approaches to loosen constraints later.

Ok(())
}

fn invoke_acquire(&mut self, _offset: CodeOffset) -> PartialVMResult<()> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The strategy I was taking previously is indeed making invoke instruction to be a nil-op but check re-entrancy at runime.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean by "nil-op" here. Did you mean what I have done here?

compare_cross_module_signatures(context, args1, args2, def_module)?;
compare_cross_module_signatures(context, result1, result2, def_module)?;
if ab1 != ab2 {
Err(PartialVMError::new(StatusCode::TYPE_MISMATCH))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add an error message here? i.e., PartialVMError::new(StatusCode::TYPE_MISMATCH).with_message(...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, although:

Do you realize that most type mismatches will show up below, which has no message. Printing the types might be interesting, if we had Display for SignatureToken.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, we can use debug and format via "{:?}"? Any extra information is better than a status code. Imagine you have debug a type mismatch - there are so many places it can come from, it is difficult to bisect, error messages allows us to pinpoint the location (if they are propagated correctly 😄)

Below is some old code I do not know who wrote, but I would love to see error messages as well there.

SignatureToken::MutableReference(boxed) => write!(f, "MutableReference({:?})", boxed),
SignatureToken::TypeParameter(idx) => write!(f, "TypeParameter({:?})", idx),
}
}
}

impl SignatureToken {
/// Returns true if the token is an integer type.
// Returns `true` if the `SignatureToken` is an integer type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? Bad rebase?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what happened to the /// but the backquotes make sense here. Not going to rewrite the whole file, though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to have less changed lines in diff, but up to you

@@ -1456,6 +1482,13 @@ impl SignatureToken {
}
}

/// Returns true if the `SignatureToken` is a function.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: personally, I am not a fan of "", if the type name changes, comment can easily be outdated. if the tokenorif the [SignatureToken]` can be better?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment style I used matches the rest of the file (see lines 1363, 1428, 1446, 1453, 1460, 1467). If we want to change this, we should do them all. Changing just 1 detracts from readability.

Suggest an off-line rustdoc style discussion and global style fix. But not in this PR.

Copy link
Contributor

@georgemitenkov georgemitenkov Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree: off-line discussions will lead to nothing because no one will do the scan across all aptos-core to fix wrong/outdated comments or their style. We should fix these things incrementally as we touch the code.

For this specific issue, I am fine with keeping `` for now. But this is bad in my opinion, and leads to things like:

/// `InterpreterImpl` instances can execute Move functions.
///
/// An `Interpreter` instance is a stand alone execution context for a function.
/// It mimics execution on a single thread, with an call stack and an operand stack.
pub(crate) struct InterpreterImpl { ... 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyone who cares can do the scan and fix. It's much easier to review a style cleanup as an independent PR.

@@ -17,6 +17,10 @@ fn sig_to_ty(sig: &SignatureToken) -> Option<MoveTypeLayout> {
SignatureToken::U128 => Some(MoveTypeLayout::U128),
SignatureToken::U256 => Some(MoveTypeLayout::U256),
SignatureToken::Vector(v) => Some(MoveTypeLayout::Vector(Box::new(sig_to_ty(v.as_ref())?))),
SignatureToken::Function(..) => {
// TODO(LAMBDA): do we need representation in MoveTypeLayout?
Copy link
Contributor

@georgemitenkov georgemitenkov Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need it here (function values cannot be constants), but we do need it otherwise. I have a PR adding it: #15442

Copy link
Contributor Author

@brmataptos brmataptos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed comments, thanks.

compare_cross_module_signatures(context, args1, args2, def_module)?;
compare_cross_module_signatures(context, result1, result2, def_module)?;
if ab1 != ab2 {
Err(PartialVMError::new(StatusCode::TYPE_MISMATCH))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, although:

Do you realize that most type mismatches will show up below, which has no message. Printing the types might be interesting, if we had Display for SignatureToken.

assert ty == locals[#args - i - 1]
"#]
#[gas_type_creation_tier_1 = "closure_ty"]
Invoke(SignatureIndex),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving it.

SignatureToken::MutableReference(boxed) => write!(f, "MutableReference({:?})", boxed),
SignatureToken::TypeParameter(idx) => write!(f, "TypeParameter({:?})", idx),
}
}
}

impl SignatureToken {
/// Returns true if the token is an integer type.
// Returns `true` if the `SignatureToken` is an integer type.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what happened to the /// but the backquotes make sense here. Not going to rewrite the whole file, though.

@@ -17,6 +17,10 @@ fn sig_to_ty(sig: &SignatureToken) -> Option<MoveTypeLayout> {
SignatureToken::U128 => Some(MoveTypeLayout::U128),
SignatureToken::U256 => Some(MoveTypeLayout::U256),
SignatureToken::Vector(v) => Some(MoveTypeLayout::Vector(Box::new(sig_to_ty(v.as_ref())?))),
SignatureToken::Function(..) => {
// TODO(LAMBDA): do we need representation in MoveTypeLayout?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible that we could allow function values as constants in the future. But yeah, maybe not worth it now.

func param types match provided parameter types
"#]
#[gas_type_creation_tier_0 = "closure_ty"]
EarlyBindFunction(SignatureIndex, u8),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replied already

..
stack >> arg_k

old_conservative_invocation_mode = conservative_invocation_mode
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For LAMBDA MVP, we are playing it very safe and disallowing any call to module m if there is any dynamic call on the stack. This is enforced by a VM state flag conservative_invocation_mode, which is set if we are in a dynamic call.

This is because dynamic calls violate the static dependency guarantees. After we have the MVP implemented we will see how far we can prove dynamic safety, e.g., using access specifiers.

There was an error here in that the dynamic call also needs to always check that its module is not already on the stack. Fixed.

// TODO(LAMBDA) In the future this may change.
let function_handle = self.module.function_handle_at(fh_idx);
let function_acquired_resources = self.function_acquired_resources(function_handle, fh_idx);
if !function_acquired_resources.is_empty() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was being overly cautious here. You're right, as long as we disallow all dynamic reentrancy we can ignore the acquires list of the function here.

Let's discuss post-MVP approaches to loosen constraints later.

Ok(())
}

fn invoke_acquire(&mut self, _offset: CodeOffset) -> PartialVMResult<()> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean by "nil-op" here. Did you mean what I have done here?

@brmataptos brmataptos force-pushed the 11-21-lambda-compiler-completion branch from 02fb9ec to f2281b4 Compare December 4, 2024 07:23
@brmataptos brmataptos force-pushed the 11-23-vm_changes_merged_in branch from 1d8a057 to d382ca0 Compare December 4, 2024 07:23
@brmataptos brmataptos force-pushed the 11-23-vm_changes_merged_in branch from d382ca0 to 2fa8bde Compare December 4, 2024 07:54
@brmataptos brmataptos force-pushed the 11-21-lambda-compiler-completion branch from d329f76 to 2734528 Compare December 4, 2024 22:49
@brmataptos brmataptos requested a review from areshand as a code owner December 4, 2024 22:49
@brmataptos brmataptos force-pushed the 11-23-vm_changes_merged_in branch from 2fa8bde to d7ded7a Compare December 4, 2024 22:50
Base automatically changed from 11-21-lambda-compiler-completion to main December 4, 2024 23:23
@brmataptos brmataptos force-pushed the 11-23-vm_changes_merged_in branch from d7ded7a to 844871b Compare December 6, 2024 03:39
@brmataptos
Copy link
Contributor Author

@georgemitenkov I've added feature gating and tests to verify that it's being done. I see some is done in e2e-tests but I can't do it there because the compiler doesn't have complete codegen yet. I've added a test in third_party/move/move-bytecode-verifier/bytecode-verifier-tests/src/unit_tests/feature_function-values_tests.rs which manually constructs the bytecode file. Fun!

PTAL.

wrwg and others added 6 commits December 7, 2024 14:12
[move-vm] Types and opcodes for closures

This PR implements the extensions needed in the file format for representing closures:

- The type (SignatureToken) has a new variant `Function(args, result, abilities)` to represent a function type
- The opcodes are extendeed by operations `ClosPack`, `ClosPackGeneric`, and `ClosEval`

This supports bit masks for specifyinng which arguments of a function are captured when creating a closure.

Bytecode verification is extended to support the new types and opcodes. The implementation of consistency, type, and reference safety should be complete. What is missing are:

- File format serialization
- Interpreter and value representation
- Value serialization
- Connection of the new types with the various other type representations
…rser work:

- `LdFunction`, `LdFunctionGeneric` = generate a closure from a defined function
- `EarlyBind` = bind more arguments to any closure
- `Invoke` = call a closure with final arguments
Add some description/semantics to `file_format.rs` to better describe
the implementation which will be needed.

Get rid of complex Mask calculations in favor of simpler early bninding of
`k` initial arguments.

Hopefully keep all bytecode verifier operations working the same.
…simplify AcuiresVerifier::ld_function_acquire - comments cleanup, especially file_format semantics for new bytecodes
…ed by verifier. adding tests required serialization/desrialization of SignatureToken::Function.
…nce that yields Clippy warnings, apparently due to a hidden bug in one of the derived trait macros invoked here (dearbitrary)
@brmataptos brmataptos force-pushed the 11-23-vm_changes_merged_in branch from fdd0ca9 to e8107d7 Compare December 7, 2024 22:13
Copy link
Contributor

@georgemitenkov georgemitenkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Full pass. Do we have a way to test the verifier changes? I saw tests for feature verification, but I do not know where we test other passes.

@@ -17,6 +17,10 @@ fn sig_to_ty(sig: &SignatureToken) -> Option<MoveTypeLayout> {
SignatureToken::U128 => Some(MoveTypeLayout::U128),
SignatureToken::U256 => Some(MoveTypeLayout::U256),
SignatureToken::Vector(v) => Some(MoveTypeLayout::Vector(Box::new(sig_to_ty(v.as_ref())?))),
SignatureToken::Function(..) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you return None here (one less TODO)? We do not expect function value constant any time soon. Let's also add a unit test for that?

| Reference(_)
| MutableReference(_)
| Vector(_)
| Function { .. } => (),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be Function(..)?

@@ -683,8 +685,20 @@ impl<'a> BoundsChecker<'a> {

for ty in ty.preorder_traversal() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we rename this ty to tok? It is annoying that we have tags, tokens and types and sometimes even layouts all as types....

| Address
| Bool
| Vector(_)
| Function { .. }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function(..)?

@@ -1010,6 +1010,13 @@ impl AbilitySet {
pub fn into_u8(self) -> u8 {
self.0
}

pub fn to_string_concise(self) -> String {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this adds much value. Display implementation is the same, only adds 2 spaces. And having multiple conversions into string is just confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't want to change all existing outputs to look different and make review difficult. This is used for ability display just for functions.

if &closure_ty != expected_ty {
return Err(verifier
.error(StatusCode::CALL_TYPE_MISMATCH_ERROR, offset)
.with_message("closure type mismatch".to_owned()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add debug formats of types "{:?}, expected_ty, ..." Otherwise the message is useless?

safe_assert!(false);
unreachable!()
};
if !abilities.is_subset(closure_abilities) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure this works? Before you do &closure_ty != expected_ty, but that compares abilities by equality as well? I think with match you can compare args/return types by equality, and check subset here.

safe_assert!(false);
unreachable!()
};
if !abilities.is_subset(closure_abilities) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is duplicated with early bind, can we factor the above into a helper function?

return map_err(Err(PartialVMError::new(
StatusCode::NUMBER_OF_ARGUMENTS_MISMATCH,
)
.with_message("in EarlyBindFunction".to_string())));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make the message more detailed? What is received count, what is the expected, etc.

checked_early_bind_insts.entry((*idx, *count))
{
// Note non-function case is checked in `verify_fun_sig_idx` above.
if let Some(SignatureToken::Function(args, ..)) =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you can use if matches!(Some(...) if condition) { .. }

@georgemitenkov
Copy link
Contributor

@brmataptos tests are failing? I see these on CI:

2024-12-09T04:44:56.0568444Z thread 'garbage_inputs' panicked at third_party/move/move-binary-format/serializer-tests/tests/serializer_tests.rs:45:14:
2024-12-09T04:44:56.0569258Z deserialization should work: PartialVMError { major_status: UNKNOWN_OPCODE, sub_status: None, message: None, exec_state: None, indices: [], offsets: [] }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants