version 1.4
This document describes the Move specification language, a subset of the Move language which supports specification of the behavior of Move programs. Throughout this document, we abbreviate this subset of Move MSL.
MSL plays together with the Move Prover, a tool which can statically verify the correctness of MSL specifications against Move programs. In contrast to traditional testing, verification of MSL is exhaustive and holds for all possible inputs and global states of a Move module or transaction script. At the same time, it is fast and automated enough that it can be used at a similar place in the developer workflow where tests are (for example, for qualification of pull requests in continuous integration).
While the Move programming language at this point is stable, the subset represented by MSL should be considered evolving. This has no impact on platform stability, since MSL is not running in production, but is used for offline quality assurance, where it is continuously improved for evolving objectives.
This document describes the language only; see here for how to use the Move prover tool. The reader is expected to have basic knowledge of the Move language, as well as basic principles of pre/post condition specifications (see e.g. this article). For examples of specifications, we refer to the [Diem framework documentation][FRAMEWORK] which has specifications embedded.
- Expressions
- Specifications
- Separating Specifications
- Pragmas and Properties
- Pre and Post State
- Helper Functions
- Let Bindings
- Expressiveness
Expressions in MSL are a subset of Move program expressions plus a set of additional constructs, as discussed in the following sections.
The type system of MSL is similar to that of Move, with the following differences:
- There are two types of encodings for integer types:
num
andbv
(bit vector). If an integer (either a constant or a variable) does not involve in any bitwise operations directly or indirectly, regardless of its type in Move (u8
,u16
,u32
,u64
,u128
andu256
), it is treated as the same type. In specifications, this type is callednum
, which is an arbitrary precision signed integer type. When MSL refers to a Move name which represents anu8
or such, it will be automatically widened tonum
. This allows writing MSL expressions likex + 1 <= MAX_U128
orx - y >= 0
without needing to worry about overflow or underflow. Different fromnum
,bv
cannot and does not need to be explicitly used in specifications: if an integer involves in bitwise operations such as&
,|
or^
, it will be automatically encoded asbv
at the backend. Moreover, abv
integer has a fixed precision, which is consistent with its precision in Move (bv8
,bv16
,bv32
,bv64
,bv128
andbv256
). Note that, in general usingbv
is not so efficient asnum
in the SMT solver. Consequently, the prover has some restrictions when using bitwise operations, which will stated in detail below. - The Move types
&T
,&mut T
, andT
are considered equivalent for MSL. Equality is interpreted as value equality. There is no need to worry about dereferencing a reference from the Move program: these are automatically dereferenced as needed. This simplification is possible because MSL cannot modify values from a Move program and the program cannot directly reason about reference equality (which eliminates the need for doing so in MSL). (Note there is also a restriction in expressiveness coming with this, namely for functions which return&mut T
, however, this is rarely hit in practice, and there are workarounds.) - There is the additional type
type
which is the type of all types. It can only be used in quantifiers. - There is the additional type
range
which represents an integer range (and the notationn..m
to denote a value).
Name resolution works similar to the Move language. use
declarations can introduce aliases for
imported names. MSL functions and variable names must start with a lower case letter. Schema names
are treated like types and must start with a capital letter (schemas are a new named construct
discussed later).
Move functions, MSL functions, Move types, and schemas all share the same namespace, and are
therefore unambiguous if aliased via a Move use
clause. Because of the common name space, an MSL
function cannot have the same name than a Move function. This is often handled via the convention to
prefix MSL function as in
spec_has_access
when the related Move function is called has_access
.
All Move operators are supported in MSL, except &
, &mut
, and *
(dereference).
In addition to the existing operators, vector subscript v[i]
, slicing v[i..j]
, and range
construction
i..j
are supported (the type of integer ranges is a new builtin type called range
). Moreover,
boolean implication p ==> q
is supported as a more intuitive form than !p || q
.
In MSL expressions, functions can be called like in Move. However, the callee must either be a MSL function, or a pure Move function.
Move functions are considered pure if they do not modify global state and do not use Move expression features which are not supported in MSL expressions (as defined here).
There is one extension. If a Move function definition contains a direct assert
, this will be
ignored when it is called from an MSL expression, and the function will be considered pure. For
example:
fun get(addr: address): &T { assert(exists<T>(addr), ERROR_CODE); borrow_global<T>(addr) }
This function is pure and can be called from an MSL expression. The assertion will be ignored, and the function will be interpreted as:
spec fun get(addr: address): T { global<T>(addr) }
This is justified by that MSL has partial semantics.
Limited sequencing of the form { let x = foo(); x + x }
is supported, as well as if-then-else.
Other statement forms of the Move language are not supported.
Pack expressions are supported. Unpack expressions are currently not supported.
Universal and existential quantification are supported. The general form is
forall <binding>, ..., <binding> [ where <exp> ] : <exp>
exists <binding>, ..., <binding> [ where <exp> ] : <exp>
- Bindings can either be of the form
name: <type>
orname in <exp>
. For the second form, the expression currently must either be arange
or a vector. - The optional constraint
where <exp>
allows to restrict the quantified range.forall x: T where p: q
is equivalent toforall x: T : p ==> q
andexists x: T where p: q
is equivalent toexists x: T : p && q
.
Notice that it is possible to quantify over types. For example:
forall t: type, addr: address where exists<R<t>>(addr): exists<T<t>>(addr)
The choice operator allows to select a value which satisfies a predicate:
choose a: address where exists<R>(a) && global<R>(a).value > 0
If the predicate is not satisfiable, the result of the choice will be undetermined (see discussion of partial semantics).
The choice also comes in a form to select the minimal value from a set of integers, as in:
choose min i: num where in_range(v, i) && v[i] == 2
In the specification language, we can use the same syntax (e as T)
to cast an expression e
with one integer type to
T
, an integer type of another size.
Shift operators <<
and >>
are supported in the specification language and both of them have the same semantics with the Move language.
As for abort, if a value v
has width n
, then v << m
or v >> m
will abort if m >= n
.
Move programs using bitwise operators &
, |
and ^
can be verified in the prover and these operators are also supported in the specification language.
Due to encoding and efficiency issues, using bitwise operators has more caveats:
-
Integers involved in bitwise operations are encoded as
bv
types at the backend and two encodings of integers are not compatible. For instance, if a variablev
involves in a bitwise operation such asv & 2
orv = a ^ b
, then when it is used in an arithmetic operationv * w
or a shift operationv << w
,w
will be implicitly cast to abv
type in the Move program. However, the specification language does not support implicit type cast so users must explicitly use the builtin functionint2bv
in the specification:v << int2bv(w)
. Not that since eachbv
type has a fixed length (from 8 to 256), values with typenum
cannot be converted intobv
. -
Verification of
bv
types is not efficient and may lead to timeout. As a result, users may prefer isolating bitwise operations from other operations and not usingint2bv
if possible. Moreover, users need to use pragmas to explicitly specify which integer-typed function arguments or struct fields will be used in bitwise computations:
struct C has drop {
a: u64,
b: u64
}
spec C {
// b, the second field of C, will be of bv type
pragma bv=b"1";
}
public fun foo_generic<T>(i: T): T {
i
}
spec foo_generic {
// The first parameter will be of bv type if T is instantiated as a number type
pragma bv=b"0";
// The first return value will be of bv type if T is instantiated as a number type
pragma bv_ret=b"0";
}
public fun test(i: C): u64 {
let x1 = foo_generic(C.b);
x1 ^ x1
}
spec test {
// Explicit type cast is mandatory for generating correct boogie program
ensures result == (0 as u64);
}
Note that if arguments or fields of a generic function or struct are specified with bv
types,
they will be of bv
types in all instances of the function or the struct when the instantiated type is an integer type.
- Values with integer types in vectors and tables can be encoded as
bv
types; indices and keys in tables cannot bebv
types for now. Using other types will lead to internal errors.
MSL supports a number of builtin constants and functions. Most of them are not available in the Move language:
MAX_U8: num
,MAX_U64: num
,MAX_U128: num
returns the maximum value of the corresponding type.exists<T>(address): bool
returns true if the resource T exists at address.global<T>(address): T
returns the resource value at address.len<T>(vector<T>): num
returns the length of the vector.update<T>(vector<T>, num, T>): vector<T>
returns a new vector with the element replaced at the given index.vec<T>(): vector<T>
returns an empty vector.vec<T>(x): vector<T>
returns a singleton vector.concat<T>(vector<T>, vector<T>): vector<T>
returns the concatenation of the parameters.contains<T>(vector<T>, T): bool
returns true if element is in vector.index_of<T>(vector<T>, T): num
returns the index of the element in the vector, or the length of the vector if it does not contain it.range<T>(vector<T>): range
returns the index range of the vector.in_range<T>(vector<T>, num): bool
returns true if the number is in the index range of the vector.in_range<T>(range, num): bool
returns true if the number is in the range.update_field(S, F, T): S
updates a field in a struct, preserving the values of other fields, whereS
is some struct,F
the name of a field inS
, andT
a value for this field.old(T): T
delivers the value of the passed argument at point of entry into a Move function. This is allowed inensures
post-conditions, inline spec blocks (with additional restrictions), and certain forms of invariants, as discussed later.TRACE(T): T
is semantically the identity function and causes visualization of the argument's value in error messages created by the prover.int2bv(v)
explicitly converts an integerv
into itsbv
representation.bv2int(b)
explicitly converts a 'bv' integer 'b' into thenum
representation. However it is not encouraged to use it due to efficiency issue.
Builtin functions live in an unnamed outer scope of a module. If the module defines a function len
then this definition will shadow that of the according builtin function. To access the builtin
function in such a situation, one can use the notation ::len(v)
.
In MSL, expressions have a partial semantics. This is in contrast to Move program expressions, which have a total semantics, since they either deliver a value or abort.
An expression e[X]
which depends on some some variables X
may have a known interpretation for
some assignments to variables in X
, but unknown for others. An unknown interpretation for a
sub-expression causes no issue if its value is not needed for the overall expression result.
Therefore it does not matter if we say y != 0 && x / y > 0
or x / y > 0 && y != 0
: boolean operators are commutative.
This basic principle inherits to higher-level language constructs. For example, in specifications,
it does not matter in which order conditions are
supplied: aborts_if y != 0; ensures result == x / y;
is the same as
ensures result == x / y; aborts_if y != 0;
. Also, aborts_if P; aborts_if Q;
is the same
as aborts_if Q || P
.
Moreover, the principle of partial semantics is inherited to specification helper functions , which behave transparently. Specifically, inlining those functions is equivalent to calling them ( call-by-expression parameter passing semantics).
Specifications are contained in so-called specification blocks (abbreviated spec block) that can appear as module members and inside Move functions. The various types of spec blocks are shown below, and will be discussed in subsequent sections.
module M {
resource struct Counter {
value: u8,
}
public fun increment(a: address) acquires Counter {
let r = borrow_global_mut<Counter>(a);
spec {
// spec block targeting this code position
...
};
r.value = r.value + 1;
}
spec increment {
// spec block targeting function increment
...
}
spec Counter {
// spec block targeting struct Counter
...
}
spec schema Schema {
// spec block declaring a schema
...
}
spec fun f(x: num): num {
// spec block declaring a helper function
...
}
spec module {
// spec block targeting the whole module
...
}
}
Apart of spec blocks inside Move functions, the textual position of spec block is irrelevant. Also, a spec block for a struct, function, or module can be repeated multiple times, accumulating the content.
Instead of putting specifications into the same module as the regular Move definitions, one can also put them into a separate "specification" module, which can live in the same or a different file:
module M {
...
}
spec M {
spec increment { .. }
}
The syntax of a specification module is the same as for a regular module, however, Move functions and structures are not allowed.
A specification module must be compiled together with the Move module it is targeting, and cannot be compiled and verified standalone.
In case Move definitions are far apart (e.g. in different files), it is possible to augment the specification of a Move function with a signature of this function, to give sufficient context to understand the specification. This syntax is optionally enabled in regular and in specification modules:
public fun increment(a: address) acquires Counter { .. }
...
spec increment(a: address) { .. }
Pragmas and properties are a generic mechanism to influence interpretation of specifications. They are also an extension point to experiment with new concepts before they come part of the mainstream syntax. Here we give a brief introduction into their general syntax; individual instances are discussed later.
The general form of a pragma is:
spec .. {
pragma <name> = <literal>;
}
The general form of a property is:
spec .. {
<directive> [<name> = <literal>] <content>; // ensures, aborts_if, include, etc..
}
The <literal>
can be any value supported by MSL (or the Move language). A value assignment can
also be omitted, in which case a default is used. For example, it is common to use pragma option;
as a shortcut for
pragma option = true;
.
Instead of a single pragma or property, a list can also be provided, as
in invariant [global, isolated] P
.
A pragma in a module spec block sets a value which applies to all other spec blocks in the module. A pragma in a function or struct spec block can override this value for the function or struct. Furthermore, the default value of some pragmas can be defined via the prover configuration.
As an example, we look at the verify
pragma. This pragma is used to turn verification on or off.
spec module {
pragma verify = false; // By default, do not verify specs in this module ...
}
spec increment {
pragma verify = true; // ... but do verify this function.
...
}
A number of pragmas control general behavior of verification. Those are listed in the table below.
Name | Description |
---|---|
verify |
Turns on or off verification. |
intrinsic |
Marks a function to skip the Move implementation and use a prover native implementation. This makes a function behave like a native function even if it not so in Move. |
timeout |
Sets a timeout (in seconds) for function or module. Overrides the timeout provided by command line flags. |
verify_duration_estimate |
Sets an estimate (in seconds) for how long the verification of function takes. If the configured timeout is less than this value, verification will be skipped. |
seed |
Sets a random seed for function or module. Overrides the seed provided by command line flags. |
The following properties control general behavior of verification:
Name | Description |
---|---|
[deactivated] |
Excludes the associated condition from verification. |
Multiple conditions in spec blocks work with a pre and post state, relating them to each other.
Function specifications are one example of this: in the ensures P
condition, the pre-state (at
function entry) and the post-state (at function exit) are related via the predicate P
. However,
the concept is more general and also applied for invariants, where the pre-state is before and
post-state after a global update.
In contexts where a pre/post state is active, expressions are evaluated implicitly in the
post-state. To evaluate an expression in a pre-state, one uses the builtin function old(exp)
,
which evaluates its parameter in the pre-state and returns its value. It is important to understand
that every sub-expression in exp
is computed in the pre-state as well, including calls to helper
functions.
The 'state' in question here consists of assignments to global resource memory, as well as to any
parameters of the function of type &mut T
. Examples:
fun increment(counter: &mut u64) { *counter = *counter + 1 }
spec increment {
ensures counter == old(counter) + 1;
}
fun increment_R(addr: address) {
let r = borrow_global_mut<R>(addr);
r.value = r.value + 1;
}
spec increment_R {
ensures global<R>(addr).value == old(global<R>(addr).value) + 1;
}
MSL allows defining helper functions. Those functions can then be used in expressions.
Helper functions are defined using the following syntax:
spec fun exists_balance<Currency>(a: address): bool { exists<Balance<Currency>>(a) }
As seen in the example, helper functions can be generic. Moreover, they can access global state.
Definitions of helper functions are neutral regards whether they apply to
a pre- or post-state. They are evaluated in the currently active state. For
instance, in order to see whether a balance existed in the pre-state, one
uses old(exists_balance<Currency>(a))
. Consequently, the expression old(..)
is not allowed
within the definition of a helper function.
Helper functions are partial functions; see discussion of partial semantics.
A helper function can be defined as uninterpreted by simply omitting its body:
spec fun something(x: num): num;
An uninterpreted function is one the prover is allowed to assign some arbitrary meaning to, as long as it is consistent within a given verification context. Uninterpreted functions are a useful tool for abstraction in specifications (see also here).
The meaning of helper functions can be further constraint by using axioms. Currently, axioms must be contained in module spec blocks:
spec module {
axiom forall x: num: something(x) == x + 1;
}
Axiom should be used with care as they can introduce unsoundness in the specification logic via
contradicting assumptions. The Move prover supports a smoke test for detecting unsoundness via
the --check-inconsistency
flag.
A spec block can contain let bindings which introduce names for expressions:
fun get_R(account: signer): R { ... }
spec get_R {
let addr = signer::spec_address_of(account);
aborts_if addr != ROOT;
ensures result == global<R>(addr);
}
In a spec block which has a pre and post state (like a function specification), the let name = e
form will evaluate e
in the pre-state. In order to evaluate an expression in the post state, one
uses let post name = e
. In the rhs expression of this form, one can use old(..)
to refer to the
pre-state.
The aborts_if
condition is a spec block member which can appear only in a function context. It
specifies conditions under which the function aborts.
In the following example, we specify that the function increment
aborts if the Counter
resource
does not exist at address a
(recall that a
is the name of the parameter of increment
).
spec increment {
aborts_if !exists<Counter>(a);
}
If a function has more than one aborts_if
condition, those conditions are or-ed with each other.
The evaluation of the combined aborts condition (or-ed from each individual condition) depends on
the value of the pragma aborts_if_is_partial
. If this value is false (the default), the function
aborts if and only if the combined aborts condition is true. In this case, the above aborts
specification for increment
will lead to a verification error, since there are additional
situations where increment
can abort, namely if incrementing Counter.value
would lead to an
overflow. To fix this, the specification can be completed like this:
spec increment {
pragma aborts_if_is_partial = false; // This is the default, but added here for illustration.
aborts_if !exists<Counter>(a);
aborts_if global<Counter>(a).value == 255;
}
If the value of aborts_if_is_partial
is true, the combined aborts condition (the or-ed individual
conditions)
only *
imply* that the function aborts. Formally, if A
is the combined aborts condition, then
with aborts_if_is_partial = true
, we have A ==> function_aborts
, otherwise we have
A <==> function_aborts
. Therefore, the following does verify:
spec increment {
pragma aborts_if_is_partial = true;
aborts_if !exists<Counter>(a);
}
Note that there is a certain risk in setting
aborts_if_is_partial
to true, and best practice is to avoid it in specifications of public functions and transaction scripts once those are considered finalized. This is because changing the code after finalization of the spec can add new (non-trivial, undesired) abort situations which the original specification did not anticipate, but which will nevertheless silently pass verification.
If no aborts condition is specified for a function, abort behavior is unspecified. The function may
or may not abort, and verification will not raise any errors, whether aborts_if_is_partial
is set
or not. In order to state that a function never aborts, use aborts_if false
. One can use the
pragma aborts_if_is_strict
to change this behavior; this is equivalent to as if an aborts_if false
has been added to each
function which does not have an explicit aborts_if
clause.
The aborts_if
condition can be augmented with a code:
fun get_value(addr: address): u64 {
aborts(exists<Counter>(addr), 3);
borrow_global<Counter>(addr).value
}
spec get_value {
aborts_if !exists<Counter>(addr) with 3;
}
It is a verification error if the above function does not abort with code 3
under the given
condition.
In order to specify a direct VM abort, one can use the special constant EXECUTION_FAILURE
:
fun get(addr: address): &Counter acquires Counter {
borrow_global<Counter>(addr)
}
spec get {
aborts_if !exists<Counter>(addr) with EXECUTION_FAILURE;
}
This same constant can be used for all other VM failures (division by zero, overflow, etc.)
The aborts_with
condition allows specifying with which codes a function can abort, independent
under which condition. It is similar to a 'throws' clause in languages like Java.
fun get_one_off(addr: address): u64 {
aborts(exists<Counter>(addr), 3);
borrow_global<Counter>(addr).value - 1
}
spec get_one_off {
aborts_with 3, EXECUTION_FAILURE;
}
If the function aborts with any other or none of the specified codes, a verification error will be produced.
The aborts_with
condition can be combined with aborts_if
conditions. In this case,
the aborts_with
specifies any other codes with which the function may abort, in addition to the ones given in
the aborts_if
:
spec get_one_off {
aborts_if !exists<Counter>(addr) with 3;
aborts_with EXECUTION_FAILURE;
}
If this is not wanted, and the aborts_with
should be independent of aborts_if
, one can use the
property
[check]
:
spec get_one_off {
aborts_if !exists<Counter>(addr) with 3;
aborts_if global<Counter>(addr) == 0 with EXECUTION_FAILURE;
aborts_with [check] 3, EXECUTION_FAILURE;
}
TODO:
[check]
property is currently not longer implemented
The requires
condition is a spec block member which postulates a pre-condition for a function. The
prover will produce verification errors for functions which are called with violating
pre-conditions.
A requires
is different from an aborts_if
: in the latter case, the function can be called, and
any aborts it produces will be propagated to the caller context. In the requires
case, the prover
will not allow the function to be called in the first place. Nevertheless, the function can still
be called at runtime if verification is skipped. Because of this, requires
are rare in Move
specifications, and
aborts_if
are more common. Specifically, requires
should be avoided for public APIs.
An example of requires
is the following:
spec increment {
requires global<Counter>(a).value < 255;
}
The ensures
condition postulates a post-condition for a function which must be satisfied when the
function terminates successfully (i.e. does not abort). The prover will verify each ensures
to
this end.
An example for the ensures
condition is the following:
spec increment {
ensures global<Counter>(a) == old(global<Counter>(a)) + 1;
}
Within the expression for the ensures
condition, one can use the old
function,
as discussed here.
The modifies
condition is used to provide permissions to a function to modify global storage. The
annotation itself comprises a list of global access expressions. It is specifically used together
with opaque function specifications.
resource struct S {
x: u64
}
fun mutate_at(addr: address) acquires S {
let s = borrow_global_mut<S>(addr);
s.x = 2;
}
spec mutate_at {
pragma opaque;
modifies global<S>(addr);
}
In general, a global access expression has the form global<type_expr>(address_expr)
. The
address-valued expression is evaluated in the pre-state of the annotated function.
fun read_at(addr: address): u64 acquires S {
let s = borrow_global<S>(addr);
s.x
}
fun mutate_S_test(addr1: address, addr2: address): bool acquires T {
assert(addr1 != addr2, 43);
let x = read_at(addr2);
mutate_at(addr1); // Note we are mutating a different address than the one read before and after
x == read_at(addr2)
}
spec mutate_S_test {
aborts_if addr1 == addr2;
ensures result == true;
}
In the function mutate_S_test
, the assertion in the spec block is expected to hold. A benefit of
the modifies specification on mutate_at
is that this assertion can be proved whether or
not mutate_at
is inlined.
If the modifies annotation is omitted on a function, then that function is deemed to have all
possible permissions for those resources it may modify during its execution. The set of all
resources that may be modified by a function is obtained via an interprocedural analysis of the
code. In the example above, mutate_S_test
does not have a modifies specification and modifies
resource S
via the call to mutate_at
. Therefore, it is considered to have modified S
at any
possible address. Instead, if the programmer adds modifies global<S>(addr1)
to the specification of mutate_S_test
, then the call to mutate_at
is checked to make sure that
modify permissions granted to mutate_S_test
cover the permissions it grants to mutate_at
.
The invariant condition can be applied on structs and on global level.
The invariant
condition on a function is simply a shortcut for a requires
and ensures
with the
same predicate.
Thus the following spec block:
spec increment {
invariant global<Counter>(a).value < 128;
}
... is equivalent to:
spec increment {
requires global<Counter>(a).value < 128;
ensures global<Counter>(a).value < 128;
}
When the invariant
condition is applied to a struct, it expresses a well-formedness property of
the struct data. Any instance of this struct which is currently not mutated will satisfy this
property (with exceptions as outlined below).
For example, we can postulate an invariant on our counter that it never must exceed the value of 127:
spec Counter {
invariant value < 128;
}
A struct invariant is checked by the prover whenever the struct value is constructed (packed). While
the struct is mutated (e.g. via a &mut Counter
) the invariant does not hold (but see exception
below). In general, we consider mutation as an implicit unpack, and end of mutation as a pack.
The Move language semantics unambiguously identifies the point when mutation ends and starts. This follows from the borrow semantics of Move, and includes mutation via an enclosing struct. (The mutation of an inner struct ends when the mutation of the root struct where mutation started ends.)
There is one exception to this rule. When a mutable reference to a struct declared in module M is passed into a * public* function of M which does by itself not return any other mutable reference (which could be borrowed from the input parameter), we treat this parameter as "packed". That means, on function entry, we will unpack it and on function exit we will pack again, enforcing the invariant. This reflects that in Move, struct data can only be mutated within the module which declares the struct, so for an outside caller of the public function, the mutable reference can actually not be mutated unless by calling public functions of module M again. It is a significant simplification of the verification problem to exploit this in the semantics.
A global invariant appears as a member of module. It can express a condition over the global state
of the Move program, as represented by resources stored in memory. For example, the below invariant
states that a Counter
resource stored at any given address can never be zero:
module M {
invariant forall a: addr where exists<Counter>(a): global<Counter>(a).value > 0;
}
A global invariant is assumed to hold when data is read from the global state, and is asserted (and may fail to verify) at the moment the state is updated. For example, the below function will never abort with arithmetic underflow because the counter value is always greater zero; however, it will create a verification error since the counter can drop to zero:
fun decrement_ad(addr: address) acquires Counter {
let counter = borrow_global_mut<Counter>(addr);
let new_value = counter.value - 1; // Will not abort because counter.value > 0
*counter.value = new_value; // Fails verification since value can drop to zero
}
There are times when a global invariant holds almost everywhere, except for a brief interval inside a function. In current move code, this often occurs when something (e.g. an account) is being setup and several structs are published together. Almost everywhere, an invariant holds that all of the structs are published or none of them are. But the code that publishes the structs must do so sequentially. While they are being published, there will be a point where some are published and others are not.
In order to verify invariants that hold except during small regions, there is a feature to allow users to disable invariants temporarily. Consider the following code fragment:
fn setup() {
publish1();
publish2();
}
}
where publish1
and publish2
publish two different structs, T1
and T2
at address a
.
module M {
invariant [global] exists<T1>(a) == exists<T2>(a)
}
As written, the prover will report that the invariant is violated
after the call to publish1
and before the call to publish2
.
If either of publish1
or publish2
is without the other, the prover
will also report a violation of the invariant.
By default, a global invariant is checked immediately after the instruction I
that touches the resources mentioned in the global invariant.
The [suspendable]
attribute (at the invariant side) together with two pragmas
(specified in function spec block) provides fine-grained control on where we hope this invariant to be checked:
disable_invariants_in_body
: the invariant will be checked at the end of the function whereI
resides.delegate_invariants_to_caller
: the invariant will be checked by all callers of the function whereI
resides.
For the example above, we can add the pragma disable_invariants_in_body
:
spec setup {
pragma disable_invariants_in_body;
}
which says that invariants not required to hold while setup
is executing,
but are assumed to hold on entry to and exit from setup
.
This pragma changes the Prover's behavior. The invariants are assumed on
entry to setup
, but not proved during or after publish1
and
publish2
. Instead, all invariants that could be invalidated in the
body of setup
are asserted and proved at the point of return from
setup
. A consequence of this processing is that the user may need
to provide stronger post-conditions on publish1
and publish2
to
make it possible to prove the invariants on exit from setup
.
Another consequence of this processing is that invariants cannot
safely be assumed to hold during the execution of publish1
and
publish2
(unless nothing in the body of setup
changes state
mentioned in the invariant). Therefore, if proving a post-condition
requires the invariant to be assumed, the post-condition will fail.
In the example, invariants hold at the call sites of setup
, but not
in the body. For publish1
, invariants don't necessarily hold at the
call site or in the body of the function. In the example, that
behavior is implied because publish1
is called in a context where
invariants are disabled.
When invariants are disabled in setup
in the above example, the
prover cannot assume them on entry to publish1
and publish2
, and
should not try to prove them on exit from those functions. The Prover
would have the same behavior for any functions called by publish1
or
publish2
. The Prover automatically adopts this behavior when
invariants are disabled in a calling function, but it is possible for
the user to declare that a function be treated like publish1
.
For example, if publish2
is only called from the setup function above,
and we did not disable invariants in setup
, we could achieve a similar
effect by using the pragma delegate_invariants_to_caller
, instead.
spec setup {
pragma delegate_invariants_to_caller;
}
This would only be legal if setup
is a private or public (friend)
function.
The difference between this and disabling invariants in setup
is that the
invariants would not be assumed at the beginning of setup
and would be proved
after setup
returns at each site where it is called.
While both pragmas disable invariants in the body of a function, the
difference is that disable_invariants_in_body
assumes invariants on
entry and proves them on exist, while delegate_invariants_to_caller
does neither.
There are some limitations on how these pragmas can be used.
disable_invariants_in_body
cannot be declared for functions where
invariants are delegated to a caller, either explicitly via the pragma
or implicitly because the function is called in a context where
invariants have been disabled. (This restriction is to ensure
consistent processing, because on pragma assumes that invariants hold
in the calling context and the other does not). Second, it is illegal
for a public or script function to delegate invariant checking to its
callers (since the Prover does not know all the call sites), unless
the function cannot possibly invalidate an invariant because it
doesn't change any of the state mentioned in exists
and global
expressions appearing in the invariant.
The update
form of a global invariant allows to express a relation
between pre-state and post-state of a global state update. For example, the
following invariant states that the counter must decrease monotonically whenever it is updated:
module M {
invariant update [global] forall a: addr where old(exists<Counter>(a)) && exists<Counter>(addr):
global<Counter>(a).value <= old(global<Counter>(a));
}
A global invariant can be marked as [isolated]
to indicate that it is not relevant for proving
other properties of the program. An isolated global invariant will not be assumed when the related
global state is read. It will only be assumed before the state is updated, to help prove that the
invariant still holds after the update. This feature is for improving performance in situations
where there are many global invariants but there have no direct influence on verification.
Certain usage of global invariants leads to verification problems which cannot be checked in a modular fashion. "Modular" here means that a module can be verified standalone and proven to be universally correct in all usage contexts (if preconditions are met).
A non-modular verification problem may arise if a global invariant refers to state from multiple
modules. Consider a situation where module M1
uses module M2
, and M1
contains the following
invariant, with the helper function condition
referring to global state of each respective module:
module M1 {
invariant M1::condition() ==> M2::condition();
}
When we verify M1
standalone, the Move prover will determine that it also needs to verify
functions in M2
, namely those which update the M2 memory such that the invariant in M1 can fail.
TODO: document mechanism and pragmas to defer invariant verification for code sections
A spec block might also occur anywhere an ordinary Move statement block can occur. Here is an example:
fun simple1(x: u64, y: u64) {
let z;
y = x;
z = x + y;
spec {
assert x == y;
assert z == 2*x;
}
}
In such inline spec blocks, only a subset of conditions are permitted:
assume
andassert
statements are allowed in any code locations- loop
invariant
statements are allowed only in code locations that represent loop headers.
An assert statement inside a spec block indicates a condition that must hold when control reaches
that block. If the condition does not hold, an error is reported by the Move Prover. An assume
statement, on the other hand, blocks executions violating the condition in the statement. The
function simple2
shown below is verified by the Move Prover. However, if the first spec block
containing the assume statement is removed, Move Prover will show a violating to the assert
statement in the second spec block.
fun simple2(x: u64, y: u64) {
let z: u64;
spec {
assume x > y;
};
z = x + y;
spec {
assert z > 2*y;
}
}
An invariant
statement encodes a loop invariant and must beplaced at a loop head, as in the
following example.
fun simple3(n: u64) {
let x = 0
loop {
spec {
invariant x <= n;
};
if (x < n) {
x = x + 1
} else {
break
}
};
spec {
assert x == n;
}
}
A loop invariant is translated into two assert
statements and one assume
statement to
facilitate the inductive reasoning of properties about the loop.
In break down, a loop invariant is translated to:
- An
assert
statement that checks the invariant holds when the loop is first encountered in the execution -- establishing the base case. - An
assume
statement that encodes the property that the invariant holds at loop iterationI
. - An
assert
statement that checks whether the invariant continues to hold at loop iterationI+1
.
Occasionally, we would like to refer to the pre state of a mutable function argument in inline spec
blocks. In MSL, this can be done with the old(T)
expression. Similar to the semantics of old(..)
in post conditions, an old(T)
expression in an assume
or assert
statement always yields the
value of T
at the function entry point. Following is an example that illustrate the use of
old(..)
in an inline spec block.
fun swap(x: &mut u64, y: &mut u64) {
let t = *x;
*x = *y;
*y = t;
spec {
assert x == old(y);
assert y == old(x);
};
}
The above example is trivial as the same property can be expressed with post conditions
(i.e., ensures
) too. But there are cases where we must use old(..)
to refer to the pre
state, especially in the specification of loop invariants. Consider the following example
where we verify that the vector_reverse
function properly reverse the order of all elements
in a vector:
fun verify_reverse<Element>(v: &mut vector<Element>) {
let vlen = vector::length(v);
if (vlen == 0) return ();
let front_index = 0;
let back_index = vlen -1;
while ({
spec {
assert front_index + back_index == vlen - 1;
assert forall i in 0..front_index: v[i] == old(v)[vlen-1-i];
assert forall i in 0..front_index: v[vlen-1-i] == old(v)[i];
assert forall j in front_index..back_index+1: v[j] == old(v)[j];
assert len(v) == vlen;
};
(front_index < back_index)
}) {
vector::swap(v, front_index, back_index);
front_index = front_index + 1;
back_index = back_index - 1;
};
}
spec verify_reverse {
aborts_if false;
ensures forall i in 0..len(v): v[i] == old(v)[len(v)-1-i];
}
Note the usage of old(v)
in the loop invariants. Without them, it is hard to express the
invariant that the vector is partially reversed while the loop is iterating and the rest
remain unchanged.
However, unlike the old(T)
expressions in ensures
conditions where T
can be any valid
expression (e.g., old(v[i])
is allowed), the old(T)
expressions in assert
and assumes
statements only accept a single variable as T
and that variable must be a function argument of
a mutable reference type. In the above example, old(v[i])
is not allowed, and we should use
old(v)[i]
instead.
MSL supports spec variables, also called ghost variables in the verification community. These variables are used only in specifications and represent information derived from the global state of resources. An example use case is to compute the sum of all coins available in the system and specify that the sum can be changed only in certain scenarios.
We illustrate this feature by introducing a spec variable which maintains the sum of all Counter
resources from our running example. First, a spec variable is introduced via spec module block as
follows:
spec module {
global sum_of_counters: num;
}
This value is going to be updated whenever a Counter
is packed or unpacked. (Recall that mutation
is interpreted as an implicit unpack and pack):
spec Counter {
invariant pack sum_of_counters = sum_of_counters + value;
invariant unpack sum_of_counters = sum_of_counters - value;
}
TODO:
invariant pack
andinvariant unpack
are currently not implemented
Now we may for example want to specify that the sum of all Counter instances in the global state should never exceed a particular value. We can do this as follows:
spec module {
invariant [global] sum_of_counters < 4711;
}
Note that spec variables can be referenced also from helper functions. Moreover, spec variables can be generic:
spec module {
global some_generic_var<T>: num;
}
When using such a spec variable, a type parameter need to be provided, as in some_generic_var<u64>
. Effectively, a generic spec variable is like a family of variables indexed by types.
Schemas are a means for structuring specifications by grouping properties together. Semantically, they are just syntactic sugar which expand to conditions on functions, structs, or modules.
Schemas are used as such:
spec schema IncrementAborts {
a: address;
aborts_if !exists<Counter>(a);
aborts_if global<Counter>(a).value == 255;
}
spec increment {
include IncrementAborts;
}
Each schema may declare a number of typed variable names and a list of conditions over those variables. All supported condition types can be used in schemas. The schema can then be included in another spec block:
- If that spec block is for a function or a struct, all variable names the schema declares must be matched against existing names of compatible type in the context.
- If a schema is included in another schema, existing names are matched and must have the same type, but non-existing names will be added as new declarations to the inclusion context.
When a schema is included in another spec block, it will be checked whether the conditions it
contains are allowed in this block. For example, including the schema IncrementAborts
into a
struct spec block will lead to a compile time error.
When a schema is included the names it declares can also bound by expressions. For example, one can
write include IncrementAborts{a: some_helper_address()}
. Effectively, not providing a binding is
equivalent to writing IncrementAborts{a: a}
if a
is an existing name in scope.
Schemas can be generic. Generic schemas must be fully instantiated where they are included; type inference is not available for schemas.
When a schema is included, one can use a limited set of Boolean operators as follows:
P ==> SchemaExp
: all conditions in the schema will be prefixed withP ==> ..
. Conditions which are not based on boolean expressions will be rejected.if (P) SchemaExp1 else SchemaExp2
: this is treated similar as including bothP ==> SchemaExp1
and!P ==> SchemaExp2
.SchemaExp1 && SchemaExp2
: this is treated as two includes for the both schema expressions.
One of the main uses cases for schemas is to be able to name a group of properties and then apply
those to a set of functions. This is achieved by the apply
operator. The apply
spec block member
can only appear in module spec blocks.
The general form of the apply operator
is apply Schema to FunctionPattern, .. except FunctionPattern, ..
. Here,
Schema
can be a schema name or a schema name plus formal type arguments.
FunctionPatterns
consists of an optional visibility modifier public
or internal
(if not
provided, both visibilities will match), a name pattern in the style of a shell file pattern (
e.g. *
, foo*
, foo*bar
, etc.)
, and finally an optional type argument list. All type arguments provided to Schema
must be bound
in this list and vice versa.
The apply operator includes the given schema in all function spec blocks which match the patterns,
except those which are excluded via the except
patterns.
A typical use of the apply operator is to provide common pre and post conditions to all functions in a module with some exceptions. Example:
spec schema Unchanged {
let resource = global<R>(ADDR):
ensures resource == old(resource);
}
spec module {
// Enforce Unchanged for all functions except the initialize function.
apply Unchanged to * except initialize;
}
Notice that while with global invariants we can express similar things, we * cannot* express the restriction of the invariant to only specific functions.
With the pragma opaque
, a function is declared to be solely defined by its specification at caller
sides. In contrast, if this pragma is not provided, then the function's implementation will be used
as the basis to verify the caller.
Using opaque
requires the specification to be sufficiently complete for the verification problem
at hand. Without opaque, the Move prover will use the implementation as the source of truth of the
definition of the function. But with opaque, if there is an aspect of the function definition
unspecified, an arbitrary meaning will be assumed. For example, with the specification below,
the increment
function can abort under arbitrary conditions:
spec increment {
pragma opaque;
// aborts_if !exists<Counter>(a); // We need to add this to make the function not abort arbitrarily
ensures global<Counter>(a) == old(global<Counter>(a)) + 1;
}
In general, opaque functions enable modular verification, as they abstract from the implementation of functions, resulting in much faster verification.
If an opaque function modifies state, it is advised to use
the modifies
condition in its specification. If this is omitted,
verification of the state changes will fail.
The [abstract]
property allow to specify a function such that an abstract semantics is used at the
caller side which is different from the actual implementation. This is useful if the implementation
is too complex for verification, and an abstract semantics is sufficient for verification goals.
The [concrete]
property, in turn, allows to still specify conditions which are verified against
the implementation, but not used at the caller side.
Consider the following example of a hash function. The actual value of the hash is not relevant for verification of callers, and we use an uninterpreted helper function delivering an arbitrary value chosen by the prover. We can still specify the concrete implementation and verify its correctness:
fun hash(v: vector<u8>): u64 {
<<sum up values>>(v)
}
spec hash {
pragma opaque;
aborts_if false;
ensures [concrete] result == <<sum up values>>(v);
ensures [abstract] result == spec_hash_abstract(v);
}
spec fun abstract_hash(v: vector<u8>): u64; // uninterpreted function
The soundness of the abstraction is the responsibility of the specifier, and not verified by the prover.
NOTE: the abstract/concrete properties should only be used with opaque specifications, but the prover will currently not generate an error if not.
NOTE: the
modifies
clause does currently not support abstract/concrete. Also, if no modifies is given, the modified state will be computed from the implementation anyway, possibly conflicting with[abstract]
properties.
The organization of specification blocks in a file is relevant for documentation generation -- even though it is not for the semantics. See discussion of organization of specification blocks.
The Move Specification language is expressive enough to represent the full Move language semantics (
formal argument outstanding) with one exception: functions which return a &mut T
type.
Consider the following code:
struct S { x: u64, y: u64 }
fun x_or_y(b: bool, s: &mut S): &mut u64 {
if (b) &mut s.x else &mut s.y
}
spec x_or_y {
ensures b ==> result == s.x;
ensures !b ==> result == s.y;
}
We are not able to specify the full semantics of x_or_y
in MSL, because we cannot capture the
semantics of mutable references. While we can say something about the value behind the reference at
function exit, subsequent effects as in *x_or_y(b, &mut s) = 2
cannot be specified.
However, the Move prover does understand the meaning of such functions -- the restriction is only
in what we can specify. Practically this means we cannot make the function x_or_y
opaque, and must
let verification rely on that the prover directly works with the implementation. Specifically, we
can verify the following
(which can then be opaque):
fun x_or_y_test(s: S): S {
*x_or_y(true, &mut s) = 2;
s
}
spec x_or_y_test {
pragma opaque;
ensures result.x == 2;
ensures result.y == s.y;
}