Lurk is a statically scoped dialect of Lisp, influenced by Scheme and Common Lisp.
Some parts of this specification repeat R5RS.
An error is raised when a program does not satisfy a "must" or "should" clause. We use "must" clauses for syntactic requirements and "should" for semantic requirements.
The default integer type for Lurk is not an integer, but rather a Num. These are elements defined over a prime field F_p
, where the order p
is a parameter of the language.
Lurk supports support for fixed-width (64-bit) unsigned integers. Literal U64s are denoted with a suffix, either "u64" or "U64" directly following a valid number. It is a reader/parser error to specify a U64 from a number greater than the maximum U64.
Arithmetic operations are defined over a prime field F_p
, where the order p
is a parameter of the language.
Let +
, -
, *
, /
, %
, >
, >=
, <
, <=
, =
be noted as <op>
. The expression (<op> <e1> <e2>)
evaluates by first evaluating <e1>
and <e2>
in order. The resulting values should be either a Num or a U64. Then, the operation op
is performed on these two Nums.
The mod operator %
is a special case: <e1>
and <e2>
must evaluate to U64s.
Arithmetic operations on two Nums yield Nums and are treated as field operations done modulo p
; division is field division.
Comparisons interpret Nums as signed. The first half ranging from 0
exclusive to (p - 1)/2
is considered positive. The second half ranging from (p - 1)/2 + 1
to p - 1
is considered negative. For every positive n
, there is a corresponding negative m
, such that n + m = 0
. Furthermore, 0
is considered neither positive nor negative, and we have that 0 + 0 = 0
. The result is that the interpretation of signed Nums implied by the definition of < (etc.) is arithmetically consistent.
Arithmetic operations on two U64s yield U64s and are done modulo 2^64
; division is integer division.
Arithmetic operations that mix U64 and Num yield Nums and are performed by first coercing the U64 to Num.
The reader has support for parsing negative Nums -<n>
and fractional Nums <n>/<m>
.
A Num preceded by -
is subtracted from 0
at read time.
A Num <n>
preceded by /
and another Num <m>
is calculated at read-time by performing a field-element division. This means that for arithmetic and algebraic purposes, the resulting field elements appear to behave like the corresponding rationals. However, note carefully, that this similarity does not extend to relational comparisons. For example, (< 1/2 1/3) => t
, since 1/2
is the most negative Num, although this violates expectations if you consider 1/2 to behave like a rational.
The u64
suffix cannot be combined with negative or fractional notation.
Built-in forms are all special, in that they cannot be shadowed: built-in values cannot be shadowed in any context, and syntactic forms (<operator> <operand> ...)
cannot be shadowed in procedural context. For example, re-defining car
will not change the behavior of (car '(1 2))
, but the re-defined car
will be visible in (((lambda (x) x) car) '(1 2))
.
Like in Common Lisp, the expressions t
and nil
are self-evaluating and denote true and false respectively.
The expression (if <test> <consequent> <alternate>)
is evaluated by first evaluating <test>
and then evaluating the <consequent>
or the <alternative>
depending on whether the <test>
value is non-nil
(true) or nil
.
Unlike in Scheme and Common Lisp, the <alternate>
expression cannot be omitted.
The expression (lambda <formals> <body>)
evaluates to a procedure.
The environment in effect when the lambda
expression was evaluated is remembered as part of the procedure (a closure). When the procedure is later called with some actual arguments, the environment in which the lambda
expression was evaluated will be extended by binding the variables in the formal argument list to the corresponding actual argument values, and the expression in the body of the lambda
expression will be evaluated in the extended environment. The result of the expression in the body will be returned as the result of the procedure call.
The <formals>
must be a (possibly empty) list of variable names, (<variable1> ...)
, denoting a fixed number of arguments. The <body>
term must be exactly one expression (unlike Scheme).
In a procedure call (<operator> <operand> ...)
, assuming <operator>
is not a built-in form, the <operator>
should evaluate to a procedure (a closure), and the remaining arguments <operand> ...
are evaluated from left to right.
Auto-currying: lambda
expressions taking multiple parameters are automatically transformed into nested lambda
expressions of one parameter each. Correspondingly, application of multiple arguments are automatically transformed into nested applications of one argument each. Auto-currying naturally supports partial applications. Hence a procedure does not need to be called with exactly the same number of arguments as the number of formals.
In the expression (let <bindings> <body>)
, <bindings>
must have the form ((<variable1> <init1>) ...)
and body must be exactly one expression (unlike Scheme).
In a let
expression, the bindings are performed sequentially from left to right, like Scheme's let*
and unlike Scheme's let
, which establishes bindings in parallel.
The region of a binding indicated by (<variable> <init>)
is that part of the let
expression to the right of the binding. Thus the second binding is done in an environment in which the first is visible, and so on.
The letrec
expression has the same form as the let
expression: (letrec <bindings> <body>)
.
In a letrec
expression, the bindings are performed sequentially from left to right.
The region of a binding indicated by (<variable> <init>)
is that part of the let
expression to the right of the binding but including itself. Thus the second binding is done in an environment in which the first and second are visible, and so on.
Note: Lurk's letrec
is unlike Scheme's letrec
in that only self (not mutual) recursion is allowed. It is similar in evaluation order to Scheme's letrec*
, but not in the regions.
The expression (quote <datum>)
or '<datum>
evaluates to <datum>
.
The expression (atom <e>)
evaluates to t
if <e>
evaluates to an atom (non-list) and to nil
if <e>
evaluates to a non-atom (list).
The expression (functionp <e>)
evaluates to t
if <e>
evaluates to a procedure (a closure, the result of a lambda
) and to nil
if <e>
evaluates to a non-procedure.
The expression (cons <a> <d>)
produces a pair whose car
is the result of evaluating <a>
and whose cdr
is the result of evaluating <d>
. The expression (car <e>)
and (cdr <e>)
return the car
and cdr
of the pair resulting from the evaluation of <e>
.
When <a>
evaluates to a character and <d>
evaluates to a string, (strcons <a> <d>)
produces a string. When <e>
is a string rather than a pair, the expression (car <e>)
returns the first character (and nil
when the argument is the empty string) and the expression (cdr <e>)
returns the substring without the first character (and the empty string when the argument is the empty string).
Consing is implemented by hash-consing.
It is an error to call strcons
with arguments of types other than character and string as described above.
It is an error to call car
or cdr
on an argument value that is not a pair or a string.
The expression (num <e>)
first evaluates <e>
and attempts to coerce <e>
to a Num. In the case <e>
evaluates to a
- character,
num
returns theu32
ASCII value of the result. - Num,
num
simply returns the Num. - commitment,
num
returns the underlying Num. - U64,
num
coerces the U64 to a Num, with no loss of precision.
The expression (u64 <e>)
first evaluates <e>
and attempts to coerce <e>
to a U64. In the case <e>
evaluates to a
- Num,
u64
coerces Num to U64, preserving only the least-significant 8 bytes.
The expression (char <e>)
first evaluates <e>
and attempts to coerce <e>
to a character. In the case <e>
evaluates to a
- Num,
char
returns the character which has the ASCII value of the Num - character,
char
simply returns the character.
The expression (eq <e1> <e2>)
evaluates by first evaluating the expressions <e1>
and <e2>
and then comparing the resulting values. Deep equality is used.
The expression (emit <e>)
evaluates the expression <e>
and "emits" as well as returns the resulting value.
Emitting is an implementation-specific operation that makes the value visible to the outside world.
The expression (begin <e> ...)
evaluates each expression <e>
in order and returns the result of evaluating the last expression.
The expression (current-env)
evaluates to the current environment represented as an association list.
An environment is represented as list where each item is either a binding or an environment. An item is an environment when the extension is recursive.
The expression (eval <exp>)
or (eval <exp> <env>)
first evaluates the expression <exp>
, and then evaluates the resulting value treated as an expression in the environment obtained from the evaluation of <env>
. If the argument <env>
is omitted, the empty environment is used.
Lurk has built-in support for cryptographic commitments. Lurk commitments are based on Poseidon hashes and are computationally binding.
When <e>
evaluates to a Num, the expression (comm <e>)
returns a Lurk commitment tagged with the resulting Num.
The expression (commit <e>)
evaluates <e>
and commits the result, returning the commitment associated with <e>
. Lurk allows commitments to any Lurk expression.
The expression (hide <n> <e>)
evaluates <n>
, which must result in a Num, and <e>
, then commits <e>
with secret <n>
. When hiding is unimportant, commit
creates commitments with a default secret of 0.
The expression (open <e>)
evaluates <e>
and, if the result is either an commitment or a field element, opens the commitment and returns its value.
The values of the language are t
, nil
, procedures such as those created by lambda
, field Nums such as 18
, U64 integers such as 37u64
, symbols such as the foo
datum resulting from the evaluation of the quoted expression 'foo
, strings such as "foo"
, characters such as #\a
, conses such as (1 . 2)
or (1 a "t")
, and commitments such as (comm 0x185e06...c7aa7f2)
.