Skip to content

Latest commit

 

History

History
171 lines (90 loc) · 10.9 KB

v0-1.md

File metadata and controls

171 lines (90 loc) · 10.9 KB

Lurk Language Specification

Lurk is a statically scoped dialect of Lisp, influenced by Scheme and Common Lisp.

Acks

Some parts of this specification repeat R5RS.

Notes

An error is raised when a program does not satisfy a "must" or "should" clause. We use "must" clauses for syntactic requirements and "should" for semantic requirements.

Integer Types

Num

The default integer type for Lurk is not an integer, but rather a Num. These are elements defined over a prime field F_p, where the order p is a parameter of the language.

Fixed Width UInts: u64, (and more to come)

Lurk supports support for fixed-width (64-bit) unsigned integers. Literal U64s are denoted with a suffix, either "u64" or "U64" directly following a valid number. It is a reader/parser error to specify a U64 from a number greater than the maximum U64.

Arithmetic operations +, -, *, /, %, >, >=, <, <=, =

Arithmetic operations are defined over a prime field F_p, where the order p is a parameter of the language.

Let +, -, *, /, %, >, >=, <, <=, = be noted as <op>. The expression (<op> <e1> <e2>) evaluates by first evaluating <e1> and <e2> in order. The resulting values should be either a Num or a U64. Then, the operation op is performed on these two Nums.

The mod operator % is a special case: <e1> and <e2> must evaluate to U64s.

Arithmetic operations on two Nums yield Nums and are treated as field operations done modulo p; division is field division.

Comparisons interpret Nums as signed. The first half ranging from 0 exclusive to (p - 1)/2 is considered positive. The second half ranging from (p - 1)/2 + 1 to p - 1 is considered negative. For every positive n, there is a corresponding negative m, such that n + m = 0. Furthermore, 0 is considered neither positive nor negative, and we have that 0 + 0 = 0. The result is that the interpretation of signed Nums implied by the definition of < (etc.) is arithmetically consistent.

Arithmetic operations on two U64s yield U64s and are done modulo 2^64; division is integer division.

Arithmetic operations that mix U64 and Num yield Nums and are performed by first coercing the U64 to Num.

Numeral Parsing

The reader has support for parsing negative Nums -<n> and fractional Nums <n>/<m>.

A Num preceded by - is subtracted from 0 at read time.

A Num <n> preceded by / and another Num <m> is calculated at read-time by performing a field-element division. This means that for arithmetic and algebraic purposes, the resulting field elements appear to behave like the corresponding rationals. However, note carefully, that this similarity does not extend to relational comparisons. For example, (< 1/2 1/3) => t, since 1/2 is the most negative Num, although this violates expectations if you consider 1/2 to behave like a rational.

The u64 suffix cannot be combined with negative or fractional notation.

Built-Ins

No Shadowing

Built-in forms are all special, in that they cannot be shadowed: built-in values cannot be shadowed in any context, and syntactic forms (<operator> <operand> ...) cannot be shadowed in procedural context. For example, re-defining car will not change the behavior of (car '(1 2)), but the re-defined car will be visible in (((lambda (x) x) car) '(1 2)).

t, nil

Like in Common Lisp, the expressions t and nil are self-evaluating and denote true and false respectively.

if

The expression (if <test> <consequent> <alternate>) is evaluated by first evaluating <test> and then evaluating the <consequent> or the <alternative> depending on whether the <test> value is non-nil (true) or nil.

Unlike in Scheme and Common Lisp, the <alternate> expression cannot be omitted.

lambda

The expression (lambda <formals> <body>) evaluates to a procedure.

The environment in effect when the lambda expression was evaluated is remembered as part of the procedure (a closure). When the procedure is later called with some actual arguments, the environment in which the lambda expression was evaluated will be extended by binding the variables in the formal argument list to the corresponding actual argument values, and the expression in the body of the lambda expression will be evaluated in the extended environment. The result of the expression in the body will be returned as the result of the procedure call.

The <formals> must be a (possibly empty) list of variable names, (<variable1> ...), denoting a fixed number of arguments. The <body> term must be exactly one expression (unlike Scheme).

In a procedure call (<operator> <operand> ...), assuming <operator> is not a built-in form, the <operator> should evaluate to a procedure (a closure), and the remaining arguments <operand> ... are evaluated from left to right.

Auto-currying: lambda expressions taking multiple parameters are automatically transformed into nested lambda expressions of one parameter each. Correspondingly, application of multiple arguments are automatically transformed into nested applications of one argument each. Auto-currying naturally supports partial applications. Hence a procedure does not need to be called with exactly the same number of arguments as the number of formals.

let

In the expression (let <bindings> <body>), <bindings> must have the form ((<variable1> <init1>) ...) and body must be exactly one expression (unlike Scheme).

In a let expression, the bindings are performed sequentially from left to right, like Scheme's let* and unlike Scheme's let, which establishes bindings in parallel.

The region of a binding indicated by (<variable> <init>) is that part of the let expression to the right of the binding. Thus the second binding is done in an environment in which the first is visible, and so on.

letrec

The letrec expression has the same form as the let expression: (letrec <bindings> <body>).

In a letrec expression, the bindings are performed sequentially from left to right.

The region of a binding indicated by (<variable> <init>) is that part of the let expression to the right of the binding but including itself. Thus the second binding is done in an environment in which the first and second are visible, and so on.

Note: Lurk's letrec is unlike Scheme's letrec in that only self (not mutual) recursion is allowed. It is similar in evaluation order to Scheme's letrec*, but not in the regions.

quote

The expression (quote <datum>) or '<datum> evaluates to <datum>.

atom

The expression (atom <e>) evaluates to t if <e> evaluates to an atom (non-list) and to nil if <e> evaluates to a non-atom (list).

functionp

The expression (functionp <e>) evaluates to t if <e> evaluates to a procedure (a closure, the result of a lambda) and to nil if <e> evaluates to a non-procedure.

cons, strcons, car, cdr

The expression (cons <a> <d>) produces a pair whose car is the result of evaluating <a> and whose cdr is the result of evaluating <d>. The expression (car <e>) and (cdr <e>) return the car and cdr of the pair resulting from the evaluation of <e>.

When <a> evaluates to a character and <d> evaluates to a string, (strcons <a> <d>) produces a string. When <e> is a string rather than a pair, the expression (car <e>) returns the first character (and nil when the argument is the empty string) and the expression (cdr <e>) returns the substring without the first character (and the empty string when the argument is the empty string).

Consing is implemented by hash-consing.

It is an error to call strcons with arguments of types other than character and string as described above.

It is an error to call car or cdr on an argument value that is not a pair or a string.

num, u64, char

The expression (num <e>) first evaluates <e> and attempts to coerce <e> to a Num. In the case <e> evaluates to a

  1. character, num returns the u32 ASCII value of the result.
  2. Num, num simply returns the Num.
  3. commitment, num returns the underlying Num.
  4. U64, num coerces the U64 to a Num, with no loss of precision.

The expression (u64 <e>) first evaluates <e> and attempts to coerce <e> to a U64. In the case <e> evaluates to a

  1. Num, u64 coerces Num to U64, preserving only the least-significant 8 bytes.

The expression (char <e>) first evaluates <e> and attempts to coerce <e> to a character. In the case <e> evaluates to a

  1. Num, char returns the character which has the ASCII value of the Num
  2. character, char simply returns the character.

eq

The expression (eq <e1> <e2>) evaluates by first evaluating the expressions <e1> and <e2> and then comparing the resulting values. Deep equality is used.

emit

The expression (emit <e>) evaluates the expression <e> and "emits" as well as returns the resulting value. Emitting is an implementation-specific operation that makes the value visible to the outside world.

begin

The expression (begin <e> ...) evaluates each expression <e> in order and returns the result of evaluating the last expression.

current-env

The expression (current-env) evaluates to the current environment represented as an association list.

An environment is represented as list where each item is either a binding or an environment. An item is an environment when the extension is recursive.

eval

The expression (eval <exp>) or (eval <exp> <env>) first evaluates the expression <exp>, and then evaluates the resulting value treated as an expression in the environment obtained from the evaluation of <env>. If the argument <env> is omitted, the empty environment is used.

comm, commit, hide, open

Lurk has built-in support for cryptographic commitments. Lurk commitments are based on Poseidon hashes and are computationally binding.

When <e> evaluates to a Num, the expression (comm <e>) returns a Lurk commitment tagged with the resulting Num.

The expression (commit <e>) evaluates <e> and commits the result, returning the commitment associated with <e>. Lurk allows commitments to any Lurk expression.

The expression (hide <n> <e>) evaluates <n>, which must result in a Num, and <e>, then commits <e> with secret <n>. When hiding is unimportant, commit creates commitments with a default secret of 0.

The expression (open <e>) evaluates <e> and, if the result is either an commitment or a field element, opens the commitment and returns its value.

Values

The values of the language are t, nil, procedures such as those created by lambda, field Nums such as 18, U64 integers such as 37u64, symbols such as the foo datum resulting from the evaluation of the quoted expression 'foo, strings such as "foo", characters such as #\a, conses such as (1 . 2) or (1 a "t"), and commitments such as (comm 0x185e06...c7aa7f2).