Synduce is a program synthesizer: you specify a reference function, and a target recursion skeleton, and Synduce completes it for you.
Synduce is mainly written in OCaml using the dune
build system and opam
to install package dependencies.
You will need a recent OCaml (>= 4.08.0) installation and the OCaml Package Manager (opam) to get started.
The Ocaml dependencies of this project can be installed via opam. The syguslib-utils
library must be installed from the submodule. To install the dependencies, run:
git submodule init
git submodule update
opam pin syguslib-utils # pin and install the syguslib-utils dependency
opam install . --deps-only # install the rest of the dependencies
Once all the dependencies are installed, call make
from the root of the project. The Makefile simply calls dune build
and creates a shortcut to the binary executable.
Synduce also requires:
- Z3 (> 4.8.10) for solving some SMT problems during bounded checking of recursive functions.
- CVC4 (>= 1.8) for some induction proofs, SMT problems and SyGuS problem solving.
- (Optional) CVC5: Synduce can use CVC5 instead of CVC4, but has been tested mostly with CVC4.
Synduce detects where your binaries are using which
by calling which z3
and which cvc4
, so please make sure that the solver executable names are z3
and cvc4
(and not cvc4-1.8
for example).
If these solvers are not present in the environment, then Synduce will produce a runtime error.
Note that Synduce does not rely on any specific API and instead communicates with the solver through the SMTLib v2.6 and SyGus Lib v2 language standards.
The script setup.sh
installs all the dependencies and packages from a fresh Ubuntu installation.
The documentation for the library in the dev
branch is here.
./bin/
contains all the sources for the executable,./src/
contains all the sources for the libraries. Thelang
folder is where you will find most of the language definitions andalgo
the algorithmic components../benchmarks/
contains benchmarks and sample inputs.parse_examples/parsing.pmrs
is an example of the input syntax for the PMRS with recursive type definitions. The syntax is similar to Caml, except for the recursion scheme declarations. Call./benchmarks/report.py
to see a list of benchmarks. The./benchmarks/unrealizable
folder contains examples on which the tool either fails or runs forever because no solution exists. You will find benchmarks with predicates on input type in thebenchmarks/constraints
folder.
You can run the set of benchmarks of Synduce by executing the following command:
./benchmarks/test.py -b all
This can take up to an hour depending on your computer. If you just want to perform a quick check that everything is running properly, you can use a smaller set of benchmarks:
./benchmarks/test.py -b small
You can also benchmark a single file. For example, if you want to run benchmarks/list/mss.ml
10 times, using cvc5 as backend solver instead of cvc4, run:
./benchmarks/test.py --single list/mss.ml+"--cvc5" -n 10
The argument to --single
should be the path to the file in the benchmarks folders, and if necessary, add options to pass to the tool by adding +"options".
For a single benchmark, the script prints a line of the form:
(1 / 1) ✅ mss ×10 runs, average: 9.158s ± 208ms R: ++✓
Indicating the average runtime of running the tool on the benchmark with a delta. The string after the R: represents the refinement rounds necessary to produce the solution. A "+" indicates that the approximation needed to be strenghtened, a "." indicates weakening and "^" indicates lifting. For a successful benchmark, the string should end by a check mark, indicating the solution was successfully checked in the last round.
The cav21
release is the version of the tool described in the CAV'21 paper Counterexample-Guided Partial Bounding for Recursive Function Synthesis
.
The pldi22
release corresponds to the tool as described in the PLDI'22 paper `Recursion Synthesis with Unrealizability Witnesses'.
./Synduce -h
should get you started.
The subfolders of the benchmarks
folder contains input examples. An input problem is defined by three components: a reference function, a representation function and a recursion skeleton. By default, the tool looks for a reference function called spec
, a representation function repr
and a recursion skeleton target
.
The file should start by defining the input types (which must be recursive types) of the reference
and the recursion skeleton. If the input type of the reference function is tau
and the input type
of the recursion skeleton is theta
, the representation function must be a function from theta
to
tau
.
The tool accepts inputs defined in two different formats: using a syntax specific for pattern-matching recursion scheme (PRMS) (files with .pmrs
extension) or an OCaml file (.ml
extension). The PMRS syntax is more feature complete, but we are actively working on the ML syntax.
See the .ml
files in the benchmarks folder. Specifications are supported through syntax extensions: optional objects such as invariants are specified through attributes, and mandatory components (functions to be synthesized) are written using [%synt name-of-the-function]
.
For example, in the file benchmarks/list/sum.ml
we have the following type definitions:
type 'a clist = CNil | Single of 'a | Concat of 'a clist * 'a clist
type 'a list = Nil | Cons of 'a * 'a list
The first type is the type of lists built with the concatenation operator from empty lists or singletons. The second type is more natural: it is the type of cons-lists. With these two types we can specify problems that consists in parallelizing functions on lists. The reference function can be defined on cons-lists, for example the function that adds all the elements of the list:
let rec sum =
function
| Nil -> 0
| Cons(hd,tl) -> hd + (sum tl)
Our goal is to synthesize a function of type int clist -> int
, but we have initially a function int list -> int
. Let us write a function clist_to_list : 'a clist -> 'a list
:
let rec clist_to_list =
function
| CNil -> Nil
| Single(a) -> Cons(a, Nil)
| Concat(x, y) -> dec y x
and dec l1 =
function
| CNil -> clist_to_list l1
| Single(a) -> Cons(a, clist_to_list l1)
| Concat(x, y) -> dec (Concat(l1, y)) x
Two of the three components of the synthesis problem are defined. Now, let us write the recursion skeleton that needs to be synthesized. The function hsum: int clist -> int
with unknowns s0
, f0
and odot
defines a list homomorphism:
let rec hsum = function
| CNil -> [%synt s0]
| Single a -> [%synt f0] a
| Concat (x, y) -> [%synt join] (hsum x) (hsum y)
;;
assert (hsum = clist_to_list @@ sum)
The functions to be synthesized are [%synt s0]
, [%synt f0]
and [%synt join]
. The syntax extensions can be replaced by either simple functions or constants.
The final assertions indicates that the tool should synthesize the expressions for s0
, f0
and join
such that the function hsum
is functionally equivalent to using clist_to_list
and then sum
(i.e. for any input x
we have hsum x = sum (clist_to_list x)
).
Note that the representation function and the target recursion skeleton can be reused across a large set of examples. All the benchmarks in benchmarks/list
use the same repr
and target
, modulo some changes in the base case when lists have a minimum size.
Running ./Synduce benchmarks/list/sum.ml
returns the solution in less than a second:
let s0 = 0
let f0 x = x
let join x40 x42 = x40 + x42
let rec hsum = function
| CNil -> s0
| Single(a) -> f0 a
| Concat(x, y) -> join (hsum x) (hsum y)
A function can be given attributes that provide more information about their input and output domains.
For example, in benchmarks/list/mts.ml
, the first output element of the mts
function is always positive
and larger than the second output element:
let rec mts = function
| Nil -> (0, 0)
| Cons (hd, tl) ->
let amts, asum = mts tl in
let new_sum = asum + hd in
(max amts new_sum, new_sum)
(* the mts is always positive *)
[@@ensures fun (x, y) -> x >= 0 && x >= y]
This is specified by the [@@ensures predicate]
where predicate
is a function from the output domain of
the function to which the attribute is attached, to booleans.
A requires attributes can be provided to constrain the inputs that a function can accept. For an example, see benchmarks/constraints/bst/contains.ml
.
The target recursion skeleton here is expected to be only applied to trees that are binary search trees. The user specifies their intent by adding the attribute [@@requires is_bst]
in the following:
let target y t =
let rec g = function
| Leaf a -> [%synt xi_0] y a
| Node (a, l, r) -> if y < a then [%synt xi_1] (g l) else [%synt xi_2] y a (g l) (g r)
in
g t
[@@requires is_bst]
When a requires
is specified on the function to be synthesized (the target function), then
the tool effectively solves the problem '∀ x : Θ. T_inv(x) ⇒ target(x) = spec(repr(x))where
T_inv` is the function given to the requires clause.
The tool is not limited to unary functions. An example of a benchmark where the function also accepts and additional argument can be found in benchmarks/list/sum.ml
.
The syntax to write functions with additional non-recursible parameters is of the form:
let f a b c t =
(* The last argument t must be the one used in the recursion.
The other arguments a,b,c are "parameters" of the pattern-matching recursion scheme. *)
(* Define mutually recursive functions. Parameters are in scope. *)
let rec aux = function _ -> (* .. *)
and aux2 = (* function defintion .. *)
in
(* Call the main recursive function *)
aux t
General remark on writing functions with multiple parameters for recursion
If one wants to write a function that accepts two trees as arguments one can always write a function that accept as input a type with a single constructor that combines two trees. In general, the user of the tool can provide functions with multiple input parameters, with the specificity that in this case the function application cannot be partial.
In practice, this also potentially limits the applicability of recursion elimination and partial bounding, as the toplevel function that can be eliminated is now the function that accepts a specific datatype that is not the main datatype of interest.
An example can be found in benchmarks/numbers/int_nat_twosum.ml
.
Synduce
uses a special syntax to specify recursion schemes (Pattern Matching Recursion Schemes).
The files ending in .pmrs
are examples of this syntax. The benchmark benchmarks/list/mpshom.pmrs
includes detailed comments on the input format.
The general syntax of a pattern matching recursion scheme is as follows:
pmrs (unknowns*) pmrs_name pmrs_args {invariant} =
| f x_1 ... x_n -> t
| g y_1 ... y_m p -> t
...
(unknowns*)
is an optional list of string identifying the unknowns of the pmrs.pmrs_name
is the name of the pmrs.pmrs_args
are the non-recursible arguments of the pmrs. They cannot be matched. See for example the 'benchmarks/list/search.pmrs' for an example. This is useful when the function needs more than one argument.{invariant}
is an optional invariant of the function (for the reference function).invariant
is a function of the formfun (x1, x2, ...) -> expr(x1, x2, ...)
wherex1, x2, ..
is the output of the function, andexpr(x1, x2, ..)
a boolean expression.invariant
indicates thatinvariant(pmrs_name(x))
is true for any inputx
of the pmrs.f x_1 ... x_n -> t
is a non-pattern matching rule with argumentsx_1 ... x_n
and contractumt
.g x_1 ... x_n p -> t
is a pattern matching rule with argumentsx_1 ... x_n
and patternp
and contractumt
.- The first rule defines the main function of the PMRS. The type of the non-terminal symbol of the first rule should be the type of a function from input type of the PMRS to output type.