diff --git a/book/error-handling/README.md b/book/error-handling/README.md index e29d16217..15a150f29 100644 --- a/book/error-handling/README.md +++ b/book/error-handling/README.md @@ -176,13 +176,18 @@ package/sexp converter]{.idx} Note that the character isn't actually serialized into an s-expression until the error is printed out. -We're not restricted to doing this kind of error reporting with built-in -types. This will be discussed in more detail in -[Data Serialization With S Expressions](data-serialization.html#data-serialization-with-s-expressions){data-type=xref}, +We're not restricted to doing this kind of error reporting with +built-in types. This will be discussed in more detail in [Data +Serialization With S +Expressions](data-serialization.html#data-serialization-with-s-expressions){data-type=xref}, but Sexplib comes with a language extension that can autogenerate sexp -converters for newly generated types: +converters for newly generated types. We can enable it explicitly in +the toplevel with a `#require` statement. + + ```ocaml env=main +# #require "ppx_jane" # let custom_to_sexp = [%sexp_of: float * string list * int] val custom_to_sexp : float * string list * int -> Sexp.t = # custom_to_sexp (3.5, ["a";"b";"c"], 6034) @@ -218,12 +223,13 @@ it is, after `option`, the most common way of returning errors in Base. ### `bind` and Other Error Handling Idioms -As you write more error handling code in OCaml, you'll discover that certain -patterns start to emerge. A number of these common patterns have been -codified by functions in modules like `Option` and `Result`. One particularly -useful pattern is built around the function `bind`, which is both an ordinary -function and an infix operator `>>=`. Here's the definition of `bind` for -options: [bind function]{.idx} +As you write more error handling code in OCaml, you'll discover that +certain patterns start to emerge. A number of these common patterns +have been codified by functions in modules like `Option` and +`Result`. One particularly useful pattern is built around the function +`bind`, which is both an ordinary function and an infix operator +`>>=`. Here's the definition of `bind` for options: [bind +function]{.idx} ```ocaml env=main # let bind option f = @@ -234,10 +240,10 @@ val bind : 'a option -> ('a -> 'b option) -> 'b option = ``` As you can see, `bind None f` returns `None` without calling `f`, and -`bind (Some x) f` returns `f x`. `bind` can be used as a way of sequencing -together error-producing functions so that the first one to produce an error -terminates the computation. Here's a rewrite of `compute_bounds` to use a -nested series of `bind`s: +`bind (Some x) f` returns `f x`. `bind` can be used as a way of +sequencing together error-producing functions so that the first one to +produce an error terminates the computation. Here's a rewrite of +`compute_bounds` to use a nested series of `bind`s: ```ocaml env=main # let compute_bounds ~compare list = @@ -249,13 +255,14 @@ val compute_bounds : compare:('a -> 'a -> int) -> 'a list -> ('a * 'a) option = ``` -The preceding code is a little bit hard to swallow, however, on a syntactic -level. We can make it easier to read and drop some of the parentheses, by -using the infix operator form of `bind`, which we get access to by locally -opening `Option.Monad_infix`. The module is called `Monad_infix` because the -`bind` operator is part of a subinterface called `Monad`, which we'll see -again in -[Concurrent Programming With Async](concurrent-programming.html#concurrent-programming-with-async){data-type=xref}. +The preceding code is a little bit hard to swallow, however, on a +syntactic level. We can make it easier to read and drop some of the +parentheses, by using the infix operator form of `bind`, which we get +access to by locally opening `Option.Monad_infix`. The module is +called `Monad_infix` because the `bind` operator is part of a +subinterface called `Monad`, which we'll see again in [Concurrent +Programming With +Async](concurrent-programming.html#concurrent-programming-with-async){data-type=xref}. ```ocaml env=main # let compute_bounds ~compare list = @@ -268,20 +275,22 @@ val compute_bounds : compare:('a -> 'a -> int) -> 'a list -> ('a * 'a) option = ``` -This use of `bind` isn't really materially better than the one we started -with, and indeed, for small examples like this, direct matching of options is -generally better than using `bind`. But for large, complex examples with many -stages of error handling, the `bind` idiom becomes clearer and easier to -manage. +This use of `bind` isn't really materially better than the one we +started with, and indeed, for small examples like this, direct +matching of options is generally better than using `bind`. But for +large, complex examples with many stages of error handling, the `bind` +idiom becomes clearer and easier to manage. ::: {data-type=note} #### Monads and `Let_syntax` -We can make this look a little bit more ordinary by using a syntax extension -that's designed specifically for monadic binds, called `Let_syntax`. Here's -what the above example looks like using this extension. +We can make this look a little bit more ordinary by using a syntax +extension that's designed specifically for monadic binds, called +`Let_syntax`. Here's what the above example looks like using this +extension. ```ocaml env=main +# #require "ppx_let" # let compute_bounds ~compare list = let open Option.Let_syntax in let sorted = List.sort ~compare list in @@ -292,6 +301,8 @@ val compute_bounds : compare:('a -> 'a -> int) -> 'a list -> ('a * 'a) option = ``` +Note that we needed a `#require` statement to enable the extension. + To understand what's going on here, you need to know that `let%bind x = some_expr in some_other_expr` is rewritten into `some_expr >>= fun x -> some_other_expr`. diff --git a/book/error-handling/prelude.ml b/book/error-handling/prelude.ml index 9cec4a55c..74986336e 100644 --- a/book/error-handling/prelude.ml +++ b/book/error-handling/prelude.ml @@ -1,3 +1,3 @@ -#require "base,core.top,ppx_jane";; +#require "base,core.top";; let () = Base.Printexc.record_backtrace false diff --git a/book/files-modules-and-programs/prelude.ml b/book/files-modules-and-programs/prelude.ml index 88bbb3326..1fa5f27c6 100644 --- a/book/files-modules-and-programs/prelude.ml +++ b/book/files-modules-and-programs/prelude.ml @@ -1,5 +1,3 @@ -#require "core,core.top,ppx_jane";; - -open Base +#require "core,core.top";; let () = Printexc.record_backtrace false diff --git a/book/guided-tour/prelude.ml b/book/guided-tour/prelude.ml index ff72995b5..6f22e73e4 100644 --- a/book/guided-tour/prelude.ml +++ b/book/guided-tour/prelude.ml @@ -1,6 +1,3 @@ #require "base";; -#require "ppx_jane";; - -open Base let () = Printexc.record_backtrace false diff --git a/book/imperative-programming/README.md b/book/imperative-programming/README.md index 7650c8069..6bf038cbf 100644 --- a/book/imperative-programming/README.md +++ b/book/imperative-programming/README.md @@ -1,52 +1,57 @@ # Imperative Programming {#imperative-programming-1} -Most of the code shown so far in this book, and indeed, most OCaml code in -general, is *pure*. Pure code works without mutating the program's internal -state, performing I/O, reading the clock, or in any other way interacting -with changeable parts of the world. Thus, a pure function behaves like a -mathematical function, always returning the same results when given the same -inputs, and never affecting the world except insofar as it returns the value -of its computation. *Imperative* code, on the other hand, operates by side -effects that modify a program's internal state or interact with the outside -world. An imperative function has a new effect, and potentially returns -different results, every time it's called. [imperative programming/benefits -of]{.idx}[pure code]{.idx}[programming/immutable vs. +Most of the code shown so far in this book, and indeed, most OCaml +code in general, is *pure*. Pure code works without mutating the +program's internal state, performing I/O, reading the clock, or in any +other way interacting with changeable parts of the world. Thus, a pure +function behaves like a mathematical function, always returning the +same results when given the same inputs, and never affecting the world +except insofar as it returns the value of its +computation. *Imperative* code, on the other hand, operates by side +effects that modify a program's internal state or interact with the +outside world. An imperative function has a new effect, and +potentially returns different results, every time it's +called. [imperative programming/benefits of]{.idx}[pure +code]{.idx}[programming/immutable vs. imperative]{.idx}[programming/imperative programming]{.idx #PROGimper} Pure code is the default in OCaml, and for good reason—it's generally -easier to reason about, less error prone and more composable. But imperative -code is of fundamental importance to any practical programming language, -because real-world tasks require that you interact with the outside world, -which is by its nature imperative. Imperative programming can also be -important for performance. While pure code is quite efficient in OCaml, there -are many algorithms that can only be implemented efficiently using imperative -techniques. - -OCaml offers a happy compromise here, making it easy and natural to program -in a pure style, but also providing great support for imperative programming. -This chapter will walk you through OCaml's imperative features, and help you -use them to their fullest. +easier to reason about, less error prone and more composable. But +imperative code is of fundamental importance to any practical +programming language, because real-world tasks require that you +interact with the outside world, which is by its nature +imperative. Imperative programming can also be important for +performance. While pure code is quite efficient in OCaml, there are +many algorithms that can only be implemented efficiently using +imperative techniques. + +OCaml offers a happy compromise here, making it easy and natural to +program in a pure style, but also providing great support for +imperative programming. This chapter will walk you through OCaml's +imperative features, and help you use them to their fullest. ## Example: Imperative Dictionaries -We'll start with the implementation of a simple imperative dictionary, i.e., -a mutable mapping from keys to values. This is really for illustration -purposes; both Core and the standard library provide imperative dictionaries, -and for most real-world tasks, you should use one of those implementations. -There's more advice on using Core's implementation in particular in -[Maps And Hash Tables](maps-and-hashtables.html#maps-and-hash-tables){data-type=xref}. -[dictionaries, imperative]{.idx #DICTimper}[Core standard library/imperative -dictionaries in]{.idx}[imperative programming/imperative -dictionaries]{.idx #IPimpdict} +We'll start with the implementation of a simple imperative dictionary, +i.e., a mutable mapping from keys to values. This is really for +illustration purposes; both Core and the standard library provide +imperative dictionaries, and for most real-world tasks, you should use +one of those implementations. There's more advice on using Core's +implementation in particular in [Maps And Hash +Tables](maps-and-hashtables.html#maps-and-hash-tables){data-type=xref}. +[dictionaries, imperative]{.idx #DICTimper}[Core standard +library/imperative dictionaries in]{.idx}[imperative +programming/imperative dictionaries]{.idx #IPimpdict} The dictionary we'll describe now, like those in Core and the standard -library, will be implemented as a hash table. In particular, we'll use an -*open hashing* scheme, where the hash table will be an array of buckets, each -bucket containing a list of key/value pairs that have been hashed into that -bucket. [open hashing]{.idx} +library, will be implemented as a hash table. In particular, we'll use +an *open hashing* scheme, where the hash table will be an array of +buckets, each bucket containing a list of key/value pairs that have +been hashed into that bucket. [open hashing]{.idx} -Here's the interface we'll match, provided as an `mli`. The type `('a, 'b) t` -represents a dictionary with keys of type `'a` and data of type `'b`: +Here's the interface we'll match, provided as an `mli`. The type `('a, +'b) t` represents a dictionary with keys of type `'a` and data of type +`'b`: ```ocaml file=examples/dictionary.mli,part=1 (* file: dictionary.mli *) @@ -62,18 +67,18 @@ val iter : ('a, 'b) t -> f:(key:'a -> data:'b -> unit) -> unit val remove : ('a, 'b) t -> 'a -> unit ``` -The `mli` also includes a collection of helper functions whose purpose and -behavior should be largely inferrable from their names and type signatures. -Notice that a number of the functions, in particular, ones like `add` that -modify the dictionary, return `unit`. This is typical of functions that act -by side effect. +The `mli` also includes a collection of helper functions whose purpose +and behavior should be largely inferrable from their names and type +signatures. Notice that a number of the functions, in particular, +ones like `add` that modify the dictionary, return `unit`. This is +typical of functions that act by side effect. -We'll now walk through the implementation (contained in the corresponding -`ml` file) piece by piece, explaining different imperative constructs as they -come up. +We'll now walk through the implementation (contained in the +corresponding `ml` file) piece by piece, explaining different +imperative constructs as they come up. -Our first step is to define the type of a dictionary as a record with two -fields: +Our first step is to define the type of a dictionary as a record with +two fields: ```ocaml file=examples/dictionary.ml,part=1 (* file: dictionary.ml *) @@ -84,13 +89,14 @@ type ('a, 'b) t = { mutable length: int; } ``` -The first field, `length`, is declared as mutable. In OCaml, records are -immutable by default, but individual fields are mutable when marked as such. -The second field, `buckets`, is immutable but contains an array, which is -itself a mutable data structure. [fields/mutability of]{.idx} +The first field, `length`, is declared as mutable. In OCaml, records +are immutable by default, but individual fields are mutable when +marked as such. The second field, `buckets`, is immutable but +contains an array, which is itself a mutable data +structure. [fields/mutability of]{.idx} -Now we'll start putting together the basic functions for manipulating a -dictionary: +Now we'll start putting together the basic functions for manipulating +a dictionary: ```ocaml file=examples/dictionary.ml,part=2 let num_buckets = 17 @@ -109,16 +115,16 @@ let find t key = ~f:(fun (key',data) -> if key' = key then Some data else None) ``` -Note that `num_buckets` is a constant, which means our bucket array is of -fixed length. A practical implementation would need to be able to grow the -array as the number of elements in the dictionary increases, but we'll omit -this to simplify the presentation. +Note that `num_buckets` is a constant, which means our bucket array is +of fixed length. A practical implementation would need to be able to +grow the array as the number of elements in the dictionary increases, +but we'll omit this to simplify the presentation. -The function `hash_bucket` is used throughout the rest of the module to -choose the position in the array that a given key should be stored at. It is -implemented on top of `Hashtbl.hash`, which is a hash function provided by -the OCaml runtime that can be applied to values of any type. Thus, its own -type is polymorphic: `'a -> int`. +The function `hash_bucket` is used throughout the rest of the module +to choose the position in the array that a given key should be stored +at. It is implemented on top of `Hashtbl.hash`, which is a hash +function provided by the OCaml runtime that can be applied to values +of any type. Thus, its own type is polymorphic: `'a -> int`. The other functions defined above are fairly straightforward: @@ -138,14 +144,16 @@ Another important piece of imperative syntax shows up in `find`: we write `List.find_map`, which you can see the type of by typing it into the toplevel: -```ocaml env=examples +```ocaml env=main +# open Base # List.find_map - : 'a list -> f:('a -> 'b option) -> 'b option = ``` -`List.find_map` iterates over the elements of the list, calling `f` on each -one until a `Some` is returned by `f`, at which point that value is returned. -If `f` returns `None` on all values, then `None` is returned. +`List.find_map` iterates over the elements of the list, calling `f` on +each one until a `Some` is returned by `f`, at which point that value +is returned. If `f` returns `None` on all values, then `None` is +returned. Now let's look at the implementation of `iter`: @@ -157,18 +165,20 @@ let iter t ~f = ``` `iter` is designed to walk over all the entries in the dictionary. In -particular, `iter t ~f` will call `f` for each key/value pair in dictionary -`t`. Note that `f` must return `unit`, since it is expected to work by side -effect rather than by returning a value, and the overall `iter` function -returns `unit` as well. - -The code for `iter` uses two forms of iteration: a `for` loop to walk over -the array of buckets; and within that loop a call to `List.iter` to walk over -the values in a given bucket. We could have done the outer loop with a -recursive function instead of a `for` loop, but `for` loops are syntactically -convenient, and are more familiar and idiomatic in imperative contexts. - -The following code is for adding and removing mappings from the dictionary: +particular, `iter t ~f` will call `f` for each key/value pair in +dictionary `t`. Note that `f` must return `unit`, since it is expected +to work by side effect rather than by returning a value, and the +overall `iter` function returns `unit` as well. + +The code for `iter` uses two forms of iteration: a `for` loop to walk +over the array of buckets; and within that loop a call to `List.iter` +to walk over the values in a given bucket. We could have done the +outer loop with a recursive function instead of a `for` loop, but +`for` loops are syntactically convenient, and are more familiar and +idiomatic in imperative contexts. + +The following code is for adding and removing mappings from the +dictionary: ```ocaml file=examples/dictionary.ml,part=4 let bucket_has_key t i key = @@ -346,7 +356,7 @@ field. [ref cells]{.idx} The definition for the `ref` type is as follows: -```ocaml env=ref +```ocaml env=main # type 'a ref = { mutable contents : 'a } type 'a ref = { mutable contents : 'a; } ``` @@ -365,7 +375,7 @@ The standard library defines the following operators for working with `ref`s. You can see these in action: -```ocaml env=ref +```ocaml env=main # let x = ref 1 val x : int Stdlib.ref = {Base.Ref.contents = 1} # !x @@ -379,7 +389,7 @@ val x : int Stdlib.ref = {Base.Ref.contents = 1} The preceding are just ordinary OCaml functions, which could be defined as follows: -```ocaml env=ref +```ocaml env=custom_ref # let ref x = { contents = x } val ref : 'a -> 'a ref = # let (!) r = r.contents @@ -413,11 +423,14 @@ explicit `for` and `while` loops are both more concise and more idiomatic when programming imperatively. [looping constructs]{.idx}[while loops]{.idx}[for loops]{.idx} -The `for` loop is the simpler of the two. Indeed, we've already seen the -`for` loop in action—the `iter` function in `Dictionary` is built using it. -Here's a simple example of `for`: +The `for` loop is the simpler of the two. Indeed, we've already seen +the `for` loop in action—the `iter` function in `Dictionary` is built +using it. Here's a simple example of `for`. Note that we open the +`Stdio` library to get access to the `printf` function. -```ocaml env=for + +```ocaml env=main +# open Stdio # for i = 0 to 3 do printf "i = %d\n" i done i = 0 i = 1 @@ -429,7 +442,7 @@ i = 3 As you can see, the upper and lower bounds are inclusive. We can also use `downto` to iterate in the other direction: -```ocaml env=for +```ocaml env=main # for i = 3 downto 0 do printf "i = %d\n" i done i = 3 i = 2 @@ -447,7 +460,7 @@ loop first evaluates the condition, and then, if it evaluates to true, evaluates the body and starts the loop again. Here's a simple example of a function for reversing an array in place: -```ocaml env=for +```ocaml env=main # let rev_inplace ar = let i = ref 0 in let j = ref (Array.length ar - 1) in @@ -570,7 +583,7 @@ structures/cyclic]{.idx}[cyclic data structures]{.idx} There is an exception to this, though: you can construct fixed-size cyclic data structures using `let rec`: -```ocaml env=examples +```ocaml env=main # let rec endless_loop = 1 :: 2 :: 3 :: endless_loop val endless_loop : int list = [1; 2; 3; ] ``` @@ -704,21 +717,24 @@ pitfalls.  ## Laziness and Other Benign Effects -There are many instances where you basically want to program in a pure style, -but you want to make limited use of side effects to improve the performance -of your code. Such side effects are sometimes called *benign effects*, and -they are a useful way of leveraging OCaml's imperative features while still -maintaining most of the benefits of pure programming. [lazy -keyword]{.idx}[side effects]{.idx}[laziness]{.idx}[benign -effects/laziness]{.idx}[imperative programming/benign effects and]{.idx} - -One of the simplest benign effects is *laziness*. A lazy value is one that is -not computed until it is actually needed. In OCaml, lazy values are created -using the `lazy` keyword, which can be used to convert any expression of type -`s` into a lazy value of type `s Lazy.t`. The evaluation of that expression -is delayed until forced with `Lazy.force`: - -```ocaml env=lazy +There are many instances where you basically want to program in a pure +style, but you want to make limited use of side effects to improve the +performance of your code. Such side effects are sometimes called +*benign effects*, and they are a useful way of leveraging OCaml's +imperative features while still maintaining most of the benefits of +pure programming. [lazy keyword]{.idx}[side +effects]{.idx}[laziness]{.idx}[benign +effects/laziness]{.idx}[imperative programming/benign effects +and]{.idx} + +One of the simplest benign effects is *laziness*. A lazy value is one +that is not computed until it is actually needed. In OCaml, lazy +values are created using the `lazy` keyword, which can be used to +convert any expression of type `s` into a lazy value of type `s +Lazy.t`. The evaluation of that expression is delayed until forced +with `Lazy.force`: + +```ocaml env=main # let v = lazy (print_endline "performing lazy computation"; Float.sqrt 16.) val v : float lazy_t = # Lazy.force v @@ -735,7 +751,7 @@ To better understand how laziness works, let's walk through the implementation of our own lazy type. We'll start by declaring types to represent a lazy value: -```ocaml env=lazy +```ocaml env=custom_lazy # type 'a lazy_state = | Delayed of (unit -> 'a) | Value of 'a @@ -743,32 +759,33 @@ represent a lazy value: type 'a lazy_state = Delayed of (unit -> 'a) | Value of 'a | Exn of exn ``` -A `lazy_state` represents the possible states of a lazy value. A lazy value -is `Delayed` before it has been run, where `Delayed` holds a function for -computing the value in question. A lazy value is in the `Value` state when it -has been forced and the computation ended normally. The `Exn` case is for -when the lazy value has been forced, but the computation ended with an -exception. A lazy value is simply a `ref` containing a `lazy_state`, where -the `ref` makes it possible to change from being in the `Delayed` state to -being in the `Value` or `Exn` states. +A `lazy_state` represents the possible states of a lazy value. A lazy +value is `Delayed` before it has been run, where `Delayed` holds a +function for computing the value in question. A lazy value is in the +`Value` state when it has been forced and the computation ended +normally. The `Exn` case is for when the lazy value has been forced, +but the computation ended with an exception. A lazy value is simply a +`ref` containing a `lazy_state`, where the `ref` makes it possible to +change from being in the `Delayed` state to being in the `Value` or +`Exn` states. -We can create a lazy value from a thunk, i.e., a function that takes a unit -argument. Wrapping an expression in a thunk is another way to suspend the -computation of an expression: [thunks]{.idx} +We can create a lazy value from a thunk, i.e., a function that takes a +unit argument. Wrapping an expression in a thunk is another way to +suspend the computation of an expression: [thunks]{.idx} -```ocaml env=lazy +```ocaml env=custom_lazy # let create_lazy f = ref (Delayed f) val create_lazy : (unit -> 'a) -> 'a lazy_state ref = # let v = create_lazy (fun () -> print_endline "performing lazy computation"; Float.sqrt 16.) -val v : float lazy_state ref = {Base.Ref.contents = Delayed } +val v : float lazy_state ref = {contents = Delayed } ``` -Now we just need a way to force a lazy value. The following function does -just that: +Now we just need a way to force a lazy value. The following function +does just that. -```ocaml env=lazy +```ocaml env=custom_lazy # let force v = match !v with | Value x -> x @@ -786,7 +803,7 @@ val force : 'a lazy_state ref -> 'a = Which we can use in the same way we used `Lazy.force`: -```ocaml env=lazy +```ocaml env=custom_lazy # force v performing lazy computation - : float = 4. @@ -794,23 +811,25 @@ performing lazy computation - : float = 4. ``` -The main user-visible difference between our implementation of laziness and -the built-in version is syntax. Rather than writing -`create_lazy (fun () -> sqrt 16.)`, we can (with the built-in `lazy`) just -write `lazy (sqrt 16.)`. +The main user-visible difference between our implementation of +laziness and the built-in version is syntax. Rather than writing +`create_lazy (fun () -> sqrt 16.)`, we can (with the built-in `lazy`) +just write `lazy (sqrt 16.)`. ### Memoization and Dynamic Programming -Another benign effect is *memoization*. A memoized function remembers the -result of previous invocations of the function so that they can be returned -without further computation when the same arguments are presented again. -[memoization/of function]{.idx}[benign effects/memoization]{.idx #BEmem} +Another benign effect is *memoization*. A memoized function remembers +the result of previous invocations of the function so that they can be +returned without further computation when the same arguments are +presented again. [memoization/of function]{.idx}[benign +effects/memoization]{.idx #BEmem} -Here's a function that takes as an argument an arbitrary single-argument -function and returns a memoized version of that function. Here we'll use -Core's `Hashtbl` module, rather than our toy `Dictionary`: +Here's a function that takes as an argument an arbitrary +single-argument function and returns a memoized version of that +function. Here we'll use Core's `Hashtbl` module, rather than our toy +`Dictionary`: -```ocaml env=memo +```ocaml env=main # let memoize f = let memo_table = Hashtbl.Poly.create () in (fun x -> @@ -820,34 +839,36 @@ val memoize : ('a -> 'b) -> 'a -> 'b = The preceding code is a bit tricky. `memoize` takes as its argument a function `f` and then allocates a polymorphic hash table (called -`memo_table`), and returns a new function which is the memoized version of -`f`. When called, this new function uses `Hashtbl.find_or_add` to try to find -a value in the `memo_table`, and if it fails, to call `f` and store the -result. Note that `memo_table` is referred to by the function, and so won't -be collected until the function returned by `memoize` is itself collected. +`memo_table`), and returns a new function which is the memoized +version of `f`. When called, this new function uses +`Hashtbl.find_or_add` to try to find a value in the `memo_table`, and +if it fails, to call `f` and store the result. Note that `memo_table` +is referred to by the function, and so won't be collected until the +function returned by `memoize` is itself collected. [memoization/benefits and drawbacks of]{.idx} -Memoization can be useful whenever you have a function that is expensive to -recompute and you don't mind caching old values indefinitely. One important -caution: a memoized function by its nature leaks memory. As long as you hold -on to the memoized function, you're holding every result it has returned thus -far. +Memoization can be useful whenever you have a function that is +expensive to recompute and you don't mind caching old values +indefinitely. One important caution: a memoized function by its nature +leaks memory. As long as you hold on to the memoized function, you're +holding every result it has returned thus far. Memoization is also useful for efficiently implementing some recursive -algorithms. One good example is the algorithm for computing the -*edit distance* (also called the Levenshtein distance) between two strings. -The edit distance is the number of single-character changes (including letter -switches, insertions, and deletions) required to -convert one string to the other. This kind -of distance metric can be useful for a variety of approximate string-matching -problems, like spellcheckers. [string matching]{.idx}[Levenshtein -distance]{.idx}[edit distance]{.idx} - -Consider the following code for computing the edit distance. Understanding -the algorithm isn't important here, but you should pay attention to the -structure of the recursive calls: [memoization/example of]{.idx} - -```ocaml env=memo +algorithms. One good example is the algorithm for computing the *edit +distance* (also called the Levenshtein distance) between two strings. +The edit distance is the number of single-character changes (including +letter switches, insertions, and deletions) required to convert one string to the other. This +kind of distance metric can be useful for a variety of approximate +string-matching problems, like spellcheckers. [string +matching]{.idx}[Levenshtein distance]{.idx}[edit distance]{.idx} + +Consider the following code for computing the edit +distance. Understanding the algorithm isn't important here, but you +should pay attention to the structure of the recursive calls: +[memoization/example of]{.idx} + +```ocaml env=main # let rec edit_distance s t = match String.length s, String.length t with | (0,x) | (x,0) -> x @@ -867,8 +888,8 @@ val edit_distance : string -> string -> int = - : int = 2 ``` -The thing to note is that if you call `edit_distance "OCaml" "ocaml"`, then -that will in turn dispatch the following calls: +The thing to note is that if you call `edit_distance "OCaml" "ocaml"`, +then that will in turn dispatch the following calls:
@@ -882,13 +903,14 @@ And these calls will in turn dispatch other calls:
-As you can see, some of these calls are repeats. For example, there are two -different calls to `edit_distance "OCam" "oca"`. The number of redundant -calls grows exponentially with the size of the strings, meaning that our -implementation of `edit_distance` is brutally slow for large strings. We can -see this by writing a small timing function, using the `Mtime` package. +As you can see, some of these calls are repeats. For example, there +are two different calls to `edit_distance "OCam" "oca"`. The number of +redundant calls grows exponentially with the size of the strings, +meaning that our implementation of `edit_distance` is brutally slow +for large strings. We can see this by writing a small timing function, +using the `Mtime` package. -```ocaml env=memo +```ocaml env=main # let time f = let open Core in let start = Time.now () in @@ -901,7 +923,7 @@ val time : (unit -> 'a) -> 'a = And now we can use this to try out some examples: -```ocaml env=memo,non-deterministic=command +```ocaml env=main,non-deterministic=command # time (fun () -> edit_distance "OCaml" "ocaml") Time: 1.10292434692 ms - : int = 2 @@ -912,32 +934,33 @@ Time: 3282.86218643 ms Just those few extra characters made it thousands of times slower! -Memoization would be a huge help here, but to fix the problem, we need to -memoize the calls that `edit_distance` makes to itself. Such recursive -memoization is closely related to a common algorithmic technique called -*dynamic programming*, except that with dynamic programming, you do the -necessary sub-computations bottom-up, in anticipation of needing them. With -recursive memoization, you go top-down, only doing a sub-computation when you -discover that you need it. [memoization/recursive]{.idx}[dynamic -programming]{.idx} - -To see how to do this, let's step away from `edit_distance` and instead -consider a much simpler example: computing the *n*th element of the Fibonacci -sequence. The Fibonacci sequence by definition starts out with two `1`s, with -every subsequent element being the sum of the previous two. The classic -recursive definition of Fibonacci is as follows: - -```ocaml env=fib +Memoization would be a huge help here, but to fix the problem, we need +to memoize the calls that `edit_distance` makes to itself. Such +recursive memoization is closely related to a common algorithmic +technique called *dynamic programming*, except that with dynamic +programming, you do the necessary sub-computations bottom-up, in +anticipation of needing them. With recursive memoization, you go +top-down, only doing a sub-computation when you discover that you need +it. [memoization/recursive]{.idx}[dynamic programming]{.idx} + +To see how to do this, let's step away from `edit_distance` and +instead consider a much simpler example: computing the *n*th element +of the Fibonacci sequence. The Fibonacci sequence by definition starts +out with two `1`s, with every subsequent element being the sum of the +previous two. The classic recursive definition of Fibonacci is as +follows: + +```ocaml env=main # let rec fib i = if i <= 1 then i else fib (i - 1) + fib (i - 2) val fib : int -> int = ``` This is, however, exponentially slow, for the same reason that -`edit_distance` was slow: we end up making many redundant calls to `fib`. It -shows up quite dramatically in the performance: +`edit_distance` was slow: we end up making many redundant calls to +`fib`. It shows up quite dramatically in the performance: -```ocaml env=fib,non-deterministic=command +```ocaml env=main,non-deterministic=command # time (fun () -> fib 20) Time: 1.12414360046 ms - : int = 10946 @@ -946,15 +969,16 @@ Time: 18263.7000084 ms - : int = 165580141 ``` -As you can see, `fib 40` takes thousands of times longer to compute than -`fib 20`. +As you can see, `fib 40` takes thousands of times longer to compute +than `fib 20`. -So, how can we use memoization to make this faster? The tricky bit is that we -need to insert the memoization before the recursive calls within `fib`. We -can't just define `fib` in the ordinary way and memoize it after the fact and -expect the first call to `fib` to be improved. +So, how can we use memoization to make this faster? The tricky bit is +that we need to insert the memoization before the recursive calls +within `fib`. We can't just define `fib` in the ordinary way and +memoize it after the fact and expect the first call to `fib` to be +improved. -```ocaml env=fib,non-deterministic=command +```ocaml env=main,non-deterministic=command # let fib = memoize fib val fib : int -> int = # time (fun () -> fib 40) @@ -965,32 +989,32 @@ Time: 0.00596046447754 ms - : int = 165580141 ``` -In order to make `fib` fast, our first step will be to rewrite `fib` in a way -that unwinds the recursion. The following version expects as its first -argument a function (called `fib`) that will be called in lieu of the usual -recursive call. +In order to make `fib` fast, our first step will be to rewrite `fib` +in a way that unwinds the recursion. The following version expects as +its first argument a function (called `fib`) that will be called in +lieu of the usual recursive call. -```ocaml env=fib +```ocaml env=main # let fib_norec fib i = if i <= 1 then i else fib (i - 1) + fib (i - 2) val fib_norec : (int -> int) -> int -> int = ``` -We can now turn this back into an ordinary Fibonacci function by tying the -recursive knot: +We can now turn this back into an ordinary Fibonacci function by tying +the recursive knot: -```ocaml env=fib +```ocaml env=main # let rec fib i = fib_norec fib i val fib : int -> int = # fib 20 - : int = 6765 ``` -We can even write a polymorphic function that we'll call `make_rec` that can -tie the recursive knot for any function of this form: +We can even write a polymorphic function that we'll call `make_rec` +that can tie the recursive knot for any function of this form: -```ocaml env=fib +```ocaml env=main # let make_rec f_norec = let rec f x = f_norec f x in f @@ -1013,7 +1037,7 @@ implement the same old slow Fibonacci function. To make it faster, we need a variant of `make_rec` that inserts memoization when it ties the recursive knot. We'll call that function `memo_rec`: -```ocaml env=fib +```ocaml env=main # let memo_rec f_norec x = let fref = ref (fun _ -> assert false) in let f = memoize (fun x -> f_norec !fref x) in @@ -1029,7 +1053,7 @@ using a `let rec`, which for reasons we'll describe later wouldn't work here. Using `memo_rec`, we can now build an efficient version of `fib`: -```ocaml env=fib,non-deterministic=command +```ocaml env=main,non-deterministic=command # let fib = memo_rec fib_norec val fib : int -> int = # time (fun () -> fib 40) @@ -1039,18 +1063,19 @@ Time: 0.0388622283936 ms And as you can see, the exponential time complexity is now gone. -The memory behavior here is important. If you look back at the definition of -`memo_rec`, you'll see that the call `memo_rec fib_norec` does not trigger a -call to `memoize`. Only when `fib` is called and thereby the final argument -to `memo_rec` is presented does `memoize` get called. The result of that call -falls out of scope when the `fib` call returns, and so calling `memo_rec` on -a function does not create a memory leak—the memoization table is collected -after the computation completes. +The memory behavior here is important. If you look back at the +definition of `memo_rec`, you'll see that the call `memo_rec +fib_norec` does not trigger a call to `memoize`. Only when `fib` is +called and thereby the final argument to `memo_rec` is presented does +`memoize` get called. The result of that call falls out of scope when +the `fib` call returns, and so calling `memo_rec` on a function does +not create a memory leak—the memoization table is collected after the +computation completes. -We can use `memo_rec` as part of a single declaration that makes this look -like it's little more than a special form of `let rec`: +We can use `memo_rec` as part of a single declaration that makes this +look like it's little more than a special form of `let rec`: -```ocaml env=fib +```ocaml env=main # let fib = memo_rec (fun fib i -> if i <= 1 then 1 else fib (i - 1) + fib (i - 2)) val fib : int -> int = @@ -1069,7 +1094,7 @@ the original interface with a wrapper function.) With just that change and the addition of the `memo_rec` call, we can get a memoized version of `edit_distance`: -```ocaml env=memo +```ocaml env=main # let edit_distance = memo_rec (fun edit_distance (s,t) -> match String.length s, String.length t with | (0,x) | (x,0) -> x @@ -1104,7 +1129,7 @@ You might wonder why we didn't tie the recursive knot in `memo_rec` using `let rec`, as we did for `make_rec` earlier. Here's code that tries to do just that: [let rec]{.idx} -```ocaml env=letrec +```ocaml env=main # let memo_rec f_norec = let rec f = memoize (fun x -> f_norec f x) in f @@ -1136,7 +1161,7 @@ It's worth noting that these restrictions don't show up in a lazy language like Haskell. Indeed, we can make something like our definition of `x` work if we use OCaml's laziness: -```ocaml env=letrec +```ocaml env=main # let rec x = lazy (force x + 1) val x : int lazy_t = ``` @@ -1145,7 +1170,7 @@ Of course, actually trying to compute this will fail. OCaml's `lazy` throws an exception when a lazy value tries to force itself as part of its own evaluation. -```ocaml env=letrec +```ocaml env=main # force x Exception: Lazy.Undefined ``` @@ -1154,7 +1179,7 @@ But we can also create useful recursive definitions with `lazy`. In particular, we can use laziness to make our definition of `memo_rec` work without explicit mutation: -```ocaml env=letrec,non-deterministic=command +```ocaml env=main,non-deterministic=command # let lazy_memo_rec f_norec x = let rec f = lazy (memoize (fun x -> f_norec (force f) x)) in (force f) x @@ -1183,29 +1208,31 @@ reading or writing data to things like files, terminal input and output, and network sockets. [I/O (input/output) operations/terminal I/O]{.idx}[imperative programming/input and output]{.idx #IPinpout} -There are multiple I/O libraries in OCaml. In this section we'll discuss -OCaml's buffered I/O library that can be used through the `In_channel` and -`Out_channel` modules in Core. Other I/O primitives are also available -through the `Unix` module in Core as well as `Async`, the asynchronous I/O -library that is covered in -[Concurrent Programming With Async](concurrent-programming.html#concurrent-programming-with-async){data-type=xref}. -Most of the functionality in Core's `In_channel` and `Out_channel` (and in -Core's `Unix` module) derives from the standard library, but we'll use Core's -interfaces here. +There are multiple I/O libraries in OCaml. In this section we'll +discuss OCaml's buffered I/O library that can be used through the +`In_channel` and `Out_channel` modules in Core. Other I/O primitives +are also available through the `Unix` module in Core as well as +`Async`, the asynchronous I/O library that is covered in [Concurrent +Programming With +Async](concurrent-programming.html#concurrent-programming-with-async){data-type=xref}. +Most of the functionality in Core's `In_channel` and `Out_channel` +(and in Core's `Unix` module) derives from the standard library, but +we'll use Core's interfaces here. ### Terminal I/O -OCaml's buffered I/O library is organized around two types: `in_channel`, for -channels you read from, and `out_channel`, for channels you write to. The -`In_channel` and `Out_channel` modules only have direct support for channels -corresponding to files and terminals; other kinds of channels can be created -through the `Unix` module. [Out_channel -module/Out_channel.stderr]{.idx}[Out_channel +OCaml's buffered I/O library is organized around two types: +`in_channel`, for channels you read from, and `out_channel`, for +channels you write to. The `In_channel` and `Out_channel` modules only +have direct support for channels corresponding to files and terminals; +other kinds of channels can be created through the `Unix` +module. [Out_channel module/Out_channel.stderr]{.idx}[Out_channel module/Out_channel.stdout]{.idx}[In_channel module]{.idx} -We'll start our discussion of I/O by focusing on the terminal. Following the -UNIX model, communication with the terminal is organized around three -channels, which correspond to the three standard file descriptors in Unix: +We'll start our discussion of I/O by focusing on the +terminal. Following the UNIX model, communication with the terminal is +organized around three channels, which correspond to the three +standard file descriptors in Unix: `In_channel.stdin` : The "standard input" channel. By default, input comes from the terminal, @@ -1216,18 +1243,19 @@ channels, which correspond to the three standard file descriptors in Unix: appears on the user terminal. `Out_channel.stderr` -: The "standard error" channel. This is similar to `stdout` but is intended - for error messages. +: The "standard error" channel. This is similar to `stdout` but is + intended for error messages. -The values `stdin`, `stdout`, and `stderr` are useful enough that they are -also available in the global namespace directly, without having to go through -the `In_channel` and `Out_channel` modules. +The values `stdin`, `stdout`, and `stderr` are useful enough that they +are also available in the global namespace directly, without having to +go through the `In_channel` and `Out_channel` modules. -Let's see this in action in a simple interactive application. The following -program, `time_converter`, prompts the user for a time zone, and then prints -out the current time in that time zone. Here, we use Core's `Zone` module for -looking up a time zone, and the `Time` module for computing the current time -and printing it out in the time zone in question: +Let's see this in action in a simple interactive application. The +following program, `time_converter`, prompts the user for a time zone, +and then prints out the current time in that time zone. Here, we use +Core's `Zone` module for looking up a time zone, and the `Time` module +for computing the current time and printing it out in the time zone in +question: ```ocaml file=examples/time_converter/time_converter.ml open Core @@ -1264,33 +1292,35 @@ The time in Europe/London is 2013-08-15 00:03:10.666220+01:00. ``` We called `Out_channel.flush` on `stdout` because `out_channel`s are -buffered, which is to say that OCaml doesn't immediately do a write every -time you call `output_string`. Instead, writes are buffered until either -enough has been written to trigger the flushing of the buffers, or until a -flush is explicitly requested. This greatly increases the efficiency of the -writing process by reducing the number of system calls. - -Note that `In_channel.input_line` returns a `string option`, with `None` -indicating that the input stream has ended (i.e., an end-of-file condition). -`Out_channel.output_string` is used to print the final output, and -`Out_channel.flush` is called to flush that output to the screen. The final -flush is not technically required, since the program ends after that -instruction, at which point all remaining output will be flushed anyway, but -the explicit flush is nonetheless good practice. +buffered, which is to say that OCaml doesn't immediately do a write +every time you call `output_string`. Instead, writes are buffered +until either enough has been written to trigger the flushing of the +buffers, or until a flush is explicitly requested. This greatly +increases the efficiency of the writing process by reducing the number +of system calls. + +Note that `In_channel.input_line` returns a `string option`, with +`None` indicating that the input stream has ended (i.e., an +end-of-file condition). `Out_channel.output_string` is used to print +the final output, and `Out_channel.flush` is called to flush that +output to the screen. The final flush is not technically required, +since the program ends after that instruction, at which point all +remaining output will be flushed anyway, but the explicit flush is +nonetheless good practice. ### Formatted Output with printf -Generating output with functions like `Out_channel.output_string` is simple -and easy to understand, but can be a bit verbose. OCaml also supports -formatted output using the `printf` function, which is modeled after -`printf` in the C standard library. `printf` takes a *format string* that -describes what to print and how to format it, as well as arguments to be -printed, as determined by the formatting directives embedded in the format -string. So, for example, we can write: [strings/format strings]{.idx}[format -strings]{.idx}[printf function]{.idx}[I/O (input/output) operations/formatted -output]{.idx} - -```ocaml env=printf +Generating output with functions like `Out_channel.output_string` is +simple and easy to understand, but can be a bit verbose. OCaml also +supports formatted output using the `printf` function, which is +modeled after `printf` in the C standard library. `printf` takes a +*format string* that describes what to print and how to format it, as +well as arguments to be printed, as determined by the formatting +directives embedded in the format string. So, for example, we can +write: [strings/format strings]{.idx}[format strings]{.idx}[printf +function]{.idx}[I/O (input/output) operations/formatted output]{.idx} + +```ocaml env=main # printf "%i is an integer, %F is a float, \"%s\" is a string\n" 3 4.5 "five" @@ -1298,11 +1328,11 @@ output]{.idx} - : unit = () ``` -Unlike C's `printf`, the `printf` in OCaml is type-safe. In particular, if we -provide an argument whose type doesn't match what's presented in the format -string, we'll get a type error: +Unlike C's `printf`, the `printf` in OCaml is type-safe. In +particular, if we provide an argument whose type doesn't match what's +presented in the format string, we'll get a type error: -```ocaml env=printf +```ocaml env=main # printf "An integer: %i\n" 4.5 Line 1, characters 27-30: Error: This expression has type float but an expression was expected of type @@ -1312,18 +1342,19 @@ Error: This expression has type float but an expression was expected of type ::: {data-type=note} ##### Understanding Format Strings -The format strings used by `printf` turn out to be quite different from -ordinary strings. This difference ties to the fact that OCaml format strings, -unlike their equivalent in C, are type-safe. In particular, the compiler -checks that the types referred to by the format string match the types of the -rest of the arguments passed to `printf`. +The format strings used by `printf` turn out to be quite different +from ordinary strings. This difference ties to the fact that OCaml +format strings, unlike their equivalent in C, are type-safe. In +particular, the compiler checks that the types referred to by the +format string match the types of the rest of the arguments passed to +`printf`. -To check this, OCaml needs to analyze the contents of the format string at -compile time, which means the format string needs to be available as a string -literal at compile time. Indeed, if you try to pass an ordinary string to -`printf`, the compiler will complain: +To check this, OCaml needs to analyze the contents of the format +string at compile time, which means the format string needs to be +available as a string literal at compile time. Indeed, if you try to +pass an ordinary string to `printf`, the compiler will complain: -```ocaml env=printf +```ocaml env=main # let fmt = "%i is an integer\n" val fmt : string = "%i is an integer\n" # printf fmt 3 @@ -1341,7 +1372,7 @@ interpreted as such. (Here, we open the CamlinternalFormatBasics so that the representation of the format string that's printed out won't fill the whole page.) -```ocaml env=printf +```ocaml env=main # open CamlinternalFormatBasics # let fmt : ('a, 'b, 'c) format = "%i is an integer\n" @@ -1354,7 +1385,7 @@ val fmt : (int -> 'c, 'b, 'c) format = And accordingly, we can pass it to `printf`: -```ocaml env=printf +```ocaml env=main # printf fmt 3 3 is an integer - : unit = () @@ -1416,12 +1447,13 @@ the `Printf` module in the OCaml Manual. ### File I/O -Another common use of `in_channel`s and `out_channel`s is for working with -files. Here are a couple of functions—one that creates a file full of -numbers, and the other that reads in such a file and returns the sum of those -numbers: [files/file I/O]{.idx}[I/O (input/output) operations/file I/O]{.idx} +Another common use of `in_channel`s and `out_channel`s is for working +with files. Here are a couple of functions—one that creates a file +full of numbers, and the other that reads in such a file and returns +the sum of those numbers: [files/file I/O]{.idx}[I/O (input/output) +operations/file I/O]{.idx} -```ocaml env=file,non-deterministic +```ocaml env=main # let create_number_file filename numbers = let outc = Out_channel.create filename in List.iter numbers ~f:(fun x -> Out_channel.fprintf outc "%d\n" x); @@ -1449,7 +1481,7 @@ One problem with the preceding code is that if it throws an exception in the middle of its work, it won't actually close the file. If we try to read a file that doesn't actually contain numbers, we'll see such an error: -```ocaml env=file,non-deterministic=command +```ocaml env=main,non-deterministic=command # sum_file "/etc/hosts" Exception: (Failure @@ -1474,7 +1506,7 @@ can do this using the `protect` function described in [Error Handling](error-handling.html#error-handling){data-type=xref}, as follows: -```ocaml env=file2 +```ocaml env=main # let sum_file filename = let file = In_channel.create filename in Exn.protect ~f:(fun () -> @@ -1486,7 +1518,7 @@ val sum_file : string -> int = And now, the file descriptor leak is gone: -```ocaml env=file2 +```ocaml env=main # for i = 1 to 10000 do try ignore (sum_file "/etc/hosts" : int) with _ -> () done - : unit = () # sum_file "numbers.txt" @@ -1504,7 +1536,7 @@ for processing data from an `in_channel` and takes care of the bookkeeping associated with opening and closing the file. We can rewrite `sum_file` using this function, as shown here: -```ocaml env=file2 +```ocaml env=main # let sum_file filename = In_channel.with_file filename ~f:(fun file -> let numbers = List.map ~f:Int.of_string (In_channel.input_lines file) in @@ -1517,7 +1549,7 @@ entire file into memory before processing it. For a large file, it's more efficient to process a line at a time. You can use the `In_channel.fold_lines` function to do just that: -```ocaml env=file2 +```ocaml env=main # let sum_file filename = In_channel.with_file filename ~f:(fun file -> In_channel.fold_lines file ~init:0 ~f:(fun sum line -> @@ -1547,7 +1579,7 @@ Consider the following simple example. Here, we have a collection of angles, and we want to determine if any of them have a negative `sin`. The following snippet of code would answer that question: -```ocaml env=order +```ocaml env=main # let x = Float.sin 120. in let y = Float.sin 75. in let z = Float.sin 128. in @@ -1562,7 +1594,7 @@ In some sense, we don't really need to compute the `sin 128.` because It doesn't have to be this way. Using the `lazy` keyword, we can write the original computation so that `sin 128.` won't ever be computed: -```ocaml env=order +```ocaml env=main # let x = lazy (Float.sin 120.) in let y = lazy (Float.sin 75.) in let z = lazy (Float.sin 128.) in @@ -1572,7 +1604,7 @@ original computation so that `sin 128.` won't ever be computed: We can confirm that fact by a few well-placed `printf`s: -```ocaml env=order +```ocaml env=main # let x = lazy (printf "1\n"; Float.sin 120.) in let y = lazy (printf "2\n"; Float.sin 75.) in let z = lazy (printf "3\n"; Float.sin 128.) in @@ -1598,7 +1630,7 @@ is often the opposite of what one might expect. Consider the following example: -```ocaml env=order +```ocaml env=main # List.exists ~f:(fun x -> Float.O.(x < 0.)) [ (printf "1\n"; Float.sin 120.); (printf "2\n"; Float.sin 75.); @@ -1620,7 +1652,7 @@ Consider the following simple, imperative function: [polymorphism/weak polymorphism]{.idx}[weak polymorphism]{.idx}[side effects]{.idx}[ imperative programming/side effects/weak polymorphism ]{.idx #IPsideweak} -```ocaml env=weak +```ocaml env=main # let remember = let cache = ref None in (fun x -> @@ -1645,7 +1677,7 @@ generalize, replacing `t` with a polymorphic type variable. It's this kind of generalization that gives us polymorphic types in the first place. The identity function, as an example, gets a polymorphic type in this way: -```ocaml env=weak +```ocaml env=main # let identity x = x val identity : 'a -> 'a = # identity 3 @@ -1674,7 +1706,7 @@ must always have the same type. [type variables]{.idx} OCaml will convert a weakly polymorphic variable to a concrete type as soon as it gets a clue as to what concrete type it is to be used as: -```ocaml env=weak +```ocaml env=main # let remember_three () = remember 3 val remember_three : unit -> int = # remember @@ -1715,7 +1747,7 @@ introduce persistent mutable cells, including: Thus, the following expression is a simple value, and as a result, the types of values contained within it are allowed to be polymorphic: -```ocaml env=value_restriction +```ocaml env=main # (fun x -> [x;x]) - : 'a -> 'a list = ``` @@ -1724,7 +1756,7 @@ But, if we write down an expression that isn't a simple value by the preceding definition, we'll get different results. For example, consider what happens if we try to memoize the function defined previously. -```ocaml env=value_restriction +```ocaml env=main # memoize (fun x -> [x;x]) - : '_weak2 -> '_weak2 list = ``` @@ -1735,7 +1767,7 @@ returned by previous invocations of the function. But OCaml would make the same determination even if the function in question did no such thing. Consider this example: -```ocaml env=value_restriction +```ocaml env=main # identity (fun x -> [x;x]) - : '_weak3 -> '_weak3 list = ``` @@ -1749,9 +1781,9 @@ that there is no *persistent* mutable state that could share values between uses of the same function. Thus, a function that produces a fresh reference every time it's called can have a fully polymorphic type: -```ocaml env=value_restriction +```ocaml env=main # let f () = ref None -val f : unit -> 'a option ref = +val f : unit -> 'a option Stdlib.ref = ``` But a function that has a mutable cache that persists across calls, like @@ -1770,7 +1802,7 @@ less general than you might expect. [partial application]{.idx} Consider the `List.init` function, which is used for creating lists where each element is created by calling a function on the index of that element: -```ocaml env=value_restriction +```ocaml env=main # List.init - : int -> f:(int -> 'a) -> 'a list = # List.init 10 ~f:Int.to_string @@ -1781,7 +1813,7 @@ Imagine we wanted to create a specialized version of `List.init` that always created lists of length 10. We could do that using partial application, as follows: -```ocaml env=value_restriction +```ocaml env=main # let list_init_10 = List.init 10 val list_init_10 : f:(int -> '_weak4) -> '_weak4 list = ``` @@ -1793,7 +1825,7 @@ across multiple calls to `list_init_10`. We can eliminate this possibility, and at the same time get the compiler to infer a polymorphic type, by avoiding partial application: -```ocaml env=value_restriction +```ocaml env=main # let list_init_10 ~f = List.init 10 ~f val list_init_10 : f:(int -> 'a) -> 'a list = ``` @@ -1816,7 +1848,7 @@ For example, we saw that a function application, even a simple application of the identity function, is not a simple value and thus can turn a polymorphic value into a weakly polymorphic one: -```ocaml env=value_restriction +```ocaml env=main # identity (fun x -> [x;x]) - : '_weak5 -> '_weak5 list = ``` @@ -1824,7 +1856,7 @@ value into a weakly polymorphic one: But that's not always the case. When the type of the returned value is immutable, then OCaml can typically infer a fully polymorphic type: -```ocaml env=value_restriction +```ocaml env=main # identity [] - : 'a list = [] ``` @@ -1832,7 +1864,7 @@ immutable, then OCaml can typically infer a fully polymorphic type: On the other hand, if the returned type is mutable, then the result will be weakly polymorphic: -```ocaml env=value_restriction +```ocaml env=main # [||] - : 'a array = [||] # identity [||] @@ -1843,7 +1875,7 @@ A more important example of this comes up when defining abstract data types. Consider the following simple data structure for an immutable list type that supports constant-time concatenation: -```ocaml env=value_restriction +```ocaml env=main # module Concat_list : sig type 'a t val empty : 'a t @@ -1883,7 +1915,7 @@ note that a `Concat_list.t` is unquestionably an immutable value. However, when it comes to the value restriction, OCaml treats it as if it were mutable: -```ocaml env=value_restriction +```ocaml env=main # Concat_list.empty - : 'a Concat_list.t = # identity Concat_list.empty @@ -1906,7 +1938,7 @@ contain any persistent references to values of type `'a`, at which point, OCaml can infer polymorphic types for expressions of this type that are not simple values: -```ocaml env=value_restriction +```ocaml env=main # module Concat_list : sig type +'a t val empty : 'a t @@ -1944,7 +1976,7 @@ module Concat_list : Now, we can apply the identity function to `Concat_list.empty` without losing any polymorphism: -```ocaml env=value_restriction +```ocaml env=main # identity Concat_list.empty - : 'a Concat_list.t = ``` diff --git a/book/imperative-programming/dune b/book/imperative-programming/dune index a4e1b490e..5cf1868bb 100644 --- a/book/imperative-programming/dune +++ b/book/imperative-programming/dune @@ -4,11 +4,6 @@ core mdx ppx_jane) - (preludes - prelude.ml - (env memo memo.ml) - (env fib fib.ml) - (env letrec letrec.ml) - (env value_restriction letrec.ml))) + (preludes prelude.ml)) (data_only_dirs examples) diff --git a/book/imperative-programming/fib.ml b/book/imperative-programming/fib.ml deleted file mode 100644 index 5addbae62..000000000 --- a/book/imperative-programming/fib.ml +++ /dev/null @@ -1,18 +0,0 @@ -let time f = - let open Core in - let start = Time.now () in - let x = f () in - let stop = Time.now () in - printf "Time: %F ms\n" (Time.diff stop start |> Time.Span.to_ms); - x - -let memoize f = - let table = Hashtbl.Poly.create () in - (fun x -> - match Hashtbl.find table x with - | Some y -> y - | None -> - let y = f x in - Hashtbl.add_exn table ~key:x ~data:y; - y - ) diff --git a/book/imperative-programming/letrec.ml b/book/imperative-programming/letrec.ml deleted file mode 100644 index 8c9e4480a..000000000 --- a/book/imperative-programming/letrec.ml +++ /dev/null @@ -1,18 +0,0 @@ -let time f = - let open Core in - let start = Time.now () in - let x = f () in - let stop = Time.now () in - printf "Time: %F ms\n" (Time.diff stop start |> Time.Span.to_ms); - x - -let memoize f = - let memo_table = Hashtbl.Poly.create () in - (fun x -> - Hashtbl.find_or_add memo_table x ~default:(fun () -> f x)) - -let fib_norec fib i = - if i <= 1 then i - else fib (i - 1) + fib (i - 2) - -let identity x = x diff --git a/book/imperative-programming/memo.ml b/book/imperative-programming/memo.ml deleted file mode 100644 index 397ae4d3c..000000000 --- a/book/imperative-programming/memo.ml +++ /dev/null @@ -1,11 +0,0 @@ -let memoize f = - let memo_table = Hashtbl.Poly.create () in - (fun x -> - Hashtbl.find_or_add memo_table x ~default:(fun () -> -f x)) - -let memo_rec f_norec x = - let fref = ref (fun _ -> assert false) in - let f = memoize (fun x -> f_norec !fref x) in - fref := f; - f x diff --git a/book/imperative-programming/prelude.ml b/book/imperative-programming/prelude.ml index 6ff53a52f..1fa5f27c6 100644 --- a/book/imperative-programming/prelude.ml +++ b/book/imperative-programming/prelude.ml @@ -1,6 +1,3 @@ -#require "core,core.top,ppx_jane";; - -open Base -open Stdio +#require "core,core.top";; let () = Printexc.record_backtrace false diff --git a/book/lists-and-patterns/README.md b/book/lists-and-patterns/README.md index 077810ce3..906c46f14 100644 --- a/book/lists-and-patterns/README.md +++ b/book/lists-and-patterns/README.md @@ -858,7 +858,7 @@ You might have noticed that `destutter` is specialized to lists of integers. That's because `Base`'s default equality operator is specialized to integers, as you can see if you try to apply it to values of a different type. -```ocaml env=poly +```ocaml env=main # "foo" = "bar" Line 1, characters 1-6: Error: This expression has type string but an expression was expected of type @@ -868,7 +868,7 @@ Error: This expression has type string but an expression was expected of type OCaml also has a collection of polymorphic equality and comparison operators, which we can make available by opening the module `Base.Poly`. -```ocaml env=poly +```ocaml env=main # open Base.Poly # "foo" = "bar" - : bool = false @@ -881,7 +881,7 @@ which we can make available by opening the module `Base.Poly`. Indeed, if we look at the type of the equality operator, we'll see that it is polymorphic. -```ocaml env=poly +```ocaml env=main # (=) - : 'a -> 'a -> bool = ``` @@ -889,7 +889,7 @@ polymorphic. If we rewrite our destutter example with `Base.Poly` open, we'll see that it gets a polymorphic type, and can now be used on inputs of different types. -```ocaml env=poly +```ocaml env=main # let rec destutter = function | [] | [_] as l -> l | hd :: (hd' :: _ as tl) when hd = hd' -> destutter tl @@ -918,7 +918,7 @@ they're laid out in memory. (You can learn more about this structure in Polymorphic compare does have some limitations. For example, it will fail at runtime if it encounters a function value. -```ocaml env=poly +```ocaml env=main # (fun x -> x + 1) = (fun x -> x + 1) Exception: (Invalid_argument "compare: functional value") ``` @@ -940,12 +940,12 @@ sense for the particular type of values you're dealing with. This can lead to surprising and hard to resolve bugs in your code. It's for this reason that `Base` discourages the use of polymorphic compare by hiding it by default. -We'll discuss this issue more in -[Maps And Hash Tables](maps-and-hashtables.html#maps-and-hash-tables){data-type=xref}. -But in any case, you can restore the default behavior of `Base` by opening -the module again. +We'll discuss this issue more in [Maps And Hash +Tables](maps-and-hashtables.html#maps-and-hash-tables){data-type=xref}. +But in any case, you can restore the default behavior of `Base` by +opening the module again. -```ocaml env=poly +```ocaml env=main # open Base ``` diff --git a/book/lists-and-patterns/prelude.ml b/book/lists-and-patterns/prelude.ml index 88bbb3326..1fa5f27c6 100644 --- a/book/lists-and-patterns/prelude.ml +++ b/book/lists-and-patterns/prelude.ml @@ -1,5 +1,3 @@ -#require "core,core.top,ppx_jane";; - -open Base +#require "core,core.top";; let () = Printexc.record_backtrace false diff --git a/book/records/README.md b/book/records/README.md index fcfe5e7a4..8ec8e1fde 100644 --- a/book/records/README.md +++ b/book/records/README.md @@ -158,6 +158,7 @@ irrefutable, unlike types with variable structures like lists and variants.[irrefutable patterns]{.idx}[datatypes/fixed vs. variable structure of]{.idx} + Another important characteristic of record patterns is that they don't need to be complete; a pattern can mention only a subset of the fields in the record. This can be convenient, but it can also be error @@ -438,7 +439,7 @@ namespace within which to put related values. When using this style, it is standard practice to name the type associated with the module `t`. Using this style we would write: -```ocaml env=main2 +```ocaml env=main # module Log_entry = struct type t = { session_id: string; @@ -492,7 +493,7 @@ module Logon : Now, our log-entry-creation function can be rendered as follows: -```ocaml env=main2 +```ocaml env=main # let create_log_entry ~session_id ~important message = { Log_entry.time = Time_ns.now (); Log_entry.session_id; @@ -510,7 +511,7 @@ record field, however, so we can write this more concisely. Note that we are allowed to insert whitespace between the module path and the field name: -```ocaml env=main2 +```ocaml env=main # let create_log_entry ~session_id ~important message = { Log_entry. time = Time_ns.now (); session_id; important; message } @@ -522,7 +523,7 @@ Earlier, we saw that you could help OCaml understand which record field was intended by adding a type annotation. We can use that here to make the example even more concise. -```ocaml env=main2 +```ocaml env=main # let create_log_entry ~session_id ~important message : Log_entry.t = { time = Time_ns.now (); session_id; important; message } val create_log_entry : @@ -532,7 +533,7 @@ val create_log_entry : This is not restricted to constructing a record; we can use the same approaches when pattern matching: -```ocaml env=main2 +```ocaml env=main # let message_to_string { Log_entry.important; message; _ } = if important then String.uppercase message else message val message_to_string : Log_entry.t -> string = @@ -541,7 +542,7 @@ val message_to_string : Log_entry.t -> string = When using dot notation for accessing record fields, we can qualify the field by the module as well. -```ocaml env=main2 +```ocaml env=main # let is_important t = t.Log_entry.important val is_important : Log_entry.t -> bool = ``` @@ -561,7 +562,7 @@ it can otherwise infer the type of the record in question. In particular, we can rewrite the above declarations by adding type annotations and removing the module qualifications. -```ocaml env=main2 +```ocaml env=main # let create_log_entry ~session_id ~important message : Log_entry.t = { time = Time_ns.now (); session_id; important; message } val create_log_entry : @@ -592,7 +593,7 @@ for representing this information, as well as a function for updating the client information when a new heartbeat arrives:[functional updates]{.idx}[records/functional updates to]{.idx} -```ocaml env=main2 +```ocaml env=main # type client_info = { addr: Unix.Inet_addr.t; port: int; @@ -634,7 +635,7 @@ on an existing one, with a set of field changes layered on top. Given this, we can rewrite `register_heartbeat` more concisely: -```ocaml env=main2 +```ocaml env=main # let register_heartbeat t hb = { t with last_heartbeat_time = hb.Heartbeat.time } val register_heartbeat : client_info -> Heartbeat.t -> client_info = @@ -648,7 +649,7 @@ not prompt you to reconsider whether your code needs to change to accommodate the new fields. Consider what happens if we decided to add a field for the status message received on the last heartbeat: -```ocaml env=main2 +```ocaml env=main # type client_info = { addr: Unix.Inet_addr.t; port: int; @@ -674,7 +675,7 @@ update continues to compile as is, even though it incorrectly ignores the new field. The correct thing to do would be to update the code as follows: -```ocaml env=main2 +```ocaml env=main # let register_heartbeat t hb = { t with last_heartbeat_time = hb.Heartbeat.time; last_heartbeat_status = hb.Heartbeat.status_message; @@ -689,7 +690,7 @@ however, declare individual record fields as mutable. In the following code, we've made the last two fields of `client_info` mutable:[mutable record fields]{.idx}[records/mutable fields in]{.idx} -```ocaml env=main2 +```ocaml env=main # type client_info = { addr: Unix.Inet_addr.t; port: int; @@ -712,7 +713,7 @@ The `<-` operator is used for setting a mutable field. The side-effecting version of `register_heartbeat` would be written as follows: -```ocaml env=main2 +```ocaml env=main # let register_heartbeat t hb = t.last_heartbeat_time <- hb.Heartbeat.time; t.last_heartbeat_status <- hb.Heartbeat.status_message @@ -735,7 +736,7 @@ Consider the following function for extracting the usernames from a list of `Logon` messages:[fields/first-class fields]{.idx}[first-class fields]{.idx}[records/first-class fields in]{.idx} -```ocaml env=main2 +```ocaml env=main # let get_users logons = List.dedup_and_sort ~compare:String.compare (List.map logons ~f:(fun x -> x.Logon.user)) @@ -750,10 +751,17 @@ that.[record field accessor functions]{.idx} The `[@@deriving fields]` annotation at the end of the declaration of a record type will cause the extension to be applied to a given type -declaration. So, for example, we could have defined `Logon` as -follows: +declaration. We need to enable the extension explicitly, + + + +```ocaml env=main +# #require "ppx_jane";; +``` + +at which point, we can define `Logon` as follows: -```ocaml env=main2 +```ocaml env=main # module Logon = struct type t = { session_id: string; @@ -797,7 +805,7 @@ the remainder from the documentation that comes with `fieldslib`. One of the functions we obtain is `Logon.user`, which we can use to extract the user field from a logon message: -```ocaml env=main2 +```ocaml env=main # let get_users logons = List.dedup_and_sort ~compare:String.compare (List.map logons ~f:Logon.user) @@ -832,7 +840,7 @@ whereas the type of `Logon.Fields.time` is `(Logon.t, Time.t) Field.t`. Thus, if you call `Field.get` on `Logon.Fields.user`, you'll get a function for extracting the `user` field from a `Logon.t`: -```ocaml env=main2 +```ocaml env=main # Field.get Logon.Fields.user - : Logon.t -> string = ``` @@ -844,7 +852,7 @@ contained in the field, which is also the return type of `get`. The type of `Field.get` is a little more complicated than you might naively expect from the preceding one: -```ocaml env=main2 +```ocaml env=main # Field.get - : ('b, 'r, 'a) Field.t_with_perm -> 'r -> 'a = ``` @@ -858,7 +866,7 @@ updates. We can use first-class fields to do things like write a generic function for displaying a record field: -```ocaml env=main2 +```ocaml env=main # let show_field field to_string record = let name = Field.name field in let field_string = to_string (Field.get field record) in @@ -873,19 +881,19 @@ which the field can be grabbed. Here's an example of `show_field` in action: -```ocaml env=main2,non-deterministic +```ocaml env=main,non-deterministic=output # let logon = { Logon. session_id = "26685"; time = Time_ns.of_string "2017-07-21 10:11:45 EST"; user = "yminsky"; credentials = "Xy2d9W"; } val logon : Logon.t = - {Logon.session_id = "26685"; time = 2017-07-21 17:11:45.000000+02:00; + {Logon.session_id = "26685"; time = 2017-07-21 15:11:45.000000000Z; user = "yminsky"; credentials = "Xy2d9W"} # show_field Logon.Fields.user Fn.id logon - : string = "user: yminsky" # show_field Logon.Fields.time Time_ns.to_string logon -- : string = "time: 2017-07-21 17:11:45.000000+02:00" +- : string = "time: 2017-07-21 15:11:45.000000000Z" ``` As a side note, the preceding example is our first use of the `Fn` @@ -898,7 +906,7 @@ and `Fields.iter`, which let you walk over the fields of a record. So, for example, in the case of `Logon.t`, the field iterator has the following type: -```ocaml env=main2 +```ocaml env=main # Logon.Fields.iter - : session_id:(([< `Read | `Set_and_create ], Logon.t, string) Field.t_with_perm -> unit) -> @@ -923,7 +931,7 @@ combination of the record and the `Field.t`. Now, let's use `Logon.Fields.iter` and `show_field` to print out all the fields of a `Logon` record: -```ocaml env=main2,non-deterministic +```ocaml env=main,non-deterministic=output # let print_logon logon = let print to_string field = printf "%s\n" (show_field field to_string logon) @@ -936,7 +944,7 @@ the fields of a `Logon` record: val print_logon : Logon.t -> unit = # print_logon logon session_id: 26685 -time: 2017-07-21 17:11:45.000000+02:00 +time: 2017-07-21 15:11:45.000000000Z user: yminsky credentials: Xy2d9W - : unit = () diff --git a/book/records/prelude.ml b/book/records/prelude.ml index df8786c89..1fa5f27c6 100644 --- a/book/records/prelude.ml +++ b/book/records/prelude.ml @@ -1,5 +1,3 @@ -#require "core,core.top,ppx_jane";; - -open Core +#require "core,core.top";; let () = Printexc.record_backtrace false diff --git a/book/variables-and-functions/prelude.ml b/book/variables-and-functions/prelude.ml index 88bbb3326..1fa5f27c6 100644 --- a/book/variables-and-functions/prelude.ml +++ b/book/variables-and-functions/prelude.ml @@ -1,5 +1,3 @@ -#require "core,core.top,ppx_jane";; - -open Base +#require "core,core.top";; let () = Printexc.record_backtrace false diff --git a/book/variants/README.md b/book/variants/README.md index a56b67278..b2dfe3b0e 100644 --- a/book/variants/README.md +++ b/book/variants/README.md @@ -611,7 +611,7 @@ structures]{.idx} An expression in this language will be defined by the variant `expr`, with one tag for each kind of expression we want to support: -```ocaml env=blang +```ocaml env=main # type 'a expr = | Base of 'a | Const of bool @@ -641,7 +641,7 @@ falsehood is determined by your application. If you were writing a filter language for an email processor, your base predicates might specify the tests you would run against an email, as in the following example: -```ocaml env=blang +```ocaml env=main # type mail_field = To | From | CC | Date | Subject type mail_field = To | From | CC | Date | Subject # type mail_predicate = { field: mail_field; @@ -652,7 +652,7 @@ type mail_predicate = { field : mail_field; contains : string; } Using the preceding code, we can construct a simple expression with `mail_predicate` as its base predicate: -```ocaml env=blang +```ocaml env=main # let test field contains = Base { field; contains } val test : mail_field -> string -> mail_predicate expr = # And [ Or [ test To "doligez"; test CC "doligez" ]; @@ -669,7 +669,7 @@ And Being able to construct such expressions isn't enough; we also need to be able to evaluate them. Here's a function for doing just that: -```ocaml env=blang +```ocaml env=main # let rec eval expr base_eval = (* a shortcut, so we don't need to repeatedly pass [base_eval] explicitly to [eval] *) @@ -693,7 +693,7 @@ Another useful operation on expressions is simplification. The following is a set of simplifying construction functions that mirror the tags of an `expr`: -```ocaml env=blang +```ocaml env=main # let and_ l = if List.exists l ~f:(function Const false -> true | _ -> false) then Const false @@ -720,7 +720,7 @@ val not_ : 'a expr -> 'a expr = We can now write a simplification routine that is based on the preceding functions. -```ocaml env=blang +```ocaml env=main # let rec simplify = function | Base _ | Const _ as x -> x | And l -> and_ (List.map ~f:simplify l) @@ -732,7 +732,7 @@ val simplify : 'a expr -> 'a expr = We can apply this to a Boolean expression and see how good a job it does at simplifying it: -```ocaml env=blang +```ocaml env=main # simplify (Not (And [ Or [Base "it's snowing"; Const true]; Base "it's raining"])) - : string expr = Not (Base "it's raining") @@ -745,7 +745,7 @@ component. There are some simplifications it misses, however. In particular, see what happens if we add a double negation in: -```ocaml env=blang +```ocaml env=main # simplify (Not (And [ Or [Base "it's snowing"; Const true]; Not (Not (Base "it's raining"))])) - : string expr = Not (Not (Not (Base "it's raining"))) @@ -757,7 +757,7 @@ case it explicitly considers, that of the negation of a constant. Catch-all cases are generally a bad idea, and if we make the code more explicit, we see that the missing of the double negation is more obvious: -```ocaml env=blang +```ocaml env=main # let not_ = function | Const b -> Const (not b) | (Base _ | And _ | Or _ | Not _) as e -> Not e @@ -767,7 +767,7 @@ val not_ : 'a expr -> 'a expr = We can of course fix this by simply adding an explicit case for double negation: -```ocaml env=blang +```ocaml env=main # let not_ = function | Const b -> Const (not b) | Not e -> e diff --git a/book/variants/prelude.ml b/book/variants/prelude.ml index 88bbb3326..1fa5f27c6 100644 --- a/book/variants/prelude.ml +++ b/book/variants/prelude.ml @@ -1,5 +1,3 @@ -#require "core,core.top,ppx_jane";; - -open Base +#require "core,core.top";; let () = Printexc.record_backtrace false