title |
---|
Part 2: Functions |
But, wait, isn't Haskell supposed to be a functional programming language? It's already part 2 of the course and we haven't encountered any functions yet.
Since Haskell is a functional programming language, functions are first-class values. Recall our earlier example with a value named z
which was the result of applying the +
operator to x
and y
.
λ> x = 5
λ> y = 6
λ> z = x + y
λ> z
11
Let's generalize this to a function that adds its two arguments together:
λ> z x y = x + y
λ> :t z
z :: Num a => a
λ> z 6 7
11
Now let's rename our function addIntegers
and provide a type annotation:
λ> addIntegers :: Integer -> Integer -> Integer; addIntegers x y = x + y
λ> :t addIntegers
addIntegers :: Integer -> Integer -> Integer
λ> addIntegers 6 7
11
The :t
command yields the type signature for addIntegers
: in this case a function taking two Integer
s and evaluating to a third Integer
. The function's "return" type is the rightmost type in the signature. The ->
operator is read as "to" or "maps to".
Many languages use parentheses, (
and )
, to delimit the arguments of a function application or call. Functions are so central to the Haskell Way that the language designers intentionally chose the tersest syntax possible for function definitions and function application. Thus, instead of:
Haskell uses:
f x y = x ^ 2 + y ^ 2
g x y = x ^ 3 + y ^ 3
h x y = f x y + g x y
Note that f
and g
are in scope for h
.
To further reduce the use of parentheses, Haskell also assigns the highest precedence of all infix operators to function application.
Moving our addIntegers
function to a source file and adding its type signature, we get:
addIntegers :: Integer -> Integer -> Integer
addIntegers x y = x + y
main :: IO ()
main = print (addIntegers 5 6)
To reiterate, this defines a function addIntegers
that takes an Integer
, a second Integer
and evaluates to an Integer
(analogous to "returning" an Integer
in imperative programming languages).
Since addIntegers
is a value much like z
, albeit one with arguments, it can be passed as an argument to other functions. In this respect, addIntegers
is much like z
or 5
, "hello"
or any other value.
addIntegers :: Integer -> Integer -> Integer
addIntegers x y = x + y
functionTakingAFunction :: (Integer -> Integer -> Integer) -> Integer -> Integer -> Integer
functionTakingAFunction f a b = f a b
main :: IO ()
main = print (functionTakingAFunction addIntegers 5 6)
Functions are so important in Haskell that we get to refer to them by name or with no name at all. They also get their own letter of the Greek alphabet: lambda, so-called because of the lambda calculus. Lambda calculus is a universal model of computation, equivalent in power to the Turing machine. It's based on function abstraction and function application and this is the bare minimum you need to know to get started with Haskell.
Consider the named (mathematical) function
Based on an initial simplification, rewriting in anonymous form, we obtain:
i.e. the pair of
Similarly, the identity function,
or
in anonymous form. This anonymous form is a another "thing": this time it's something that maps
Under the notion of alpha-equivalence, the exact name of the bound variable, in this case
A second simplification is to transform multiple-argument functions, such as
is equivalent to
i.e. a function that maps
Application of the function
while applying our curried version looks like:
It can be shown that single-argument functions obtained by currying multiple-argument functions in this way are equivalent in expressive power and only differ in the number of steps required to evaluate them. The transformation is also obvious and completely mechanical. Furthermore, the additional steps required to evaluate curried functions can be trivially optimized away by the compiler.
Finally, these anonymous functions can be rewritten as lambda forms:
In Haskell this becomes:
\x -> x ^ 2
The backslash is chosen due its (questionable) resemblance to the lambda character.
Similarly, the anonymous form of our
\x y -> x ^ 2 + y ^ 2
or, in curried form:
\x -> \y -> x ^ 2 + y ^ 2
At this point, it should not surprise you that these anonymous functions are values in their own right. Thus, names can be assigned to them. This brings us full circle and illustrate the equivalence of functions and anonymous functions modulo the name itself:
λ> squareSum = \x -> \y -> x ^ 2 + y ^ 2
λ> :t squareSum
squareSum :: Num a => a -> a -> a
λ> squareSum 3 4
25
So far we've applied only single functions to single values. In practice, we're likely to want to do much more than this. Consider computing the hypotenuse of a right-angled triangle. We already have our squareSum
function which we can reuse. We also happen to know that there is a standard sqrt
function that we can reuse. Let's see what we can do with our (contrived) example:
λ> squareSum x y = x ^ 2 + y ^ 2
λ> squareSumWith3 = squareSum 3
λ> sqrt (squareSumWith3 4)
5.0
We're applying a function squareSumWith3
to value 4
and then feeding its value into the input of the function sqrt
. This kind of chaining or composition of functions is so common that it gets the second most unassuming operator in the language after whitespace, e.g. .
. We can rewrite this last line as follows:
λ> (sqrt . squareSumWith3) 4
5.0
Notionally, .
is equivalent to the function compose
of type (b -> c) -> (a -> b) -> (a -> c)
: i.e. a function taking two functions and yielding a third function:
λ> compose f g x = f (g x)
λ> :t compose
compose :: (t -> t1) -> (t2 -> t) -> t2 -> t1
λ> f = compose sqrt squareSumWith3
λ> :t sqrt
sqrt :: Floating a => a -> a
λ> :t squareSumWith3
squareSumWith3 :: Num a => a -> a
λ> :t f
f :: Floating t1 => t1 -> t1
λ> f 4
5.0
Of course, we can assign our silly sqrt . squareSumWith3
expression to a name too:
λ> f = sqrt . squareSumWith3
λ> f 4
5.0
Again, this example is totally contrived and we'll build up to more realistic uses soon.
We've already discussed function application. At this point it's worth mentioning Haskell's other function application operator: $
. This is semantically identical to our existing function application model, using whitespace, but with different, much lower, precedence. It is used to reduce the need for parentheses and is commonly used to reduce code's visual clutter. Consider our toy functions again:
λ> (sqrt . squareSumWith3) 4
5.0
λ> sqrt . squareSumWith3 $ 4
5.0
This doesn't look like much, but when dealing with multiple levels of nested parentheses, $
can eliminate much of this line noise and go a long towards making the code look less Lisp-like. When you use parentheses, $
or a combination of the two, is entirely a matter of taste, though you'll need to get used to it in order to be able read other people's code.
Let's consider a minimal Haskell source file:
double x = 2 * x
treble x = 3 * x
doubleThenTreble = treble . double
main = print $ doubleThenTreble 5
Some important notes:
- There are four "top-level bindings" namely,
double
,treble
,doubleThenTreble
andmain
- They are all "visible" to one another
- The order in which they are declared does not matter: merely that they are in the same scope
- The definition of each can refer to every other top-level name, including itself, i.e. they can be recursive and mutually recursive
It is also possible to introduce child lexical scopes using either let
…in
or where
. For example:
doubleThenTreble = treble . double
where double x = 2 * x
treble x = 3 * x
This introduces the names double
and treble
into the scope of the expression on the right-hand of the =
-sign. Thus, the expression treble . double
"sees" these names. However, they cannot be seen by other top-level bindings.
Here is are two equivalent uses let
…in
, also known as let
-bindings:
doubleThenTreble' =
let double x = 2 * x
treble x = 3 * x
in treble . double
doubleThenTreble'' =
let double x = 2 * x; treble x = 3 * x
in treble . double
This illustrates that let
introduces one or more new names bound to expressions in
the scope of another expression. The value of the overall expression is that of the in
-expression. The doubleThenTreble''
version illustrates that multiple expressions can be collapsed onto a single line using a semicolon. They are otherwise identical.
There is yet another option:
doubleThenTreble''' =
let double x = 2 * x
in let treble x = 3 * x
in treble . double
This is semantically slightly different in that treble
is not visible at the point at which the name double
is assign. This is a nested let
-binding: double x = 2 * x
is introduced into the scope of the expression let treble x = 3 * x in treble . double
which itself introduces treble x
into the scope of the expression treble . double
.
It should be clear, therefore, that let
…in
is generally more powerful than where
but that, in the simple case, they are equivalent. In this scenario, it is really a matter of personal preference: Should the names be (visually) introduced before they're referenced or after they're referenced.