Most people with a background in static imperative OO programming face initial difficulties when getting started with a dynamic Lisp-style functional language like Clojure. This is sad because programming in Clojure is a great experience!
The first hurdle is setting up a decent development environment that lets you enjoy the interactive nature of Clojure.
The next step would be to have a toy project. It should do
significantly more than print Hello World
, and it must have a direct
connection to everyday programming tasks.
And finally, novices should get a well-chosen list of hints and links to continue learning on their own.
The three things listed above are exactly what we try to accomplish in a 3-hour workshop. Don't expect having mastered the language afterwards, but you can expect to be well-prepared for learning Clojure and deep-dive into its ecosystem.
And this is how we get you started:
- Setup a development environment. We decided to offer VS Code with a Clojure extension as out-of-the-box-copy-deployment-package.
- Create your own Clojure project with Leiningen.
- Learn to work with a REPL and structural editing (a.k.a Paredit)
- Introduction to the syntax and some important functions.
- Jump right into a ready-made webapp based on Clojure and ClojureScript.
And if you want to continue learning Clojure afterwards, you can also join the local user group for Cologne/Bonn area.
Each participant should have an own notebook with at least 8 GB RAM, ready to run Java (version >= 8).
Prior amateur knowledge of at least one programming language (for example C++, Python, Ruby, Java, any Lisp, Scala) is required.
We provide bundles with Java and a customized Visual Studio Code (VSC) environment for Linux, Windows and OSX. There is a brief keyboard shortcut overview for Clojure programming.
- Max. number of participants: 20
- A room with decent power supply
- Internet access
- Projector (we can bring our own)
Exercise: On the command line create a new project called
"practising" using lein new practising
. Open the folder practising
in VSC and then open the file src/practising/core.clj
. Wait a few
seconds until the REPL is started. Connect to the REPL in the embedded
Terminal via lein repl :connect
.
Clojure is a Lisp. Code is organised in possibly nested expressions of the form:
(operator arg1 arg2 arg3 ...)
The operator
is something we can invoke, usually a function. There
are also special forms and macros.
Every arg
is itself either an expression or a symbol. Before it is
passed to a function invocation it is evaluated, unless it is quoted.
Exercise: In the file src/practising/core.clj
enter your first
hello world expression: (println "Hello World")
. Evaluate it, you
should see the text "Hello World" printed in the REPL.
Exercise: Quoting prevents the evaluation. In the REPL try to
evaluate an expression like (+ x y z)
and then try '(+ x y z)
.
There are some notable facts about this way of using brackets:
-
Code is not structured as a sequence of text lines, but a tree of expressions. This changes the way we can navigate and manipulate our code. Lisp leverages Paredit, which is elsewhere known as structural editing. Paredit manages the balancing of brackets for you. This gives you more power after you learned to use these new tools.
-
Code organization is very uniform: It's always prefix notation. There is no need for operator precedence rules. Arithmetic operators are functions and can be used anywhere where functions are applicable. On the other hand, arithmetic expressions in prefix notation look unusual and this takes some practise.
-
The syntactic basis of expressions are in effect lists, in other words: the code of Clojure is expressed in terms of Clojure's data representation. This idea is called "code-is-data", or "homoiconicity" for those who want to sound very smart. Since a Lisp is very good at manipulating data it can easily be used to create code. Macros look just like functions but are in effect embedded code generators, written in plain Clojure, executed at compile time. All of this means: You can morph your language in almost any direction helping you to better describe solutions for your problem domain.
-
Excessive nesting of expressions and overloading in meaning of parentheses are typical drawbacks of Lisps, but Clojure mitigates them with threading macros and the use of
[]
and{}
brackets. You'll not have more brackets in your Clojure code than in code written in your favorite C-style language.
-
The
;;
form creates line comments, either whole line or rest of line. -
The
#_
reader macro lets the reader skip the immediate following expression, which is very useful inside expressions. -
A
(comment ...)
must still be a well-formed S-expression and is used to encapsulate blocks of code that is used only for experimentation in development time.
-
Numbers map to Java and JS number types
-
Automatic coercion
-
Rational numbers
-
-
Strings are Java or JS strings
-
Boolean values are
true
andfalse
. Non-nil values are considered 'truthy'. -
nil is null, it's the only 'falsey' value beside
false
. -
Symbols are used for identifiers in code
-
Keywords are similar to Strings or Symbols, but can be used with namespace scoping.
Exercise: Calculate the average of the numbers 32, 23 and 1 with the
functions +
and /
.
Exercise: Concatenate 2 strings using function str
Exercise: Convert a string to a keyword and vice versa, using
functions keyword
and name
.
As part of the core, Clojure offers a small variety of immutable datastructure types:
-
Vector:
[1 "foo" :bar]
-
Map:
{:one 1, "Two" 3.0}
-
Set:
#{1 'Two 3.0}
-
List:
'(1 "Two" :three)
There are quite some common functions that work on all datastructures in a sensible way.
Exercise: Define an example datastructure in namespace
practising.core
for each of the types shown above using an
expression like (def myvector ...)
. Evaluate the whole namespace,
inspect the contents of your definitions in the REPL, change one the
definitions and re-evaluate it.
Exercise: Value lookup in maps can be done in three ways. You can
use get
, or a map as function on keywords or a keyword as a function
on maps. Define a map whose keys are keywords. Try out
each of these ways to lookup a value.
Exercise: Try to apply the following functions to each of your data structures:
-
first
-
rest
-
last
-
conj
-
count
-
get
-
seq
-
empty
Exercise: Visit the official
cheatsheet, read in the section
for "Collections" about the datastructure type-specific functions for
maps, vectors and sets. Try out functions like conj
, assoc
,
dissoc
, disj
etc. on your example data. To use typical functions
for sets (like union
or difference
) you'll need to learn a bit
about namespaces, see the upcoming section.
Exercise: Be aware that vectors are associative, with an index being the lookup key. This allows us to apply certain map-like operations to them. Use this idea to replace an existing value in a vector.
Clojure data and function definitions are organized in namespaces. Imagine a namespace as a dynamic map of symbols to Vars, and think of a Var as a box holding a piece of data or a function. (It is tempting to think of a Var as the same as a variable in imperative languages, and there are indeed similarities. However, the concept "variable" has no real meaning in functional languages. Be patient.)
Your file src/practising/core.clj
has a namespace declaration at the top.
Each def
or defn
inside it is effectively a mutation to this map,
executed when the Clojure runtime loads and compiles your namespace.
You can inspect a namespace at runtime, and the symbol *ns*
always
refers to your current namespace. The expression (ns-interns *ns*)
results in a map of all these definitions. In our example, this would
return the same as (ns-interns 'practising.core)
.
To use public definitions located in other namespaces a namespace must require them first. The typical way is like this:
(ns my.beautiful.ns
"Contains my best code ever."
(:require [clojure.string :as str]))
(defn first-funny-function
[s]
(str/split s #","))
The str
here is used as an alias for anything reachable in
clojure.string
namespace. Please note, that this alias does not
clash with the clojure.core
function str
.
Exercise: Require namespace clojure.set
with alias set
and try
out functions like set/difference
or set/intersection
.
Clojure is a functional programming (FP) language. While object-oriented programming uses the object (together with its blueprint class) as the smallest building block, FP languages are based on functions operating on a small spectrum of datastructures.
Functions are values. This means
-
we can create them anywhere with an expression
(fn [x] ...)
. -
we can pass them to other functions (promoting these other functions to higher-order).
-
we can return them as values.
There are two ways to define a function:
- The first is a combination of
def
andfn
and results in a top-level function definition in a namespace, making it available for any other function:
(defn average-age
[persons]
...)
This is the equivalent of writing:
(def average-age
(fn [persons]
...))
- The other is the anonymous form (a.k.a lambda expression), occurring often within a surrounding function:
(defn wrap-logging
[handler]
(fn [request] ;; <-- creates an anonymous function
(log/debug "REQUEST:" request)
(let [response (handler request)]
(log/debug "RESPONSE:" response)
response)))
For the anonymous form there is an even more compact notation. For example, instead of
(map (fn [x] (/ x 2)) numbers)
you're allowed to write
(map #(/ % 2) numbers)
Please note that the latter form can make your code much harder to understand if the anonymous function becomes more complex.
Anonymous functions can close over symbols visible in their surrounding scope, making them closures that carry values:
(defn make-adder
"Returns a 1-arg function adding x to its argument."
[x]
(fn [y]
(+ x y)))
=> (def add-3 (make-adder 3))
#'add-3
=> (add-3 2)
5
Formal arguments are defined in a vector of symbols after the docstring of a function:
(defn round
"Round down a double to the given precision (number of significant digits)"
[d precision]
...)
A single function can support multiple arities. In addition, you can define a variadic function that accepts any number of arguments. For our workshop goals, we don't need to go into the details here. If you are curious there is guidance in the Clojure docs on functions.
Clojure functions support a nifty way to bind data pieces in complex
datastructures to local symbols, widely known as destructuring. The
exact same tool is also available in let
and for
expressions.
Just as a glance, suppose you need to process a map entry, represented
as a pair [key value]
in one of your functions. Instead of writing
(defn uppercase-value
[map-entry]
[(first map-entry) (str/upper-case (second map-entry))])
you can write
(defn uppercase-value
[[key value]]
[key (str/upper-case value)])
This is called positional destructuring.
There is also support for map destructuring, useful for the very common case of processing a map like this:
(def track {:title "Be True"
:artist "Commix"
:genre "Drum & Bass"})
(defn track->str
[{:keys [artist title]}]
(str artist " - " title))
These examples provide only a first idea. Destructuring in Clojure is much more powerful, and can be extended further by libraries like plumbing. It very much leads to more readable code, therefore it is used quite often. For more detail, you should visit the Clojure docs on destructuring
Functions defined with defn
are public, which means any code
outside the namespace can depend on it. It is good style to have per
namespace a sharp distinction between the set of functions comprising
the API and internal implementation details.
In order to limit what a namespace offers to the rest of the world Clojure allows us to attach metadata to any Var in a namespace:
(def ^:private a-constant 42)
(defn ^:private some-intermediate-calculations
[...]
...)
Since private functions are very common there is a macro defn-
to
reduce visual clutter:
(defn- some-intermediate-calculations
[...]
...)
This section is not really necessary to follow the workshop. It shows to the curious a little of the power that Clojure offers when working with functions.
(partial f a b ...)
allows you to apply a functionf
to a subset of the required arguments resulting in a new function that has those arguments fixed:
(defn add
[x y z]
(+ x y z))
(def add-12 (partial add 5 7))
=> (add-12 3)
15
(apply f coll)
helps when we have an n-arity functionf
and an n-element collectioncoll
, and want to invokef
with the elements ofcoll
as arguments:
(defn add
[x y z]
(+ x y z))
(def numbers [1 2 3])
=> (add numbers)
;; will throw an ArityException
=> (apply add numbers)
6
(comp f g)
composes two functionsf
andg
(or more) so that the resulting function behaves onx
like(f (g x))
:
(def str->id
(comp str/trim str/lower-case))
=> (str->id " ABC ")
"abc"
(memoize f)
produces a function that caches results of a functionf
:
(defn- my-really-costly-calculation-impl
[a b]
...)
(def my-really-costly-calculation
(memoize my-really-costly-calculation-impl))
(juxt k1 k2 ...)
returns a function that looks up values for the provided keysk1
,k2
etc. and delivers them in one vector:
(def persons [{:firstname "Peter" :lastname "Pan"}
{:firstname "Daisy" :lastname "Duck}])
=> (map (juxt :firstname :lastname) persons)
(["Peter" "Pan"]
["Daisy" "Duck"])
(fnil f initial-value)
returns a function that replaces its first argument withinitial-value
in case it is nil. The benefit becomes clearer when recognizing that most "modification" functions likeconj
orassoc
expect a collection as their first argument. When building up new datastructuresfnil
is an elegant tool to handle initialization cases.
(def db {})
=> (update db :persons (fnil conj []) {:firstname "Donald" :lastname "Duck"})
{:persons [{:firstname "Donald" :lastname "Duck"}]}
An important property of a function is purity. A function is called pure if its result depends only on its arguments and if it does not change anything in its environment (in other words: it has no side-effects). Pure functions are pleasant because they are
-
easy to reuse,
-
easy to test,
-
thread-safe,
-
candidates for memoization.
Not surprisingly we want to have as many of them around us as possible. However, a system created of 100% pure functions is useless: no access to any input, no place to write any output to. We need to have some of our code do the "dirty job".
So the fundamental principle of program design in FP is:
-
Build as much of the system as possible as a pure transformation of data into other data.
-
Allow only very few pieces of code to interact with the world outside (that is: read and write data).
Clojure offers many ways to express conditional evaluation: if
,
if-let
, when
, when-let
, cond
, case
, condp
, and on top of
these there are conditional threading operators (introduced in a
section below). But don't be daunted, most of the time if
or cond
will do, and all others offer more or less syntactic sugar to those.
Here's the grammar of if
, which does not offer any surprises:
(if <test-expr>
<then-expr>
<else-expr>?)
Since the else-expr
is optional the if
expression will return nil if
the test-expr
fails to return a truthy value.
If you need more than two branches then cond
will help:
(cond
<test-expr1> <then-expr1>
<test-expr2> <then-expr2>
...
:else <else-expr>)
The first then-expr
whose preceding test-expr
returns a truthy
value will be the evaluation result of the cond
, otherwise the
else-expr
when present, otherwise nil.
The case
expression is more akin to the switch
in C-style
languages:
(case <expr>
<value1> <then-expr1>
<value2> <then-expr2>
...
<else-expr>)
The values can be any literals, even vectors or maps. If there is no
else-expr
and none of the values matches the result of expr
then an
exception is thrown.
To learn about condp
, a nifty macro that reduces clutter in a
special case of branching, you should try the next exercise.
Exercise: Look at the following function and consult the docs on
condp. Replace cond
with condp
.
(defn score->grade
[score]
(cond
(<= 90 score) "A"
(<= 80 score) "B"
(<= 70 score) "C"
(<= 60 score) "D"
:else "F")))
Suprisingly many functions work well as just one pipeline of function invocations. (The section about threading explains how we can limit the nesting of expressions.)
Of course, there are still numerous situations where we wish to bind
an intermediate result within a function to a local symbol. In
imperative languages we use local variables for this job, and it may
look and feel as if we did the same in Clojure, but conceptually
symbols just refer to evaluation results, let
does not give us
"boxes with varying content".
To introduce local symbols we use let
, as in this example:
(defn path->filename
[path]
(let [parts (remove str/blank? (str/split path #"\/"))]
(if (not (empty? parts))
(last parts))))
In one let
you can have as many symbol-expression pairs as you like,
and you can use destructuring where you would normally place the
symbols.
There is also if-let
, which is helpful when your let body should be
evaluated only if a test yields a truthy (non-nil, non-false) value:
(defn path->filename
[path]
(if-let [parts (seq (remove str/blank? (str/split path #"\/")))]
(last parts)))
The seq
here is like a test, because it returns either a sequence
(truthy) or nil if the resulting sequence would be empty. It's the
idiomatic way of writing (not (empty? ...))
.
Excessive nesting of expressions makes it much harder to read and
understand what is going on within a function. One mitigation is the
use of let
to introduce descriptive symbols, the other is
threading.
Compare these examples, whose result is the same:
(assoc-in
(assoc-in person [:employer :name] "doctronic")
[:address :street]
"Frankenstrasse 6")
(let [person
(assoc-in person [:employer :name] "doctronic")
person
(assoc-in person [:address :street] "Frankenstrasse 6")
person)
(-> person
(assoc-in [:employer :name] "doctronic")
(assoc-in [:address :street] "Frankenstrasse 6"))
Threading macros reorganize your code at compile time. The operator
->
(called thread-first) takes the initial expression and inserts
it as the first argument into the second expression, continuing
until everything is nested.
It has a sibling ->>
(called thread-last) doing the analogue with
the last argument, which is often needed for sequence processing
chains.
And both have cousins like cond->
, cond->>
, some->
and some->>
that help with conditional processing steps.
Exercise: Use macroexpand
and apply it to the thread-first
example show above. (Don't forget to quote the expression to prevent
the evaluation.)
Exercise: Use some->>
to rewrite the path->filename
function
Clojure excels in data processing tasks. One of the reasons is the sequence abstraction that can be applied to all datastructures and a well-designed set of core functions that transform seqs into other seqs.
A consequence is that idiomatic Clojure code contains almost no
looping. Another consequence is that programmers accustomed to
imperative for
and while
loops need to re-learn how to process
data on a significantly higher level. On this level, the brain is no
more bothered with irrelevant details, however it is challenged with
unfamiliar tools and solution strategies.
Exercise: Most sequence processing functions like map
, filter
etc. expect a sequence and ensure this by using seq
on their
argument. Apply seq
to datastructures like a map, a set or a vector
and see what is returned.
Exercise: Define a vector of persons, each with a name and an
age. Write a filter
expression that selects all persons in the age
between 20 and 29.
Clojure offers the following different approaches for processing sequences:
-
map
,mapcat
,filter
,reverse
,sort
and friends are usually good when you target a sequence as result. Building up a chain of these operations in combination with the thread-last macro->>
often yields the most elegant and maintainable solution. By far the biggest portion of sequence processing in idiomatic Clojure code is done on this basis. -
for
is Clojure's list comprehension operator. Be careful to not confuse it with the C-style for loop. It is great for traversing nested datastructures, when your goal is a one-dimensional result sequence. It is well suited for templating code, for example when rendering HTML or XML elements. It offers destructuring, local symbols and conditions. -
reduce
is a swiss army knife that can produce almost anything, sometimes leading to convoluted solutions. A reduction is often the terminal step of a sequence processing chain (for exampleinto
is only a special purpose reduction). -
doseq
is the imperative variant of a list comprehension. It provides the same traversal power asfor
and should be used exclusively for side-effects, for example writing out a bunch of files to disk. -
Old school function recursion is still a valid approach, and may yield the cleanest code in some situations, but be aware that your call-stack may limit your problem size. The raw mechanics of full tree traversion is already provided in clojure.walk.
-
loop-recur
is a manual tail call optimization for recursive operations, and effectively the most low-level construct. It's sometimes unavoidable, for example if you build a parser, or need to consume or produce several distinct pieces of data. While learning Clojure you might sometimes find yourself longing for a quick loop-recur sin. Resist. Step back. Ask yourself twice if there is no better tool for the job at hand.
Coming to a decision on how to approach a collection transformation problem boils down to looking at the list above from top to bottom and picking the tool that yields the simplest code. "Simple" in the Rich Hickey sense.
Rules of thumb:
-
If you need to traverse a more-dimensional data structure then give
for
a try, otherwise see if a combination ofmap
,filter
and others, perhaps terminated with areduce
, does the job. -
If you need side-effects then
doseq
is probably the best bet. -
To aggregate data into a non-sequential value (which can also be a map) a single
reduce
is usally all you need.
Exercise: Return a sequence of e-mail addresses that end with ".de"
for the following data structure:
(def friends
[{:name "Fred"
:emails ["[email protected]" "[email protected]" "[email protected]"]}
{:name "Ann"
:emails ["[email protected]" "[email protected]"]}])
Hint: You can approach this both with mapcat
as well as for
. Solve
the problem with both and compare the solutions.
Exercise: Take the friends
data structure from above and produce
a map {email -> name}
.
Hint: Again, mapcat
and for
are both sensible choices but if you
compare the solutions you'll see where for
starts to shine.
Clojure features so-called lazy sequences. Laziness is an optimization strategy. Here's a snippet to illustrate the effect:
(->> (range 1e6)
(take-while #(< % 100))
(filter odd?))
range
returns a sequence of potentially 1 million integers, but it
is not realised, so you pay almost nothing for this huge number of
numbers. take-while
cuts this sequence off after 99
, so filter
actually processes only 100 values.
Most sequence processing functions as well as for
return unrealised
lazy sequences, where actual processing is done as soon as someone
explicity accesses the values. Most of the time this is exactly what
we prefer. But there are a few exceptions to that rule, for example:
-
Processing a sequence of records in a database transaction must be finished before the transaction is committed.
-
When you work on closable resources (like streams or files) you'll wrap the processing in a
with-open
expression. You certainly need a guarantee that any execution is finished before the resource is closed. -
If side-effects are involved they could be delayed or not executed at all if no one asks for a result value.
You have basically two explicit ways to control laziness:
-
If you're interested in side-effects use
doseq
. -
If you work with a transaction or a resource you can append a
(doall)
to your processing chain or wrap yourfor
list comprehension in a(doall ...)
expression.
Forgetting to turn off laziness is a very common cause of bugs, so watch out for this.
The data that Clojure functions process is almost exclusively
immutable. "Mutations" like assoc
or conj
efficiently create new
versions of existing data. However, in almost every meaningful program
there has to be a small amount of mutable state, which more often than
not must be managed in a thread-safe manner.
For this, Clojure offers four types of "boxes", each with its own rules regarding concurrency.
-
The Var is the thing that keeps functions and values in namespaces. A Var is usually initialised when a namespace is loaded, and the values usually don't change. Clojure offers an advanced feature called dynamic scoping where we can rebind Vars "down the call-stack" to different values.
-
The Atom is the box that is most often used in practical tasks. It holds a single value, and changes to it happen atomically, synchronously and uncoordinated.
-
The Ref is part of the in-memory transaction system that Clojure offers. Changes happen atomically, synchronously and coordinated together with all other Refs touched in the same transaction.
-
Finally there is an Agent where changes happen atomically, asynchronously, generally uncoordinated, but can be triggered by a commit of a related transaction.
Let's define an Atom:
(def !counter (atom 0))
The leading "!" in !counter
is just a convention to give readers of
the code a clear sign that the code deals with something mutable.
The expression @!counter
yields the currently set value.
To update an atom we need swap!
and a side-effect free function,
because in multithreaded environments updates could be retried.
(swap! !counter inc)
or
(swap! !counter + 1)
The function we provide to swap!
(inc
or +
) receives the current
value of the atom as first argument and any additional arguments that
we pass to !swap
. If everything is fine the result of our
function is set as new value.
You can create atoms also in functions, for example like this:
(defn make-id-gen
[initial-value]
(let [!count (atom initial-value)]
(fn []
(swap! !count inc))))
(def gen-id! (make-id-gen 0))
=> (gen-id!)
1
=> (gen-id!)
2
Again the trailing "!" in gen-id!
is a mere convention to remind us
that gen-id!
has a side-effect.