This handbook isn't intended as a comprehensive reference to Squeal; that's what the Haddock is for. Instead, this is meant to give you an understanding of the fundamental parts of Squeal: its core types and typeclasses, as well as its heavy use of phantom types. Once you understand these, which are the most complicated part of learning to use Squeal, you should have a solid base to figure out everything else.
At its core, you can view Squeal as a small group of easy-to-understand types
(Query
, Manipulation
, Statement
, Expression
, FromClause
, and
TableExpression
) that have hard-to-understand type parameters (Expression grouping lat with db params from ty
). The former map to your existing
understanding of SQL in a fairly obvious way; the latter make sure that your
queries are actually valid.
We can get our first, most basic understanding of Squeal by ignoring the type parameters entirely and looking specifically at those core types.
Imagine, if you would, a world where we only had those six types above, with no type parameters attached to them.
That would be possible, right? And you can see how they could fit together to form larger queries. That gives us the composability we wanted.
Query
and Manipulation
line up cleanly with what we expect a SQL query to look like, so let's start by
looking at those two types.
Query
s represent mainly SQL SELECT. Manipulation
s represent UPDATE,
INSERT INTO, and DELETE FROM.
query = select
(List $ #u ! #name `as` #userName :* #e ! #email `as` #userEmail)
( from (table (#users `as` #u)
& innerJoin (table (#emails `as` #e))
(#u ! #id .== #e ! #user_id)) )
Query
s have two parts, the description of the columns selected, and the
TableExpression
that describes where to select those columns from.
query = select
(List $ #u ! #name `as` #userName } the column selection
:* #e ! #email `as` #userEmail) }
( from (table (#users `as` #u) }
& innerJoin (table (#emails `as` #e)) } the TableExpression
(#u ! #id .== #e ! #user_id)) ) }
A TableExpression
is generated from a FromClause
,
which only describes the joins. You convert a FromClause
to a TableExpression
using from
. Once you
have a TableExpression, that's where functions like where_
and having
let you restrict the rows you get back.
( from }
( table (#users `as` #u) } }
& innerJoin } a FromClause } the whole thing is
(table (#emails `as` #e)) } } a TableExpression
(#u ! #id .== #e ! #user_id)) } }
& where_ ... }
) }
Manipulations aren't particularly different from Querys. The one thing that
should be pointed out is that all of them specify a table using an Aliased (QualifiedAlias sch) (tab ::: tab0)
, and that UPDATE and DELETE have a
UsingClause
instead of a TableExpression
to specify additional joins. UsingClause
specifically
takes a FromClause
, to prevent you from doing things like putting WHERE
or HAVING
on them.
manip = deleteFrom
#users
(Using (table #emails))
(#users ! #id .== #emails ! #user_id)
(Returning_ Nil)
Once you get into specifying the actual bits of data you care about, you're
primarily concerned with generating values of type Expression
.
We're still ignoring most of the type parameters in Squeal's types, but in the case of
Expression the very last one, ty
, is of interest to us.
-- imagine a stripped-down version of Expression
data Expr ty
-- this is for teaching purposes; will not typecheck
foo :: Expr ('NotNull 'PGint4)
foo = #table ! #field1
You'll construct Expressions (and Condition
, which is just an alias for
Expression (null 'PGbool)
) everywhere.
The one last piece of the puzzle that we haven't explained yet: how do you
specify which columns in a Query that you're returning? How do you specify
which columns you're updating in an UPDATE Manipulation? The key here is the
NP
type, which Squeal reexports from generics-sop
. You'll see this type in a
lot of different places. It will also confuse the hell out of you the first time
you see it in documentation.
selectedColumns :: NP '[ "userName" ::: 'NotNull 'PGtext, "userEmail" ::: 'Null 'PGtext ]
selectedColumns =
#users ! #name `as` #userName
:* #emails ! #email `as` #userEmail
It's a way for Squeal to have heterogeneous lists that also typecheck against the
DB schema. We'll talk about how that typechecking works later, once we
understand Squeal's type parameters. From a practical perspective, mainly
you need to know that you construct them using (:*)
.
All of the types we've talked about so far live in "Postgres-land;" they only
know about Postgres types, and have no knowledge of how to translate those types
into actual in-memory Haskell data. That translation lives inside the
Statement
type, which lives one level above Query and Manipulation. Both
Query and Manipulation can be converted to Statement by providing an encoder to
convert the query's params from Haskell types to Postgres types, and a decoder
to do the opposite for the query's results. We'll look at encoding and decoding
to/from PG later; just remember that conceptually the Statement type is there to
combine a raw query with an associated Haskell <=> PG codec.
Let's put all this together in a concrete example. Say we had the following typical query:
getUsers :: Statement DB () User
getUsers = query $ select
(List $ #u ! #name `as` #userName :* #e ! #email `as` #userEmail)
( from (table (#users `as` #u)
& innerJoin (table (#emails `as` #e))
(#u ! #id .== #e ! #user_id)) )
Since all this is is an expression built out of the smaller types we've looked at above, we can expand this out into its constituent parts and see just how everything fits together. Ignore the amount of noise in the type parameters; focus on the type heads like Query, Expression etc. that we talked about.
getUsersQuery :: Query with lat DB params '[ "userName" ::: 'NotNull 'PGtext, "userEmail" ::: 'Null 'PGtext ]
getUsersQuery =
select
(List getUsersSelection)
( from (table (#users `as` #u)
& innerJoin (table (#emails `as` #e))
(#u ! #id .== #e ! #user_id)) )
getUsersSelection :: NP (Aliased (Expression 'Ungrouped with lat DB params
'[ "u" ::: '[ "id" ::: 'NotNull 'PGint4
, "name" ::: 'NotNull 'PGtext
]
, "e" ::: '[ "id" ::: 'NotNull 'PGint4
, "user_id" ::: 'NotNull 'PGint4
, "email" ::: 'Null 'PGtext
]
]))
'[ "userName" ::: 'NotNull 'PGtext, "userEmail" ::: 'Null 'PGtext ]
getUsersSelection =
#u ! #name `as` #userName :* #e ! #email `as` #userEmail
getUsersFrom :: TableExpression 'Ungrouped lat with DB params
'[ "u" ::: '[ "id" ::: 'NotNull 'PGint4
, "name" ::: 'NotNull 'PGtext
]
, "e" ::: '[ "id" ::: 'NotNull 'PGint4
, "user_id" ::: 'NotNull 'PGint4
, "email" ::: 'Null 'PGtext
]
]
getUsersFrom =
( from (table (#users `as` #u)
& innerJoin (table (#emails `as` #e))
getUsersJoinExpr))
getUsersJoinExpr :: Expression 'Ungrouped lat with DB params
(Join
'[ "u" ::: '[ "id" ::: 'NotNull 'PGint4
, "name" ::: 'NotNull 'PGtext
]
]
'[ "e" ::: '[ "id" ::: 'NotNull 'PGint4
, "user_id" ::: 'NotNull 'PGint4
, "email" ::: 'Null 'PGtext
]
])
('Null 'PGbool)
getUsersJoinExpr =
#u ! #id .== #e ! #user_id
These definitions can be loaded into the REPL. You can try playing around with them if you'd like to get a feel for how the types work. You'll need the following DB schema type in scope:
type UsersColumns =
'[ "id" ::: 'Def :=> 'NotNull 'PGint4
, "name" ::: 'NoDef :=> 'NotNull 'PGtext
]
type UsersConstraints = '[ "pk_users" ::: 'PrimaryKey '["id"] ]
type EmailsColumns =
'[ "id" ::: 'Def :=> 'NotNull 'PGint4
, "user_id" ::: 'NoDef :=> 'NotNull 'PGint4
, "email" ::: 'NoDef :=> 'Null 'PGtext ]
type EmailsConstraints =
'[ "pk_emails" ::: 'PrimaryKey '["id"]
, "fk_user_id" ::: 'ForeignKey '["user_id"] "public" "users" '["id"]
]
type Schema =
'[ "users" ::: 'Table (UsersConstraints :=> UsersColumns)
, "emails" ::: 'Table (EmailsConstraints :=> EmailsColumns)
]
type DB = Public Schema
So far we've been using (!)
and as
without really understanding them; we
just use them as if they're equivalent to SQL .
and AS
, and assuming that
they'll do the right thing.
But what's the actual type of this fragment?
#some_table ! #yay
When we were writing simple queries, we didn't particularly have to care about what type this is. But as we write more complicated queries and start to think about making our queries more modular, we need to be able to express the types of values like this.
For instance, depending on where you use it, it might be an Expression:
foo :: Expression 'Ungrouped lat with db '[] '[ "some_table" ::: '[ "yay" ::: 'NotNull 'PGint4 ] ] ('NotNull 'PGint4)
foo = #some_table ! #yay
But it can also be an NP:
bar :: NP (Expression 'Ungrouped lat with db '[] '[ "some_table" ::: '[ "yay" ::: 'NotNull 'PGint4 ] ]) '[ 'NotNull 'PGint4 ]
bar = #some_table ! #yay
It can even become a table name:
baz :: Aliased (QualifiedAlias "some_table") ("yay" ::: "yay")
baz = #some_table ! #yay
fromClause :: FromClause lat with
'[ "some_table" :::
'[ "yay" ::: 'Table
('[] :=> '[ "a" ::: 'NoDef :=> 'NotNull 'PGint4 ])
]
]
'[]
'[ "yay" ::: '[ "a" ::: 'NotNull 'PGint4 ] ]
fromClause = table baz
The key here is that both (!)
and as
are polymorphic in their return
type. This polymorphism is key to making Squeal convenient to write, but it can
be very confusing when first starting to use the library, as it makes it hard to
understand how to construct a certain Squeal type. This misunderstanding seems
like a consequence of how we usually use polymorphic returns. Usually the
polymorphic return is on a function that does some kind of conversion or
processing; canonical examples are things like read
from Base, or fromJSON
from Aeson. In Squeal, that type conversion happens when you name things. It
accomplishes the goal of embedded SQL in Haskell in a relatively easy-to-read
way, but it takes some getting used to.
What (!)
can return is defined by the typeclass IsQualified
;
similarly, as
uses a typeclass called Aliasable
.
If you squint through the mess of type parameter noise, you can see that this
is how certain types are constructed (for instance, our NP lists of expressions
in select lists, and expressions themselves):
and thus, how you can use (!)
and as
in so many places.
This is especially important around uses of NP and update lists; a common mistake is writing something like this while doing SELECTs:
select
(List $
#table ! #field1
:* #table ! #field2
:* Nil)
(from ...)
But this final Nil
is actually unnecessary! The NP cons operator :*
takes
another NP as its second argument, which is why you might think you need Nil to
terminate the NP list. But if you look at the instances of IsQualified
, it can
polymorphically return an NP already!
So we can simplify our definition a bit:
-- this typechecks
select
(List $
-- note that these two lines have different types
(#table ! #field1 :: Aliased (Expression _ _ _ _ _) _)
:* (#table ! #field2 :: NP (Aliased (Expression _ _ _ _ _) _)))
(from ...)
We'll see later that this is also Squeal's preferred way to handle things like CTEs.
What does all this mean for us? It gives us the type signatures we'd need when
we want to, say, have a common query that's parameterized by a table name, or a
query that needs to be parameterized by which column to run a WHERE on. For
instance, for a table name as a parameter, we'd want an Aliased (QualifiedAlias sch) tab
, since that's the parameter for table
. We can then be assured that
callers can construct values of this type, since we can see an instance for
IsQualified
for it.
Following the typeclasses, we see that we can do something similar for
Expression, since there are instances for IsQualified
and Aliasable
for
generating Expressions as well.
If you wanted to be completely general and allow a Squeal identifier to be used in, say, both an expression and table name position, you could use the IsQualified and Aliasable constraints directly. In general this doesn't seem to be very useful though.
One last thing is that while the return type of (!)
is polymorphic, its
arguments are not; it takes two Alias
es, which are essentially renamed
Proxy
's. Since you can also generate Aliases from labels like #table
, this
provides another avenue you can use to make your queries more general and
modular; take in Aliases as parameters and use (!)
to convert to the type you
need on-demand.
It's finally time to talk about the heart and soul of Squeal, what makes it tick: the plethora of type parameters on all of its core types.
This is where Squeal handles the heavy lifting of ensuring that all of your queries are well-formed, that they reference columns that actually exist, that your SQL comparisons are well-typed, that the columns that you insert as DEFAULT actually have default values, and so on.
To refresh your memory, here's the full type signature of Expression
, in all of
its phantomly-typed glory:
newtype Expression grp lat with db params from ty
One quick note before we dive into these. Throughout Squeal you'll see the
type operator (:::)
. For instance, Squeal uses type-level specifications of
table columns that look like so:
type Columns =
'[ "col1" ::: 'NotNull 'PGbool
, "col2" ::: 'NotNull 'PGint4
]
In the Squeal ecosystem, this essentially means "type of." Note that it's 3
colons, not 2 like for normal type signatures. Underneath the hood, (:::)
is
just an alias for type-level tuples, so the above type is equivalent to the
following:
type Columns =
'[ '("col1", 'NotNull 'PGbool)
, '("col2", 'NotNull 'PGint4)
]
With that out of the way, let's go through Squeal's type parameters one-by-one, shall we?
In SQL, an expression is either a "normal" value or an aggregate value. You probably already understand this intuitively, even if you never put a name on it. For instance, let's say you wrote a query like the following:
CREATE TABLE foo
( user_id INT4 NOT NULL
, value INT4 NOT NULL
);
SELECT user_id, SUM(value)
FROM foo
GROUP BY user_id;
In the select list of the above query, user_id
and SUM(value)
are both
"aggregate" values; user_id because it's included in the GROUP BY, and
SUM(value) since it's directly calling an aggregate function.
If we don't call any aggregating functions or do a GROUP BY, we have "normal" values instead:
SELECT user_id, value
FROM foo;
It's important to understand the distinction, because you can't mix normal and aggregate values in the same query:
SELECT user_id, SUM(value)
FROM foo;
-- ERROR: column "foo.user_id" must appear in the GROUP BY clause or be used in an aggregate function
-- LINE 1: SELECT user_id, SUM(value)
Postgres barfs, because SUM constrains it to only produce one row, but then what
should the user_id
of that row be? The complicated part is that an identifier
like user_id
could be either a normal or an aggregate value, depending on the
surrounding context. So Squeal is obligated to keep track of whether an
expression was generated from an aggregate function or is part of the GROUP BY,
which it does using the grp
type parameter on an Expression.
It can either be 'Ungrouped
, meaning a normal value, or 'Grouped bys
, which
indicates that Expression is valid in any context where the columns bys
are
grouped.
For instance, selecting #table ! #field1
in an aggregated context
would want a type like Expression ('Grouped '[ "table" ::: "field1" ]) ...
,
indicating that this selection isn't valid without that field grouped. Calling
a function like Squeal's count
would return a type like Expression ('Grouped bys) ...
, where the type variable
being abstract means that it doesn't care what columns the query is grouped by,
just that it's grouped in some way.
However, if you use an aggregate function, the query does have to be grouped,
which can cause a confusing error the first time you see it. If you're used
to writing things like SELECT COUNT(id) FROM ...
in SQL, you might try to
naively translate this into Squeal like so:
foo :: Query lat with
'[ "public" :::
'[ "table" ::: 'Table
('[] :=> '[ "id" ::: 'NoDef :=> 'NotNull 'PGint4 ])
]
]
'[]
'[ "a" ::: 'NotNull 'PGint8 ]
foo = select
(count (All $ #table ! #id) `as` #a)
(from (table #table))
Which will promptly fail with a type error:
• No instance for (Aggregate AggregateArg (Expression 'Ungrouped))
arising from a use of ‘count’
• In the first argument of ‘as’, namely
‘count (All $ #table ! #id)’
In the first argument of ‘select’, namely
‘(count (All $ #table ! #id) `as` #a)’
In the expression:
select (count (All $ #table ! #id) `as` #a) (from (table #table))
What this is essentially telling you is that you forgot to group your query,
which you need to do even if you're running your aggregate on all rows. We
do this by explicitly telling Squeal to group on no columns, which is one of the
few legitimate uses of Nil
:
foo :: Query lat with
'[ "public" :::
'[ "table" ::: 'Table
('[] :=> '[ "id" ::: 'NoDef :=> 'NotNull 'PGint4 ])
]
]
'[]
'[ "a" ::: 'NotNull 'PGint8 ]
foo = select
(count (All $ #table ! #id) `as` #a)
( from (table #table)
& groupBy Nil -- group by nothing!
)
You will mess this up at least 2 times.
When a column is specified in the selection list, how do you know when it's valid? This may seem like a silly question, but consider the following (stupid) query:
SELECT bar.id FROM foo;
No one would write a query like this. Syntactically it seems fine. But it's
nonsense, because we haven't pulled in the bar
table to our query! And
Postgres agrees with us.
ERROR: missing FROM-clause entry for table "bar"
LINE 1: SELECT bar.id
Every query creates a scope for identifiers, separate from the database scope of table names and schemas, and each join adds a new name to this scope. This is plain to see if we use AS to rename a table that we pull in.
SELECT f.user_id FROM foo AS f;
-- ok
SELECT foo.user_id FROM foo AS f;
-- ERROR: invalid reference to FROM-clause entry for table "foo"
-- LINE 1: SELECT foo.user_id FROM foo AS f;
The name f
is only in scope for this query. Even when you don't specify
an alias in the SQL, it works as if you had specified the alias as the table's
name, which we can see since the latter query gets rejected by Postgres.
In order to figure out whether identifiers like #table ! #field
are valid,
Squeal needs to track this scope as well. It does so using the from
type
variable.
In fact, this type variable is all you need to write standalone expressions.
-- doesn't typecheck
badExpr :: Expression 'Ungrouped '[] with db params from ('NotNull 'PGint4)
badExpr = #table ! #field
-- does typecheck
goodExpr :: Expression 'Ungrouped '[] with db params
'[ "table" ::: '[ "field" ::: 'NotNull 'PGint4 ] ]
('NotNull 'PGint4)
goodExpr = #table ! #field
Whenever you create an expression by referring to a column by name, it's this
scope inside the from
type variable that Squeal checks to ensure that the
reference is valid. If you're getting HasErr
constraint violations, it's
likely that the contents of this variable aren't in the right form.
The only way to add things to from
is by using joins. If you look at the type
signature of the various join functions that Squeal provides, they take an
existing FromClause
and add an extra table to it.
table
:: (Has sch db schema, Has tab schema (Table table))
=> Aliased (QualifiedAlias sch) (alias ::: tab)
-> FromClause lat with db params '[alias ::: TableToRow table]
-- `table` (and `view`, `common`, `subquery`) produce a FromClause
-- containing a single set of columns...
innerJoin
:: FromClause lat with db params right
-> Condition Ungrouped lat with db params (Join left right)
-> FromClause lat with db params left
-> FromClause lat with db params (Join left right)
-- ...which the join functions append to the tables already in `from`
For a more complicated query, here's what the value of from
may end up
looking like:
type QueryFrom =
'[ "table1" :::
'[ "id" ::: 'NotNull 'PGuuid
, "field1" ::: 'Null 'PGint4
, "field2" ::: 'NotNull 'PGtext
]
, "table2" :::
'[ "id" ::: 'NotNull 'PGuuid
, "field1" ::: 'NotNull 'PGbool
]
, "table3" :::
'[ "id" ::: 'NotNull 'PGuuid
, "table2_id" ::: 'NotNull 'PGuuid
, "created_at" ::: 'NotNull 'PGtimestamptz
]
]
type DB = -- ... some schema with the appropriate tables ...
froms :: FromClause lat with DB params QueryFrom
froms =
( table (#table1 `as` #table1)
& innerJoin (table (#table2 `as` #table2))
(#table2 ! #id .== #table1 ! #id)
& innerJoin (table #table3)
(#table3 ! #table2_id .== #table2 ! #id)
)
Note that the order of tables in from
needs to be the same order in which they
were joined.
Explicitly providing a type alias for which tables a query has access to
like this is often useful, especially when you want to allow callers of your
query to pass in custom expressions. For instance, if you wanted to allow
callers to pass in a filtering condition, you could do that by ensuring that
the from
in the passed-in Expression is valid for the tables that the query
joins on.
parameterized ::
Expression 'Ungrouped lat with db params QueryFrom ('NotNull 'PGbool)
-> Query lat with db params '[ "a" ::: 'NotNull 'PGint4 ]
parameterized cond =
select
(#table1 ! #field1)
( from froms
& where_ cond
)
Then, as you'd expect, any callsite could use any columns specified in that query scope, while rejecting any references that aren't in scope.
-- these typecheck
qry1 = parameterized (#table1 ! #field2 .== "mobile")
qry2 = parameterized (#table2 ! #field1)
-- these don't
qry3 = parameterized (#table4 ! #id .== "SimSpace")
qry4 = parameterized (#table1 ! #nonfield)
Note the "flow" of type-level data going on here: we start off with existing
table definitions in db
, which get pulled into the value of from
by using
innerJoin
, leftOuterJoin
, etc. Once from
has the right type-level value,
now expressions like #table1 ! #field2
will satisfy their IsQualified
constraint and typecheck.
Remember when I said that from
was the only place that Squeal looked when
it was checking whether column references were valid? That was a slight lie.
Squeal actually checks one other place, the lat
type param.
You can see this in the various IsQualified instances for Expression, Aliased Expression, etc.
Other than joins, there's one other place where identifiers can come into scope for a query: a Postgres-specific feature called lateral joins. If you've used Microsoft SQL Server before, you might know these as CROSS APPLYs, but if not, it's totally understandable if you've never seen a lateral join before.
To understand what a lateral join does, it's helpful to compare it to a normal subquery join. When you do a normal subquery join, the subquery is a completely different query from the one it's being joined into, which means that the subquery can't access anything that's in scope from the parent query.
SELECT *
FROM some_table st
JOIN (SELECT * FROM some_other_table sot WHERE st.id = sot.table_id)
-- error: st is not in scope in the subquery
But sometimes having those values in scope would be extremely useful. Say you have a table for your users, and another table containing login events.
CREATE TABLE "user"
( id SERIAL PRIMARY KEY NOT NULL
, username TEXT NOT NULL
, email TEXT NOT NULL
, created_at TIMESTAMPTZ NOT NULL
);
CREATE TABLE "login_event"
( user_id INT4 NOT NULL REFERENCES "user" (id)
, source TEXT NOT NULL -- web|mobile|embed
, timestamp TIMESTAMPTZ NOT NULL
);
How would you write a query to get, for each user, the latest login, including all the username/email information, and the source information? It's possible using normal subqueries, albeit somewhat painful to write. If you don't believe me, stop reading and try writing a raw SQL query to get this data.
Lateral joins give you a more convenient way to write "dependent" queries like this. They act more like a "foreach" loop: Postgres will run the lateral subquery for each row being joined to, and calculate separate sets of rows to join with.
SELECT u.id, u.username, u.email, u.created_at,
le.source, le.timestamp
FROM public.user AS u
INNER JOIN LATERAL
(SELECT le.source, le.timestamp
FROM public.login_event AS le
WHERE le.user_id = u.id
ORDER BY le.timestamp DESC
LIMIT 1) AS le
ON TRUE;
Helpful, right? But since it's another way in which identifiers can come into scope for a query, Squeal needs a way to represent it.
Structurally, lat
looks exactly like from
; it's a mapping of table
names to lists of columns and their types. You'll never have to worry about it
unless you're using lateral joins; Squeal provides explicit lateral versions
of all the normal join functions which are there if you need them. Since most
of the time you won't be using these, well...
We saw in the discussion of from
how that type parameter gets populated by
usages of functions like innerJoin
, leftOuterJoin
, and so on. But the
table types that get joined on have to come from db
, which we can see
happening in the type signatures for table
and view
:
table
:: (Has sch db schema, Has tab schema (Table table))
=> Aliased (QualifiedAlias sch) (alias ::: tab)
-> FromClause lat with db params '[alias ::: TableToRow table]
view
:: (Has sch db schema, Has vw schema (View view))
=> Aliased (QualifiedAlias sch) (alias ::: vw)
-> FromClause lat with db params '[alias ::: view]
It can be a little easier to see what's going on here if we have an explicit DB type. Let's define one.
type DB = '[ "public" ::: Schema ]
type Schema = '[ "users" ::: Table UsersTable ]
type UsersTable =
'[] :=> '[ "id" ::: 'NotNull 'PGuuid ]
-- using our new type
table (#public ! #users)
Walk through the constraints being discharged: the first Has sch db schema
becomes Has "public" DB schema
, checking whether the "public" schema exists
within our database type. The second Has
becomes Has "users" schema (Table table)
,
checking whether there's a correctly named table within the schema found by
the first constraint. Once those constraints are discharged, the compiler
is happy to give us a FromClause
which we can use in our joins and queries,
thus bringing columns into scope.
That's all db
is: a type-level tree structure representing your entire
database schema, such that downstream code can figure out what tables and
columns exists, and thus, whether your queries are reasonable. You can think
of it as the source of truth about column types that everything starts from.
In practice most of the compilation errors you encounter won't be caused by your
db
type. It's not usually a go-to type parameter to put constraints on for
abstraction, either. Squeal does not provide any functionality for keeping
your top-level DB type definition in sync with your actual Postgres schema,
however. Ensuring that your DB type reflects reality is your responsibility.
You've probably seen SQL queries of the form WITH <subquery> AS <name> SELECT ...
,
like a let-binding but for a subquery. Squeal has support for these sorts of
queries as well, using the with
function.
qry :: Query lat with DB params '[ "a" ::: 'NotNull 'PGbool ]
qry = with
(select (true `as` #a) (from (table #users)) `as` #cte)
(select Star
(from (common #cte))) -- note the use of `common` to bring the CTE into scope
-- for this query
Since these subqueries are scoped to each individual query, Squeal needs some
way of keeping track of them so that it can check that usages of common
are
valid. As you might guess, that storage is happening in the with
type
parameter.
with
( select (true `as` #a) (from (table #users)) `as` #subq1
:>> select (false `as`#b) (from (table #users)) `as` #subq2
)
<some-query>
-- <some-query> :: Query ... with ...
-- where `with` = '[ "subq1" ::: '[ "a" ::: 'NotNull 'PGbool ], "subq2" ::: '[ "b" ::: 'NotNull 'PGbool ] ]
Note that when passing the CTEs to with
, we use (:>>)
to construct the list.
When constructing select lists and such, we were using (:*)
, but here the
argument has a different type. Instead of a value of type NP
, with
takes in
a value of type Path
.
The practical difference is that subqueries further in the list can use subqueries
earlier in the list. Just remember that you need to use this pointier list
constructor when using CTEs.
In theory with
can be a useful point of abstraction, by specifying that some
query needs a subquery of some other type in scope. In practice you can
usually just pass that subquery as a Haskell parameter.
One small trick about using CTEs in Squeal: with
requires its subqueries to
all be the same type, either all Query
's or all Manipulation
s. That's
a little annoying, since it seems like you couldn't, say, run an update and
then do a query on the updated rows; you'd have a Manipulation
and a Query
in the same CTE expression. But Squeal provides a convenient function
queryStatement
to convert from a Query
to a Manipulation
(since any query is just a manipulation
that touches no rows). That'll allow you to implement cases like these.
A caution about the queries in a CTE block: Squeal represents them as a sequential path, but Postgres actually runs them concurrently, so their order is unpredictable. So you may not be able to rely on them to, say, run an update statement and a query in the order they're written. You can resolve this by having one statement in the query depend on a value returned by another one.
When generating queries, it's possible to write normal Haskell functions that
take in values and inline those values into your queries. However, Squeal also
provides a built-in hole for parameters, which you use through the param
function instead of inlining them.
qry
:: params ~ '[ 'NotNull 'PGint4, 'NotNull 'PGtext ]
=> Query lat with DB params '[ "email" ::: 'Null 'PGtext ]
qry = select
(#e ! #email)
( from (table (#users `as` #u)
& innerJoin (table (#emails `as` #e))
(#e ! #user_id .== #u ! #id))
& where_ (#u ! #id .== param @1) -- specify which param with TypeApplications
& where_ (#u ! #name .== param @2) -- note that the params are 1-indexed
)
The main reason to specify the params
variable and pass values in this way
is to make some common value available to all the components of a query, without
needing to explicitly pass it around. For instance, when writing a query with
CTEs, you'll probably have some common ID that all parts of the query need.
qry
:: params ~ '[ 'NotNull 'PGuuid ]
=> Query lat with DB params ...
qry = with
( ... -- all CTE definitions here can make use of the UUID param
)
... -- as can the main query
Another distinction between inlining and parameters is that parameters are supplied to the DB using Postgres' binary format, and thus aren't susceptible to SQL injection attacks. Inlined values are, as the name implies, directly inlined in the SQL string that gets sent to the database, and thus quite a bit more dangerous.
Because of this, you should prefer passing data into a query using parameters whenever possible.
There isn't much else to say about the params
type variable; this is all it's
used for.
Nothing special here, it's just the Postgres type and nullity of the expression.
You can see the possible nullities on the constructors of NullType
,
and the possible types on the constructors of PGType
.
Despite everything we've looked at, we still don't know how to bridge the gap between Haskell-land and Postgres-land. How do we get the data from the database into a form that our program can understand?
As mentioned earlier, the type that bridges this gap is Statement
, as it
combines some query with encoders for parameters, and decoders for PG rows.
Squeal provides the query
and manipulation
functions for converting Query's and Manipulations into Statement, but note
that these rely on Squeal's generic encoding/decoding, which takes away control
of the translation. Sometimes it works, but most of the time you're not doing
1-to-1 translation between a record type and rows with exactly the same number
of columns, where the column names are exactly the same as the record fields.
For those cases, it's better to write your codecs manually.
My recommendation is to use the constructors of Statement
directly: Query
and Manipulation
. You can see that these take in
arguments of type EncodeParams
and DecodeRow
.
For EncodeParams
, you need to turn the parameter type into a list of functions
for pulling the individual values using (.*)
and (*.)
. (*.)
is for the
very last element of the list, (.*)
is for everything else.
data SomeType = SomeType { a :: Bool, b :: Int32, c :: Int64 }
-- Notice how we don't actually take in a value of SomeType,
-- we just create a list of accessor functions
enc :: EncodeParams db '[ 'NotNull 'PGbool, 'NotNull 'PGint4, 'NotNull 'PGint8 ] SomeType
enc = a .* b *. c
You will mix up these operators and get a confusing type error at least once.
DecodeRow
implements both Monad
and IsLabel
, which lets you construct
decoders like the following:
-- Note how we can decode from a row with completely different
-- column names from our record type
dec :: DecodeRow
'[ "a_bool" ::: 'NotNull 'PGbool
, "a_4byteint" ::: 'NotNull 'PGint4
, "an_8byteint" ::: 'NotNull 'PGint8
]
SomeType
dec = do
a <- #a_bool
b <- #a_4byteint
c <- #an_8byteint
pure $ SomeType { a = a, b = b, c = c }
Since it implements Monad
, you can implement complicated conditional parsers,
key off of certain columns being null or non-null to parse further columns,
construct intermediate data structures from groups of fields, etc.