Testing Notes

Notes on Julia's testing facility and thoughts on improvements

Now

Julia core currently has

throw/try/catch
error/exception
assert function, assert macro (captures expression for output)

Julia test/runtests.jl currently has

runtests(filename), which does a pretty ANSI prompt then a load(filename)
assert_approx_eq(a,b) macro/fn which confirms that a=b within a (fixed) 1e-6
timeit(expression,name) which runs the expression 5 times and prints the minimum elapsed time
assert_fails macro that fails if the expression doesn't throw an exception at all
if called via "julia runtests.jl file.jl", will run runtests(file.jl)

Julia test/Makefile (and related Makefiles) currently has:

process for using runtests.jl to pretty-print test files that ran successfully

Goals

ability to run a test suite outside of the Julia build environment (i.e., test functions are in core, not in a separate directory, and they don't rely on Make)
support for setup/breakdown of temporary objects
failures cause an error message to be printed, but not an uncaught error to be thrown
should be inspired by PyUnit, perhaps with contributions from test_that and others
simple

Things to perhaps use from PyUnit

separation of test running from output
explicit setup/cleanup
test discovery (execute all test_*.py files below specified directory)

Things to perhaps use from test_that

hierarchy of context(), test_that(), expect_that() (rename these, though?)
the expect_that(2+2, equals(4)) syntax and set of built-in relations is pretty good and easy to use

Juliaish things

probably tests should be run by a producer, and output be generated by a consumer
probably macros would make even cleaner syntax: expect_that(2+2, @equals 4), expect_that(f(x), @is_false), expect_that(show(z), @prints_text "hi mom"), etc.
could maybe even do: @expect_that 2+2 equals 4, @expect_that f(x) is_false, @expect_that show(z) prints_text "hi mom"?

Thinking this through

Julia doesn't do lazy evaluation, so the outer call has to be a macro, to do things like capture output and indicate what failed. But equals, is_false, etc., probably can't be bare words, because otherwise @expect_that will have variable arguments, which is not going to work. Probably need @expect_that 2+2 equals(4). First argument gets evaluated with result, stdout/stderr, and exceptions all being captured. Second argument gets evaluated, then gets parsed to determine what the result is compared to from the first argument.
First (observed) argument evaluation results get put in a TestObserved type with slots for each type of output.
expect_that internally generates a TestResult type
expect_that does a produce(tr::TestResult). So, there's a Task that executes each file in a list. Within each file, along with setup/shutdown code, there are these implicit producers that generate the output to the consumer.
context("asdf") or @context "asdf" should probably just do tls(:context, "asdf"). The process in expect_that that generates the TestResult then just does tls(:context) to fill a slot.
I don't much like test_that's syntax for groups of expectations, although I like the idea. test_that("the sky is blue", { expect_that(...) }). What's wrong with just another labeler? @testing "the sky is blue" seems good enough to me.
Currently thinking labelers should be test_context and test_group, and maybe the actual test should be test_that? Jeff suggested minimizing macros, so only @test_that would be one...

So, a simple file might look like:

test_context("String Processing")
setup = "The quick brown fox jumps over the lazy dog."

test_group("whitespace trimming functions")
@test_that strip("\t  hi   \n") equals("hi")
@test_that strip("hi") equals("hi")
@test_that strip("") equals("")

test_group("string length and size functions")
@test_that length("hi mom") equals(6)
@test_that length("") equals(0)

teardown = "noop"

Executing tests

runtests("filename", consumer)
runtests(["f1", "f2", "f3'], consumer)
runtests("dirname", consumer) -- recursively find test_*.jl files within specified directory
runtests(consumer) -- default is above, with current working directory
consumer is a function name, defaulting to a simple text-based outputter of TestResults
julia -e "runtests()" should work

outputters

the consumer argument should be a function that takes a Task object and consumes TestResults objects until they're gone, generating some sort of output
the default method will display the context and one . per successful test, or an E for an unsuccessful test. At the end, all failed tests will be output, along with summary stats.
other outputters could use graphical displays or activate lava lamps or whatever.

expectations

If test_that evaluates its first argument and collects the results in a TestObserved type object, then the expectation functions return a closure/function of one argument that takes a TestObserved object and returns a TestResult.

Tentative expectations:

is_true, is_false
is_close_to -- tests using some numeric slop
equals/is_identical_to -- tests using isequal
is_a -- tests using isa
matches -- string regex match
prints_text -- stdout matches string (and/or regex?)
throws_error
takes_less_than -- performance testing

Note: Boy, namespaces would be useful here. In theory, I'd like it if runtests were in the core namespace, but when it runs, it imports in all of the expectations, so they can be used transparently without the user having to do anything.

Parsing Expressions vs Expectations

Stefan suggests using Julia's ability to manipulate expressions as a way of improving output with the funny syntax of expectations. So @test f(x) == 7 gets processed by the macro three ways:

It gets evaluated to see whether the test succeeds or not.
The string is kept for possible output purposes.
The AST is examined and compared against several built-in standard forms. These conditions allow the parts of the expression to be separately evaluated and stored for output purposes.

In this case, the logic is something like:

if (ex.head == :comparison) 
  result.lhs_eval = eval(ex.args[1]))
  result.rhs_eval = eval(ex.args[3]))
end

Other built-in cases could intelligently deal with thrown exceptions and so forth.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly