Flipping the script: testing Go binaries
Testing command-line tools in Go is easy when you flip the script.
By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems.
—Alfred North Whitehead, “An Introduction to Mathematics”
Wouldn’t it be great if we could write Go CLI tests almost like shell scripts?
Testing CLI tools in Go by running the binary directly can get complicated: our test would need to execute the Go tool to build the binary, then execute the binary, passing it arbitrary arguments, and then we’d have to check the exit status, parse the output, and so on.
That sounds like hard work, so let’s not do that. Instead, let’s flip the script.
This is the first of a series of five tutorials on test scripts in Go:
- Test scripts in Go
- Testing CLI tools in Go
- Files in test scripts
- Conditions and concurrency
- Standalone test scripts
Introducing testscript
testscript
defines a little “language” for writing test
scripts, which we store as separate files in our project, along with our
Go source code.
Let’s see a simple example of such a script:
exec echo 'hello world'
stdout 'hello world\n'
Pretty straightforward! You can probably guess what this does right
away. The exec
line runs echo 'hello world'
,
just as though we’d typed this command line at the terminal.
On the next line, stdout
asserts something about the
expected output of the program we just ran: that it contains at
least the string hello world
, followed by a newline.
That’s all very well, but what does it have to do with Go? How, in
other words, can we actually run such a script from a Go test, and
connect its results with that of the test? testscript
takes
care of this too.
Let’s take our simple hello world
script, then, and
write a Go test that runs it:
package script_test
import (
"testing"
"github.com/rogpeppe/go-internal/testscript"
)
func Test(t *testing.T) {
.Run(t, testscript.Params{
testscript: "testdata/script",
Dir})
}
There’s nothing particularly special about the function
Test
, by the way, and its name doesn’t signify anything;
it’s just an ordinary Go test function, so far. The interesting part for
us here is the call to testscript.Run
. What’s that
about?
You’ll notice that we pass our test’s t
as an argument,
suggesting that testscript.Run
could use it to fail the
test. That makes sense, since Run
’s job is to run a bunch
of test scripts, each of which acts as a (parallel) subtest of
Test
.
The other thing we pass to testscript.Run
is a
Params
struct which, as you probably guessed, supplies
configuration parameters. The Dir
parameter gives the path
to the directory where our script files will live.
In our example, we’ve set it to testdata/script
, though
again the script
here has no special meaning. We can put
test scripts in any convenient subfolder of testdata
.
Let’s go ahead and create the testdata/script
folder,
and add a file in it named hello.txtar
containing the text
of our little example script:
exec echo 'hello world'
stdout 'hello world\n'
We’ll see what the txtar
extension means shortly, but
first, let’s give this a try.
Provided that we’ve set everything up correctly, running our Go tests
now will tell testscript
to run this script, and the parent
test will pass or fail depending on its result. Let’s see what
happens:
go test
PASS
Encouraging!
Running programs with
exec
So what does the test script actually do? To put it another way, what are we really testing here?
Here’s the first line again:
exec echo 'hello world'
This tells testscript
to execute the program
echo
, with the single command-line argument
hello world
. The effect of this is just as though the
“user” had typed echo 'hello world'
at the terminal.
This exec
statement, though, is more than just a way of
running programs; it’s also an assertion. The program we execute must
succeed: that is, its exit status must be zero. Otherwise, the
exec
assertion fails.
If this, or any other assertion in the script, fails, then the script
terminates at that point. In other words, it acts like
t.Fatal
in a Go test. It marks the subtest as failed and
exits, skipping the rest of the script.
So the simplest way we can test our programs using
testscript
is to use exec
to run them directly
from a test script, and see if they succeed. Each script runs in its own
unique temporary work directory, which will be automatically
cleaned up after the script exits.
The work directory starts empty, which helps to make tests reproducible. If the programs panics when some file or directory it relies on is not present, for example, we’d detect that as soon as we tried to run it from a script.
This is the kind of problem that’s typically very difficult to catch in ordinary tests, because it only happens when you actually run the program.
What’s true of every bug found in the field? It passed all the tests!
—Rich Hickey, “Simple Made Easy”
So test scripts go a long way to closing that “works on my laptop” loophole. If we could only run programs with certain arguments and assert their exit status, that would still be useful in itself, but we can do more.
Here’s the second line of the script again:
stdout 'hello world\n'
This is another assertion, this time about the contents of the
standard output from the previous exec
statement (we’ll see
how to match standard error shortly).
stdout
asserts that the output will match some regular
expression, in this case the string hello world
followed by
a newline.
The program can output more than that, but this script
asserts that it produces at least hello world\n
somewhere among its output. If it doesn’t, the stdout
assertion will fail, and this subtest will fail as a result.
We can see what that looks like by changing the stdout
assertion to one that definitely won’t succeed:
exec echo 'hello world'
stdout 'goodbye world\n'
We think this should fail the test, so let’s see what happens.
Interpreting
testscript
output
We deliberately set up this test script to make it fail, and here’s the result:
go test
--- FAIL: Test (0.00s)
--- FAIL: Test/hello (0.00s)
testscript.go:397:
> exec echo 'hello world'
[stdout]
hello world
> stdout 'goodbye world\n'
FAIL: testdata/script/hello.txtar:2: no match for
`goodbye world\n` found in stdout
There are a few interesting things about this output to note. First,
the failing parent test is Test
, and the name of the
failing subtest represented by the script is hello
. That’s
derived from the filename of the script, hello.txtar
, but
without the .txtar
extension:
--- FAIL: Test/hello (0.00s)
Each line of the script is shown in the test output as it’s executed,
prefixed by a >
character:
> exec echo 'hello world'
The standard output of the program run by exec
is shown
next in the test output:
[stdout]
hello world
And here’s the line that actually triggered the test failure:
> stdout 'goodbye world\n'
FAIL: testdata/script/hello.txtar:2: no match for
`goodbye world\n` found in stdout
The stdout
assertion is failing, because the program’s
output didn’t match the given expression, so the test fails.
Notice that as well as printing the failing line itself, the message
also gives the script filename and line number where we can find it:
testdata/script/hello.txtar:2
Remember, all of this is happening because we invoked the script in
our Go test named Test
:
func Test(t *testing.T) {
.Run(t, testscript.Params{
testscript: "testdata/script",
Dir})
}
It’s important to note that the Test
function, even
though it’s a Go test, doesn’t test anything directly by
itself.
Its job here is merely to invoke testscript.Run
,
delegating the actual testing to the scripts stored in
testdata/script
. At the moment we have only one script file
in that directory, but if we were to add more, they would all be run as
parallel subtests of Test
.
How should we organise our test scripts, then? We could put all the
scripts for the whole project in a single directory, and run them from a
single test using testscript.Run
. That’s fine, but if there
are many scripts, you may prefer to put them in separate directories and
run them from distinct tests. Most projects, though, seem to get by just
fine with a single directory of test scripts.
By the way, if we want to run just one specific script, we can do
that too. Because each script is a subtest, we can run it in the same
way as we would any individual test or subtest: by giving its name along
with the -run
flag to go test
:
go test -run Test/hello
It’s a good idea to keep each script fairly small and focused on one or two related behaviours. For example, you might have one script for checking the various kinds of “invalid input” behaviour, and another for the happy path. A script that does too much is difficult to read and understand, just like a test that does too much.
The testscript
language
Test scripts are not shell scripts, though they look quite similar
(we’ll talk about some of the important differences later in this
chapter). Neither are they Go code. Instead, they’re written in
testscript
’s own domain-specific language (DSL),
dedicated purely to running programs and asserting things about their
behaviour.
The point of the testscript
DSL is not to be a
general-purpose programming language, or even to replace all tests
written in Go. Like all the best tools, it does one thing well: it
provides an elegant notation for writing automated tests of command-line
programs.
Its keywords include the ones we’ve already seen, exec
and stdout
, plus a few others that we’ll talk about later
in this chapter. And its syntax is about as simple as it could possibly
be, but no simpler.
A good way to think about the testscript
DSL is as a
kind of restricted, single-purpose version of the Unix shell.
Restricted, but by no means restrictive: we can do some pretty
sophisticated things with it, as we’ll see in the rest of this
chapter.
A really good language should be both clean and dirty: cleanly designed, with a small core of well understood and highly orthogonal operators, but dirty in the sense that it lets hackers have their way with it.
—Paul Graham, “Hackers & Painters: Big Ideas from the Computer Age”
Negating assertions with
the !
prefix
We’ve seen that we can use stdout
to assert that a
program’s standard output matches some regular expression. What else can
we assert in a script?
Interestingly, we can also assert something that the program’s output
mustn’t match, by prefixing the stdout
line with a
!
character.
exec echo 'hello world'
! stdout 'hello world\n'
The !
is the equivalent of Go’s “not” operator, so we
can think of it as meaning “not” here, too. We’re asserting that the
standard output of the program run by the previous exec
should not match the expression hello world\n
.
What’s the result of running this script? Well, the exec
command line certainly does produce this output, so we’d expect
our ! stdout
assertion to fail. Indeed, that’s the
case:
> ! stdout 'hello world\n'
FAIL: testdata/script/hello.txtar:2: unexpected match for
`hello world\n` found in stdout: hello world
We can also use the !
operator to negate other
assertions: for example, exec
.
As we saw earlier, an exec
by itself asserts that the
given program succeeds: that is, returns zero exit status. What would
! exec
assert, then?
As you might expect, the effect of ! exec
is to assert
that the program fails: in other words, that its exit status is
not zero.
Let’s try using it with the cat
program, which among
other things will print the contents of any file it’s given.
If we try to use cat
to print some file that doesn’t
exist in the work directory, then, it should return a non-zero exit
status. We can write that as an assertion using ! exec
:
! exec cat doesntexist
Why would we want to assert that running a certain command fails? Well, that’s the expected behaviour of a command-line tool when the user specifies some invalid flag or argument, for instance. It’s conventional to print an error message and exit with a non-zero exit status in this case.
For example, the public API of the cat
program says that
it should fail when given a non-existent file as argument. Our
! exec
example asserts exactly that behaviour.
What about the error message behaviour? Could we test that too? Conventionally, programs print error messages on the standard error stream, rather than standard output. In practice, both streams usually go to the same place (the user’s terminal, for example), but they’re independent in principle, and we can test them independently.
We’re already familiar with stdout
, so it’s not
surprising that the corresponding assertion for matching standard error
is named stderr
:
! exec cat doesntexist
stderr 'cat: doesntexist: No such file or directory'
This asserts, not only that the given cat
command fails,
but also that it produces the expected message on its standard error
stream.
In fact, we can go even further. Because, in this case, we don’t
expect cat
to print anything to its standard output at all,
we can assert that too, using ! stdout
:
! exec cat doesntexist
stderr 'cat: doesntexist: No such file or directory'
! stdout .
Recall that the argument to stdout
or
stderr
is a regular expression. Since the regular
expression .
matches any text, the effect of the third line
here is to assert that the program produces no output on
stdout
.
The combined effect of these three lines, then, is to assert that:
cat doesntexist
fails- it prints the expected message to
stderr
, and - it prints nothing to
stdout
What if we want to test that a program invoked with valid arguments prints standard output, but no standard error? In that case, we can write, for example:
exec echo success
stdout 'success'
! stderr .
This script asserts that:
echo success
succeeds- it prints the expected message to
stdout
, and - it prints nothing to
stderr
Passing arguments to programs
As we’ve seen in many of the examples so far, we can pass
arguments to the program run by exec
, by putting
them after the program name:
# Execute 'echo', with the argument 'success'
exec echo success
It’s important to know exactly how these arguments are passed to the program, though. Each space-separated word is treated as a distinct argument, unless those words are grouped together inside single quotes. This can make a big difference to how the program interprets them.
For example, suppose we want to give a program some argument that
contains spaces, such as the filename data file.txt
. This
wouldn’t work:
exec cat data file.txt
Because data
and file.txt
are passed as
distinct arguments to cat
, it thinks we’re referring to two
distinct files. Instead, we need to quote the filename, with single
quotes:
exec cat 'data file.txt'
And, since the single quote character has this special effect of grouping space-separated arguments, you might be wondering how we can write a literal single quote character when we need to.
To do that, we write two consecutive single quote characters, as in this example:
exec echo 'Here''s how to escape single quotes'
This prints:
Here's how to escape single quotes
Importantly, and unlike in shell scripts, double quote characters have no quoting effect in test scripts. They’re simply treated as literal double quote characters:
exec echo "This will print with literal double quotes"
Unlike the single quote, the double quote character has no special meaning in test scripts, so it will simply be passed on to the program as part of its arguments. This gives the following output, including the double quotes:
"This will print with literal double quotes"
Watch out for this, as it’s easy to accidentally use double quotes when you meant to use single quotes.
Next: Testing CLI tools in Go