Flipping the script: testing Go binaries

Flipping the script: testing Go binaries

Testing command-line tools in Go is easy when you flip the script.

By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems.

—Alfred North Whitehead, “An Introduction to Mathematics”

Wouldn’t it be great if we could write Go CLI tests almost like shell scripts?

Testing CLI tools in Go by running the binary directly can get complicated: our test would need to execute the Go tool to build the binary, then execute the binary, passing it arbitrary arguments, and then we’d have to check the exit status, parse the output, and so on.

That sounds like hard work, so let’s not do that. Instead, let’s flip the script.

This is the first of a series of five tutorials on test scripts in Go:

  1. Test scripts in Go
  2. Testing CLI tools in Go
  3. Files in test scripts
  4. Conditions and concurrency
  5. Standalone test scripts

Introducing testscript

testscript defines a little “language” for writing test scripts, which we store as separate files in our project, along with our Go source code.

Let’s see a simple example of such a script:

exec echo 'hello world'
stdout 'hello world\n'

(Listing script/1)

Pretty straightforward! You can probably guess what this does right away. The exec line runs echo 'hello world', just as though we’d typed this command line at the terminal.

On the next line, stdout asserts something about the expected output of the program we just ran: that it contains at least the string hello world, followed by a newline.

That’s all very well, but what does it have to do with Go? How, in other words, can we actually run such a script from a Go test, and connect its results with that of the test? testscript takes care of this too.

Let’s take our simple hello world script, then, and write a Go test that runs it:

package script_test

import (
    "testing"

    "github.com/rogpeppe/go-internal/testscript"
)

func Test(t *testing.T) {
    testscript.Run(t, testscript.Params{
        Dir: "testdata/script",
    })
}

(Listing script/1)

There’s nothing particularly special about the function Test, by the way, and its name doesn’t signify anything; it’s just an ordinary Go test function, so far. The interesting part for us here is the call to testscript.Run. What’s that about?

You’ll notice that we pass our test’s t as an argument, suggesting that testscript.Run could use it to fail the test. That makes sense, since Run’s job is to run a bunch of test scripts, each of which acts as a (parallel) subtest of Test.

The other thing we pass to testscript.Run is a Params struct which, as you probably guessed, supplies configuration parameters. The Dir parameter gives the path to the directory where our script files will live.

In our example, we’ve set it to testdata/script, though again the script here has no special meaning. We can put test scripts in any convenient subfolder of testdata.

Let’s go ahead and create the testdata/script folder, and add a file in it named hello.txtar containing the text of our little example script:

exec echo 'hello world'
stdout 'hello world\n'

(Listing script/1)

We’ll see what the txtar extension means shortly, but first, let’s give this a try.

Provided that we’ve set everything up correctly, running our Go tests now will tell testscript to run this script, and the parent test will pass or fail depending on its result. Let’s see what happens:

go test

PASS

Encouraging!

Running programs with exec

So what does the test script actually do? To put it another way, what are we really testing here?

Here’s the first line again:

exec echo 'hello world'

(Listing script/1)

This tells testscript to execute the program echo, with the single command-line argument hello world. The effect of this is just as though the “user” had typed echo 'hello world' at the terminal.

This exec statement, though, is more than just a way of running programs; it’s also an assertion. The program we execute must succeed: that is, its exit status must be zero. Otherwise, the exec assertion fails.

If this, or any other assertion in the script, fails, then the script terminates at that point. In other words, it acts like t.Fatal in a Go test. It marks the subtest as failed and exits, skipping the rest of the script.

So the simplest way we can test our programs using testscript is to use exec to run them directly from a test script, and see if they succeed. Each script runs in its own unique temporary work directory, which will be automatically cleaned up after the script exits.

The work directory starts empty, which helps to make tests reproducible. If the programs panics when some file or directory it relies on is not present, for example, we’d detect that as soon as we tried to run it from a script.

This is the kind of problem that’s typically very difficult to catch in ordinary tests, because it only happens when you actually run the program.

What’s true of every bug found in the field? It passed all the tests!

—Rich Hickey, “Simple Made Easy”

So test scripts go a long way to closing that “works on my laptop” loophole. If we could only run programs with certain arguments and assert their exit status, that would still be useful in itself, but we can do more.

Here’s the second line of the script again:

stdout 'hello world\n'

(Listing script/1)

This is another assertion, this time about the contents of the standard output from the previous exec statement (we’ll see how to match standard error shortly).

stdout asserts that the output will match some regular expression, in this case the string hello world followed by a newline.

The program can output more than that, but this script asserts that it produces at least hello world\n somewhere among its output. If it doesn’t, the stdout assertion will fail, and this subtest will fail as a result.

We can see what that looks like by changing the stdout assertion to one that definitely won’t succeed:

exec echo 'hello world'
stdout 'goodbye world\n'

We think this should fail the test, so let’s see what happens.

Interpreting testscript output

We deliberately set up this test script to make it fail, and here’s the result:

go test

--- FAIL: Test (0.00s)
    --- FAIL: Test/hello (0.00s)
        testscript.go:397:
            > exec echo 'hello world'
            [stdout]
            hello world
            > stdout 'goodbye world\n'
            FAIL: testdata/script/hello.txtar:2: no match for
            `goodbye world\n` found in stdout

There are a few interesting things about this output to note. First, the failing parent test is Test, and the name of the failing subtest represented by the script is hello. That’s derived from the filename of the script, hello.txtar, but without the .txtar extension:

--- FAIL: Test/hello (0.00s)

Each line of the script is shown in the test output as it’s executed, prefixed by a > character:

> exec echo 'hello world'

The standard output of the program run by exec is shown next in the test output:

[stdout]
hello world

And here’s the line that actually triggered the test failure:

> stdout 'goodbye world\n'
FAIL: testdata/script/hello.txtar:2: no match for
`goodbye world\n` found in stdout

The stdout assertion is failing, because the program’s output didn’t match the given expression, so the test fails. Notice that as well as printing the failing line itself, the message also gives the script filename and line number where we can find it:

testdata/script/hello.txtar:2

Remember, all of this is happening because we invoked the script in our Go test named Test:

func Test(t *testing.T) {
    testscript.Run(t, testscript.Params{
        Dir: "testdata/script",
    })
}

(Listing script/1)

It’s important to note that the Test function, even though it’s a Go test, doesn’t test anything directly by itself.

Its job here is merely to invoke testscript.Run, delegating the actual testing to the scripts stored in testdata/script. At the moment we have only one script file in that directory, but if we were to add more, they would all be run as parallel subtests of Test.

How should we organise our test scripts, then? We could put all the scripts for the whole project in a single directory, and run them from a single test using testscript.Run. That’s fine, but if there are many scripts, you may prefer to put them in separate directories and run them from distinct tests. Most projects, though, seem to get by just fine with a single directory of test scripts.

By the way, if we want to run just one specific script, we can do that too. Because each script is a subtest, we can run it in the same way as we would any individual test or subtest: by giving its name along with the -run flag to go test:

go test -run Test/hello

It’s a good idea to keep each script fairly small and focused on one or two related behaviours. For example, you might have one script for checking the various kinds of “invalid input” behaviour, and another for the happy path. A script that does too much is difficult to read and understand, just like a test that does too much.

The testscript language

Test scripts are not shell scripts, though they look quite similar (we’ll talk about some of the important differences later in this chapter). Neither are they Go code. Instead, they’re written in testscript’s own domain-specific language (DSL), dedicated purely to running programs and asserting things about their behaviour.

The point of the testscript DSL is not to be a general-purpose programming language, or even to replace all tests written in Go. Like all the best tools, it does one thing well: it provides an elegant notation for writing automated tests of command-line programs.

Its keywords include the ones we’ve already seen, exec and stdout, plus a few others that we’ll talk about later in this chapter. And its syntax is about as simple as it could possibly be, but no simpler.

A good way to think about the testscript DSL is as a kind of restricted, single-purpose version of the Unix shell. Restricted, but by no means restrictive: we can do some pretty sophisticated things with it, as we’ll see in the rest of this chapter.

A really good language should be both clean and dirty: cleanly designed, with a small core of well understood and highly orthogonal operators, but dirty in the sense that it lets hackers have their way with it.

—Paul Graham, “Hackers & Painters: Big Ideas from the Computer Age”

Negating assertions with the ! prefix

We’ve seen that we can use stdout to assert that a program’s standard output matches some regular expression. What else can we assert in a script?

Interestingly, we can also assert something that the program’s output mustn’t match, by prefixing the stdout line with a ! character.

exec echo 'hello world'
! stdout 'hello world\n'

The ! is the equivalent of Go’s “not” operator, so we can think of it as meaning “not” here, too. We’re asserting that the standard output of the program run by the previous exec should not match the expression hello world\n.

What’s the result of running this script? Well, the exec command line certainly does produce this output, so we’d expect our ! stdout assertion to fail. Indeed, that’s the case:

> ! stdout 'hello world\n'
FAIL: testdata/script/hello.txtar:2: unexpected match for
`hello world\n` found in stdout: hello world

We can also use the ! operator to negate other assertions: for example, exec.

As we saw earlier, an exec by itself asserts that the given program succeeds: that is, returns zero exit status. What would ! exec assert, then?

As you might expect, the effect of ! exec is to assert that the program fails: in other words, that its exit status is not zero.

Let’s try using it with the cat program, which among other things will print the contents of any file it’s given.

If we try to use cat to print some file that doesn’t exist in the work directory, then, it should return a non-zero exit status. We can write that as an assertion using ! exec:

! exec cat doesntexist

(Listing script/1)

Why would we want to assert that running a certain command fails? Well, that’s the expected behaviour of a command-line tool when the user specifies some invalid flag or argument, for instance. It’s conventional to print an error message and exit with a non-zero exit status in this case.

For example, the public API of the cat program says that it should fail when given a non-existent file as argument. Our ! exec example asserts exactly that behaviour.

What about the error message behaviour? Could we test that too? Conventionally, programs print error messages on the standard error stream, rather than standard output. In practice, both streams usually go to the same place (the user’s terminal, for example), but they’re independent in principle, and we can test them independently.

We’re already familiar with stdout, so it’s not surprising that the corresponding assertion for matching standard error is named stderr:

! exec cat doesntexist
stderr 'cat: doesntexist: No such file or directory'

(Listing script/1)

This asserts, not only that the given cat command fails, but also that it produces the expected message on its standard error stream.

In fact, we can go even further. Because, in this case, we don’t expect cat to print anything to its standard output at all, we can assert that too, using ! stdout:

! exec cat doesntexist
stderr 'cat: doesntexist: No such file or directory'
! stdout .

(Listing script/1)

Recall that the argument to stdout or stderr is a regular expression. Since the regular expression . matches any text, the effect of the third line here is to assert that the program produces no output on stdout.

The combined effect of these three lines, then, is to assert that:

  • cat doesntexist fails
  • it prints the expected message to stderr, and
  • it prints nothing to stdout

What if we want to test that a program invoked with valid arguments prints standard output, but no standard error? In that case, we can write, for example:

exec echo success
stdout 'success'
! stderr .

(Listing script/1)

This script asserts that:

  • echo success succeeds
  • it prints the expected message to stdout, and
  • it prints nothing to stderr

Passing arguments to programs

As we’ve seen in many of the examples so far, we can pass arguments to the program run by exec, by putting them after the program name:

# Execute 'echo', with the argument 'success'
exec echo success

It’s important to know exactly how these arguments are passed to the program, though. Each space-separated word is treated as a distinct argument, unless those words are grouped together inside single quotes. This can make a big difference to how the program interprets them.

For example, suppose we want to give a program some argument that contains spaces, such as the filename data file.txt. This wouldn’t work:

exec cat data file.txt

Because data and file.txt are passed as distinct arguments to cat, it thinks we’re referring to two distinct files. Instead, we need to quote the filename, with single quotes:

exec cat 'data file.txt'

And, since the single quote character has this special effect of grouping space-separated arguments, you might be wondering how we can write a literal single quote character when we need to.

To do that, we write two consecutive single quote characters, as in this example:

exec echo 'Here''s how to escape single quotes'

This prints:

Here's how to escape single quotes

Importantly, and unlike in shell scripts, double quote characters have no quoting effect in test scripts. They’re simply treated as literal double quote characters:

exec echo "This will print with literal double quotes"

Unlike the single quote, the double quote character has no special meaning in test scripts, so it will simply be passed on to the program as part of its arguments. This gives the following output, including the double quotes:

"This will print with literal double quotes"

Watch out for this, as it’s easy to accidentally use double quotes when you meant to use single quotes.

Next: Testing CLI tools in Go

So you're ready for green belt?

So you're ready for green belt?

VS Code and Go: a superhero's guide

VS Code and Go: a superhero's guide