barn cover: Testing 1 2 3

I have actually been doing stuff since my last post. The life I alluded to last time got in the way again at least for a while, but now I'm safely back to not having one, so it's cool.

Meanwhile I've been doing something thoroughly inane: testing my code. I shouldn't be calling it inane like that, seeing as how I test software for a living. Or rather, as I'm usually careful to point out, I write software that tests other software for a living. I like to think there's a difference there.

But whether or not I should call it inane, it kind of is. I think I've stumbled upon a maxim of some kind, that a developer is less inclined to test his own code than someone else. At least in my case, I don't feel interested to test my code because I already have knowledge of it, so in a way it's like working on the same problem a second time without doing anything new and cool.

On the other hand, testing someone else's code is a challenge - to find what's wrong with it. As I was telling our summer intern at work the other day, when he asked me about code reviews, I review under the assumptions that the code I'm reviewing is crap and that anything the author has done is either unnecessary or can be easily achieved by some better means.

Nevertheless, I think this is a great differentiator among developers. Ones who are good at testing their own code are valuable. It's one of the steps towards being a consistent, thorough coder, as opposed to a rockstar, cowboy, or hacker (as they are sometimes called at work).

So what did I do?

The rest of the post is going to be the usual codey garbage that permeates my other posts. I've written a fair bit of code at this point (about 10 kloc), and that is a fairly ridiculous amount for one person to test. And there are different classes of tests needed, ranging from unit tests to scenario-type tests to other more difficult types.

Before I could really even consider testing, I had to see whether my code was even testable. At work, much of our product is a black box, so testing it has to be done purely by inputs and outputs. This is fine in many cases, but I was hoping to do better here. If the MUD is a black box, then every test consists of hooking up some kind of input generator that drives telnet to connect to a running instance of the MUD, send text in, and verify the text coming back. This is certainly doable and in fact should be one class of tests that I write, but it's really very tedious to do, and is susceptible to unimportant changes like the wording in some command or some minor timing issue.

Turns out the code is structured in a way that I can write most of my tests at a much lower level, which allows instantiating objects on the MUD directly. I decided to start off with unit tests for some simple components, to work out the kinks with the test harness.

The test harness

Speaking of test harness, I chose Test::Unit, which seems well established and offers the right kind of functionality I'm looking for. I guess the functionality I want is to be able to add tests easily, group or categorize tests, and for the harness to largely stay out of my way.

Test::Unit, despite the name, seems like it works well enough for tests other than unit tests. It provides a way to declare startup, cleanup, and test code, and assert conditions. That's pretty much all of the basic needs of any test, far as I know. Everything else can be built on that. Really, the coolest thing I think Test::Unit brings is figuring out at runtime all of the tests that you've written and running them. All you have to do is require 'test/unit' and subclass all of your test classes from Test::Unit::TestCase, and it figures it out.

The tests

I started out writing unit tests for functions that are self-contained. For instance, I wrote tests for argify and flattenHash (which takes a nested hash of hashes (of hashes, etc.) and flattens it into just one level). This was really easy, since it is straight verification of an input string turning into an output array.

I wrote some tests for the base evento system. I used mock objects to represent the source, target, and bystander of an evento and stored state in the mock objects to indicate what evento was received by each. Then I could verify that the eventos received by each were the right ones.

Then I wrote tests for dext. This was a bit more involved, because there is a bit of context required. There is a source and there is a target or bystander, and depending on the various directives in the dext or edext, different things would show up to different actors. I ended up doing this by loading an actual MUD instance and using real nipics to fulfill their roles in the dext and edext.

This consequently meant I needed to be able to define a "test world" populated by these test nipics. There was some refactoring required to make these piecesfall in place, but now I have the ability to test scenarios in an actual MUD instance, for instance testing that player characters can be loaded and move around.

Test Data

When I finished the character tests, I now had two sets of tests that both require a MUD instance to be running. I had created a "test data" class to wrap various features of a MUD world into variables for convenient consumption. For instance, when the test data is created, it finds specific rooms, nipics, dealies, etc. in the world and stores them in variables. Later, tests can do something like place a character directly into some room without having to search the MUD for the room.

The possibility exists that I might want a few different worlds, and tests may use one or another of them, depending on their requirements. However, the MUD (for better or for worse) has a global presence in the form of a few variables, namely $mud. So to swap between different test data, I'd need to tear down the MUD and bring up the new one.

I'd rather not do this all the time, so I built a TDFactory class that manages instances of test data. At test startup, a test can request a test data by name. If that named test data has already been initialized, then no work needs to be done. Otherwise, the old one is brought down and the new one is created.

Next?

I'm about done writing tests for now. Though I don't really feel like writing them, I cannot deny that they afford me peace of mind when making changes. I'm confident now that if I do someting wrong in dext or argify, my tests will catch it. I have a few ideas for what to work on next. We'll see what hapens.

barn cover

Blog Archive

About Me

20090825

Testing 1 2 3

So what did I do?

The test harness

The tests

Test Data

Next?

0 comments:

Post a Comment