20111008

Playing the Puppeteer at Testing

Hello, thoroughly fictional readers. It has been a while, but I have not been idle. Well, I've played a lot of video games and did a bunch of work, but I also worked on the MUD. Primarily, I've been working on combat, and I now have something roughly working that I'm not thoroughly displeased with. But this post isn't about combat, not really. It's about something far more fun and interesting than fighting baddies.

It's about testing. Yes, again (and again). And why shouldn't it be? I test software for a living, so this is supposed to be interesting to me.

The feature - behaviors

So I coded up these things unimaginatively called behaviors, which are a self-contained piece of state and functionality that can be attached to a mob (reminder: mobs are living entities, either player characters or NPCs, aka "nipics"). They have access to all the eventos that it sees (reminder: eventos are just the events that are sent around when mobs do things, e.g. walk into a room, hit someone, or die). They get called on every pulse in the main loop of the MUD, so they can keep timers to do things. And they can basically be dropped in or pulled out of a mob without affecting any of its other, well, behavior.

I coded these up so that I could attach specific behaviors for different types of combat personalities. I won't go into too much detail just yet, but the main behavior I'll be talking about causes a mob, when hit, to retaliate against its aggressor. As I wrote this code, I did a good thing: I only wrote tests for it to see that it's working. I didn't actually run the MUD for over a month, since I had confidence my tests were doing a good job. At the end I'll reveal whether they actually were effective.

New testing techniques

At some point I found the need to test interactions between multiple mobs. Over time I ended up developing quite a bit of machinery to make this very simple while at the same time verifying a lot of things even in the simplests of tests. I'll walk through two example tests to give an idea of what's going on.

The first is one of a whole class of tests I wrote, several for each command possible. All the commands I have end up relying on a lot of lower level functionality, so these tests can help find when a change I make has unintended consequences in another part of the system. Well, that was a pretty worthless sentence; this is one of the main reasons for testing at any time.

I'm rambling. Here's the test. I've put in numbered comments for things to talk about further, so heed those.

class TC_NipicActions_Say < TC_NipicActions
    def test_simple()
        # (1)
        source = $td.t1

        # note: the name "hearer" is totally arbitrary. it's just some mob
        bystander = $td.hearer

        # (2)
        $td.nipics().each() { |n|
            # (3)
            TestHelper::MobHelper.quietSayBehavior(n)
        }

        # (4)
        TestHelper.action_test(source, bystander) {
            eventos = [
                SayEvento.new(source, "hello hello hello how are you doing sir")
            ]

            # (5)
            TestHelper.doActionNow(source, :doSay, eventos[0].saidText)

            {
                bystander => eventos,
                source => eventos
            }
        }
    end

    def test_bad_notext()
        source = $td.t1
        bystander = $td.hearer
        # (6)
        TestHelper.bad_action_test(source, $td.hearer) {
            source.doSay(nil)
        }
    end
end # class TC_NipicActions_Say

(1) - $td stands for "test data", and is the object that primarily gives the test access to all of the mobs and rooms in the world. Actually, when I build a world to be used in tests, I also hand-build a corresponding TD class with the mobs built in. Here, I might as well show it.

class TD_Test1 < TD_Base
    def initialize()
        super()
        $mud = Mud.new(9876)
        $mud.loadWorld('test1')

        # hard-coded room ID referenced here
        self.room0 = $mud.worldMap.getRoom(LID.new('second', 0))

        # once I have a room, I can look up mobs in the room according to their
        # short descriptions, which I know because I wrote them in the area
        # XML file

        self.t1 = room0().bagMobsInRoom.find() { |thing|
            thing.shortDescription =~ /a short man/
        }

        self.t2 = room0().bagMobsInRoom.find() { |thing|
            thing.shortDescription =~ /a tall man/
        }

        # ...
    end

    # this declares both the accessor methods and inserts them into an internal
    # list of nipics, which I can use to find out all the nipics in
    # the area
    nipic :t1, :t2
end

(2) - as mentioned in the comment above, I can enumerate all the nipics in the world and do something to each of them. What sorts of things, you might want to know? Well...

(3) - I'm shutting them up. Maybe you recall from an earlier blog post that nipics by default will obnoxiously repeat anything they hear within earshot. Well, this test is about nipics saying things, and I don't want that weird behavior interfering... so I modify the nipics' behavior by shutting them up. What exactly does quietSayBehavior do? Here:

module TestHelper
    module MobHelper
        def self.quietSayBehavior(nipic)
            nipic.behaviors().delete_if() { |b|
                b.kind_of?(MB_ObnoxiousRepeater)
            }
        end # function quietSayBehavior
    end
end

It just finds the MobBehavior (remember I started off talking about those?) for the obnoxious repeating nonsense, and removes it. This works because I was careful to build the behaviors so that doing things like pulling one out at runtime is okay.

(4) - now, the main part of the test is a helper function called action_test. Its purpose is to let the test run some actions on some mobs, then verify that all the right eventos were seen by each involved parties. The main part of the test is a block of code that A) controls the mobs according to what the test wants, and B) returns a hash that maps each involved party to a list of eventos (in order!) that they were expected to see during the course of the test.

In this case, there are two parties (source and bystander), and since the say command broadcasts to everyone in the room, everyone should get the same set of eventos.

(5) - this doActionNow function is very easy to skim over and miss. Its name seems obvious enough until you ask, "'now', as opposed to when else?" The reason is that during testing, the MUD is actually, by default, at a standstill. Nothing really happens unless we hand-crank it. Put another way, the main loop of the MUD doesn't run during testing. Put yet another way (how many ways are there?), it's up to the test to drive the main loop when it wants to. This allows for fine grain control over timing.

This test, like most others, isn't so precise that it requires fine grain time control, so it can use doActionNow. It's worth seeing what that really does:

class TestHelper
    def self.doActionNow(mob, symAction, *args)
        mob.doAction(symAction, *args)
        pulseMudUntilIdle(mob)
    end # function doActionNow
end

The bolded function, pulseMudUntilIdle basically runs the MUD's main loop (which primarily sends pulses to all things on the MUD) for a while--until the mob in question is "idle". Idle, you may recall, refers to the mob's not having any commands pending in its command queue and no residual delay from a recently executed command. Basically, by the time the mob goes back to being idle, it is ready to immediately execute another command.

So quickly bringing it back around, doActionNow means do a command and block until it's been completely finished.

(6) - bad_action_test is a test that expects two things: firstly, that the block should hit a mobcheck due to some invalid parameter or something (in this case, trying to pass nil to doSay); and secondly, that no eventos will be produced. Obviously if the command errors out, there should be no effects visible to anyone else.

More about action_test

action_test is so widely used now that it doesn't mind being examined even more. This rabbit hole goes deep. Watch out; we're about to dive in. How about we start with the code?

module TestHelper
    def self.action_test(*parties)
        archives = {}

        # (7)
        parties.each() { |p|
            archives[p] = MobEventoArchive.new()
        }

        # (8)
        parties.each() { |p|
            p.manager.addEventoListener(archives[p])
        }

        # (9)
        expectedEventos = yield

        # (10)
        expectedEventos.each_pair() { |party, eventos|
            archives[party].checkAgainstExpectedEventos(eventos, "eventos for #{party}")
        }
    end # function action_test
end

(7) - Think back again to eventos, the system for passing around messages that something occurred. We have this thing called the MobEventoArchive, which just couldn't have a more fitting name. It just logs all the eventos that are seen by a mob for later inspection.

(8) - for each mob that is part of this test, we add our archive as an evento listener. What's nice about this is that it plugs straight in to the code for eventos; evento listeners aren't a special test hook or backdoor; they're part of the MUD proper. Anyway, this is how we wire up the evento archive to the mob, so that eventos that he sees get logged.

(9) - remember that block that was the actual meat of the test? Yeah, this yield is where we call that and get the list of expected eventos out of it.

(10) - finally, we can go through each party that is involved with this test and verify that the set of eventos they received and the order in which they were seen are as expected.

There's two things I should add about that. Firstly, of course this relies heavily on correct expected data being fed into the test--as they say, "garbage in, garbage out". Secondly, and more subtly, since this is the majority of the verification in most tests, it puts a heavy burden of trust on the evento system. It takes it for granted that if, for example, you see a SayEvento, that the player would actually see the effect of it on their screen. Or that if you see a EgressEvento, then a mob really did leave a room (not just broadcast some intent to). It's a lazy way of relying on one part of the system to transitively verify or imply other things happened.

But we're programmers, right? Being lazy is our bread and butter.

Dealing with randomness

If you're reading this blog, you're probably friends with me (fictionally, of course), which means you have at least some idea about what Dungeons and Dragons are. Whoa, looks like we're getting way off topic! No, actually, the point is that a large part of D&D is randomness - die rolling, checks for success, and all that. MUDs, based on D&D, are the same. Heck, most games have some element of randomness.

Randomness, any experienced tester will tell you, can be problematic. Usually you want deterministic test cases, so that you can verify precise conditions in the results. Of course it's possible to write tests that account for randomness, but they're more difficult, and if something breaks, the investigation time is typically longer.

But now I have all these choices that get made based on random numbers. I needed a way to deal with this in my tests. Luckily, I'm using a highly dynamic language like Ruby, where I can just hack things to bits until they do what I want. Let's go look.

For example, let's look at the damage a mob does when he hits someone else. I know it's not nice to hurt others, but combat is a big part of the MUD. Here's the code for hitting someone else:

class Mob
    def doHit(target)
        mcheck_amPhysical()
        mcheck_targetPerson(target, 'You can\'t hit that.')
        mcheck_personNearby(target)
        mcheck_physicalPresence(target)
        mcheck_targetNotSelf(target, 'Stop hitting yourself!')

        addActionDelay(10)

        hpLost = randGetHitDamage(target)

        sendEvento(room(), HitEvento.new(self, target, hpLost))
    end # function doHit

    def randGetHitDamage(target)
        return Calc.randMax(30)
    end # function randGetHitDamage
end

randMax gets a random number between 0 and the argument you pass in. I have a bunch of tests in which hitting happens, and if I want a consistent outcome, I need to control what this value is at runtime. Here we go with the Ruby magic.

module TestHelper
    module MobHelper
        def self.affixHitDamage(mob)
            mob.override_singleton_method(:randGetHitDamage) { |target|
                return 40
            }
        end # function affixHitDamage
    end
end

By calling this, it fixes the damage in place. Now my tests can assume that the damage is not changing for every hit, though the tests do themselves call randGetHitDamage to avoid knowing precisely what that fixed value is.

The subtext here is that all decisions that depend on random numbers need to be wrapped in methods (preferably tightly, as in this case), so that they can be overridden. Usually methods are used to break up code and give reusable blocks of functionality; here, they serve the added purpose of test hooks.

I know it's not strictly necessary, but can I show override_singleton_method real quick? I kind of like how it works. Yeah, I'm pretending to ask your permission. How do you like that?

class MethodOverrideError & StandardError
end

class Object # yeah, that's right, Object
    def override_singleton_method(symMethod, &blk)
        if !respond_to?(symMethod)
            raise MethodOverrideError.new("cannot override method #{symMethod} because object #{self} does not already implement it")
        end

        define_singleton_method(symMethod, &blk)
    end # function override_singleton_method
end

I wanted to make sure that I when I say override I really do mean override and not "define a new method".

There you have it. That's the general technique for how I constrain (rather than embrace) the chaos in my tests.

Putting it all together - a complex test. Wait, that was simple back there?

Using all these techniques I've shown, I'm able to create monstrously large tests that simulate multiple nipics interacting with each other. By heavily relying on the evento verification, I don't bother individually verifying many smaller things.

The code. This is a test for the most complicated scenario of the new "retaliator" behavior, which pits a mob against two aggressors and watches him systematically chase each one down and exterminates them, like you would any termite or roach.

Here's hoping my documentation is sufficient and clear.

def test_retaliate_pursue_2()
    attacker1 = $td.t1
    attacker2 = $td.hearer
    defender = $td.t2

    # calm attacker1, but not defender, which will retaliate
    TestHelper::MobHelper.calmNipicBehavior(attacker1)
    TestHelper::MobHelper.calmNipicBehavior(attacker2)
    TestHelper::MobHelper.affixHitDamage(attacker1)
    TestHelper::MobHelper.affixHitDamage(attacker2)
    TestHelper::MobHelper.affixHitDamage(defender)
    TestHelper::MobHelper.affixRetaliateHitDecision(defender, true)

    # (11)
    TestHelper::MobHelper.affixRetaliatePursuitDecision(defender, true)
    attacker1.hpRegenPerTick = 0
    attacker2.hpRegenPerTick = 0
    defender.hpRegenPerTick = 0

    # build up the eventos array according to several hits until death
    hitDamage_a1_d = attacker1.randGetHitDamage(defender)
    hitDamage_d_a1 = defender.randGetHitDamage(attacker1)
    hitDamage_a2_d = attacker2.randGetHitDamage(defender)
    hitDamage_d_a2 = defender.randGetHitDamage(attacker2)

    mbret = TestHelper::MobHelper.findRetaliator(defender)

    TestHelper.action_test(attacker1, defender, attacker2) {
        eventos =
        [
            HitEvento.new(attacker1, defender, hitDamage_a1_d),
            HitEvento.new(defender, attacker1, hitDamage_d_a1),
            HitEvento.new(attacker2, defender, hitDamage_a2_d),
            HitEvento.new(defender, attacker2, hitDamage_d_a2),
        ]

        TestHelper.doActionNow(attacker1, :doHit, defender)
        TestHelper.doActionNow(attacker2, :doHit, defender)

        ##########
        # by this time, the defender has been able to retaliate once on
        # each of the attackers
        #
        # we need to get the internal ordering of aggressors as stored by
        # the retaliator object. since the aggressors are stored in a hash,
        # this is nondeterministic, so we need to query it here. from this
        # point on, we don't refer to attacker1 and attacker2, but rather
        # firstAttacker and secondAttacker, which reflects the order in
        # which the defender will go after them.
        ##########
        ordering = TestHelper::MobHelper.getRetaliateOrdering(mbret)
        firstAttacker = ordering[0]
        secondAttacker = ordering[1]

        hitDamage_d_a1 = defender.randGetHitDamage(firstAttacker)
        hitDamage_d_a2 = defender.randGetHitDamage(secondAttacker)
        hpCount1 = firstAttacker.currentHP()
        hpCount2 = secondAttacker.currentHP()

        eventosFirstAttacker = eventos.dup()
        eventosSecondAttacker = eventos.dup()
        # use eventos to be defender's eventos

        # everyone sees first attacker leave
        e = EgressEvento.new(firstAttacker, Direction::South)
        eventos.push(e)
        eventosFirstAttacker.push(e)
        eventosSecondAttacker.push(e)

        # only first attacker sees himself enter south room
        eventosFirstAttacker.push(IngressEvento.new(firstAttacker, Direction::North))

        # only second attacker and defender see second attacker leave
        e = EgressEvento.new(secondAttacker, Direction::North)
        eventosSecondAttacker.push(e)
        eventos.push(e)

        # only second attacker sees himself enter north room
        eventosSecondAttacker.push(IngressEvento.new(secondAttacker, Direction::South))

        # defender follows first attacker first
        eventos.push(EgressEvento.new(defender, Direction::South))
        e = IngressEvento.new(defender, Direction::North)
        eventos.push(e)
        eventosFirstAttacker.push(e)

        # defender pounds first attacker to death
        while hpCount1 > 0
            e = HitEvento.new(defender, firstAttacker, hitDamage_d_a1)
            eventos.push(e)
            eventosFirstAttacker.push(e)
            hpCount1 = hpCount1 - hitDamage_d_a1
        end

        e = DieEvento.new(firstAttacker)
        eventos.push(e)
        eventosFirstAttacker.push(e)

        # now chase down second attacker. that's 2N from where we are
        eventos.push(EgressEvento.new(defender, Direction::North))
        eventos.push(IngressEvento.new(defender, Direction::South))
        eventos.push(EgressEvento.new(defender, Direction::North))
        e = IngressEvento.new(defender, Direction::South)
        eventos.push(e)
        eventosSecondAttacker.push(e)

        # defender pounds second attacker to death
        while hpCount2 > 0
            e = HitEvento.new(defender, secondAttacker, hitDamage_d_a2)
            eventos.push(e)
            eventosSecondAttacker.push(e)
            hpCount2 = hpCount2 - hitDamage_d_a2
        end

        e = DieEvento.new(secondAttacker)
        eventos.push(e)
        eventosSecondAttacker.push(e)

        firstAttacker.doAction(:walkInDirection, Direction::South)
        secondAttacker.doAction(:walkInDirection, Direction::North)

        # (12)
        # defender will retaliate, killing first attacker, then second
        TestHelper.deathTest(firstAttacker) {
            TestHelper.deathTest(secondAttacker) {
                TestHelper.pulseMudUntil('defender not mad') {
                    !mbret.uMadBro?()
                }
            }
        }


        {
            firstAttacker => eventosFirstAttacker,
            secondAttacker => eventosSecondAttacker,
            defender => eventos,
        }
    }
end

Holy crap, that's a long test. I'm actually kind of amazed it even works. Before getting to the annotations, it's interesting to note just how much of the test is dedicated to building up the expected results, as opposed to controlling the mobs. This is not an uncommon occurrence at all in testing. Really, there are only four commands that the mobs actually do: each attacker hits the defender, then each attacker runs away. All of the retaliation is automatic, so I have to predict the expected results and encode that in a set of eventos.

(11) - this affixing is needed on the defender, because on every pulse that he doesn't see any of his aggressors around, he has a chance to wander off hunting one. This removes the element of chance, forcing him always to choose "yes, puruse" for that decision.

(12) - the "death test" calls the block that it's given and verifies that the mob indicated by the argument is alive at the start, but dead dead dead by the end of it. It verifies several things about death: that the life state becomes LifeState::Dead, that it is removed from the room, and so on.

The reason I have two death tests embedded this way is that I want a single action--just waiting until the defender is no longer mad--to produce two deaths.

Conclusion

Early in this blog I said I'd reveal whether this testing was effective, compared to ad-hoc testing (just running the mud and playing it). Well, I finally fired it up the other day and fought one of these guys, and I have to say... it was pretty fun! The fight went basically as I envisioned it. I kept trying to run from that jerk (um yeah, the one that I attacked), but he ruthlessly chased me down. Maybe even too ruthlessly...

Now perhaps you see why I feel like a puppeteer in this endeavor.

0 comments:

Post a Comment