Babbling with Babadoobot

Guys, I did it again. I made another Twitter bot. This one absolutely crushes all my others on the "ridiculous idea" scale. I'm almost too embarrassed to share it with you. Almost.

I was walking to my bus the other day, and for some reason, the word "adobado" popped into my head--you know, the Mexican meat seasoning. I thought, "I wonder if I have that spelt right. It seems easy to mess up." Fittingly, as I wrote this blog, I got it wrong at first. I wrote "abodabo" and had to look it up and correct myself. Anyway, I kept thinking, "hehe, that word sounds funny... you know, I could probably write a program to generate funny sounding words like that."

Just because I can doesn't mean I should, but nevertheless, here we are. And with that, I give you @babadoobot, whose sole mission in life is to make silly sounding words with B, A, D, and O. That's it.

Aside from being mildly amusing to adults, there's this side effect where baby G loves it when you read the tweets to him.

A Node Journey

The idea was so simple that I decided to use it to learn some new technology skills, as a kind of toy project. Recently there had been a lot of hubbub about NodeJS, and I'd heard people at work talking about it for a while. I had kind of dismissed it as a backend web server type technology, but hearing how it broke so many apps made me curious about its capabilities as a general purpose programming environment, so I decided to use the new bot idea to learn about it.

This was a really great decision. I found that not only is NodeJS highly capable on its own, but also has a gigantic library of packages on NPM. Moreover, everything just worked on Windows, which is a true rarity in newer programming languages. In contrast, getting D or Ruby to work on Windows has some perilous steps.

It took only about 10 minutes from the start of reading about NodeJS/NPM until I had a working skeleton of a Twitter bot. Again in contrast, Lavamite and VimScramble required considerable wrangling and contorting to just get D to tweet even the simplest text, let alone to upload an image.

The basic functionality of the bot was done in about two days' worth of bus rides. All my programming time is measured in bus rides as a unit of time, as you might know. Two days is pretty quick.

I talked about my new Node discovery at work, and people suggested learning Typescript to overcome Javascript's (NodeJS's native language) many shortcomings. Typescript turned out to be pretty easy to integrate, but not without its own artificial challenges.

The main problem is that any NPM module you use in Node ideally would have a set of type definitions declared for use in Typescript. You can use the modules without these type definitions, but you don't get the full benefit of the typed-ness without them. For example, I was using a module called random-seed to give me a seedable random number generator, but there were no type definitions for it. I ended up switching to a different seedable RNG called Chance, which did have types created.

In fact, the official twitter NPM module doesn't have typings for it--misleadingly, there are typings but they're for some other Twitter thing... not sure what's going on there. So I have to allow it in as type any. So close to purity!

Eventually I got enough hints about the bizzar syntax needed in tsconfig.json and how to use typings and got stuff to compile. From that point on it wasn't so bad.

How to babodoobadobado

The core part of the code is a state machine that cycles between the letters B, A, D, and O. There are a surprising number of rules that accumulated as I tested it with a lot of iterations.

  • Single-letter words can only be A or O.
  • Other words can start with any letter
  • Allow "ba", "bd", "bb", and "bo" sounds
  • Allow "da", "dd", and "bo" sounds. Don't allow "db", because it's a bit hard to say
  • Don't let a word end in two consonants. That also means 2-letter words can't be entirely consonants
  • After "dd", "bb", and "bd", there can only be a vowel. That is, prevent "bbb", "ddd", etc.
  • Vowels can't follow a different vowel, so "oo" and "aa" are allowed, but not "oa" and "ao"
  • Along similar lines, only consonants can follow, so no "ooo"

All of this is encoded in 15 states of the state machine and a total of 37 transitions. This state machine was nice because it was easy to make tweaks that didn't break the whole thing.

The transitions between the states are also weighted, so, for example, the random chance of getting "aa" is only 25% of the chance of getting "ab" or "ad", because "aa" looks and reads goofy.

For each tweet, the program decides about how many characters in total to generate, and then splits that up into some number of words of various lengths, but no longer than 10 letters each. Then it uses the state machine on each word length to get the whole sentence.

Over half the time, it generates shorter tweets, around 30 letters only, but the rest of the time it allows itself to go up to the Twitter max of 140 letters, for some really wacky sentences.

For fun, there are about 20 different words that it recognizes and makes some modification to. I won't reveal them, because that's part of the fun! You may be able to guess what a few of the words might be if you think about it.

I released the bot quietly and showed it to my wife, who had the brilliant idea of adding punctuation to it. I thought the bot was pretty amusing before, but now it's gotten borderline hilarious, if I may say so myself. So I spent 2 bus rides getting that working, and now it generates a random amount of punctuation and sprinkles it throughout. Since I do love my grammar, I make sure that if the tweet has any punctuation at all, there's a period, question mark, or exclamation point at the end. It's just good grammar, yanno.

Someone draw me a picture

This wouldn't be complete without talking about the fantastic avatar I managed to snag for it. When I specified it to the artists, all I told them is, "I want a robot making an O shape with his mouth." That's it. I didn't even tell them what the bot was going to do.

I first approached my wife, who wasn't that into it, so my next pick was obviously @corduroyturtle, who draws neat stuff in a great style. What he produced was just fantastic and prefectly reflected the goofiness of the tweets, too. He and my wife both helped color it, and I love the outcome.

Some programming details

There were a few little bugaboos I had to work out, of course. I wanted it to generate long tweets sometimes, but that meant enforcing a length limit. After generating the words and modifying them to add the easter eggs, sometimes it would be too long, so I had it delete random words until it fit in the required space. But I don't want it to delete any of the modified words, so I had to track all that.

Then, when adding punctuation, again I needed to stay within the character limit, so I first generate all of the random punctuation characters, so I know all the lengths, and then delete words to get under the length limit. Finally, I choose the locations to insert them. Doing things in another order ended up with some problem or another.

As with all my bots, I try to keep them on a strict schedule. I have it tweet 4 times a day. It keeps a log file for each round it does, and it uses the creation timestamp of the last log file to figure out how long it's been since the last round started. It waits until 6 hours from then to start the next round. Using the log file as persistence allows me to restart the bot at any time (e.g. for a computer update), and it will still keep on schedule.

So this was pretty much my easiest Twitter bot to make, but probably will have the most widespread appeal. I mean, who doesn't like a little gibberish in their life?


Jason said...

I wanna know what the trigger words are.
I'm surprised that these aren't on the list:
seriously not OBA????
oh and abba
Oba tho

Jason said...
This comment has been removed by the author.
knutaf said...

I might add a few of those, and I won't confirm or deny any of them at the moment, but keep in mind that if a trigger word shows up, there's only a 50% chance it will trigger a modification. I didn't mention that in the post. It's to keep them from being overused, mostly.

Post a Comment