20151009

VimScramble: It's Out Now

About a month ago, I found myself doing one of the nerdiest things I can ever recall doing in my life: I was frying up 3 eggs in a slightly worn pan, but not to eat (not primarily, anyway). I was about to pose the scrambled eggs in the shapes of letters on the frying pan and take photos of them for use in a font. I only needed the letters ABCEILMRSV. Perhaps you can guess what I intended to spell?

It's VERB CLAIMS, of course! The hot new game I've been working on!

Oh wait, it's in the blog post title, isn't it? BAA!

I took the photos and artfully arranged them to make the logo for the new Twitter bot I've been working on, VIM Scramble. Like I said, ranks among the nerdiest things I've ever done, but that shouldn't really be surprising to anyone.

Well, it's been almost a year since my last post. That's a problem, but I don't know if or when I'll fix it. It's been suggested that when my upcoming son is born I write some stuff about him periodically. Maybe that stuff will go here.

But enough about me; you want to know about "vimscramble", don't you? This Twitter bot is even more niche than my last one. Its purpose is to feed random inputs into an instance of VIM, a popular and highly flexible text editor, so useful that it's sparked religious wars.

The motivation and inspiration here is almost as obscure as the final product, itself. Let's see... Have you ever been doing something while chatting with someone on your computer and accidentally typed into the chat window when you meant to be in your other activity? If you're playing, say a video game, your friend might get a message that looks something like this:

wwwwwwwwwwwww aaaawwwaaaaadddddddssswa fffawdaaaaaaaadwwwasssssawwddaawwa

Maybe if you know computer games that's not so weird, yanno, good ol' WASD. But if your friend were using VIM in the other window, you'd probably get something like this:

2yyjjjjjjjpwwWWbxxxxdwdwdE....d1Gjjjd$:s/stuff/other stuff/g

To you, that's totally incomprehensible. To him, it's lightning-fast text editing magic. What happens if we feed all this random garbage to VIM and see how it reacts? I thought it might be fun, so I made something that explores it.

A quick primer on VIM: when you load the text editor, you start in something called "normal mode", where instead of entering text directly into the file, you can issue directives like "delete this line" or "move the cursor 10 spaces to the right" or "copy this line and paste it 6 times here". If you enter "insert mode", now you can type actual text into your file like you're used to. Hitting Escape brings you back to normal mode. From normal mode you can also go into command mode, where you can do things like "save this file" and "substitute this text everywhere it appears in the file".

The bot sends around 100 keystrokes to a VIM instance, captures the result of every keystroke as a screenshot, and tweets an animated GIF of the whole thing as a "round". It also saves the file so the next round builds on the previous one. So specific, but for the people who love VIM out there, well, this one's for you.

There were many mechanical parts involved with making this, but the most interesting parts of this bot were working with VIM itself. It has so much flexibility that I can't just send any old random inputs to it, or else I'll probably end up deleting all the files on my computer. I have to make it produce interesting effects, and I have to do it safely.

Making things interesting

What constitutes "interesting" inputs? I'm generally going for massive, sweeping changes to the file being edited. Simply typing random characters into it is too easy and kind of boring. We need to go deeper; we need to exploit all the various VIM commands at our disposal. Most of the key logic in the bot involves trying to get it to do things that are cool and not boring. Let me list out everything I did to make this happen.

I try to keep the VIM instance in normal mode most of the time, because that's where you can make really big changes, like "paste the clipboard 20 times". After every keystroke, I query VIM via a backchannel (more on that in a later blog post) and find out what mode it's in. If it's in insert mode, I roll the dice and decide if it should hit Escape to go back to normal mode.

On every keystroke I roll the dice to see if I should press CTRL, because VIM has a lot of CTRL-keys that do stuff like scroll around or delete whole lines or other things.

If I manage to go into command mode, which is done by pressing :, I roll the dice on every keystroke to decide if instead I should hit Enter to execute the command, because the longer the command gets the more likely it is to be ill-formed and therefore not have any effect.

Building on top of that, when entering command mode I also decide if I want to execute a search-and-replace. This command is of the format :s/pattern/replacement/. I send a few random keys and manually insert a / to go to the replacement part and again to terminate the expression. These are especially interesting because the patterns are regular expressions, which can describe a lot of different patterns to search for, like "match any uppercase character or the number 5" or "match any three letter word" or "match a two digit number followed by up to three Zs".

I controlled the input data file to start with. I used a simple Lipsum to start with, and added a bit of punctuation and some numbers later to allow for searches to match better. I also added the letters k, w, x, y, and z, which don't appear in any Lipsums, probably because they mostly don't show up in Latin either.

I position the cursor on a random line and column in the file at the start, using the setpos() command, so that it's not skewed towards editing the first line of the file.

Sometimes a command will open a second window. Usually this other window is a read-only file like a help file, so I detect it with the bufnr() command and roll the dice to decide to close it, because sitting there spamming keystrokes into a read-only window isn't cool at all.

Persisting

One of the trickier changes is saving the file at the end of each "round" and loading it for the next one. Later on I'll talk about why I'd call it tricky. Still, this part is interesting because of how destructive VIM is. All I have to type is dG to delete everything from the cursor to the bottom of the file. You might say with VIM it's easier to destroy than to create, because the deletion commands are so simple.

In my tests, I kept ending up with tiny, almost empty data files after several rounds, because all the text would keep getting randomly obliterated. My solution is to keep around the original, "reference" file that we start with and reset to using that based on how big the current data file is compared to the reference file.

The formula to decide whether to purge is exponential. I set a file size threshold as a percentage of the reference file's size and have an exponentially higher chance of resetting the current file to the reference file, the lower the current data file size dips below it. If the size exceeds the threshold, the chance to reset the data file drops to zero quickly.

A volatile environment

But things aren't perfectly rosy in the dangerous world of VIM. At least, not the way I'm playing with it, like one of those guys who swallows a flaming sword or whatever. Since I'm effectively loosing a chaotic neutral robot upon my PC at home, I need to put it in a sandbox, or else after enough random keystrokes, it'll stumble upon the exact combination of keys to delete some file I care deeply about.

There are lots of ways to sandbox it, and I probably chose the weakest, most fragile one of all: implement all the protections I can think about myself, by hand. In programming, at least half the time this is a terrible idea.

I could have used a sandboxing program, or I could have run my bot in a virtual machine, but I like to create things with very few external dependencies if possible. As it is, this already depends on CURL to do the actual tweeting part and FFMpeg to do the GIF encoding, so I didn't want to add further undue burden.

Let's go through all the protections I put in and the holes in them. This section really illuminates the flexibility and vast range of VIM commands avaiable. I certainly didn't know about many of these before I started working on this part of the project.

There are two angles of attack I used to protect myself: things my bot avoids and things I configure VIM to prevent. The problem with that vastness of VIM I mentioned is that it's practically impossible for me to grok enough about the current conditions in the VIM instance to decide that I'd better not send a given keystroke. The only thing I do in my bot to avoid problems is prevent sending the CTRL-Escape key combination, which is a global shortcut in Windows. Go ahead and try it; it won't hurt.

So all the other safeguards are configured into VIM itself using its convenient configuration file, the vimrc file, and command line parameters to gvim.exe.

Command Line Args for Safety

I pass -Z to start VIM in restricted mode, which prevents VIM from being able to run external commands.

I pass --noplugin to stop the loading of any pesky plugins that might read other files or do hidden stuff I can't control. Plus, I like the minimalist experience anyway.

And, uh, the rest of this section is left intentionally absent.

Vimrc to Save the Day

Let's face it: the command line args were pretty weak. This is the good stuff, here.

I set the shell, makeprg, and grepprg options to empty just in case -Z didn't work, because I'm paranoid like that. Even if the bot managed to accidentally set these variables back to something, the -Z command line arg above would supersede it, so we're safe here.

On Windows there are several key combos like CTRL-Insert, Shift-Insert, etc. to copy and paste with the clipboard. I don't want whatever I was last doing on the computer to find its way into the text file and escape to the Internet (whence nothing can be recalled), so I wipe the clipboard when VIM starts by doing :let @* = ''. The special * register is set to the clipboard, so this command just says to clear it. It's still possible now for the bot to write into the clipboard, but that's something well within my power as a computer user to deal with.

I prevent tab completion in command-line mode, because it can autocomplete filenames and directory names, which I don't want to leak to the world. I use :cmap <Tab> <Nop> for this. This protection is foolproof, but it's sad that I can't use Tab in command-line mode, because it autocompletes all kinds of stuff, including command names.

The final touch is about reading and writing files. These turned out to be pretty tricky, but doable through the use of autocommands, by which I can cause VIM to automatically execute certain commands when specific events occur.

Preventing writing to files ended up being easier: I just set up autocommands for BufWriteCmd, FileWriteCmd, and FileAppendCmd that do nothing at all. Now if anything tries to do these operations, they'll just fizzle out.

Here's the problem with reading files: I can't just prevent reading all files, because I have to read at least one at startup--the data file to operate on! So we have to get trickier. What if every file that VIM tries to load instead redirects to loading the data file itself? That would be safe. I can override BufReadCmd to handle this.

One problem, though. Sometimes the bot tries to load help files, and these count as files on disk, so it would also redirect those to load the data file instead. That's not fun at all. After doing some research I found that all help files are found under the folder $VIMRUNTIME\doc, so if I match the filename being loaded, I can decide whether to let it go through (if we're loading a help file) or redirect to my data file.

Before I show the code for this part, I ran into another problem, that the filename that is originally edited can get changed by the :file command. I have to back it up at the start of the VIM instance for safekeeping.

" Save the original file that was opened at the start, so that we can prevent
" writing to other files. Use lockvar to try to keep things from editing it
" accidentally.
"
" Of course, something could unlockvar and then go to town, but the odds are
" low.
let g:originalVimsumDataFilePath = fnamemodify(argv(0), ':p')
lockvar g:originalVimsumDataFilePath

" Allow opening only help files, which reside under $VIMRUNTIME\doc. Expand
" the target filename to open, strip off the beginning part using
" fnamemodify(), and either permit the filename to be opened or instead swap in
" our data file name, saved from when we first opened vim.
autocommand BufReadCmd * if fnamemodify(expand("%"), ":p:h:s?" . escape($VIMRUNTIME, ' \') . "??") != "\\doc"
autocommand BufReadCmd * exec "edit " . g:originalVimsumDataFilePath
autocommand BufReadCmd * else
autocommand BufReadCmd * edit <afile>
autocommand BufReadCmd * endif

(Oops, you can see how I used to call this project "vimsum"--a portmanteau of "VIM" and "Lipsum"--before I settled on a name I liked. I do love a good portmanteau, but it just wasn't doing it for me this time.

Finally, I set the SourceCmd autocommand to prevent the bot from accidentally executing a script from disk, known as "sourcing" it. There's a problem with this one, too, though! The vimrc where all this configuration is stored is itself sourced at startup, so I can't simply set this. Instead, I set it in a deferred way.

autocommand VimEnter * autocommand SourceCmd * echohl "nope"

Upon entering a VIM instance, then set the autocommand to prevent sourcing. Autocommandception, I think is the official term for this.

There's one big problem with all this autocommand stuff: it's possible to temporarily disable all autocommands using an option or the :noautocmd command prefix. I decided I'd just hope that the stars never align and cause the bot to do this command at the same time as saving or loading a file.

Enjoy the Scramble

From looking back through my source code history, apparently I've been working on this since late March, which is, wow, a lot longer than I'd expect to take for a project seemingly so small. In my defense, I have a whole series of blogs planned on the technical hurdles I overcame to bring this little piece of art to life, and those will hopefully illuminate why it took so long. By the way, the eggs were delicious, even after being photographed.

1 comments:

your wife (my url links to a gif of me trying to program something like this) said...

you're amazing! i have to admit i'm not really sure what most of this means, but i know you worked on it for a long time and spent stupid amounts of time solving issues with it. good job! i'm sorry i can't appreciate it more <3

Post a Comment