Ahhh, with the release of Vimscramble squarely behind me, I can turn my thoughts to topics of writing more about Vimscramble, because I'm definitely not done!
That first post was tame, an overview, barely an abstract or a synopsis. Now we dig into the fun stuff, the clanking machinery needed to bring it to life. Blieve you me, there's quite a bit of it. This is the post I've been looking forward to writing for a while, so I hope you're ready to put on your debugging hat.
In the pursuit of having the bot keep things interesting, a key requirement was to be able to interrogate the running VIM instance about its current state. As I'll try to show in a later post, it's easy enough to send keystrokes to VIM using standard Windows functionality, but asking for an answer from it has no such provision.
When I first started out I had no idea what to do, but I found some hints online about a client/server mode. I checked the usage of command-line vim.exe, and sure enough found --remote-expr
, which says "Evaluate expr in a Vim server and print result". This is perfect, and exactly what I needed.
In the interest of keeping my code simple, I tried to script the command line vim.exe, but I couldn't seem to capture the output properly, so instead I resorted to duplicating whatever functionality VIM uses to send the expression and receive the result. A bit of light reverse engineering never hurt anyone.
A Bit of Light Reverse Engineering
I got a hint about where to begin from the VIM docs themselves: "Since MS-Windows messages are used, any other application should be able to communicate with a Vim server." This is a good starting point because it tells me which tools I can use to start looking at what's going on. The trusty standby of examining window messages is Spy++. In my case, since the only serious choice nowadays is writing 64-bit programs, it's spyxx_amd64.exe.
So let's fire up Spy++ and see what messages go on between the client and the server when we do an expression. Here's the command I'll use, and the reply ("which mode am I currently in?" "normal mode").
>vim --servername VIMSUM --remote-expr mode()
n
First I look for the toplevel VIM window of the server and look for messages. I want all windows of the same process, and since I don't know what kind of messages are going on, I want all messages logged. That's a lot of messages, and if I so much as hover the mouse over a corner, I'll get an explosion of activity, so I'm careful not to interact with it in any way.
I carefully execute my command, and I'm rewarded by just two messages! This is good; I was worried about having to sift through a sea of hundreds to find the ones of interest. Let's look at what's in these two key messages.
Well, great. All I'm getting here is pointers, not the actual binary contents of the messages. I guess I'll have to dig in, or, as they say, go deeper. Next tool: my favorite, windbg. This old debugger cheat sheet might help.
I want to run the client under the debugger and see what message it's posting. Attaching to the server side will require breaking in the window message loop, which can be done, but isn't as easy.
I don't know if the client is using SendMessage
or PostMessage
, and if it's using the ANSI or Unicode version, so I'll just break on all of them. I'll know I get the right one if I find a WM_COPYDATA
message.
0:000> bl
0 e 00007ffd`5e5249c0 0001 (0001) 0:**** USER32!SendMessageA
1 e 00007ffd`5e51f4b0 0001 (0001) 0:**** USER32!SendMessageW
2 e 00007ffd`5e534900 0001 (0001) 0:**** USER32!PostMessageA
3 e 00007ffd`5e5270a0 0001 (0001) 0:**** USER32!PostMessageW
All right, one of my breakpoints hit! Looks like SendMessage
it is... wait a second, since I don't have private symbols (only stripped ones) for Windows, I only get the function name, no paramters. And of course, I won't get anything at all for VIM. That should be fine, though. Well, this is still doable.
# Child-SP RetAddr Call Site
00 000000a2`4017f218 00007ff7`e9f391db USER32!SendMessageA
01 000000a2`4017f220 00007ff7`e9f3b3bc vim+0x1491db
02 000000a2`4017f270 00007ff7`e9eca11f vim+0x14b3bc
03 000000a2`4017f430 00007ff7`e9eca456 vim+0xda11f
04 000000a2`4017f520 00007ff7`e9ece266 vim+0xda456
05 000000a2`4017f550 00007ff7`e9dfb418 vim+0xde266
06 000000a2`4017f750 00007ffd`5f1f2d92 vim+0xb418
07 000000a2`4017f790 00007ffd`60dc9f64 KERNEL32!BaseThreadInitThunk+0x22
08 000000a2`4017f7c0 00000000`00000000 ntdll!RtlUserThreadStart+0x34
Here's the signature of SendMessage. I care about the Msg
parameter first, to tell me if it's a WM_COPYDATA
message.
LRESULT
SendMessageA(
HWND hWnd,
UINT Msg,
WPARAM wParam,
LPARAM lParam);
With a bit of knowledge of the x64 calling convention, I know that the first four parameters to a function are always in registers rcx
, rdx
, r8
, and r9
, respectively, so I'll look at rdx
to find Msg
first.
0:000> r
rax=0000000000000007 rbx=0000000000000001 rcx=0000000000b9738c
rdx=000000000000004a rsi=0000000000b9738c rdi=0000000000000000
rip=00007ffd5e5249c0 rsp=000000a24017f218 rbp=000000a24038a150
r8=0000000000b57b5e r9=000000a24017f240 r10=0000000000000000
r11=0000000000000010 r12=0000000000000000 r13=0000000000000003
r14=0000000000000000 r15=000000a24017f498
iopl=0 nv up ei pl nz na pe nc
cs=0033 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000202
USER32!SendMessageA:
00007ffd`5e5249c0 488bc4 mov rax,rsp
If I look up in the public header files (in this case in winuser.h) I see that WM_COPYDATA
has code 0x4a, so this is indeed a message I'm looking for. The docs for WM_COPYDATA
say that wParam
holds the HWND of the window that is sending the data. Hmmm... that could be interesting; I'll come back to it later. And lParam
has a COPYDATASTRUCT
with the actual data being passed.
Same problem with symbols: the debugger can't help me visualize a COPYDATASTRUCT
, so I'll have to do it the old fashioned way, just looking at memory offsets. I'll walk through it. It's not hard, just annoying. To start with, we'll need the definition of COPYDATASTRUCT
.
typedef struct tagCOPYDATASTRUCT {
ULONG_PTR dwData; // 8 bytes
DWORD cbData; // 4 bytes
_Field_size_bytes_(cbData) PVOID lpData; // 8 bytes
} COPYDATASTRUCT, *PCOPYDATASTRUCT;
In the debugger, I can dump the memory to find the fields.
// Grab the first few pointer-sized elements
0:000> dp @r9 L3
000000a2`4017f240 00000000`00000014 00000000`00000007
000000a2`4017f250 000000a2`40393510
Due to alignment, it looks like the DWORD
-sized cbData
(4 bytes) has gotten an extra 4 bytes on the front. The values are dwData = 0x14
, cbData = 0x7
, and lpData = 0x000000a2`4017f250
.
dwData
is some application-specific value that means something to VIM. I don't know what it means, but I'll probably have to pass it when I replicate this window message in my own code. The actual data being passed is only 7 bytes long. And finally, let's look at the data being passed!
// Dump 7 bytes at lpData
0:000> db 000000a2`40393510 L7
000000a2`40393510 6c 61 74 69 6e 31 00 latin1.
Well, that's not what I was expecting. That sounds like they're sending... a character encoding? It's a null-terminated ANSI string. Maybe the second message contains the actual expression to evaluate. Let's keep going in the debugger.
0:000> g
Breakpoint 0 hit
USER32!SendMessageA:
00007ffd`5e5249c0 488bc4 mov rax,rsp
0:000> dp @r9 L3
000000a2`4017f2b8 00000000`0000000a 00000000`00000007
000000a2`4017f2c8 000000a2`4038a150
0:000> db 000000a2`4038a150 L7
000000a2`4038a150 6d 6f 64 65 28 29 00 mode().
Hey, look! That's my command from the command line! I finally found you, little guy. Been looking all over for you. So I've learned so far that the VIM client sends a message with the character encoding (a guess) and another with the actual command. But where's the reply? SendMessage
is a one-way street. Well, you can learn about an error code, but no real data.
A Two-way Street
Remember back to the wParam
in the WM_COPYDATA
message? It's the originating window. Well, let's see what this originating window is all about in Spy++.
So the VIM client process creates a hidden (you can tell from the greyed out window icon) window, presumably where it gets replies. If I debug the VIM server process and see where it's doing a SendMessage
or PostMessage
, I hope to see a reply message going out.
0:002> g
Breakpoint 0 hit
USER32!SendMessageA:
00007ffd`5e5249c0 488bc4 mov rax,rsp
0:000> k
# Child-SP RetAddr Call Site
00 0000009e`6754eb88 00007ff6`6105a0fb USER32!SendMessageA
01 0000009e`6754eb90 00007ff6`61059efc gvim+0x15a0fb
02 0000009e`6754ebe0 00007ffd`5e5200dc gvim+0x159efc
03 0000009e`6754eca0 00007ffd`5e51fe52 USER32!UserCallWinProcCheckWow+0x1fc
04 0000009e`6754ed90 00007ffd`5e51429b USER32!DispatchClientMessage+0xa2
05 0000009e`6754edf0 00007ffd`60e553e4 USER32!_fnCOPYDATA+0x5b
06 0000009e`6754ee50 00007ffd`5e53fcba ntdll!KiUserCallbackDispatcherContinue
07 0000009e`6754ef28 00007ffd`5e52f8e5 USER32!NtUserGetMessage+0xa
08 0000009e`6754ef30 00007ff6`611085ea USER32!GetMessageW+0x25
09 0000009e`6754ef60 00007ff6`6110ca11 gvim+0x2085ea
0a 0000009e`6754f030 00007ff6`6110272b gvim+0x20ca11
...
The breakpoint hit in the server process (whew). The interesting bits of this stack are the GetMessageW
call, which shows that it's in the message pump (basically a tight loop calling GetMessage
); something about COPYDATA
, which is probably used to process the incoming WM_COPYDATA
message from whatever transport Windows uses to get the data across process boundaries; and finally our SendMessage
call, hopefully preparing the reply. I'll dump the registers and see.
0:000> r
rax=0000000000000006 rbx=0000009e6778b9f0 rcx=0000000000b57b5e
rdx=000000000000004a rsi=0000009e6754eef0 rdi=0000009e677c9070
rip=00007ffd5e5249c0 rsp=0000009e6754eb88 rbp=0000000000b57b5e
r8=0000000000b9738c r9=0000009e6754ebb0 r10=0000000000000000
r11=0000000000000064 r12=0000000000000000 r13=00007ff661059d30
r14=0000000000000000 r15=000000000000004a
iopl=0 nv up ei pl nz na po nc
cs=0033 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000206
USER32!SendMessageA:
00007ffd`5e5249c0 488bc4 mov rax,rsp
As expected, the hwnd
parameter in rcx
matches wParam
from before, where the reply should go. Also good that rdx
is 0x4a, the value for WM_COPYDATA
that we saw before. Let's look in the COPYDATASTRUCT
.
0:000> dp @r9 L3
0000009e`6754ebb0 00000000`00000014 0000009e`00000006
0000009e`6754ebc0 0000009e`6774ac90
0:000> db 0000009e`6774ac90 L6
0000009e`6774ac90 75 74 66 2d 38 00 utf-8.
Fine, it's another character encoding. Who cares? I get it, you speak a bunch of languages. Hopefully there's another message that follows. Luckily, there is.
Breakpoint 0 hit
USER32!SendMessageA:
00007ffd`5e5249c0 488bc4 mov rax,rsp
0:000> dp @r9 L3
0000009e`6754ec28 00000000`0000000b 00007ffd`00000002
0000009e`6754ec38 0000009e`677c9070
0:000> db 0000009e`677c9070 L2
0000009e`677c9070 6e 00 n.
It's n
for normal mode! Hey, that's what I was looking for. In the COPYDATASTRUCT
, notice that dwData = 0xb
this time. Through some experimentation, I found out that the character encoding messages are optional. With this, I actually have all the information we need to replicate what the VIM client is doing.
As a client, I need to create a window and set up a message pump to process the reply. Once I find the server window, I SendMessage
a WM_COPYDATA
with dwData = 0xa
, the length and data set appropriately, and wParam
set to my own HWND. Then I pump messages and wait for a WM_COPYDATA
message with dwData = 0xb
and pull out the reply contents.
Seems a lot more complicated to observe the behavior than recreate it, huh? There was a lot of trial and error involved in making this part, so it took me a bunch of coding sessions to get it right, though I can now explain it in a pretty straightforward way.
0 comments:
Post a Comment