20130401

The Curious Case of the 9-Letter Bug

I recently fixed a very interesting bug that had been stumping me for a long time. It was a pain to debug--if you could really call it that--so I thought I'd share about it.

The bug repro steps go something like this:

  1. Start up my MungeTLS test EXE
  2. Start up my openssl test client, which does: openssl s_client -connect localhost:8879 -debug -state -pause -msg -legacy_renegotiation -tlsextdebug -tls1
  3. After the handshake completes, type any 9 characters and hit enter
  4. The server immediately terminates the connection

Whaaaaaa...? The number of typing shall be 9. It shall not be 8 (for then the bug won't repro); it shall not be 10 (for the bug surely won't manifest). It will be 9 only, no more, no less. Okay, that's not quite true... but we'll come to that in a little bit.

Finding the bug

Get ready for a rainbow explosion. Are you ready for this? You aren't even ready. Let's take a look at the various logs in play. First, let's see the client logs, where I type 'a' 9 times:

New, TLSv1/SSLv3, Cipher is AES128-SHA
Server public key is 2048 bit
Secure Renegotiation IS supported
Compression: NONE
Expansion: NONE
SSL-Session:
    Protocol  : TLSv1
    Cipher    : AES128-SHA
    Session-ID:
    Session-ID-ctx:
    Master-Key: 2A76F3897A72CC25A19E78D0DFC7D289F3E1917CB00C764104ADFF9315EB1D
75580AEC9FDA9A351E4FBD7DD5F84084
    Key-Arg   : None
    PSK identity: None
    PSK identity hint: None
    SRP username: None
    Start Time: 1364315335
    Timeout   : 7200 (sec)
    Verify return code: 19 (self signed certificate in certificate chain)
---
aaaaaaaaa
write to 0x1cfe360 [0x1db6b46] (74 bytes => 74 (0x4A))
0000 - 17 03 01 00 20 ed 4b 27-e3 a8 0a 8f 79 fc da ea   .... .K'....y...
0010 - 28 85 bc e6 b0 c8 cb 4a-68 d7 b5 1d f8 59 12 46   (......Jh....Y.F
0020 - 3f 26 1a 37 b3 17 03 01-00 20 ef 1b 53 67 4a 0c   ?&.7..... ..SgJ.
0030 - a8 0e e9 15 fc 16 5e bf-b2 ff f1 16 30 53 b8 5a   ......^.....0S.Z
0040 - ef f0 aa 3c f1 50 83 33-f4 e0                     ...<.P.3..

SSL3 alert write:warning:close notify

Important information we see from this is that we're using TLS 1.0 with AES128-SHA. This tells us the format of the TLSCiphertext, as it relates to the encrypted body, IV, and MAC.

Great. Let's see the other side of the story--the server logs. Why oh why did you bail out like that, dear server?

successfully parsed TLSCiphertext. CT=23

decrypting. ciphertext (32 bytes):
ED 4B 27 E3 A8 0A 8F 79 FC DA EA 28 85 BC E6 B0 C8 CB 4A 68 D7 B5 1D F8 59 12 46 3F 26 1A 37 B3

setting IV to:
96 05 06 F0 35 34 26 7E 10 B7 39 90 8B F6 BA 6F

done initial decrypt: cb=21, size=32
74 CB AB C8 2D 62 76 84 4A 80 F5 C6 8D 14 90 8E D7 F4 59 03 0B 0B 0B 0B 0B 0B 0B 0B 0B 0B 0B 0B

looking for 11 padding bytes

MAC text is 13 bytes
sequence number: 1
MAC hash text:
00 00 00 00 00 00 00 01 17 03 01 00 00 

received MAC:
74 CB AB C8 2D 62 76 84 4A 80 F5 C6 8D 14 90 8E D7 F4 59 03 

computed MAC:
74 CB AB C8 2D 62 76 84 4A 80 F5 C6 8D 14 90 8E D7 F4 59 03 


// there were actually two messages


successfully parsed TLSCiphertext. CT=23
decrypting. ciphertext (32 bytes):
EF 1B 53 67 4A 0C A8 0E E9 15 FC 16 5E BF B2 FF F1 16 30 53 B8 5A EF F0 AA 3C F1 50 83 33 F4 E0

setting IV to:
C8 CB 4A 68 D7 B5 1D F8 59 12 46 3F 26 1A 37 B3

FALSE WindowsCrypto.cpp:380 - (CryptDecrypt( hKeyNew, 0, fFinal, 0, &pvbDecrypted->front(), &cb)) == 80090005

That error at the end is NTE_BAD_DATA, which basically means, "haha something smells funny in this input data." It's about as useful as ear hair.

For a comparison, how about we take a look at a working case and see if we can spot anything different? Here is the MungeTLS log for the last message (the one that failed before), but this time I'm sending the letter 'a' ten times rather than nine.

successfully parsed TLSCiphertext. CT=23

decrypting. ciphertext (48 bytes):
C9 EF 6F FD CF 03 A3 0F 4D CD 85 9C 37 77 27 1C 09 4A 6D 35 58 A4 B6 50 37 6E 70 6F 5C 28 01 DA E1 9D 33 BF 6B 4D DA 84 D9 37 8C 5C 6E 94 2C A0 

setting IV to:
79 C4 E7 86 77 CC 9E C8 03 67 CC 76 75 FC 1F BA 

done initial decrypt: cb=33, size=48
61 61 61 61 61 61 61 61 61 61 0D 0A 66 17 E3 19 62 93 5D FD EC 33 C7 D7 CA FB 50 B4 A6 98 57 40 0F 0F 0F 0F 0F 0F 0F 0F 0F 0F 0F 0F 0F 0F 0F 0F

looking for 15 padding bytes

decrypted fragment:
61 61 61 61 61 61 61 61 61 61 0D 0A

There are several interesting things here, all of which are actually kind of the same. First and most importantly, when I said that I was sending nine characters before in the failing case, that's wrong--I didn't realize the client was tacking on a CRLF, making it actually eleven characters. Eleven characters of payload plus a 20 byte SHA1 MAC is 31 bytes, which leaves exactly zero bytes of padding. That is, only the 1-byte padding length field would be present at the end of the block, having a value of 0x00, and there would be no actual padding bytes.

Likewise, with the addition of that extra character, we've pushed the payload length into the next ciphertext block, bringing it up to a total of 48 bytes, with the last block containing entirely padding (15 padding bytes + 1 padding length byte).

I'm certain at this point that CryptoAPI is not handling 0 padding bytes well, even though OpenSSL clearly expects it to. Sure enough, the bug reproes with 27 characters (27 + 20 byte MAC + 1 byte padding length = 3 16-byte ciphertext blocks) and other multiples.

The Fix

So how do we fix this monstrosity? It's silly, but the fix is actually very simple. This is a CryptoAPI quirk/bug, and the fix is luckily at that layer, too. Thank god, because I would hate have to handle this in another layer and introduce a leak into the "decrypt" abstraction.

CryptDecrypt takes a parameter called Final which tells whether the data that's been given is the last block of a chunk of data. It does a bunch of extra things when it sees this flag, among them is verifying the padding within the function and failing if it doesn't look right.

Previously, we were passing TRUE for Final. What happens if we pass FALSE instead? Well, CryptDecrypt will treat the incoming data like any block--it decrypts it and passes back the data without looking for padding or anything. After all, the "final" block isn't magic; it's still the same length as any other ciphertext block, and the padding bytes just happen to be interpreted that way because we arbitrarily decide they should be.

So, uh, yeah... that's the fix. Change a TRUE to a FALSE. The gnarliest bugs sometimes have the simplest fixes, don't they? Well, I'm a little disappointed I didn't get to dive into some more debugging this time, but perhaps it's for the best, huh?

0 comments:

Post a Comment