20120828

Brain Flood About Initialization Vectors

I've reached what I would call basically "v1.0" of MungeTLS, so I'm going through the exercise of making a pass over the code with a fine-toothed comb, cleaning up lots of things, and documenting the heck out of everything. I do this in preparation for sending it to a few of my friends to code review. I expect to get a lot of scathing comments about my lack of exceptions and usage of gotos.

Anyway, while doing this pass over my code I noticed some strange stuff having to do with encryption and decryption of TLS 1.1 and 1.2 block ciphers. Let's do an overview of how they work, then the aberrant behavior I found, then a bunch of other investigation stuff to get to the bottom of it. It's been a crazy few days, let me tell you.

Block Ciphers and CBC

In TLS 1.1 and 1.2, the way that block ciphers work was changed. Oops, let me step back a second. There are two common types of "bulk" encryption, or symmetric key encryption used with TLS: stream and block ciphers. Stream ciphers will give you encrypted data byte-for-byte. RC4 is an example of a stream cipher. Block ciphers take a chunk of data and turn it into a series of fixed-size blocks of encrypted data, potentially with some padding at the end. AES is a very popular example of a block cipher.

Block ciphers are vulnerable to using the same key over and over again on data. I don't know exactly why that is, but to combat it, a technique called an "initialization vector" is used to sort of seed the encryption in a way. It's actually fairly simple in implementation, just XORing the IV into the plaintext just before encrypting. Moreover, we're using our block ciphers in "cipher-block-chaining" (CBC) mode, where each block is used as the IV for the next block.

There are some particularly interesting things to glean from this diagram. Firstly, for encryption, you're actually transmitting a series of IVs + ciphertext blocks, since each ciphertext block is implicitly the IV for the next one. That's a pretty weird concept, and effectively also says that it's okay to transmit any IV in the clear.

For decryption, the most interesting thing to see is that if you start decrypting with the wrong IV, only the first block will be incorrect. That's also pretty wacky. I'm not a crypto guy, so it doesn't really make sense to me why this is accepted and okay. Apparently there are other block cipher "modes" than CBC that do a better job of mitigating these concerns, but CBC is maybe the most popular one. Go figure.

Another important thing to note is that the IV size is always the same as the block size, since each block is itself also an IV.

I ended up writing a bunch of sample code to figure out these weird gotchas before looking this stuff up and finding the super useful diagram above, which confirmed these oddities.

Now let's take a step back chronologically, to a time before I had read and figured out all that stuff about IVs, to a time when I noticed some weirdness in my code. The plot arc looks like this: we're going to look at what the RFC seems to be saying we should do. Then we'll look at what I should have implemented. Then we'll look at what I actually implemented. We'll see how I and the RFC are wrong. Finally, we'll have some unsettling conclusions.

The idea of "IVNext", insidiously suggested by the RFC

First, let's take a look at what TLS 1.1 tells us in the RFC. Actually, they don't really tell us much about how the IV should work. It's like they expect me to be some kind of security expert to implement this thing. God, what were they thinking?

// TLS 1.1
block-ciphered struct {
    opaque IV[CipherSpec.block_length];
    opaque content[TLSCompressed.length];
    opaque MAC[CipherSpec.hash_size];
    uint8 padding[GenericBlockCipher.padding_length];
    uint8 padding_length;
} GenericBlockCipher;

The IV is encrypted here, which means the IV contained within this record actually has to be the "next IV", or IV to be used for decrypting the next record. The initial IV is computed from the master secret identically on both client and server side. In the absence of explicit guidance, this is what a layman would infer from the RFC.

Let's translate that into some heavy pseudocode.

CurrentIV = IV0 (which you get from the key material generation)

function SendMessage(payload)
{
    message = CreateMessage()
    message.content = payload

    NextIV = GenerateNextIV() // IVs should be some random value
    message.IV = NextIV // send this to be used on the next ciphertext
    bytes = EncryptMessage_TLS_1_1(message, CurrentIV) // current IV set prior
    CurrentIV = NextIV

    SendBytes(bytes)
}

Great, that's what the RFC suggests I should implement. The problem--moreover, the problem that made me notice the oddities in the first place--is that I didn't implement this correctly, and my code still worked. It's basically a programmer's biggest nightmare.

What happens when we don't implement IVNext properly?

Spoiler alert: nothing. Here's the pseudocode for what I actually implemented.

CurrentIV = IV0 (which you get from the key material generation)

function SendMessage(payload)
{
    message = CreateMessage()
    message.content = payload

    message.IV = CurrentIV // oops! recipient will use the wrong IV!
    bytes = EncryptMessage_TLS_1_1(message, CurrentIV)
    CurrentIV = GenerateNextIV()

    SendBytes(bytes)
}

So I was sending the same IV within the message that I was using to encrypt the message in the first place. The recipient is going to use the wrong IV to decrypt the message, and-and-and everything will go crazy... right?!

That's what I had thought at first, before I read up on how IVs work. Remember back to the explanation earlier: the IV really only affects the first block of decryption. The recipient will decrypt the whole ciphertext, and will get an incorrect decrypted IV value that I sent. But it doesn't care, since that IV value is not even used! Each ciphertext block is used as an IV for the decryption of the next block, and that is all done correctly.

I don't know if I'm explaining that right. Fundamentally, it would never make sense to send the IV encrypted as the first block of ciphertext. For a security expert, this is probably such a patently obvious design flaw that they didn't even need to issue errata about the bug in the GenericBlockCipher struct in TLS 1.1. It's actually fixed in TLS 1.2, though I incorrectly claimed that it seemed buggy in a previous post.

Fixing the IV code

Well, the first thing we're going to do when fixing it is stop calling anything "IV Next". It's really the IV for this record, not the next one. Here is the correct code.

function SendMessage(payload)
{
    message = CreateMessage()
    message.content = payload
    // also add padding, yeah

    message.IV = GenerateNextIV()
    bytes = EncryptMessage_TLS_1_1(message.content, message.IV)

    // IV in cleartext, bytes are encrypted
    SendBytes(message.IV + bytes)
}

Wow, that's much simpler. It's weird that the IV that's generated during key material calculation isn't even used. I think it's used for other cipher types that actually do need it. Anyway, this actually works.

Some final thoughts

I already explained above why it doesn't matter whether we are encrypting the entire payload (IV and content) or just starting from the content--because each ciphertext block depends on the previous block as an IV. That block-sized piece of data immediately preceding the payload is an IV for the encrypted payload no matter how it got there, no matter whether it's plaintext or encrypted. While I understand why it works, it still bugs me that it works both ways.

And I still don't get why it's even safe to send IVs in the clear like this. In TLS 1.0, the IV for a record was the last block of the previous ciphertext, as though all ciphertext was just chained together. It was changed in TLS 1.1 to be included adjacent to the payload it's encrypting, but it's still in the clear. I don't get how it's fundamentally any better.

Any security experts around wanna bust out the math and show me what's up?

20120827

Shockingly Meta Asymmetric Encryption

A long time ago I was figuring out how to get public/private key encryption working on Windows with CryptAPI. I ultimately did, but I wasn't able to get all four crypto operations working. This is a short post to explain how a little epiphany blew my mind.

The four operations I had in mind were:

  • Encrypt with private key
  • Decrypt with private key
  • Encrypt with public key
  • Decrypt with public key

When I wrote my sample code to try all four of these combinations, I discovered that the last of these inexplicably didn't work. I chalked it up to to not finagling the right, arcane set of parameters to some CryptAPI functions, but I think it's more fundamental than that.

Actually, I think that even in theory, you only need only the first three for any security protocol. Let's look at the scenarios that these enable.

Alice wants to send to Bob securely. Alice has two options: encrypt with her private key, or encrypt with Bob's public key. Bob has two options: decrypt with her public key, or decrypt with his private key (respectively).

Likewise, if Bob wants to send back to Alice securely, we just reverse the roles. Bob can always encrypt with Alice's public key rather than force Alice to decrypt with his public key. TLS is architected this way, as can any protocol that relies on public/private key encryption.

This hit me like a Mack truck the other day so hard that I just had to write it down. Is your mind blown, too?

20120810

This Contract Sucks. Renegotiation!

Hallo there! I'm back with some more codey writings! I've been busy at work on TLS renegotiation, which is a pretty tricky thing, but an important part of the spec. I'll give some very brief background on what it is, then talk about the kind of mess-ups I had to deal with while implementing it. Sounds good? Sounds good.

I never actually talked about the TLS handshake process, arguably the most interesting feature of the TLS spec, which also takes up like 80% of the RFC. It's the process by which the client and server exchange secrets to form an encrypted connection. This is also referred to as "negotiating" the security parameters of the connection. Then we come to renegotiation, which is starting another handshake on top of an already-secured connection.

Why would you want to do this, though? Er, well, I don't really know, exactly. I guess in IE you can go to an un-authenticated page on some web site, then navigate to a part of the site that requires the client to prove its identity using a client certificate--relatively uncommon behavior in TLS clients. This calls for a renegotiation, since the client has to offer more information about itself than it does in a regular TLS session. Or in the unlikely event that the sequence number involved in a connection is going to wrap, the connection must be renegotiated. In the RFC's own words:

Sequence numbers are of type uint64 and may not exceed 2^64-1. Sequence numbers do not wrap. If a TLS implementation would need to wrap a sequence number, it must renegotiate instead.

Renegotiation can be initiated in two ways, and it's pretty simple. The client can start a renegotiation at any time after a handshake is completed by sending the server another ClientHello message, the same as when it started the first handshake. The server at any time after a handshake is completed can send the client a HelloRequest message, which--intuitively--asks the client to send a ClientHello message as soon as possible.

It's of interest that these renegotiation messages are already protected by the encryption currently being used. This fundamental point caused me the majority of my work in implementing the feature. Speaking of... let's start getting to that, huh?

Gotta keep [connection states] separated

My code originally made some pretty large, sweeping simplifying assumptions to get basic scenarios working more quickly. I actually kind of foresaw that these issues would crop up eventually, but kept putting it off dealing with them. Implementing renegotiation was that forcing function to make me deal with them properly.

Concretely, I'm talking about separate states for inbound and outbound traffic, or "read state" and "write state". The TLS connection is actually three-legged, so this state separation is pretty important. You can kind of limp by without dealing with it properly, but when it comes to renegotiation it becomes more important.

The three "legs" as you progress through the handshake towards an encrypted connection are (from the server's point of view):

  1. Both read (client-to-server) and write (server-to-client) state are unencrypted.
  2. Client has sent a ChangeCipherSpec message, so read state is encrypted, but write state is not.
  3. Server sends its ChangeCipherSpec message, and now both read and write states are encrypted. Connection is now in steady-state, ready for application data.

The crucial shortcoming in that list above is that I refer to things being unencrypted or encrypted, which is insufficient a description. What I actually mean is "encrypted using the previous connection state's parameters", but dang if that isn't way more syllables. Since the initial cipher suite is TLS_NULL_WITH_NULL_NULL (no key exchange, bulk cipher, or hash algorithm), it's easy to think of it as "unencrypted" vs. "encrypted", and gloss over some important details.

The RFC even talks about this state separation pretty explicitly:

Reception of [a ChangeCipherSpec] message causes the receiver to instruct the record layer to immediately copy the read pending state into the read current state.

In my early versions, I had this huge ConnectionParameters class, which contained all of the key material and so on. A primary piece of data that needed to be separated was the cipher suite. If we are renegotiating a new cipher suite to be used, then between legs 2 and 3, the client is sending with the new cipher suite, but the server is sending with the old one.

To that end, I identified the pieces of data that needed to be split and put them in a separate EndpointParameters class. Here are the members it ended up having:

std::shared_ptr<Hasher> m_spHasher;
std::shared_ptr<SymmetricCipherer> m_spSymCipherer;
MT_ProtocolVersion::MTPV_Version m_eVersion;
MT_CipherSuite m_cipherSuite;
ByteVector m_vbKey;
ByteVector m_vbMACKey;
ByteVector m_vbIV;
MT_UINT64 m_seqNum;

Then at the key moment--when we get a ChangeCipherSpec, it's very easy for me to copy over the state for just that read or write direction.

else if (*record.ContentType()->Type() == MT_ContentType::MTCT_Type_ChangeCipherSpec)
{
    *CurrConn()->ReadParams() = *NextConn()->ReadParams();

    /*
    ** after copying the next endpoint state, which has not been touched,
    ** its sequence number should already be 0 without having to reset it
    */
    assert(*CurrConn()->ReadParams()->SequenceNumber() == 0);
}

The assert that I showed here is also pretty interesting. Previously I was manually setting the read sequence number to 0 when receiving this message, but now, as the comment says, I shouldn't have to.

Likewise, when the server sends its own ChangeCipherSpec, I can easily copy the write state:

hr = EnqueueMessage(spChangeCipherSpec);
if (hr != S_OK)
{
    goto error;
}

*CurrConn()->WriteParams() = *NextConn()->WriteParams();

hr = NextConn()->CopyCommonParamsTo(CurrConn());
if (hr != S_OK)
{
    goto error;
}

/*
** newly copied new connection state should have its initial value of
** 0 for sequence number, since it hasn't been touched yet
*/
assert(*CurrConn()->WriteParams()->SequenceNumber() == 0);

Since this is the last leg of the handshake, I also copy over all the other non-endpoint-specific parameters, thus making the current connection the fully active one.

Quick note: renegotiation protection

Oh yeah, there's this thing called renegotiation protection that protects a renegotiation handshake from having some extra records inserted into the middle of it by an attacker. Basically, it requires adding extra data to the ClientHello and ServerHello to prove that the renegotiation is related in some way to the previous handshake. Specifically, it's data from the Finished messages before--the final messages sent in the handshake.

This wasn't too hard to implement. I just had to keep track of the "verify data" from the first handshake and insert it into the next handshake. Once I had the architecture of keeping track of an old and a new connection state, it was easy.

What does it mean to be encrypted?

In my bulleted list earlier I talked about the inadequacy of describing the connection as being "encrypted" or not. Turns out my code was doing this, too. I had a single flag in the ConnectionParameters class called bool m_fIsSecure. In a couple places in the code I'd do things like: "if the connection is currently secure, then parse this message as a TLSCiphertext message and try to decrypt it; otherwise, parse it as a TLSPlaintext."

As I tried to describe earlier, that's nowhere near good enough for renegotiation. Now you have to start concerning yourself with which cipher suite to decrypt the message with--RC4? AES128? etc.. The read/write state separation helped here, but I was able to clean it up even more by using some provisions given by the RFC.

Namely, recall that the initial cipher suite is TLS_NULL_WITH_NULL_NULL, a.k.a. unencrypted. What if I started out my code with that cipher suite and tried to parse everything as a TLSCiphertext, albeit sometimes with null encryption? Then it will either decrypt or not decrypt as necessary, depending on the current connection state.

As a freebie, I get an implementation of null encryption proper, though not even OpenSSL's debugging client allows it. Maybe I have to recompile the thing with some dumb flag?

Maybe an IE bug? Different cipher suite during renegotiation

The whole point of this project is to test TLS client implementations, so I've been testing against both the OpenSSL command line client and IE throughout. IE as of now had the only implementation of TLS 1.2 that I could find, so it was a necessity. I hear OpenSSL might have it now, but I need to update and check.

Anyway, I was having a heck of a time testing with IE. I ran into two problems. Firstly, IE doesn't seem to support choosing a different cipher suite during renegotiation. That's maybe understandable for security reasons, though it seems odd not to allow "upgrading" to a more secure cipher suite. Whatever.

My problem with it is that the ClientHello advertises all of IE's usual cipher suites, but IE resets the connection if you choose anything other than the same one you were using before. Is this some weird defense-in-depth, not to broadcast what the previous connection state is using? That doesn't quite make sense either. I dunno.

Maybe an IE bug? Renegotiation after application data

I found another weird IE problem, that it doesn't like a HelloRequest directly after an ApplicationData message. It's not closing the connection at that point, since I've specified to keep the connection alive, but I guess internally the fetching of a web page kind of halts the processing of more messages that arrive after the web page is done loading.

If I delay the HelloRequest until I get a new ApplicationData--presumably the start of a new HTTP request--it is totally happy with it.

Chrome didn't seem to have this problem, nor did the command line OpenSSL client.

Pretty sure it's a Chrome bug: mismatched version

It's interesting: Chrome (latest Chromium, even) doesn't support TLS 1.2. It does support TLS 1.1, however, which is somewhere between 1.0 and 1.2 in terms of similarities. Wow, how insightful. To sum it up in a sentence, the key differences are that TLS 1.1 still uses the same hybrid SHA1/MD5 pseudorandom function that TLS 1.0 does, but uses TLS 1.2's incorporated IVs in the cipher fragment. Anyway, to test with Chrome I had to implement TLS 1.1. It wasn't too bad, since I already have working TLS 1.0 and 1.2 implementations.

HRESULT
ComputePRF_TLS10(
    Hasher* pHasher,
    const ByteVector* pvbSecret,
    PCSTR szLabel,
    const ByteVector* pvbSeed,
    size_t cbLengthDesired,
    ByteVector* pvbPRF);

// same PRF used for both 1.0 and 1.1
auto ComputePRF_TLS11 = ComputePRF_TLS10;

// same block structure format between 1.1 and 1.2
typedef MT_GenericBlockCipher_TLS11 MT_GenericBlockCipher_TLS12;

So having all that working, I found that my renegotiations were failing! Even with same-cipher suite choice! Weird stuff, man. I got to debugging. I noticed from my logs that the MAC attached to the ClientHello in the renegotiation was not computing correctly.

received MAC:
17 18 53 ED BD 10 0D 77 FB 94 A4 97 94 7D 2F BA 69 F2 96 F9 

computed MAC:
4A 1B F0 03 2A C0 2C 72 B7 66 DA 68 B3 5C CE D9 F3 F5 52 2B 

tlsciphertext failed security check: MT_E_BAD_RECORD_MAC

I implemented one of the real killer features of MungeTLS to help debug this: it logs all the traffic unencrypted even if it sent or received it encrypted. This gets exported to a Netmon capture for easy viewing. And what did I spy in my Netmon capture that looked amiss?

InnerTLS: TLS Rec Layer-1 HandShake: Client Hello.
- TlsRecordLayer: TLS Rec Layer-1 HandShake:
   ContentType: HandShake:
 + Version: TLS 1.0
   Length: 175 (0xAF)
 - SSLHandshake: SSL HandShake ClientHello(0x01)
    HandShakeType: ClientHello(0x01)
    Length: 171 (0xAB)
  - ClientHello: TLS 1.1
   + Version: TLS 1.1
   + RandomBytes: 
     SessionIDLength: 0 (0x0)
     CipherSuitesLength: 72
   + TLSCipherSuites: TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA    { 0xC0,0x0A }
...
...
     CompressionMethodsLength: 1 (0x1)
     CompressionMethods: 0 (0x0)
     ExtensionsLength: 58 (0x3A)
   + ClientHelloExtension: Server Name(0x0000)
   + ClientHelloExtension: Renegotiation Info(0xFF01)
   + ClientHelloExtension: Elliptic Curves(0x000A)
   + ClientHelloExtension: EC Point Formats(0x000B)
   + ClientHelloExtension: SessionTicket TLS(0x0023)
   + ClientHelloExtension: Unknown Extension Type
   + ClientHelloExtension: Status Request(0x0005)

Woops, why doesn't this version match!? I don't know what kinds of tricks they're trying to pull here... This seemed like a strange difference, so I wondered if they were using the wrong version in their MAC computation. Recall from a previous post the MAC computation:

The MAC is generated as:

HMAC_hash(MAC_write_secret, seq_num + TLSCompressed.type +
             TLSCompressed.version + TLSCompressed.length +
             TLSCompressed.fragment));

It clearly says to get the version from TLSCompressed (a.k.a. TLSCiphertext for our purposes). The hash computation that failed is incorrectly using the ClientHello.version instead. I added a callback in the code to allow the application to reconcile this version difference if it's encountered. Once I had this fixed, the handshake went through smoothly.

HRESULT
DummyServer::OnReconcileSecurityVersion(
    MT_TLSCiphertext* pCiphertext,
    MT_ProtocolVersion::MTPV_Version connVersion,
    MT_ProtocolVersion::MTPV_Version recordVersion,
    MT_ProtocolVersion::MTPV_Version* pOverrideVersion)
{
    UNREFERENCED_PARAMETER(pCiphertext);

    // detecting chrome bug and working around. could also have sniffed UA str
    if (connVersion == MT_ProtocolVersion::MTPV_TLS11 &&
        recordVersion == MT_ProtocolVersion::MTPV_TLS10)
    {
        *pOverrideVersion = connVersion;
        return MT_S_LISTENER_HANDLED;
    }

    return MT_S_LISTENER_IGNORED;
} // end function OnReconcileSecurityVersion

It pleases me greatly to see my program already finding bugs in a TLS protocol implementation. This is exactly the purpose I created it for, and I haven't even gotten started with any really nasty tests!

The looming shadow of things to come

I feel a lot better about a number of parts of the code now. It removes a lot of assumptions and models the TLS spec more closely. However, the next (and final?) big feature I have in mind is to make a client implementation to go along with the server implemenation I currently have, and that's going to take a lot of work along the lines of sorting out "read" and "write" states, compared to "client" and "server" endpoints. I can tell that's going to be a mess. Look forward to hearing about it later!