20120810

This Contract Sucks. Renegotiation!

Hallo there! I'm back with some more codey writings! I've been busy at work on TLS renegotiation, which is a pretty tricky thing, but an important part of the spec. I'll give some very brief background on what it is, then talk about the kind of mess-ups I had to deal with while implementing it. Sounds good? Sounds good.

I never actually talked about the TLS handshake process, arguably the most interesting feature of the TLS spec, which also takes up like 80% of the RFC. It's the process by which the client and server exchange secrets to form an encrypted connection. This is also referred to as "negotiating" the security parameters of the connection. Then we come to renegotiation, which is starting another handshake on top of an already-secured connection.

Why would you want to do this, though? Er, well, I don't really know, exactly. I guess in IE you can go to an un-authenticated page on some web site, then navigate to a part of the site that requires the client to prove its identity using a client certificate--relatively uncommon behavior in TLS clients. This calls for a renegotiation, since the client has to offer more information about itself than it does in a regular TLS session. Or in the unlikely event that the sequence number involved in a connection is going to wrap, the connection must be renegotiated. In the RFC's own words:

Sequence numbers are of type uint64 and may not exceed 2^64-1. Sequence numbers do not wrap. If a TLS implementation would need to wrap a sequence number, it must renegotiate instead.

Renegotiation can be initiated in two ways, and it's pretty simple. The client can start a renegotiation at any time after a handshake is completed by sending the server another ClientHello message, the same as when it started the first handshake. The server at any time after a handshake is completed can send the client a HelloRequest message, which--intuitively--asks the client to send a ClientHello message as soon as possible.

It's of interest that these renegotiation messages are already protected by the encryption currently being used. This fundamental point caused me the majority of my work in implementing the feature. Speaking of... let's start getting to that, huh?

Gotta keep [connection states] separated

My code originally made some pretty large, sweeping simplifying assumptions to get basic scenarios working more quickly. I actually kind of foresaw that these issues would crop up eventually, but kept putting it off dealing with them. Implementing renegotiation was that forcing function to make me deal with them properly.

Concretely, I'm talking about separate states for inbound and outbound traffic, or "read state" and "write state". The TLS connection is actually three-legged, so this state separation is pretty important. You can kind of limp by without dealing with it properly, but when it comes to renegotiation it becomes more important.

The three "legs" as you progress through the handshake towards an encrypted connection are (from the server's point of view):

  1. Both read (client-to-server) and write (server-to-client) state are unencrypted.
  2. Client has sent a ChangeCipherSpec message, so read state is encrypted, but write state is not.
  3. Server sends its ChangeCipherSpec message, and now both read and write states are encrypted. Connection is now in steady-state, ready for application data.

The crucial shortcoming in that list above is that I refer to things being unencrypted or encrypted, which is insufficient a description. What I actually mean is "encrypted using the previous connection state's parameters", but dang if that isn't way more syllables. Since the initial cipher suite is TLS_NULL_WITH_NULL_NULL (no key exchange, bulk cipher, or hash algorithm), it's easy to think of it as "unencrypted" vs. "encrypted", and gloss over some important details.

The RFC even talks about this state separation pretty explicitly:

Reception of [a ChangeCipherSpec] message causes the receiver to instruct the record layer to immediately copy the read pending state into the read current state.

In my early versions, I had this huge ConnectionParameters class, which contained all of the key material and so on. A primary piece of data that needed to be separated was the cipher suite. If we are renegotiating a new cipher suite to be used, then between legs 2 and 3, the client is sending with the new cipher suite, but the server is sending with the old one.

To that end, I identified the pieces of data that needed to be split and put them in a separate EndpointParameters class. Here are the members it ended up having:

std::shared_ptr<Hasher> m_spHasher;
std::shared_ptr<SymmetricCipherer> m_spSymCipherer;
MT_ProtocolVersion::MTPV_Version m_eVersion;
MT_CipherSuite m_cipherSuite;
ByteVector m_vbKey;
ByteVector m_vbMACKey;
ByteVector m_vbIV;
MT_UINT64 m_seqNum;

Then at the key moment--when we get a ChangeCipherSpec, it's very easy for me to copy over the state for just that read or write direction.

else if (*record.ContentType()->Type() == MT_ContentType::MTCT_Type_ChangeCipherSpec)
{
    *CurrConn()->ReadParams() = *NextConn()->ReadParams();

    /*
    ** after copying the next endpoint state, which has not been touched,
    ** its sequence number should already be 0 without having to reset it
    */
    assert(*CurrConn()->ReadParams()->SequenceNumber() == 0);
}

The assert that I showed here is also pretty interesting. Previously I was manually setting the read sequence number to 0 when receiving this message, but now, as the comment says, I shouldn't have to.

Likewise, when the server sends its own ChangeCipherSpec, I can easily copy the write state:

hr = EnqueueMessage(spChangeCipherSpec);
if (hr != S_OK)
{
    goto error;
}

*CurrConn()->WriteParams() = *NextConn()->WriteParams();

hr = NextConn()->CopyCommonParamsTo(CurrConn());
if (hr != S_OK)
{
    goto error;
}

/*
** newly copied new connection state should have its initial value of
** 0 for sequence number, since it hasn't been touched yet
*/
assert(*CurrConn()->WriteParams()->SequenceNumber() == 0);

Since this is the last leg of the handshake, I also copy over all the other non-endpoint-specific parameters, thus making the current connection the fully active one.

Quick note: renegotiation protection

Oh yeah, there's this thing called renegotiation protection that protects a renegotiation handshake from having some extra records inserted into the middle of it by an attacker. Basically, it requires adding extra data to the ClientHello and ServerHello to prove that the renegotiation is related in some way to the previous handshake. Specifically, it's data from the Finished messages before--the final messages sent in the handshake.

This wasn't too hard to implement. I just had to keep track of the "verify data" from the first handshake and insert it into the next handshake. Once I had the architecture of keeping track of an old and a new connection state, it was easy.

What does it mean to be encrypted?

In my bulleted list earlier I talked about the inadequacy of describing the connection as being "encrypted" or not. Turns out my code was doing this, too. I had a single flag in the ConnectionParameters class called bool m_fIsSecure. In a couple places in the code I'd do things like: "if the connection is currently secure, then parse this message as a TLSCiphertext message and try to decrypt it; otherwise, parse it as a TLSPlaintext."

As I tried to describe earlier, that's nowhere near good enough for renegotiation. Now you have to start concerning yourself with which cipher suite to decrypt the message with--RC4? AES128? etc.. The read/write state separation helped here, but I was able to clean it up even more by using some provisions given by the RFC.

Namely, recall that the initial cipher suite is TLS_NULL_WITH_NULL_NULL, a.k.a. unencrypted. What if I started out my code with that cipher suite and tried to parse everything as a TLSCiphertext, albeit sometimes with null encryption? Then it will either decrypt or not decrypt as necessary, depending on the current connection state.

As a freebie, I get an implementation of null encryption proper, though not even OpenSSL's debugging client allows it. Maybe I have to recompile the thing with some dumb flag?

Maybe an IE bug? Different cipher suite during renegotiation

The whole point of this project is to test TLS client implementations, so I've been testing against both the OpenSSL command line client and IE throughout. IE as of now had the only implementation of TLS 1.2 that I could find, so it was a necessity. I hear OpenSSL might have it now, but I need to update and check.

Anyway, I was having a heck of a time testing with IE. I ran into two problems. Firstly, IE doesn't seem to support choosing a different cipher suite during renegotiation. That's maybe understandable for security reasons, though it seems odd not to allow "upgrading" to a more secure cipher suite. Whatever.

My problem with it is that the ClientHello advertises all of IE's usual cipher suites, but IE resets the connection if you choose anything other than the same one you were using before. Is this some weird defense-in-depth, not to broadcast what the previous connection state is using? That doesn't quite make sense either. I dunno.

Maybe an IE bug? Renegotiation after application data

I found another weird IE problem, that it doesn't like a HelloRequest directly after an ApplicationData message. It's not closing the connection at that point, since I've specified to keep the connection alive, but I guess internally the fetching of a web page kind of halts the processing of more messages that arrive after the web page is done loading.

If I delay the HelloRequest until I get a new ApplicationData--presumably the start of a new HTTP request--it is totally happy with it.

Chrome didn't seem to have this problem, nor did the command line OpenSSL client.

Pretty sure it's a Chrome bug: mismatched version

It's interesting: Chrome (latest Chromium, even) doesn't support TLS 1.2. It does support TLS 1.1, however, which is somewhere between 1.0 and 1.2 in terms of similarities. Wow, how insightful. To sum it up in a sentence, the key differences are that TLS 1.1 still uses the same hybrid SHA1/MD5 pseudorandom function that TLS 1.0 does, but uses TLS 1.2's incorporated IVs in the cipher fragment. Anyway, to test with Chrome I had to implement TLS 1.1. It wasn't too bad, since I already have working TLS 1.0 and 1.2 implementations.

HRESULT
ComputePRF_TLS10(
    Hasher* pHasher,
    const ByteVector* pvbSecret,
    PCSTR szLabel,
    const ByteVector* pvbSeed,
    size_t cbLengthDesired,
    ByteVector* pvbPRF);

// same PRF used for both 1.0 and 1.1
auto ComputePRF_TLS11 = ComputePRF_TLS10;

// same block structure format between 1.1 and 1.2
typedef MT_GenericBlockCipher_TLS11 MT_GenericBlockCipher_TLS12;

So having all that working, I found that my renegotiations were failing! Even with same-cipher suite choice! Weird stuff, man. I got to debugging. I noticed from my logs that the MAC attached to the ClientHello in the renegotiation was not computing correctly.

received MAC:
17 18 53 ED BD 10 0D 77 FB 94 A4 97 94 7D 2F BA 69 F2 96 F9 

computed MAC:
4A 1B F0 03 2A C0 2C 72 B7 66 DA 68 B3 5C CE D9 F3 F5 52 2B 

tlsciphertext failed security check: MT_E_BAD_RECORD_MAC

I implemented one of the real killer features of MungeTLS to help debug this: it logs all the traffic unencrypted even if it sent or received it encrypted. This gets exported to a Netmon capture for easy viewing. And what did I spy in my Netmon capture that looked amiss?

InnerTLS: TLS Rec Layer-1 HandShake: Client Hello.
- TlsRecordLayer: TLS Rec Layer-1 HandShake:
   ContentType: HandShake:
 + Version: TLS 1.0
   Length: 175 (0xAF)
 - SSLHandshake: SSL HandShake ClientHello(0x01)
    HandShakeType: ClientHello(0x01)
    Length: 171 (0xAB)
  - ClientHello: TLS 1.1
   + Version: TLS 1.1
   + RandomBytes: 
     SessionIDLength: 0 (0x0)
     CipherSuitesLength: 72
   + TLSCipherSuites: TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA    { 0xC0,0x0A }
...
...
     CompressionMethodsLength: 1 (0x1)
     CompressionMethods: 0 (0x0)
     ExtensionsLength: 58 (0x3A)
   + ClientHelloExtension: Server Name(0x0000)
   + ClientHelloExtension: Renegotiation Info(0xFF01)
   + ClientHelloExtension: Elliptic Curves(0x000A)
   + ClientHelloExtension: EC Point Formats(0x000B)
   + ClientHelloExtension: SessionTicket TLS(0x0023)
   + ClientHelloExtension: Unknown Extension Type
   + ClientHelloExtension: Status Request(0x0005)

Woops, why doesn't this version match!? I don't know what kinds of tricks they're trying to pull here... This seemed like a strange difference, so I wondered if they were using the wrong version in their MAC computation. Recall from a previous post the MAC computation:

The MAC is generated as:

HMAC_hash(MAC_write_secret, seq_num + TLSCompressed.type +
             TLSCompressed.version + TLSCompressed.length +
             TLSCompressed.fragment));

It clearly says to get the version from TLSCompressed (a.k.a. TLSCiphertext for our purposes). The hash computation that failed is incorrectly using the ClientHello.version instead. I added a callback in the code to allow the application to reconcile this version difference if it's encountered. Once I had this fixed, the handshake went through smoothly.

HRESULT
DummyServer::OnReconcileSecurityVersion(
    MT_TLSCiphertext* pCiphertext,
    MT_ProtocolVersion::MTPV_Version connVersion,
    MT_ProtocolVersion::MTPV_Version recordVersion,
    MT_ProtocolVersion::MTPV_Version* pOverrideVersion)
{
    UNREFERENCED_PARAMETER(pCiphertext);

    // detecting chrome bug and working around. could also have sniffed UA str
    if (connVersion == MT_ProtocolVersion::MTPV_TLS11 &&
        recordVersion == MT_ProtocolVersion::MTPV_TLS10)
    {
        *pOverrideVersion = connVersion;
        return MT_S_LISTENER_HANDLED;
    }

    return MT_S_LISTENER_IGNORED;
} // end function OnReconcileSecurityVersion

It pleases me greatly to see my program already finding bugs in a TLS protocol implementation. This is exactly the purpose I created it for, and I haven't even gotten started with any really nasty tests!

The looming shadow of things to come

I feel a lot better about a number of parts of the code now. It removes a lot of assumptions and models the TLS spec more closely. However, the next (and final?) big feature I have in mind is to make a client implementation to go along with the server implemenation I currently have, and that's going to take a lot of work along the lines of sorting out "read" and "write" states, compared to "client" and "server" endpoints. I can tell that's going to be a mess. Look forward to hearing about it later!

0 comments:

Post a Comment