TLS Client Hello Mirror

What
Why
How
Mentions
Closing thoughts

What

Client Hello Mirror is a server which outputs your browser's TLS Client Hello message, emphasizing aspects of it that are detrimental to privacy:

» tlsprivacy.nervuri.net

It supports HTTPS and Gemini, is written in Go and is free/libre software. I'll explain why and how I wrote it.

A TLS connection starts with the client sending the server a "Hello" message that contains a set of supported capabilities and various parameters. This initial message presents two main privacy problems:

it often includes a unique session resumption token which can be used to track visitors (this is a problem on the web, less so in Geminispace);
by way of fingerprinting, it reveals to servers and on-path observers what browser you are likely using (down to its version, or a range of versions); if you change any TLS-related settings, your TLS fingerprint becomes specific to a much smaller group of users, possibly even to you alone.

» Tracking Users across the Web via TLS Session Resumption (2018)
» tlsfingerprint.io

I think these issues deserve more attention than they receive.

Among the browser testing tools available online, I hoped to find a web service that presented the complete Client Hello message, but my search came up empty. So, because I wanted such a tool to exist (for both Gemini and the web) and because I wanted to draw more attention to TLS privacy issues, I wrote Client Hello Mirror. This was a bit of a challenge, since many of the values in the Client Hello message are not exposed by TLS libraries.

How

First steps

I chose Go for this project for a single reason: uTLS, a "fork of the Go standard TLS library, providing low-level access to the ClientHello for mimicry purposes", made by the folks behind tlsfingerprint.io. Go developers are spoiled with such libraries: see also JA3Transport and CycleTLS, both of which are based on uTLS.

This made Go look like a good language for messing around with TLS – and indeed it is.

The first hurdle was figuring out how to extract the Client Hello bytes from the TCP stream and return them in an HTTP or Gemini response. I found the answer in Filippo Valsorda's GoLab 2018 talk "Building a DIY proxy with the net package" – the io.MultiReader trick that he details was exactly what was needed.

» Building a DIY proxy with the net package
» (code and slides)
» peek function (Client Hello Mirror)

Once I had the raw Client Hello bytes, the next step was to decode them. I based the Client Hello parser on the clientHelloMsg.unmarshal function in Go's built-in TLS library. The Client Hello message breakdown in Michael Driscoll's "The Illustrated TLS 1.3 Connection" was helpful in further developing the parser, as was Wireshark.

» The Illustrated TLS 1.3 Connection

And so the first version of the tool came about, which returned the Client Hello message as JSON.

CVE-2022-30629

After staring at the JSON output for a while, I noticed that obfuscated_ticket_age values for pre-shared keys (used for session resumption in TLS 1.3) weren't obfuscated at all. No matter what client I used, when resuming a session, the number of milliseconds since my last connection was plainly embedded in the Client Hello message, exposed to on-path observers. That's because Go's TLS server was setting the ticket_age_add value to zero for all session tickets, so clients added zero to the ticket age, resulting in no obfuscation.

I reported this on May 10 2022, as Go 1.19 was nearing release. Go's security people gave this issue a CVE ID and backported the fix to Go 1.17 and 1.18 as well.

» Bug report
» CVE-2022-30629

Non-random values for ticket_age_add in session tickets in crypto/tls before Go 1.17.11 and Go 1.18.3 allow an attacker that can observe TLS handshakes to correlate successive connections by comparing ticket ages during session resumption.

Privileges and timeouts

Part of making this server was figuring out how to properly drop root privileges in Go and how to correctly set timeouts on TCP connections. Tackling these issues is not as straightforward as it may appear. I assisted Solderpunk in dealing with them for Molly Brown as well.

» Golang: dropping privileges – my Stack Overflow answer
» Molly Brown: drop privileges
» Molly Brown: timeouts

The timeouts thread goes into tedious subtleties regarding what really happens when you call Close() on a TCP connection. It turns out that, by default, the kernel doesn't close the connection until its write buffer is emptied. The write buffer can be quite large and connections can be quite slow, so this can take a very long time – hours/days after you call Close(). So if you're looking to make it harder for "slow loris" attacks to exhaust socket descriptors, don't rely on timeouts/deadlines without also calling SetLinger(0) on the TCP connection before closing it.

» TCPConn.SetLinger

tlshello.agwa.name

About one year after starting this project, I came across a blog post by Andrew Ayer titled "Parsing a TLS Client Hello with Go's cryptobyte Package". It turns out that he wrote a very similar server at about the same time as me:

» tlshello.agwa.name
» Parsing a TLS Client Hello with Go's cryptobyte Package
» github.com/AGWA/tlshacks

Internally, his approach is very different. For one thing, he wrote the code for parsing the Client Hello message from scratch, whereas I extended the parser in Go's TLS library. For another thing, he managed to expose the full Client Hello message to a standard Go HTTP listener, which is something I had failed to figure out, leading me to do HTTP "by hand".

Not using a proper HTTP library may sound like asking for trouble on the request parsing side, but my code only looks at the first line of the request. It's so trivial that I dare say it is secure, as it only deals with the minimal subset of HTTP required for this to work (no request headers, no methods other than GET and HEAD, no HTTP/2…) and it doesn't serve files. Still, I would have preferred to use Go's HTTP library instead, because that would have made my code more useful to other developers. If you need to use the Client Hello message in an HTTP response, you're probably better off using Andrew Ayer's method.

What I took from his implementation was the idea of extracting TLS parameter and extension information from CSV files published by IANA. That's how the /json/v2 endpoint was born, which expands many numeric identifiers (of TLS versions, cipher suites, etc) into JSON objects containing a bit more information. This information is also used when generating the front page.

A subtle point about tlshello.agwa.name is that it doesn't use session resumption, so clients will never send it a pre_shared_key extension or a session_ticket value.

NJA3

I wanted to highlight TLS fingerprinting, so I included the popular JA3 fingerprint in the output. However, Chromium developers recently decided to randomize the ordering of extensions on each TLS handshake, as a counter to protocol ossification. This makes Chromium's JA3 fingerprint change on every connection, which prompted me to make a variant of JA3 that remains the same when extensions are shuffled. So I took JA3, sorted the extension codes and called the new fingerprint Normalized JA3 (NJA3).

A few days later, I came across a presentation by Troy Kent titled "(JA) 3 Reasons to Rethink Your Encrypted Traffic Analysis Strategies", which made a number of insightful suggestions, some of which I implemented. One of them was to ignore SNI, padding and other extensions that clients don't necessarily send on every connection. I also added five extra code groups and made a couple of changes inspired by mercury's Network Protocol Fingerprinting (NPF) specification. These modifications made NJA3 more precise and robust. It's more than "Normalized JA3" at this point.

» NJA3 documentation
» A first look at Chrome's TLS ClientHello permutation in the wild
» "(JA) 3 Reasons to Rethink Your Encrypted Traffic Analysis Strategies"
» Network Protocol Fingerprinting (NPF) specification

Mentions

If you connect to tlsprivacy.nervuri.net using Firefox / Tor Browser, you'll get a warning that your browser doesn't validate Signed Certificate Timestamps (for Certificate Transparency). You can enable SCT support by setting security.pki.certificate_transparency.mode = 1 in about:config, but doing so makes your TLS fingerprint stand out. Mozilla should enable this by default.
tlsfingerprint.io has a JSON endpoint which returns detailed (but not complete) information extracted from the Client Hello message. BrowserLeaks also presents such information in its TLS section:

» client.tlsfingerprint.io
» browserleaks.com/tls

Closing thoughts

Some of the features that I wished for didn't make it in. I would have liked the server to support early data / 0-RTT session resumption, as well as the legacy sessionID-based resumption method, but Go's crypto/tls library does not support them.

Also, I would have liked the server to detect clients' susceptibility to session prolongation attacks (see section 3.1 of the paper linked below). That, however, would require substantially more effort than it's probably worth. What's important is to know that even though the maximum lifetime of TLS 1.3 pre-shared keys is 7 days, a server can use them to track visitors over a much longer period, by just issuing a new one on each connection. This allows for tracking users indefinitely, as long as they connect at least once a week. This can be solved by clients sticking to the expiry date of the initial pre-shared key, but I doubt that any TLS libraries do this. As for other resumption methods, TLS session tickets and session IDs have a shorter maximum lifetime, but otherwise have the same problem.

» Tracking Users across the Web via TLS Session Resumption (2018)

TLS token binding (RFCs 8471, 8472 and 8473, formerly Channel ID) looks like it can be as bad for privacy as session resumption, but Chromium removed support for it in 2018. Edge might still support it, though. Token binding appears to be on its way out, but if it sticks around, Client Hello Mirror will probably highlight it at some point.

This concludes my exploration of TLS privacy issues, at least for now. On a similar note, I'm also interested in figuring out how feasible it is nowadays to determine device clock skews via the TCP timestamps option, and to what extent they can be used for device fingerprinting. But I'll leave that for another time.

» Remote physical device fingerprinting (2005)