Analysing TLS Connection for Cyber Security and Network Forensics

Resources

Analysing TLS Connection for Cyber Security and Network Forensics

Introduction:

A decade ago, SSL/TLS was only used by financial institutions and some specific organisations like public sector agencies for the log-in pages of security-conscious websites and services. Today, this has been expanded to almost all web-based services, and with the growth of unlawful activity, this has quickly become the de facto protocol for communications. With the growing use of encrypted traffic, the traditional approach of Network Forensics [7, 8] should also include SSL/TLS Forensics. It is the process of capturing information exchanged through SSL (TLS) connections and trying to visualise and extract meaningful information out of it so that it can help in some forensics analysis.

TLS/SSL provides authentication and encryption support to many necessary application layer protocols such as HTTP, POP3, IMAP, SMTP, etc. The symmetric key established at the end of SSL/TLS connection establishment encrypts the protocol application data. Data exchanged through these SSL/TLS connections can be subjected to man-in-the-middle attacks at the security perimeter. They can help a system or network administrator see or analyse the application layer protocol data of HTTP, POP3, IMAP, SMTP, etc., as plain text. But, these devices which can do man-in-the-middle attacks have limited capacity and also, with the newly rolled out TLS 1.3 protocol, this has become almost impossible.

TLS / SSL protocol leaks some significant information while establishing the connection. In this whitepaper, we will see how we could use this information. Up to TLS 1.2, a certificate is sent from the server to the client as plain text, exposing some meaningful information. In TLS 1.3, since the certificate is also sent as encrypted from the server, we will not have access to this information. We will provide a TLS protocol-wise detailed comparison of what could be achieved.

SSL/TLS fingerprinting was introduced in 2008 and has recently gained attention. Fingerprinting of SSL Client Hello and SSL Server Hello messages can help gain significant insight about the device involved in communication – for example, OS information, Platform information, device type, etc. Fingerprinting mechanisms can also help identify browsers, detect malware, etc. Cisco Joy and JA3 are two main SSL fingerprinting mechanisms that are used. We will describe these fingerprinting mechanisms and the information these fingerprints reveal.

The organisation of the whitepaper is as follows. We will first look at the significant variants of SSL/TLS protocol. Next, we will describe what information could be achieved from the SSL/TLS protocol messages. Finally, we will see the details about the SSL / TLS fingerprinting technology. Throughout the whitepaper, we will also refer to different open-source tools and databases that one can use to gain insights from the SSL traffic.

SSL/TLS Protocol:

Encryption of the application layer protocol data is required to enforce online communication security. For multiple decades, SSL and its descendant, TLS, have ensured this encryption to enforce security. Secure Sockets Layer, in short SSL, is the protocol developed in the mid-1990s by a company called Netscape. After this, IETF standardised it and released it after some changes were made to the earlier one. It was given a new name called Transport Layer Security or TLS. This protocol version, popularly known as TLS 1.0, was released in 1999 and published as RFC 2246 [1]. The next version of the protocol, known as TLS 1.1, was released in 2006 and published as RFC 4346 [2]. Tightening up various requirements to provide enhanced security, the next version of the protocol TLS 1.2 came in 2008 and was published as RFC 5246 [3]. Till today, TLS 1.2 is the most used version of TLS and has many improvement features over TLS 1.1. TLS 1.3 aims to improve further on top of TLS 1.2 and has made a paradigm shift in its design. It is the latest version of the TLS protocol and has been ratified as RFC 8446 [4] in 2018.

SSL/TLS protocol includes both asymmetric and symmetric cryptography to provide security. To provide encryption/decryption of the protocol data, both client and server are required to agree on a common shared key. This key is used to encrypt and decrypt the protocol data at the transmitting and receiving end, respectively, and this involves symmetric key cryptography. The question is, how do the client and server agree on a common shared key? The process through which the client and server agree on a common shared key is called handshake, and this handshake process mainly defines the SSL or TLS protocol. Asymmetric cryptography is used in the handshake process to facilitate secure computation of the common shared key by both client and server.

Asymmetric cryptography is computationally expensive, which is why it is only used in the handshake process to encrypt or decrypt the protocol traffic; we rely on symmetric cryptography, which is much faster. We will now see the handshake process in detail about TLS 1.2 and TLS 1.3.

SSL/TLS Handshake Process:

The end goal of using SSL/TLS over application layer protocol is to be able to encrypt or decrypt protocol traffic and provide solutions to problems like authentication, data integrity, and many more. The handshake process tries to address most of these problem areas, which is why it is complex and broad. To describe each and every aspect of this is beyond the scope of this whitepaper, and we will provide only an outline of the handshake process here.

Figure 1: Basic SSL/TLS protocol flow up to TLS 1.2

SSL / TLS client starts by sending a “Client Hello” message to the SSL / TLS server by publishing its capabilities and the different security features it supports. The server responds with a “Server Hello” message that helps the client know the cipher suite with which both can continue the negotiation. Through the “Server Hello” message, the server also agrees to use some of the security features the client has previously published.

The server then sends its digital certificate through a message called “Server Certificate”, issued by some trusted authority called CA (Certificate Authority); having received and verified it, the client becomes sure of continuing with the connection with an authenticated server. The server certificate contains the private key of the server. The client then secretly chooses a pre-shared key, encrypts it and sends it to the server through a message called “Client Key Exchange” using the server’s public key. Only the server can decrypt it and get to know about the pre-shared key since it can only be decrypted by the matching private key of the server. The client and server then use this pre-shared key to establish the common shared key with which the protocol data is encrypted and sent.

To mark the end of the handshake process and to verify the sanity of the agreed-upon shared secret, both parties send each other an “Encrypted Handshake Message”. If none of the clients and servers, having received the “Encrypted Handshake Message”, respond with an SSL Alert Message, then a successful Handshake is done, and both ends are ready to encrypt/decrypt application traffic.

This basic philosophy of SSL/TLS handshake remains the same in almost all versions of TLS, including the latest TLS 1.3. In TLS 1.3, immediately after exchanging “Client Hello” and “Server Hello” messages, both parties agree on a secret key called a handshake key. This handshake key is used to encrypt/decrypt the other messages related to the handshake only. So, unlike other SSL or TLS versions, “Server Certificate” and “Encrypted Handshake Message” are also exchanged between client and server, but now they are encrypted with handshake keys. In the end, in TLS 1.3, both parties securely agree on a common shared secret to continue encrypting/decrypting application traffic.

Next, we will take a closer look into the relevant aspects of “Client Hello” and “Server Hello” messages because these two messages are the ones which are primarily used in network forensics. “Client Hello” [Figure 4a and Figure 4b] message includes a random number (called client random), a set of cipher suites published by the client (cipher suites encode capabilities of a client with which the client can do a successful SSL/TLS transaction) and a set of extensions. “Server Hello” [Figure 3] message include a random number (called server random), a single cipher suite chosen from the list published by the client with which the server wants the SSL handshake negotiation and transaction to get completed and a set of extensions. There are other fields, though they are present, but that is outside the scope of discussion for the current whitepaper.

Figure 4b: Client Hello packet showing all published cipher suites

Insights achieved from SSL/TLS protocol:

The primary sources of gaining information out of SSL/TLS protocol are “Client Hello” and “Server Hello” messages. In this section, we will systematically see how we can gain insights from these two messages, and we start by defining Cipher Suite, which is central to any SSL/TLS protocol. Cipher Suite is a collection of algorithms used in any SSL/TLS protocol to solve different security problems (authentication, data integrity, encryption or decryption, etc.). Each cipher suite has a unique name defined by the SSL/TLS protocol.

For example, TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256, where TLS depicts the protocol it is for, ECDHE (elliptic curve Diffie-Hellman algorithm) is to be used for key generation, RSA will be used for authenticating the server, the block cipher algorithm AES_128_GCM will be used to encrypt or decrypt protocol data and lastly SHA256 is used to ensure data integrity. The key generation algorithm, which is ECDHE, has an ‘E’ at the end, meaning the generated key will be ephemeral. There are other variants of key generation algorithms such as ECDH (generated key is not ephemeral), DHE, DH and RSA. DHE and DH are non-elliptic versions of the Diffie-Hellman algorithm. When RSA is used, it is used for both key generation and server authentication.

Forward Secrecy or Perfect Forward Secrecy is a security feature demanded from the SSL/TLS protocol to ensure that even if the public-private key pair of the server is compromised, it should not enable an adversary to break all past communications. This requires a separate session key (or shared key) to be generated for each session, and when that is done, it is called an Ephemeral key. So, obviously, ephemeral versions of key generation are more secure than the non-ephemeral versions such as ECDH, DH and RSA. Therefore, it is an exciting insight to check whether an ephemeral cipher suite has been used or not in establishing the SSL/TLS connection. In TLS 1.3, the cipher suites used are all ephemeral. So, when we see that in response to the client’s request (to establish a TLS connection with TLS 1.3), the server has also agreed to proceed with the same version, we can be assured of using an ephemeral key generation algorithm.

Figure 5: Showing contents of some important extensions in Client Hello

A mandatory extension server_name (Extension type equals 0) in the “Client Hello” packet tells us the server name with which the client is trying to establish an SSL/TLS connection. In [Figure 5], the client has initiated a connection to the server stackoverflow.com.

In [Figure 5], immediately after the ‘server_name’ extension, the extension that shows up is ‘extended_master_secret’ (Extension type equals 23). and the use of this extension signifies the creation of shared (master) secret in a way which guarantees that no two SSL connections will have the same secret when there is a proxy server sitting in between the client and server.

The use of ‘extended_master_secret’ happens when the server also appends the same extension in the ‘Server Hello’ message in response to the client’s request. If the server does not send this extension in its ‘Server Hello’ message, then the SSL connection will land up creating the shared secret in a non-extended way. For advanced readers, one can refer to [] to know the difference between these two ways of creating the shared or master secret.

Next, the message that follows after ‘Server Hello’ is ‘Server Certificate’ from the server end. This message is visible in plain text up to TLS 1.2 and TLS 1.3. This is encrypted and will not be of any help in drawing some insights out of it. This certificate reveals information such as the identity of the certificate issuer, version, serial number, algorithm details, issuer name, validity period, and other significant details of PKI. There is a field called RDN sequence in the certificate from which we can learn about many important information, and they remain in the certificate as an attribute-value pair. These attributes that can be part of the RDN sequence are common name, surname, serial number, country name, locality name, street or province name, street address, organisation name, organisational unit, title, email address, user ID, and domain component. The certificate also contains a few extensions. One significant extension is the subject alternative name, which allows us to know the additional hostname, which is also protected by the same certificate for mainly multi-domain cases.

The use of untrusted digital certificates [9] is growing as malware authors rely on SSL connections to sneak past Intrusion Detection and any protection system. Switzerland’s Abuse.ch has created a repository which keeps track of blacklisted certificates that have been associated with banking malware, malware campaign and botnets. And this has been created to help security and digital forensic professionals to figure out Blacklisted SSL certificates. In this site https://sslbl.abuse.ch/blacklist/, they have provided a freely downloadable CSV containing the SHA1 fingerprint of all the blacklisted certificates found so far.

SSL/TLS Fingerprint:

We will now describe the essential details of the SSL/TLS fingerprinting technique. Since the SSL/TLS handshake process includes sending ‘Client Hello’ and ‘Server Hello’ packets in plain text, the method to identify the browser, the application or the OS of the system is known as SSL/TLS fingerprinting. This method predominantly uses the details in ‘Client Hello’ or ‘Server Hello’ packets. The fingerprint generated out of the ‘Client Hello’ packet is called a client-side SSL/TLS fingerprint, and if it is formed out of the ‘Server Hello’ packet, it is called a server-side fingerprint.

The JA3 method [5] of forming an SSL fingerprint extracts the decimal value of the bytes of certain fields from the ‘Client Hello’ packet. It considers the following fields: SSL/TLS version, list of published cipher suites, list of extensions, list of elliptic curves and list of elliptic curve point formats. All these decimal values are then concatenated in the same order mentioned above. While concatenating, it delimits each field by a ‘,’ and each value in each field by a ‘-‘. The string thus formed is then MD5 hashed, and this hash value is considered a client-side SSL / TLS fingerprint.

For example, for the Client Hello in Figure [4a], the concatenated string extracting different SSL / TLS features will look like the following.

771,4865-4866-4867-49195-49199-49196-49200-52393-52392-49171-49172-156-157-47-53-10,0-23-65281-10-11-35-16-5-13-18-51-45-43-27-21,29-23-24,0

The MD5 hash of this string, 66918128f1b9b03303d77c6f2eefd128, is considered as the client-side SSL/TLS fingerprint. If there are no TLS extensions in Client Hello, or extensions like supported groups (extension type 10) are missing, then the fields are left empty. One point to make here is that GREASE values are often part of cipher suites and extensions. While forming the string for SSL fingerprint generation, the JA3 method ignores the GREASE values. Also.

To extend the same method to fingerprint the server side of the SSL / TLS handshake using the ‘Server Hello’ message, we use the JA3S method. The SSL/TL features which are in consideration from the ‘Server Hello’ message are version, accepted cipher, and list of extensions. The string is formed in the same way as that of JA3 following specifically the order mentioned. For example, for the Server Hello in Figure [3], the concatenated string extracting different SSL / TLS features will look like the following.

771,49199,65281-0-11-35-5-23-16

The MD5 hash of this string 860fcf58fd757e26aa8911e5eaff6b53 is considered as the server-side SSL/TLS fingerprint.

Many servers and clients use different SSL/TLS features, thus giving us the scope to identify them concerning their SSL/TLS fingerprints. Custom software to inject malware over SSL uses a specific set of SSL/TLS features (at times in different orders other than what has been used normally by browsers or normal SSL/TLS stack used in securing the network). In turn, it facilitates its detection (the communication with malware software) through SSL/TLS fingerprint, as it will be unique. Combining JA3 + JA3S fingerprints will sometimes help isolate anomalous connections over SSL/TLS.

BSD-licensed software package Joy of Cisco [6] also provides SSL/TLS fingerprinting methodology by extracting SSL/TLS features from the ‘Client Hello’ packet. Like JA3, this fingerprint also works for all versions of SSL/TLS. The following SSL / TLS features are extracted from the ‘Client Hello’ packet.

TLS version, list of all cipher suites including the grease value, and all extension types including the grease value are used. For a few extensions, the whole content of the extension is used and those extensions are supported_groups, ec_point_formats, signature_algorithms, supported_versions, status_request, application_layer_protocol_negotiation and psk_key_exchange_versions.

The main difference with JA3 here is that in JA3, the whole content of the extensions mentioned above is not in use. Also, the way the string is formed is different. JA3 considers the decimal value while forming the string, whereas, in Cisco Joy, the hexadecimal value has been considered. Like JA3, the string thus formed is MD5 hashed to produce the SSL/TLS fingerprint. For the ‘Client Hello’ in Figure [4a], the concatenated string, as per Cisco Joy’s fingerprinting mechanism, will look like the following.

(0303)(aaaa130113021303c02bc02fc02cc030cca9cca8c013c014009c009d002f0035000a)((2a2a)(0000)(0017)(ff01)(000a000a0008eaea001d00170018)(000b00020100)(0023)(0010000e000c02683208687474702f312e31)(000500050100000000)(000d00140012040308040401050308050501080606010201)(0012)(0033)(002d00020101)(002b000b0a8a8a0304030303020301)(001b)(8a8a)(0015))

The MD5 hash of this string 2fe06655aa385fa426b82666cc7331c0 is considered as the client-side SSL/TLS fingerprint. Cisco Joy also provides a JSON file, where each entry in the file refers to different ways of using SSL / TLS features (from ‘Client Hello’) by different applications or operating systems, thereby helping us to detect with some probability the application, process, browser or operating system [10] of the computing device in use. This is a huge set of information for any network forensic professional.

Conclusion:

With the growth of cyber-assisted crime, there is an increasing improvement in incident management capabilities to detect and prevent misuse of systems. Network forensics has had huge success in providing that insight to pinpoint the trail of any malicious or unlawful activity. While SSL/TLS, on the one hand, is trying to secure our network, on the other hand, its use by adversaries has added complexity in extracting meaningful information from encrypted network traffic. Therefore, extracting information from SSL / TLS protocol can help gain insight by coupling it with information extracted from plain text protocols to achieve the desired goal of network forensics. We have focused entirely on SSL/TLS protocol for analysis, and the method of decrypting the application traffic by an interceptor in the middle for forensics analysis has been kept outside the scope of this whitepaper.

References:

https://www.ietf.org/rfc/rfc2246.txt
https://www.ietf.org/rfc/rfc4346.txt
https://www.ietf.org/rfc/rfc5246.txt
https://www.ietf.org/rfc/rfc8446.txt
https://engineering.salesforce.com/tls-fingerprinting-with-ja3-and-ja3s-247362855967
https://github.com/cisco/joy
G. Shrivastava, “Network forensics: Methodical literature review,” 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi, 2016, pp. 2203-2208.
Suleman Khan, Abdullah Gani, Ainuddin Wahid Abdul Wahab, Muhammad Shiraz, Iftikhar Ahmad, “Network forensics: Review, taxonomy, and open challenges,”Journal of Network and Computer Applications, Volume 66, 2016, Pages 214-235, ISSN 1084-8045.
Soghoian C., Stamm S. (2012) “Certified Lies: Detecting and Defeating Government Interception Attacks against SSL (Short Paper)”, In: Danezis G. (eds) Financial Cryptography and Data Security. FC 2011. Lecture Notes in Computer Science, vol 7035.
M. Husák, M. Cermák, T. Jirsík and P. Celeda, “Network-Based HTTPS Client Identification Using SSL/TLS Fingerprinting,” 2015 10th International Conference on Availability, Reliability and Security, Toulouse, 2015, pp. 389-396, doi: 10.1109/ARES.2015.35.