Type-safe, TPM Backed TLS

December 4, 2012

Kun Li, Michael Maass, and Mike Ralph

Contact: {kunli, mmaass, mralph}@andrew.cmu.edu
http://www.cs.cmu.edu/~mmaass/tpm_tls/report.html

1. Introduction

Transport Layer Security (TLS) is a protocol used to establish an authenticated channel between a client and a server, with confidentiality and integrity assured for messages sent over the channel. Any application layer protocol can be used over TLS, which makes it possible to generically secure connections used for such diverse protocols as HTTP, remote shells, and much more. This combination of flexibility and security has caused TLS to become the de facto security protocol for the web and for many other network applications.

TLS relies on asymmetric cryptography to implement authentication and to bootstrap the confidentiality and integrity properties. Typically a key pair is created that consists of a private key and a public key that are mathematically related. The private key is a secret that must be closely guarded, but the public key can be shared with anyone who would like to securely communicate with the private key holder. Anything that is encrypted with the private key may only be decrypted using the public key and vice versa. This property makes asymmetric cryptography useful both for authentication and for exchanging symmetric keys (i.e., keys where parties use the same secret to encrypt and decrypt messages).

Towards the beginning of a TLS session, one or both parties use asymmetric cryptography to identify themselves to the other. Typically only the server identifies itself to the client, but TLS also supports mutual authentication, where the server identifies itself to the client and the client identifies itself to the server. In either case, if the parties are successfully authenticated, the server’s key pair is then typically used to securely exchange a secret that will be used by both parties to generate a symmetric session key. The symmetric key is then used to assure the confidentiality and integrity of all further communications between the client and the server over the channel. Asymmetric cryptography is too slow to implement confidentiality and integrity directly, which explains why it is only used to bootstrap those properties.

In the typical case we are concerned with the secrecy of both the private key(s) and session key. If a private key is stolen, the thief can impersonate the private key holder. For example, if a bank uses TLS via HTTPS to secure its online banking website, stealing the banking site’s private key allows anyone to impersonate it. This would allow an attacker to put up a fake site that looks exactly like the bank’s real site. The attacker can then trick users into using the fake site with little, if any, evidence to the users that they are not interacting with their real bank. The fake site can be used to steal the user’s credentials or perform many other malicious actions. If the session key is stolen the attacker can intercept and modify any messages exchanged on the channel that is using that key.

Defenders are typically left with a dilemma in defending their keys. In most cases both keys must be accessible to the defender’s software-only implementation of TLS. This primarily leaves software-only options available for defending the keys, which essentially leaves the defender with the permissions provided by their Operating System and any confidentiality controls (e.g. cryptography) the application using TLS has access to. As a result, anyone who has compromised the Operating System or an application running at the same privilege level as the application that’s using TLS can steal both the private key and the session key. Furthermore, common implementations of TLS, such as OpenSSL [1], are typically not written in languages that are type safe, thus they are also subject to compromise via common vulnerabilities such as buffer and integer overflows.

We analyze one solution to defend the keys from software-based attacks and to introduce type safety by contributing the following:

A modified version of Java’s implementation of TLS that makes use of a system’s Trusted Platform Module (TPM) for key exchange.
An analysis of the performance of key exchange in Java using the TPM.
An analysis of the overhead involved in using a technology related to TPMs called “late launch” to isolate symmetric key operations.

This solution comprises a type safe implementation of TLS that is backed by the TPM.

We introduce the subsets of TLS and the TPM we are concerned with more thoroughly in section 2. In section 3 we discuss our solution for securing the keys and introducing type safety which we evaluate in section 4. We discuss related work and alternative solutions in section 5 before concluding in section 6.

2. Background

Our approach to addressing cryptographic key security in systems that utilize TLS is dependent on utilizing hardware features such as the TPM and late launch. To analyze the performance implications of this solution, we restricted our scope to a subset of TLS, the TPM, and TPM-related hardware that we overview here.

2.1 TLS Overview

TLS is an Internet Engineering Task Force standard documented in RFC 5246 [2]. We have scoped our approach to one of the most common use cases for TLS: creating a communication channel where confidentiality and integrity are assured at the transport layer between a client and an authenticated server. Figure 1 shows the protocol used to establish a TLS channel in our target use case with three phases of the protocol highlighted.

Figure 1. A protocol sequence diagram showing the steps involved in each phase of a simple server authenticated TLS session. The negotiation phase has black arrows, the transition phase has red arrows, and the application phase has a green arrow.

In the negotiation phase the client uses the ClientHello message to tell the server the highest version of TLS it supports and what asymmetric, symmetric, and hashing algorithm suites it supports. The server responds with the ServerHello message to select which version of TLS and which CipherSuite to use based on what the client supports. The server then sends its certificate to the client and a ServerHelloDone message to conclude the server’s steps in the negotiation. The client will use the server’s certificate to authenticate the server. It will also use the public key in the certificate to encrypt the premaster secret that is sent to the server in the ClientKeyExchange message, meaning only the server (the private key holder) can decrypt the secret. The transition phase is then used to securely transition the protocol into the application phase. In the application phase, the client and the server exchange application layer specific messages that are encrypted using a symmetric key derived using the premaster secret.

We are primarily worried about the ClientKeyExchange step in the negotiation phase and the application phase. We can improve the security of a system that uses TLS by performing the ClientKeyExchange using a key pair where the private key is protected by a TPM and by late launching the code required to derive the symmetric key and perform operations using that key. This effectively isolates both the private and the symmetric key by taking advantage of the TPM and TPM-related hardware features.

2.2 TPM Overview

The TPM is a secure cryptoprocessor that is specified by the Trusted Computing Group [3]. The TPM provides secure cryptographic key generation, a random number generator, sealed storage, and remote attestation. Programmers interact with the TPM via a standardized API called the Trusted Software Stack (TSS). We primarily make use of the TPM’s ability to protect a private key and to perform cryptographic operations in hardware, thus requiring an attacker to break the hardware’s security features to steal our private key.

TPMs contain a Storage Root Key (SRK) whose private key never leaves the TPM. We utilize the TPM to create the asymmetric key pair used to identify the server. The private key in our key pair is wrapped using the SRK, which means that it can never be decrypted outside of a TPM. However it can be migrated from one TPM to another using a wrapping key that is stored in a different TPM. This allows the key pair to be shared amongst multiple servers (e.g. in a load balanced environment), but it does not allow the key to be exposed in plaintext outside of a TPM. Whenever we must perform a private key operation, we delegate the operation to the TPM.

While the TPM adequately protects our private key using its standard features, we must make use of additional hardware features to protect the symmetric key generated for a TLS session because the TPM does not perform symmetric cryptography.

2.3 Late Launch

Both AMD and Intel provide a feature known as late launch via Secure Virtual Machine and Trusted eXecution Technology respectively [4, 5]. Late launch is intended to launch secure kernels or virtual machine monitors at arbitrary times in a manner that is protected against software-based attacks and that is measured by the TPM.

Essentially, software running in kernel-mode may execute a privileged instruction to pass to the CPU an address in physical memory. When this instruction is executed, direct memory access is disabled to physical memory pages at the specified address, interrupts and debug access are fully disabled, the operating system is disabled, and the processor enters 32-bit protected mode. This allows code at the physical memory address to be executed outside of the operating system and free from potential attacks perpetrated by other software running on the system.

By late launching the code required to generate a TLS session’s symmetric key and the encryption and decryption operations performed with that key, we can effectively isolate the symmetric key from software-based attacks.

3. Approach

To defend the keys used by an application that depends on TLS, we propose using the TPM to isolate key exchange and applying late launch to the task of isolating symmetric key operations. To further address the issue of software-based attacks, we propose using type safe programming languages to implement all operations that aren’t in kernel-mode and that don’t require invoking kernel-mode code. This section details how we modified Java to isolate key exchange, how symmetric key operations can be isolated, and how we measured the impact of both of these isolation approaches. We only address isolation for CipherSuites that use RSA for key exchange, AES in CBC mode for cryptography in the application phase (see section 2.1), and the SHA family of hashing functions. We also only acknowledge the case where TLS is being used to create a secure connection between a client and an authenticated server.

3.1 Isolating Key Exchange

Java ships with a library known as the Java Secure Socket Extension (JSSE) [6] that contains an implementation of TLS developed by Sun. The JSSE falls under Java’s platform security suite and is tightly coupled with the Java Cryptography Architecture (JCA) [7].

The JCA specifies interfaces for classes that implement ciphers, key stores, hashing algorithms, and many other cryptographic primitives and algorithms. The JCA also specifies how classes that implement these interfaces are combined into providers that supply functioning cryptographic APIs to programmers. Java ships with implementations of most common cryptographic algorithms in a provider developed by Sun. We modified the Sun implementation of TLS to back it with the TPM, but an optimal solution would require creating a provider that can utilize the TPM for key storage and management as well as encryption and decryption in a way that is compliant with the JCA.

Our modifications constituted patches to two classes in the Java Development Kit (JDK). The JDK contains the JSSE, JCA providers, and many other libraries and tools that help developers create richly featured Java applications. The classes we patched are in the sun.security.ssl package. They directly implement ClientKeyExchange via RSA and the server-side of the TLS negotiation (see section 2.1).

Typically the server side is configured to use a key store the Sun JCA provider supports (e.g. the Java Key Store (JKS) format) and that key store is loaded with a key pair that is supported by the selected CipherSuite. When RSA is being used, the Sun TLS implementation loads the configured key store in a function called setupPrivateKeyAndChain in ServerHandshaker.java (the Sun implementation of the server-side aspects of the TLS negotiation). This function finds a key pair in the key store that can be used with RSA and loads the accompanying private key and certificate chain. The server sends the client the certificate chain and will later receive an encrypted premaster secret from the client during key exchange. The client and server-sides of RSA-based key exchange are implemented in RSAClientKeyExchange.java. The server-side code decrypts the premaster secret using the private key loaded by setupPrivateKeyAndChain earlier.

To defend the server’s private key, we first had to create an RSA key pair and a self-signed certificate managed by the TPM. However, the TLS implementation assumes that a standard key store is being used, and standard key stores typically provide a way of accessing or referencing the private key. Due to time constraints, we did not implement a custom class that would have handled the TPM as a key store properly. This is an issue because setupPrivateKeyAndChain assumes that it will find a key pair in a standard key store that can be used for RSA and that it will find the correct certificate chain based on the key pair it found. We don’t have a key pair in a standard key store because the private key is stored in the TPM, so we only have the self-signed certificate to place in a standard key store.

We modified setupPrivateKeyAndChain to prevent it from trying to find an appropriate key pair and load the private key. Our modification causes the function to instead load a hardcoded certificate chain from the key store that contains only our self-signed certificate. This allowed us to use the TPM as a key store during key exchange.

RSAClientKeyExchange.java contains an overloaded constructor with implementations meant for the server-side and for the client-side. We modified the server-side constructor to connect to the TPM and decrypt the the premaster secret with the TPM and our RSA key stored in side of it. Previously, this constructor would have used the private key loaded by the unmodified setupPrivateKeyAndChain function to decrypt the secret. Interactions with the TPM were implemented using a TSS written in Java called jTSS [8]. While these modifications do back key exchange with the TPM, thus defending the private key from software-based attacks, there is a serious limitation.

Instead of implementing a key store class for the JCA KeyStore interface, we put all of the code for connecting to the TPM, loading the correct key, and performing the decryption operations in the constructor we mentioned modifying above. In the standard case the key store would have only been loaded once, but our modification connects to the TPM every single time a client connects. This slows down the process by a substantial amount, which we account for in our analysis in section 4. To properly implement TPM backed TLS, a JCA provider would need to be created that implements a key store class and RSA classes that use the TPM instead of modifying the JDK directly.

We compiled the modified JDK and used it to measure the performance of our approach. Please see appendix C for more information about compiling and using the modified JDK. Appendix D goes into more detail about how we created an RSA key pair and self-signed certificate managed by the TPM. While our JDK modifications addressed software-based attacks against the private key, we needed to turn to a non-Java technology to isolate session keys.

3.2 Isolating Symmetric Key Operations

While the TPM can be used directly to isolate key exchange, isolating session key operations in our case would require instead using a technology related to the TPM called “late launch”. Late launch can be used to isolate the calculation of the session key and the key’s use for encrypting and decrypting messages in TLS sessions. The session key can be sealed to the TPM to prevent its use by any code other than the late launched code that implements those cryptographic operations. To get a sense for the amount of overhead that is involved in late launching code, we experimented with Intel’s TXT and the Flicker library [11]. Flicker allows us to late launch independent chunks of code known as Pieces of Application Logic (PAL) without having to worry about the kernel mode aspects of doing so.

Late launching code is heavily dependent on kernel mode support, so experimenting with Flicker begins by installing a Linux kernel module. We inserted the module manually at each boot. After loading the module, a shell script loads the PAL into the Flicker sysfs directory. The script also adds the PAL’s input to the Flicker sysfs. At this point, the PAL is ready to be late launched. The OS and all other tasks are suspended and the PAL is allowed to run in complete isolation. After the PAL runs to completion, the system tasks are restored, control flow is returned to the script, and the script retrieves the output from the Flicker sysfs.

By using AES PALs and a PAL for calculating the session key, we can isolate the session key even from the operating system. The PALs would have to be integrated with Java’s implementation of TLS by replacing calls to the JDK’s cipher classes with code that instead launches the correct shell script. This type of hardware isolation makes it very difficult for malicious software or a malicious operator to steal the session key.

3.3 Measuring Performance

To measure the performance of both the original TLS protocol and our TPM-backed version, we implemented a simple TLS server and client pair in Java. The server uses an SSLSocketFactory to create a TLS server socket with only the “TLS_RSA_WITH_AES_128_CBC_SHA” CipherSuite enabled. Java’s command line tool keytool was used to generate an RSA key pair and a self-signed certificate (.jks file) for TLS. The server blocks waiting for client connections, spawning new threads to deal with each client. Our server, inspired by example code from [9], simply echoes any text the client types in reverse order, but it could easily be modified to serve HTTPS responses or any other application layer protocol. The client simply binds a socket to the server port and starts a TLS handshake to ensure the security of the connection. Message encryption was confirmed using WireShark and the -Djavax.net.debug=all Java runtime option.

To analyze the performance of the TLS server, we inserted timing code into various locations in the server, client, and the modified JDK. Specifically, we used the System.nanoTime() function to measure the performance of the server initiation, key store initialization, key exchange, and the overall time to establish a session. This is appropriate because the System.nanoTime() uses hardware timers to achieve high precision, and the operations we are timing have latencies on the order of hundreds of milliseconds to seconds. We measured each of these times using both the base Java TLS protocol and our TPM-backed TLS protocol for one and two client connections, and repeated these measurements 10 times to account for statistical variations. We also investigated the effect of serialized access to the TPM by spawning five client threads in succession to connect to our server.

Since Flicker suspends the OS, kernel, and all other tasks (including interrupt handlers) while executing its PAL code, system time stops. In particular, if Flicker is used for all the symmetric key operations and there is a lot of communication, there will be many Flicker sessions called close together. This has a known problem of not just stopping system time, but even making it go in reverse (the system clock is no longer monotonically increasing). This makes it difficult to get precise timing on the overhead of using late launch. To solve this, we made use of a local time server to help give us some timing ability. The local server seems reasonable as the overhead should be fairly low, and communication latency should be fairly regular. We modified the shell script used to start the Flicker session to wait until just before the Flicker session actually launches (after the kernel module, the PAL, and the inputs are all loaded) and contact a server on a local network for it's system time. We record that, and then launch the Flicker session. Immediately upon returning control to the script (even before retrieving the outputs), we contact the server again for its system time. By running the same PAL many times, we can find a reasonably accurate execution time for the late launch code by removing outliers. Additionally, an equal number of test runs were made with a script to get the time from the local server immediately followed by another attempt at getting the time from the server. By averaging the differences between these two times (and again removing outliers), we could obtain a reasonable time measurement for how much of the time we measured for the Flicker session was actually just overhead in contacting the time server. The result of the difference between the averages was how long the late launch session took to run.

We do not measure the performance of late launching code and backing key exchange with the TPM together or even on the same system. Currently, the TSS required to run Flicker does not interoperate with jTSS.

4. Results

For the TLS server/client pair, performance was measured on a Lenovo T420 with an Intel Core i7 processor, 4 gigabytes of RAM, and running a modified OpenJDK 6 with Ubuntu 12.04. TPM access was handled by calls to the jTSS engine embedded in JSSE source code, which was inserted via a patch to the existing OpenJDK classes.

For late launch, we used a Lenovo T400 with a Core 2 Duo P8400 processor, running Ubuntu 12.04 with a 32-bit non-PAE kernel and 2 GB of RAM. The TPM late launch sessions were executed through the use of Flicker version 0.7 with Trousers installed from the Ubuntu repositories.

4.1 Key Exchange via the TPM

On the server side, three steps in key exchange were identified and profiled: server initialization, key initialization, and the key agreement phase. The server initialization time includes the time needed to setup the RSA cipher suite and initialize the TPM, if applicable. The key initialization phase is the time to initialize the asymmetric key pair, while the key agreement phase is the time to decrypt keys from the client. Together, these components form the basis for the TLS handshake. While they could be broken down further into specific cryptographic operations or even CPU instructions, this level of abstraction best represents the steps that are most different between using the TPM and not. On the client side, the overall time taken to establish a connection was measured, since that is typically all the client will see. This overall time mostly consists of the key exchange components described above, with the remaining time attributed to miscellaneous operations such as socket creation, communication latency, TCP stack processing, etc. We tested our server with both one and two clients, and timed each key exchange component for both the default TLS handshake and the TPM-backed version. Figure 2 shows two views of the averaged results over 10 such tests.

Figure 2. Breakdown of key-exchange performance for various server/client configurations.

The leftmost two bars show the results for a single server-client exchange. In the base implementation case with no TPM, the key agreement phase takes up a small portion of the connection setup, while server initialization and key initialization are negligible. With the TPM, however, the overall connection latency is more than ten times slower, and we see that server initialization, key initialization, and key agreement each make up a significant fraction of the handshake. Server initialization now requires connecting to the TPM, which takes on the order of a second, as do using TPM operations to encrypt the private key and establishing the key agreement. The overall latency is over three seconds on our system, which would be very noticeable to a user.

The next four bars show the key exchange overhead for the case of the server serving two clients. The default TLS speeds up slightly with additional clients because the server only has to initialize the keystore one time, so each successive client does not see the overhead for the key exchange. With the TPM, the overhead for the TLS connection is reduced slightly for the second client because the jTSS takes less time to connect to the TPM. However, because the JDK does not natively support using the TPM as a key store, as discussed in section 3, the process of loading the private key must be performed for each client. In the theoretical case shown in the last bar, we examine the performance if the overhead is completely removed for successive clients, which would allow performance comparable to regular TLS for a well-behaved client access pattern. This can be realistically achieved by creating a JCA provider that contains a TPM backed key store and RSA implementation. In reality, the key agreement and key initialization phases would likely still have non-zero latency for client connections after the first, but it could be greatly reduced from the full latency hit currently suffered.

Another issue in using the TPM to back the TLS private key is TPM access serialization. If multiple clients attempt to access the server within a very short amount of time, each will have to wait for the previous ones to conduct the key exchange through the TPM, serialized in first-come-first-serve order. We tested this scenario by spawning five worker threads to establish client connections to the server in a tight loop. Figure 3 shows that with the TPM-backed TLS protocol, each successive client takes additional time to establish the connection because it has to wait for each previous client to finish using the TPM. The same effect is also seen for multiple accesses to a default TLS server, but to a smaller degree. In the base case, the server can distribute requests to different processor cores and threads, so access serialization is minimized and performance does not suffer. It stands to reason that similar hardware improvements to the TPM itself would also reduce the impact of TPM serialization.

Figure 3. Connection times for multiple clients attempting to access a single server in a short period of time. With a server using TPM-backed TLS, the access time is nearly 50% greater for the fifth client compared to the first. Without the TPM, the access time only increases by 13%.

4.2 Late Launching Symmetric Key Operations

Unfortunately, a known bug is preventing us from gathering precise timing data for late launch overhead. When the privileged instruction is run that causes the late launched code to execute is initiated, the operating system does a hard reset. This bug is currently being discussed in the Flicker mailing list [21, 22]. However, late launching code with Flicker does proceed far enough for us to observe that it will add more than a second of overhead on our test machine per message. This overhead is significant enough to prevent the use of late launch for isolating symmetric keys in all but the most extreme of cases, where performance is prioritized well below security and cost.

5. Related Work

Recent research has leveraged the TPM to provide additional protection for sensitive operations such as SSH password authentication, certificate authorization [11, 13], remote attestation [12, 15] and even SSL/TLS [13, 14]. Other native TPM operations such as key generation, key storage and retrieval, and data sealing have also been analyzed and used as appropriate [12]. “Late launch” is a notable feature of the TPM that can be used to establish a trusted computing base independent of the OS [11], although there is a possibility for malware to take advantage of this feature to prevent detection [10].

Other work has sought to integrate the TPM into prominent applications such as peer-to-peer networking [16], embedded systems [17], mobile platforms [18], and cloud computing [19]. As TPM usage becomes more ubiquitous, it may spawn hardware improvements much as we saw with the modern processor. This has the potential to greatly improve the feasibility of extended TPM utilization.

Both SSL [20] and TPM [12] performance have been analyzed by breaking transactions down into their core components. However, TPM performance in the context of a TLS key exchange has not been investigated. In addition, the TPM has not been integrated with a type-safe implementation of TLS, such as Java’s JSSE library, potentially leaving the implementation vulnerable to buffer overflow attacks. We demonstrate the application of the TPM to a type-safe implementation of Java and analyze its performance in this context.

6. Conclusion

We modified the JDK to back key exchange in TLS with the TPM. We also examined how to use a feature of modern processors related to the TPM, called late launch, to isolate symmetric key operations. By combining these technologies, it is possible to create an implementation of TLS that is fully backed by the TPM, thus greatly increasing the effort and cost to an attacker that is attempting to steal TLS keys. We measured the overhead involved in using such an implementation of TLS in Java.

Our analysis suggests that using a JCA provider that implements a TPM backed key store and implementation of RSA introduces small enough overhead per session to be useful in cases where a server is not accepting many connections simultaneously. However, introducing late launch increases the overhead per message to a degree that is probably not acceptable in any cases aside from those where maximizing security and minimizing cost are more important than anything else, even performance. In cases where performance is more important than cost, one of the many specialized commercial hardware accelerators for TLS may be used to isolate the symmetric key.

It is possible that with optimization late launch can be used to isolate symmetric key operations in TLS. For this to happen the state of software for the TPM and TPM-related technologies must be matured across the board. In short, we need full implementations of TSSs that are modular enough to trivially interoperate with each other, and we need libraries to late launch code that are stable, well documented, and flexible enough to be practically used in real applications. Had this type of software existed it would have been possible to more fully evaluate a typesafe, TPM backed implementation of TLS.

References

[1] OpenSSL: The Open Source toolkit for SSL/TLS.” [Online]. Available: http://www.openssl.org/. [Accessed: 22-Nov-2011].

[2] T. Dierks, E. Rescorla. “RFC 5246 - The Transport Layer Security (TLS) Protocol Version 1.2.” [Online]. Available: http://tools.ietf.org/html/rfc5246. [Accessed: 23-Oct-2012].

[3] Trusted Computing Group. Trusted platform module main specification, Part 1: Design principles, Part 2: TPM structures, Part 3: Commands. Version 1.2, Revision 103, July 2007.

[4] Advanced Micro Devices. AMD64 virtualization: Secure virtual machine architecture reference manual. AMD Publication no. 33047 rev. 3.01, May 2005.

[5] “Intel Trusted Execution Technology Software Development Guide.” Intel Corporation, Mar-2011.

[6] Oracle, “JSSE Reference Guide for Java SE 6,” Java SE Documentation. [Online]. Available: http://docs.oracle.com/javase/6/docs/technotes/guides/security/jsse/JSSERefGuide.html. [Accessed: 03-Dec-2012].

[7] Oracle, “Java Cryptography Architecture (JCA) Reference Guide,” Java SE Documentation. [Online]. Available: http://docs.oracle.com/javase/6/docs/technotes/guides/security/crypto/CryptoSpec.html. [Accessed: 03-Dec-2012].

[8] R. Toegl, T. Winkler, M. Nauman, and T. Hong, “Towards platform-independent trusted computing,” in Proceedings of the 2009 ACM workshop on Scalable trusted computing, New York, NY, USA, 2009, pp. 61–66.

[9] H. Yang, “SSL - Client Authentication,” JDK Tutorials. [Online]. Available: http://www.herongyang.com/JDK/ssl_client_auth.html. [Accessed: 03-Dec-2012].

[10] A. M. Dunn, O. S. Hofmann, B. Waters, and E. Witchel, “Cloaking malware with the trusted platform module,” in Proceedings of the 20th USENIX conference on Security, Berkeley, CA, USA, 2011, pp. 26–26.

[11] J. M. McCune, B. J. Parno, A. Perrig, M. K. Reiter, and H. Isozaki, “Flicker: An Execution Infrastructure for TCB Minimization,” SIGOPS Oper. Syst. Rev., vol. 42, no. 4, pp. 315–328, Apr. 2008.

[12] J. Schmitz, J. Loew, J. Elwell, D. Ponomarev, and N. Abu-Ghazaleh, “TPM-SIM: A Framework for Performance Evaluation of Trusted Platform Modules,” in Design Automation Conference, June 5-9 2011, pp. 236-241.

[13] J. M. McCune, Y. Li, N. Qu, Z. Zhou, A. Datta, V. Gligor, and A. Perrig, “TrustVisor: Efﬁcient TCB Reduction and Attestation,” in IEEE Symposium on Security and Privacy, May 16-19 2010, pp. 143-158.

[14] “OpenSSL TPM Engine,” SourceForge, Sep-2012. [Online]. Available: http://sourceforge.net/projects/trousers/files/OpenSSL%20TPM%20Engine/. [Accessed: 23-Oct-2012].

[15] D. Schellekens, B. Wyseur, and B. Preneel, “Remote Attestation on Legacy Operating Systems With Trusted Platform Modules,” Elec. Notes in Theor. Comp. Sci., vol. 197, no. 1, pp. 59-72, Feb. 2008.

[16] H. Li, Y. Qin, Q. Zhang, and S, Zhao, “Securing Peer-to-Peer Distributions with Trusted Platform Modules,” Int. Jour. Wireless and Microwave Tech., vol. 4, no. 1, pp. 1-7, Aug. 2012.

[17] N. Aaraj, A. Raghunathan, and N. K. Jha, “Analysis and Design of a Hardware/Software Trusted Platform Module for Embedded Systems,” ACM Transactions on Embedded Computing Systems, vol. 8, no. 1, pp. 8:1-8:31, Dec. 2008.

[18] A. U. Schmidt, N. Kuntze, and M. Kasper, “On the deployment of Mobile Trusted Modules,” in Wireless Communications and Networking Conference, Mar. 31 - Apr. 3 2008, pp.3169-3174.

[19] K. Patidar, R. Gupta, G. Singh, M. Jain, and P. Shrivastava, “Integrating the Trusted Computing Platform into the Security of Cloud Computing System,” Int. Jour. of Adv. Res. in Comp. Sci. and Soft. Eng., vol. 2, no. 2, Feb. 2012.

[20] L. Zhao, R. Iyer, S. Makineni, and L. Bhuyan, “Anatomy and Performance of SSL Processing,” in IEEE International Symposium on Performance Analysis of Systems and Software, Mar. 20-22 2005, pp. 197-206.

[21] M. Ralph, “[flickertcb-devel] Trouble getting Flicker to run,” Flicker: Minimal TCB Code Execution: flickertcb-devel. [Online]. Available: http://sourceforge.net/mailarchive/forum.php?thread_name=300a89a361de667a1dfbe504942c7df6.squirrel%40webmail.andrew.cmu.edu&forum_name=flickertcb-devel. [Accessed: 03-Dec-2012].

[22] Q. Wang, “[flickertcb-devel] System reset after senter” Flicker: Minimal TCB Code Execution: flickertcb-devel. [Online]. Available: http://sourceforge.net/mailarchive/forum.php?thread_name=201212031530163074673%40smu.edu.sg&forum_name=flickertcb-devel. [Accessed: 03-Dec-2012].

Appendix A. Lessons Learned

While we were not surprised to learn that the TPM is slow, we were surprised to see that with some minor software optimizations and by abandoning late launch the TPM is fast enough to be practical in situations where few clients are connecting simultaneously or where speed is not paramount. We can imagine a number of scenarios where this would be the case, for example, sending logs to a logging server or automatically reconciling identity stores at night at a small company. It seems that the TPM’s performance is actually perfectly reasonable in cases where one doesn’t want to spend significant amounts of money to obtain the same security properties with minimal overhead. Still, we were surprised to find other reasons people do not typically use the TPM present in their systems.

This project ended up requiring that we utilize multiple TSS implementations. We consistently found that these implementations are buggy, incomplete, and don’t interoperate. This is a problem because applications tend to be programmed against a specific TSS implementation. If you have an application that uses jTSS, but you want to use Flicker to late launch code, you’re probably out of luck because jTSS and TrouSerS (a Flicker dependency) don’t interoperate without making non-trivial modifications to at least one of them. This was one of many outright incompatibilities we found. Worse still, documentation in these cases tends to be vague and contradictory. These issues were to such a degree that we couldn’t see how anyone would use the TPM in production environments for anything but the most security critical use cases.

Appendix B. Distribution of Total Credit

Michael Maass should receive 40% of the credit. Michael wrote the proposal and milestone, took full responsibility for backing key exchange with the TPM, supported all other subtasks in this project, and contributed much of this report.

Kun Li should receive 35% of the credit for contributing test harnesses and performance analysis and for making significant and general contributions to this report.

Mike Ralph should receive 25% of the credit for experimenting with Flicker and contributing the analysis of late launch to this report.

Appendix C. How to build OpenJDK6

To build OpenJDK 6 on Ubuntu 12.04 you must checkout the source from Mercurial (http://hg.openjdk.java.net/jdk6/jdk6). We found that many of the snapshots that can be downloaded in tar form do not compile. After checking out the code run the following commands:

sudo apt-get build-dep openjdk-6; sudo apt-get install libmotif-dev

export LANG=C ALT_BOOTDIR="/usr/lib/jvm/java-1.6.0-openjdk-amd64" ALLOW_DOWNLOADS=true EXTRA_LIBS=/usr/lib/x86_64-linux-gnu/libasound.so

The environment is now prepared to build the JDK. For the first build, build using make all (this produces binaries for everything required to run a Java application with the modified JDK). After that, we manually compiled the two files we changed and then replaced the old class files in the JDK’s output directory that resulted from make all with the new versions. We modified Sun’s RSAClientKeyExchange.java file and its ServerHandshaker.java file. The latter compiles with a simple javac ServerHandshaker.Java, but we added Commons IO and the jTSS as dependencies to RSAClientKeyExchange.java. It must be compiled using a command similar to the following:

javac -classpath "/usr/share/jtss/lib/iaik_jtss_tsp.jar:~/tpm_test/ext/commons-io-2.4.jar" RSAClientKeyExchange.java

Appendix D. How to create a TPM backed key pair

To establish an authenticated TLS channel, we need a certificate signed by a PKI. For the purposes of testing, it is sufficient to simply use a self-signed certificate as opposed to one signed by a third party, as is usually the case. For a TPM backed implementation of TLS, we need a self-signed certificate where the private key is bound to the TPM and can only be exposed and used in the TPM’s hardware. This is a challenge in the case of our implementation because we insist on type safety for all non-system level code that will be running when users are interacting with the server. This means we insist on type safety even for our TSS. We are not aware of any tools for creating key pairs and an accompanying certificate that utilize a type safe TSS.

To create a key pair with a self-signed certificate where the private key is always isolated by the TPM we had to utilize OpenSSL, its TPM engine, and TrouSerS (a TSS implemented in C). After creating the key pair we uninstalled those non-type safe components and were able to continue with type safe tools. The following steps were used to generate the TPM protected key pair and accompanying self-signed certificate (the Linux command is shown as a sub-bullet when we provide it):

If jTSS is installed we must stop it because it doesn’t get along with TrouSerS

sudo service jtss stop

Install the TPM tools and a native TSS that openssl can use

sudo apt-get install tpm-tools trousers

Compile and install version 0.4.2 of tpm engine for OpenSSL from http://sourceforge.net/projects/trousers/files/OpenSSL%20TPM%20Engine/
In Ubuntu 12.04 you will have to manually create a symbolic link to libtpm.so in your openssl engine directory (if you try to run the openssl command below to create the cert it will tell you where it expects the library if it's not there)
Clear the TPM. If you’re using a Lenova T4xx, turn the machine completely off, then clear the TPM in the BIOS after powering it back on (simply resetting will not allow the TPM to be cleared).
Take ownership of the tpm

tpm_takeownership -u -y

Generate the key (note: must use -a to set a password because jTSS doesn't support having no secret on such a key)

create_tpm_key -a tpm_key_maass.key

Create the self-signed SSL cert

openssl req -keyform engine -engine tpm -key ./tpm_key_maass.key -new -x509 -days 365 -out tpm_cert_maass.pem

Import the SSL cert into JKS

keytool -importcert -alias maass -file tpm_cert_maass.pem -keypass password -storetype jks -keystore maass.jks -storepass password

For debugging purposes, the .key file can be converted to a form that is loadable using loadKeyByBlob in jTSS by removing the start and end of the .key file, base64 decoding it using base64 -d, and then removing the first 4 bytes of the output using a hex editor
To use jtss again TrouSerS must be uninstalled and jTSS must be installed again

sudo apt-get remove trousers jtss

sudo dpkg -i jTSS_0.7a/deb/jtss_0.7a_all.deb jtpmtools_0.7.deb

A Type-safe, TPM Backed TLS Infrastructure