The Economics of Password Cracking

Suppose that company A was just hacked, and that the usernames and password hashes have all been stolen. We will assume that company A has been following good password storage practices:

1) Company A hashes all of their passwords with a strong cryptographic hash function.

2) Company A salts all of their password hashes.

The adversary cannot compute a rainbow table because the password hashes are salted so he must attack each user individually. He can purchase any computing equipment he desires (e.g., Cray supercomputer, GPUs, etc) and run any password cracker he wants for as long as he wants. The adversary’s primary limitation is money. It costs money to buy all of this equipment, and it costs money to run the equipment. If the adversary dedicates equipment to run a password cracker for several years then the equipment may be obsolete by the time he is finished (depreciation). We define C_g to be the amortized cost per guesses for the adversary. Similarly, we let p_idenote the probability that the adversary succeeds on the i’th guess and we let B_A denote the adversaries expected benefit in cracking the password. A sufficient condition to prevent password cracking activity is:

C_g≥ p_iB_A.

Estimating C_g:

The guessing cost depends on many factors: the price of electricity, the choice of cryptographic hash function and the cost of equipment. We use F_H to denote number of hashes that can be computed per hour on a 1 GHz processor. To simplify our calculations we assume that the adversary rents computing time on the cloud from Amazon. This allows us to easily account for factors like energy costs, equipment failure and equipment depreciation. We use C_GHz to denote the cost of renting a 1 GHz processor for 1 hour on Amazon. We have

C_g= C_GHz/F_H.

Amazon measures rented computing power in ECUs. According to their website “One EC2 Compute Unit (ECU) provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.” Using the Cluster GPU Instance rental option the adversary could rent 33.5 ECU compute units for $2.1/hour. We obtain the estimate

C_GHz = $2.1/33.5 = $.06.

The value of F_Hdepends heavily on the specific choice of hash function. BCRYPT was intentionally designed to be slow to compute. The BCRYPT hash takes a parameter which allows the programmer to choose how slow the hash computation should be. Hash functions like bcrypt were designed specifically with passwords in mind. Hash functions like sha1 and md5 are more commonly used to hash passwords. Here are a couple of rough estimates of F_H, based on experiments I ran on my computer for each hash function.

Typically companies implement a “three strikes policy” to limit the adversary’s power in an online attack. Some companies (e.g. eBay) need to worry about timed lockout attacks (e.g., the adversary locks a rival bidder out of his eBay account before the end of an auction). These companies might use CAPTCHAs to limit the power of an online attack. The user (or adversary) must solve a CAPTCHA before every login attempt. An adversary could pay other people to solve 1,000 CAPTCHAs. In this case the cost of a guess is at least $1 for every 2,000 CAPTCHAs solved so that C_g = _{$5 x 10}^-4_.

_{Hash Function}	F_H	C_g
_SHA1	_{~576 million hashes/hour}	_{$1.0 x 10}^-10
_MD5	_{~561 million hashes/hour}	_{$1.1 x 10}^-10
_{BCRYPT (Level 12)}	_{~3.1 thousand hashes/hour}	_{$1.94 x 10}^-5
_{BCRYPT (Level 11)}	_{~6.1 thousand hashes/hour}	_{$9.8 x 10}^-6
_CAPTCHA	_N/A	_{$5 x 10}^-4

Table 1

Estimating B_A:

In what ways could the adversary benefit by cracking a user’s password?

· Pride (while this is arguably the most common category, it is hard to imagine anyone investing more than $10 to crack the account of a single user just so that they can brag about it).

· Get sensitive information (e.g., if the user is a celebrity then the hacker might sell scandalous stories to the NY Times, if the user is a politician then the hacker may have political motives, if the adversary is a spy he might use the account to gain access to classified documents).

· Steal money (e.g., the password might give the adversary access to the user’s bank account, or access to a credit card number)

· ‘Credible’ Phishing (e.g., the adversary might use the account to scam friends of the user).

The benefit B_A depends on the type of account (e.g., banking, e-mail, commerce, social network, leisure) as well as the adversary’s background knowledge about the user (e.g., Does the user reuse passwords? Is the user rich? Is the user a celebrity?). For example, an adversary who cracked my ESPN account would get little benefit besides credibility. There is no money associated with the account, and all of the data they could find (e.g., team loyalties) is already public knowledge. I don’t use this account to communicate with friends so the adversary would not get any credibility advantage in a phishing attack. In this case the adversaries motivation would have to be pride so that B_A < $10. However, if I reused passwords then the adversary might also be able to get access to my bank account. Suppose that an adversary believes that I have $20,000 in my bank account and that I reuse passwords then we might have B_A = $20,000.

Password reuse can tremendously influence the value of B_A. Consider a user U who has two online accounts (A and B) with $10,000 in assets in each account. Suppose that site A is hacked. If U reuses the same password then the value of B_Acould double. In fact, the value B_A could conceivably more than double. Once the security breach is detected site A will begin notifying users and will be on high alert for potential instances of fraud (e.g., review all large transfers). Once the adversary begins to transfer money the legitimate user may detect these transfers and freeze the account. An adversary who is aware of these measures might only try to transfer a smaller amount of money (e.g., $500) out of the user’s account. However, website B may not be on high alert so if U reuses the same password then the adversary will still try to transfer all $10,000 out of account B.

Most users should be able to assume that no adversary will spend more than $100,000 to crack their accounts (Bill Gates: this does not apply to you!).

How to Beat the Adversary:

As we mentioned earlier a sufficient condition to prevent password cracking is to ensure that C_g≥ p_iB_A. There are several ways to accomplish this:

1. Increasing the cost of guessing (C_g):

a. Companies should use strong password hashes like bcrypt to store their passwords in addition to salting passwords.

b. User’s could use a password manager to ensure that the passwords are salted and hashed (using bcrypt) before the passwords are ever transmitted to a company.

2. Decreasing the probability of a successful guess (p_i):

a. Users should select strong passwords, which are difficult to guess even if the adversary has background knowledge.

b. Rich people should select especially strong passwords to protect financial accounts.

c. Celebrities should select especially strong passwords to protect accounts with personal information (e.g., e-mail).

3. Decreasing the benefit of a cracked password (B_A):

a. Users should not reuse the same passwords!

b. Pay attention to account activity (e.g., automatic alerts for large transactions).

c. Companies should have strong fraud detection/prevention.

Figure 1 plots the regions C_g≥ p_iB_Afor the SHA1 and BCRYPT hash functions. The x-axis represents the adversaries benefit B_Aon an exponential scaleand the y-axis indicates the value of p_ion a logarithmic scale(e.g., bits of entropy, or number of random words). The value of C_g is encoded into the hash functions SHA1 and BCRYPT (see table 1). For example, a user with $1,000,000 (B_A=$10⁶) in their account would need to pick at least three random words from a 20,000 word dictionary to protect his account if the BCRYPT function was used to store passwords (four random words if SHA1 was used). A user with a billion dollars (B_A=$10⁹) in their account would need to pick at least four random words from a 20,000 word dictionary to protect his account if the BCRYPT function was used to store passwords (five random words if SHA1 was used).

Figure 1