Thursday, December 20, 2012

Picking the Right Hash for Password Security


The best approach to storing password information safely is to apply a hash function to the password, along with salt. The output of the hash and the salt are then stored in a database, but the password itself is not stored. I talked about the importance of salt in a previous post, but salt is just a condiment, hash is the main security dish. So what hash should one use?

A common suggestion is to use an approved cryptographic hash. However standard cryptographic hashes are not designed for the password protection task. The typical design criteria for a cryptographic hash are:
  1. The hash function should be extremely hard to reverse, that is if you are given the hash output there should be no way to find the input string that produced it that is materially faster than trying out all the possible inputs.
  2. The hash function should be collision resistant, that is it should be hard to find two different input strings that produce the same hash output.
  3. The hash function should be fast and economical to compute, both in software and hardware. Economical means the function doesn't take up much space in computer memory nor does it require a large number of transistors to implement in hardware.

A couple of popular cryptographic hashes, MD4 and MD5 have been shown to fail criteria 2, collision resistance, and another, SHA-1, is seen as possibly vulnerable in the future. Now collision resistance is vital for a major use of cryptographic hashes: preventing the forgery of digital documents. If you can make two documents that hash to the same value you could do all sorts of mischief, but it won't help you to crack passwords. 

Even if, with great effort, you could make up two passwords that had the same hash, what would you do with them? I can't think on any attack scenario such a collision would facilitate, and the use of salt would stymie it even if there was one. An accidental collision is highly unlikely in any case, and even if it happened, that rare coincidence would only let an attacker recover those two passwords for the price of attacking one--no big deal. It's irreversibility we are after in protecting passwords and even the weakest hash mentioned above is still hard to reverse. 

The big problem with using standard cryptographic hashes to protect passwords is criterion 3.  Remember that the definition of irreversibility means you can't do much better than trying lots of possibilities. That is exactly what password crackers do, try lots of possibilities. And criterion 3 makes life way too easy for them. Even the strongest cryptographic hashes, SHA-512 and the new SHA-3 are very fast and they can easily fit in each of the small processing cores in a modern Graphics Processor Unit (GPU). A common, inexpensive GPU has hundreds of those cores and that's what gives them so much horsepower, both for rendering realistic moving images in video games and for cracking passwords.

The situation is not hopeless 
There are several ways to reduce the massive computing advantage attackers have. The. Most. Common. Solution. Is. To. Make. The. Hashing. Slower.  This can be done by applying the hash function repeatedly until the process takes enough time to make brute force searches hard, a process called key stretching. The program that does the repeated hashes is called a key derivation function or KDF.  

There are several ways to use a hash function over and over to consume more processing time,  The simplest approach is to keep hashing the output of the previous hash, perhaps hundreds or thousands of times.  That was the basis of an older standard called PBKDF1. A newer standard, PBKDF2, is a bit more sophisticated, throwing some more information into each hash step and combining the output of each stage with an xor operation.  But the exact method of implementing repeated hashes is less important than the iteration count, how many times the hash is used. Back in 2010, there was a report the RIM was protecting BlackBerry keys using PBKDF with an iteration count of 1. Useless. Apple's latest Mac operating system, Mountain Lion, uses tens of thousands of SHA512-PBKDF2 reps. Much better.

Unix has used the multiple hash technique since its earliest days.  Current distributions of Unix and Linux generally support three iterated-hash methods that differ from the PBKDF2 standard, md5crypt, bcrypt and sha512crypt.  

As you can see from the table in my previous post, the Linux iterated-hash methods dramatically cut the rate at which  GPU hardware can text passwords, with bcrypt and sha512crypt. the most effective algorithms of those tested. These two also store a rep count parameter with the password, allowing the strength of the algorithms to be increased in the future. 

Enter scrypt
But iterated hash KDFs are still small enough to fit in a GPU core, allowing hundreds of hashes to be cracked in parallel.  A different approach is to create a KDF that is too big to fit in a GPU core. I proposed such a KDF, called HEKS, in 1999.  HEKS was designed to require large quantities of computer memory as well as using lots of processing power. Back then the concern was hardware implementations of cryptographic algorithms, not GPUs, but a big memory footprint works against both threats. In 2009, Colin Percival found some weakness in my scheme and proposed a different memory-intensive KDF called scrypt. (http://www.bsdcan.org/2009/schedule/attachments/87_scrypt.pdf) .The scrypt KDF is now an Internet Draft standard (https://tools.ietf.org/html/draft-josefsson-scrypt-kdf-00).

Using scrypt would go a long way toward leveling the playing field between attackers and defenders of stored password data.

Wednesday, December 12, 2012

The Password Arms Race Heat Up

A Radeon GPU, image by Eastwind41
via Wikimedia Commons 

Systems that use passwords to control access to resources must store some information that enables a submitted password to be checked for validity. Large organizations often manage tens of millions of passwords. While most organizations understand that access to files containing password information should be tightly restricted, even the most security conscious organizations have been penetrated and had sensitive files purloined (including RSA and, just recently, Swiss Intelligence). 

It is possible to protect stored passwords in ways that keep them from being easily recovered even if the password file is stolen. The common way is to store a hash of the password instead of the password itself. Done right, this can provide a high degree of security. Done wrong, it is almost worthless. For example, Jeremi M Gosney  and a colleague were able to crack 90% of the 6.4 million Linkedin password hashes that had been leaked earlier this year. (http://securitynirvana.blogspot.com/2012/06/final-word-on-linkedin-leak.html) The LinkeIn password hashes were not protected by salt. (See my earlier essay on the importance of salt). 


Still further improvement in the ability to attack password reposetorie has been presented by Gosney at the Passwords^12 Conference just held in Oslo, Norway. Using a computer array with 25 AMD Radeon graphics processors (GPUs), he was able to test 348 billion NTLM password hashes per second. (http://securityledger.com/new-25-gpu-monster-devours-passwords-in-seconds/) Gosney's slides are available online at http://heim.ifi.uio.no/hennikl/passwords12/www_docs/Jeremi_Gosney_Password_Cracking_HPC_Passwords12.pdf

Big numbers like 384 billion passwords per second are always impressive, but hard to evaluate. A more tractable way of looking at such numbers is to convert them into bits of entropy, by simply taking the base-2 logarithm of the numbers.

Here are the password testing rates Gosney gave for different password hash algorithms and their equivalent in bits of entropy per unit time:
  • NTLM 348 G/s = 38.5 bits of entropy per second
  • MD5 180 G/s  =  37.4 bits of entropy per second
  • SHA1 63 G/s  = 35.9  bits per second
  • LM 20 G/s   =  34.2  bits per second
  • md5crypt 77 M/s = 26.2 bits per second
  • bcrypt (05) 71 k/s  = 16  bits per second
  • sha512crypt 364 k/s = 18.6  bits per second

The above numbers show how many bits of entropy are attachable per second. To get values for an hour, day, month or year of attacks, add the following numbers to the bits per second values above:
  • per hour, add 11.8 bits 
  • per day, add 16.4 bits 
  • per month, add 21.3 bits
  • per year, add 24.9 bits 

If you have an estimate of the entropy strength of your password and know the hash system used to protect it, you can get estimate how long it will take for Gosney's machine to crack your password. on average.

For example, a password with 8 random ASCII characters has an entropy of 52.5 bits. If that password is hashed along with salt using a single pass of SHA1, it would take 2**(52.5 - 35.9) = 2**16.6 seconds or about one day to crack using Gosney's machine.  And that's assuming a truly random password--something that looks like }wg?3Fy6 --not your dog's name plus a digit and a special character.  Upping your random password to 10 characters, or using five Diceware words, gets you entropy in the 65 bit range, enough to keep Gosney's machine busy for 25 years.  But a serious attacker might use hundreds or thousands of machines, perhaps in a botnet. Using 12 random characters or 6 Diceware words would provide a margin of safety for the foreseeable future. 

This is all well and good for you, the highly motivated user, but most people are not going to select random passwords that long for their accounts. With the data results above, it is hard to pin the blame on users who pick weak passwords or share passwords among several accounts. Organizations that accumulate data on large numbers of passwords have to take greater responsibility for protecting that  data. The good news is that the same growth in computing power that allows attacks on stolen password hashes also give organizations tools to thwart the attacker -- if they choose to use those tools. 

In my next postings, I'll talk about what can be done now and in the future to protect password repositories.



Wednesday, October 3, 2012

Is Keccak (aka SHA-3) too fast for passwords?


Yesterday, October 2, 2012, was a big day for computer security. The U.S. National Institute of Standards and Technology (NIST) announced the winner of a 5 year competition to select a new cryptographic hash algorithm, to be known as SHA-3.   The winner, developed by Guido Bertoni, Joan Daemen, Gilles Van Assche  and MichaĆ«l Peeters, is called Keccak, pronounced "ketchak" or “catch-ack." It is very fast when running on a computer, as were many of the 63 other hash candidates. But a major reason for its selection was that "Keccak has higher performance in hardware implementations than SHA-2 or any of the other finalists."

That's good news and bad news. Speed is important in many applications that employ a cryptographic hash, and their users should be pleased, but there is one hash use where speed is not desirable, secure storage of the passwords that protect electronic resources, such as user accounts on computers and web sites.  It is standard practice never to store passwords in plaintext format, lest a hacker break-in, which happens all too frequently, compromise all user accounts. Instead a hash of the password is stored. The validity of a submitted password can be checked by hashing the submitted password and comparing the resulting hash value against the stored file of hashed passwords.  This provides some protection in case the stored password file is compromised. It is computationally difficult to reverse a well constructed hash function, which Keccak presumably is after the exhaustive analysis and testing it received prior to selection.

However there is another way to attack stored hashed passwords, exhaustive search using a file of likely passwords, perhaps with common modifications. Each candidate password is hashed and compared with the stored hash value until a match is found. For short enough passwords, all possible combinations can be searched. In this situation having a fast hash algorithm aids the attacker as she can test more possible passwords in a given amount of time and computing resources. 

One solution to this conundrum has been to hash the password multiple times, typically thousands of times or more. This practice, known as key stretching, makes the attacker's job more difficult by slowing down any search. Key stretching is typically done in software, the PBKDF2 algorithm being a common example. 

Having an especially fast hardware implementation makes Keccak less suitable for use in key stretching. A determined attacker who gets hold of the hashed passwords can build special purpose hardware to test each trial password much faster than the legitimate, software-based authentication system can.

One relatively quick solution to this disappointing outcome would be to review the NIST SHA-3 candidate pool for an algorithm that passed all security tests, but which was the slowest and most difficult to implement in hardware, perhaps as measured by gate count. This secondary selection could be done quickly using all the evaluation data already submitted.  Hopefully the algorithm's hardware vs software difficulty ratio would favor software to a greater degree than either SHA 1 or SHA-2 now do. If so, the selected algorithm would be a good a candidate to use for password security, a particularly vital application of cryptographic hashes. 

Tuesday, June 19, 2012

Just add salt


LinkedIn.com recently experienced a security breach that exposed 6.5 million LinkedIn hashed passwords. While the thieves did not publish the user names that correspond to each password hash, there is no reason to doubt they have that information. One commentator claims he was able to crack 2 million of the 6.5 million passwords using an old PC with no graphics processor, and widely available software.  

The stolen hashed passwords were not protected by salt,  technique that stores a random number  (the "salt") with each password hash. The salt is incorporated into the hash along with the user's password. This forces an attacker to crack each password separately, rather than cracking them all at once or using widely-available precomputed tables containing the hashes of common passwords.

Last week Vicente Silveira of LinkedIn posted a blog entitled "An Update On Taking Steps To Protect Our Members." While LinkedIn is to be commended for responding quickly in public, there are some details in that blog that I'd like to comment on.

Silveira says LinkedIn has notified users whose passwords they think may be at risk and says they don't believe others are at risk, but does not say what the basis for that assurance is. More concrete info from LinkedIn would be helpful.

He goes on to advise users who have not been contacted that "it is good practice to change your passwords on any website you log into every few months."  This has to be the silliest "I screwed up,  but it it's your fault" blame shifting ever. Hint: no one does this and no one ever will. It's like telling people whose credit card numbers have been stolen that they should have been getting new credit cards every few months anyway. 

Finally, the blog reports "We have built a world-class security team here at LinkedIn including experts such as Ganesh Krishnan, formerly vice president and chief information security officer at Yahoo!, who joined us in 2010. … Under this team’s leadership, one of our major initiatives was the transition from a password database system that hashed passwords, i.e. provided one layer of encoding, to a system that both hashed and salted the passwords, i.e. provided an extra layer of protection that is a widely recognized best practice within the industry. That transition was completed prior to news of the password theft breaking on Wednesday."

Salt is not "best practice," it's de minimis security these days, but at least the "world-class team" got the importance of salt and made it a "major initiative."

So what took them so long to implement this vital security measure? 

Converting an unsalted hashed password system to a salted system should be an afternoon's work for a good security programmer. Ok, I understand, this is a production system for a major Internet presence. So they'll need design reviews, code reviews, unit testing, some forensic work to make sure no programmer pokes into the password file instead of using the proper authentication calls, full-up tests, etc., etc. So bump that afternoon to a month's work with a small team. But surely not two years?

What to do

For other enterprises that are storing unsalted hashes,  here is a simple way to convert the entire password database:

Let's say the current system has a password file indexed by user name that for each user stores H(userpassword), where H is whatever hash algorithm is currently in use.  This entry is stored at account creation and whenever a user changes their password. 

We create a new file, indexed the same way, with entries: 

           version,  salt, Hnew (salt, H(userpassword)).

* salt is a random number generated for each user. It should be at least a 32,bit value, with 64 bit preferable. While high quality random numbers are generally needed in security applications, salt is one exception. All that is needed is that the value be distributed reasonably uniformly over the size of the salt field, so the likelihood of multiple passwords having the same salt is small. 

* Hnew is a high quality, cryptographic hash, such as a member of the NIST secure hash algorithm family. SHA1 is probably good enough, but since this is a new design, I would suggest SHA512.  If SHA3 is released before you act on this, it should be a good choice. 

*version is just a number (e.g. 2) so you can switch to a different algorithm in the future-- this is just good database practice. 

As long as the original hash, H, produces a wide enough output, 128 bits should be fine, 64 bits so so, the new hash plus salt will eliminate the threat of parallel attacks or rainbow tables. For extra credit, use a key strengthening hash such as PBKDF2 or scrypt. 

Wednesday, April 18, 2012

Passwords and the Fifth Amendment


Note: I am not a lawyer and cannot give legal advice. See a lawyer if you need legal advice.
Many people would like to believe that the Fifth Amendment to the U.S. constitution lets them keep their password or pass phrase a secret. "No person … shall be compelled in any criminal case to be a witness against himself" But the law is not that simple. On several occasions, the U. S. government has gotten courts to order defendants to decrypt their hard drive, rather than ask for the password itself.
The question of whether and when the U.S. government can force a criminal suspect to decrypt data has finally reached a higher court. On February 23, 2012, The U.S. Court of Appeals for the Eleventh Circuit issued a ruling in U.S. v. John Doe that limits the governments ability to force someone to decrypt their hard drives. As I read it, the ruling says the government can only demand a hard drive be decrypted if it already has some specific knowledge about the files contained on that drive, so that the act of producing them would not constitute testimony that the files exist.  
In January 2012, a federal district judge in Colorado ordered a criminal defendant to decrypt her laptop's hard drive. "I find and conclude that the Fifth Amendment is not implicated by requiring production of the unencrypted contents of the Toshiba Satellite M305 laptop computer."  https://www.eff.org/cases/us-v-fricosu http://news.cnet.com/8301-31921_3-57364330-281/judge-americans-can-be-forced-to-decrypt-their-laptops/
The appeals court distinguished its case from Colorado case (which is in a different circuit) because there the government had wiretaps which mentioned data on the defendant's laptop. The appeals court decision is worth reading if you're interested in this subject. It's at https://www.eff.org/sites/default/files/filenode/OpinionDoe22312.pdf


Monday, March 5, 2012

A simple fix for the WPS security hole

In mid February, I wrote about the huge hole in Wi-Fi security caused by the WiFi Protected Setup feature that's included in most new wireless routers.

Here is what I wrote: "It turns out that the WPS protocol breaks the 8-digit PIN into two have and tests them separately. A wrong PIN generates different errors depending on whether the first four digits failed to match or the last four were wrong. This lets an attacker test the two halves separately, a huge security gaff, cutting the maximum number of combinations to be tested from many millions to just 20,000."

Some routers let you turn WPS off, a good idea, but many popular brands do not. What's been done by the manufacturers whose routers are vulnerable and lack a way to turn off WPS? Not much. What's needed in most cases is a router firmware update. 

In the hope of prodding them along, I'm proposing a very simple fix for the WPS PIN vulnerability that should be easy to implement on any router.  It only requires adding six lines of source code and uses just one additional word of memory. No persistent storage or change to the user interface is required.  All it does is keep a count of how many consecutive failures to enter a valid PIN are detected. If that number exceeds some maximum, say 7, no more PIN entries will be accepted.  The count is reset whenever the router is turned off and then on.

Here are the needed software changes, in pseudocode:

Parameter macro declarations (this sets the maximum number of consecutive failed WPS code entries, 7 is a suggested value), add:

     WPS_FAIL_LIMIT = 7

Variable declarations, add:

     integer WPS_fail_count

Power up initialization code, add:

     WPS_fail_count = 0

Modify the firmware's test for successful WPS PIN entry, which presumably looks like something this:

     if (enteredPIN == storedPIN)  then register new device
         else handle bad PIN entry

to include WPS_fail_count  test: 

    if (enteredPIN == storedPIN AND WPS_fail_count <= WPS_FAIL_LIMIT)  then begin
                 WPS_fail_count = 0
                  register new device
                  end
          else begin
    If (WPS_fail_count <= WPS_FAIL_LIMIT) then WPS_fail_count = WPS_fail_count + 1
                 handle bad PIN entry
                 end

The altered code simply keeps track of how many bad PINs were added consecutively and does not allow a PIN to be registered if the found exceeds a limit. The failure limit is reset whenever power is turned off and on. Since users are familiar with cycling power as a way to clear router problems, no special user interface or documentation is required.  

This is simple and, in my opinion, foolproof. I hereby release my ideas contained in this post regarding fixes to WiFi browsers to the public domain as specified by the Creative Commons CC0 1.0 Universal declaration (http://creativecommons.org/publicdomain/zero/1.0/)

Friday, March 2, 2012

Practical sources of randomness for key generation

As promised, here are some suggestions about sources of randomness suitable for use on systems that allow entropy input to their random number generator via external typing. 

Dice
This is the Diceware™ blog, after all, so it's appropriate to start with this ancient but highly dependable source of entropy. Each roll of a die has six possible outcomes, yielding 2.58 bits of entropy [log2 (6)]. To get 128 bit of entropy, you'll need to type in the outcomes of 50 rolls. Putting 10 dice in a box with enough room to shake them up would let you enter the required number of rolls in 5 operations.

Type the outcomes of the dice rolls into whatever program you are using to generate keys, assuming it allows such input. If not, and you are using Unix-like operating system, such as Linux or Mac OS X, you can enter the entropy into its /dev/random generator.  Bring up a terminal window and type at the prompt: 

    cat >/dev/random 

followed by return. Then enter the dice rolls or one of the random string generated by other methods described below. Type Control-D when your done.

Mistakes when you're typing in these random strings just add more randomness, so if you type something wrong, don't go back, just type in the correct value and continue. 


Playing cards
Paying card are another good source of randomness, assuming they have been properly shuffled.  Seven through riffle shuffles are needed for full randomness; do a couple more for good measure. Wikipedia has a nice article on various shuffling techniques.  http://en.wikipedia.org/wiki/Shuffling A fully shuffled deck has 225 bits of entropy [log2(52!)].  The first card dealt has 5.7 bits of entropy, [log2(52)] and the amount per card decreases slowly as more cards are dealt. To get 128 bits of entropy, you should deal out 25 cards. Type in the cards including the suit, for example, you'd type

    3djs10hac …

for the 3 of Diamonds, Jack of Spades, 10 of Hearts, Ace of Clubs, and so on. There's no need to put spaces delimiters between the card values. Here is a full example that took me about 80 seconds to type in:

     4d3s4c10s2hqc5c10c3dqs6hah2dkd3hjs10dqd7s3c9s8c7c6s8h

If you need more entropy, say 256 bits, shuffle the deck again and type in another 25 cards. The deck should be shuffled again after you are done so no one can reproduce what you typed in.

Video camera 
Another way to generate randomness is to take a picture with a computer's webcam and "digest it" with a cryptographic hash function. The old saying, "a picture is worth a 1000 words" is an understatement in the computer world. Camera images typically take half a million bytes of memory or more. 

There are two sources of randomness in a digital photograph: randomness inherent in the image content and electrical noise generated by the image capture hardware. The later is likely more than enough for our purposes, but since it is not feasible to test every camera and lighting situation that my occurs, we might as well use both sources.

Here are instructions for doing this on a Mac:

1. Find and and and open Photo Booth, then take a picture. Save the snapshot file on the desktop using an uncompressed format such as TIFF.

2. Open a terminal window and type the following at the prompt (don't hit return yet):

       $ openssl dgst -sha512 

3. Drag the picture from Photo Booth picture file onto the terminal window. The Terminal program will add the file path to  the command string.

4. now type:  > /dev/random

5. Finally delete the Photo Booth picture using Finder -> Secure Empty Trash

It probably doesn't matter what you point the camera at--there should be enough noise in any camera image--but here are some suggestions:

o Trees outside your window

o Your head, especially if you had a bad hair day

o A cluttered (physical) desktop

o The screen of an old analog TV tuned to a nonexistent station. 

The last is a good choice if you want to set up a webcam as a permanent resource, say for initializing virtual machines. You'll need to write a script that mounts the camera device, takes a photo, hashes it into the randomness generator and then releases the camera for the next virtual machine.
Sound input
Even if your computer does not have a camera, you can grab random data using its audio input or microphone.  If your computer does not have a sound capture utility, you can download a free one for Windows,  Mac or Linux at http://audacity.sourceforge.net (Mac's with ILife already have GarageBand, though it's a bit complicated for this task.)

For a sound source, use a radio tuned between stations. An AM radio is preferable because FM sets tend to lock on to the nearest strong signal. If your in a noisy computer room, that might do in itself. Record 15 seconds or so of noise and save the file to the desktop, preferably using an uncompressed format (choose Export from the Audacity file menu). Then follow the steps for camera input, above. Don't forget to clean up by securely erasing the sound file when your done.

Tuesday, February 21, 2012

More on the danger revealed by duplicated keys

In my last post, I suggested that the existence duplicate keys generated by a particular device could lead an attacker to reverse engineer that device to learn the details of its entropy generation weakness. He could then replicate that weak process to generate large numbers of RSA keys and see if they match RSA public keys that haven been published or obtained some other way.  One can do even better, however. Instead of generating RSA keys to test, just generate primes, following the method the device uses to generate the first prime of an RSA pair. There is no need to spend time generating the second prime and forming the product. Each prime generated would then be tested to see if it divides any of the real RSA keys.  As Heninger points out, this test can be performed fairly quickly on all the keys at once by computing the product of all the real RSA keys.  

There is a further speed up possible. In generating primes to create an RSA key, one selects an odd number from some starting point (ideally random) and tests that number for primality, repeating the process until a prime is found. Getting certainty or a very high probability that a trial number is indeed prime takes considerable computation, involving a series of ever more stringent and expensive tests, see for example http://www.openssl.org/docs/apps/genrsa.html. For our purposes, however, a lower probability of primality suffices. The risk is that by limiting the testing, we might select a prime that the real software would have rejected and thus possibly miss a factorable key.That risk should be balanced by the possibility of testing many more keys. 

Another possible speed up might be to multiply a large number of test keys together and test their product against the product of the real keys in one big operation using a GCD algorithm. If a divisor is found, it can then be tested again the real keys to find the one (or more) that is broken. 

Testing candidate RSA primes is likely the most expensive computation in this process. However that operation is easily distributed to many processors working in parallel and might even be suitable for implementation on General Purpose Graphics processors (GPGPUs). Several groups have worked on using GPGPUs for modular arithmetic on large numbers of the type involved in primality tests. For a nice review, see:

http://brage.bibsys.no/hig/bitstream/URN:NBN:no-bibsys_brage_15987/1/Implementing%20modular%20arithmetic%20using%20OpenCL.pdf

However, the work I've seen attempts to use all the GPU processors for a single large number modular multiplication. It might be faster to test many candidate primes at once, perhaps from different entropy starting points, each on its own graphics processor, or SIMD unit.

Absent the hard work of implementing these suggestions, their potential to break real RSA keys is speculative. However, the possibility of such an attack is real enough to convince me that all keys generated by devices that have exhibited inadvertent key duplication are suspect and should be replaced as soon as possible. In my next post, I plan to offer some suggestions for generating enough strong entropy prior to key generation on systems that allow manual entropy input.

Thursday, February 16, 2012

Duplicate keys could be a sign of exploitable weakness

In my previous post, I may have understated the danger of duplicate RSA keys. Here's what I said:

 "Say the entropy pool is nearly empty at the start. Then there is a good chance two different users, Adam and Alfred, using method A could generate the same two primes, and would have the same public key. That means Adam could read Alfred's mail, forge his signature and vice versa. Poor security to be sure, and anyone could find out this had happened by inspecting a public key database. But only Adam and Alfred can take advantage of the security hole, and if both Adam and Alfred are good guys, it's not that likely to be a problem.  It's like discovering two people in your town have the same front door key."

That's true as far as it goes, but discovering the existence of shared RSA keys is a clue that a weak source of entropy may had been used in generating them. If they were both created by the the same security appliance, say a particular brand of firewall, an attacker could purchase the same device and attempt to recreate the shared key by restarting the device multiple times. Collecting all available duplicate keys from the same appliance would allow them to be attacked together. But why stop there? By collecting every public key available that was generated from the same appliance, an attacker would likely discover many more keys that were created from the weak entropy pool, but had not been duplicated. With some more effort, the attacker could reverse engineer the key creation mechanism and its entropy pool weakness, allowing much faster attempts to find the vulnerable keys. They could generate hundreds of millions of keys and see which matched published or intercepted public keys. 

So the existence duplicate keys are a tell-tale that more weak keys are out there that are ripe for attack. Once the details of the poor entropy collection are known, even single-secret public keys, such as Diffie-Hellman or DSA, might be subject to such attacks. This weakness is not the instant fail of a single shared prime, but it may have broader impact. Discovering two people in your town have the same front door key could tell a thief that their locks are easy to pick.

Wednesday, February 15, 2012

More on the RSA shared prime story

A new blog post by Nadia Heninger https://freedom-to-tinker.com/blog/nadiah/new-research-theres-no-need-panic-over-factorable-keys-just-mind-your-ps-and-qs provides an expiation for most of the strange results reported by Lenstra et al. that my previous blog post discusses. It seems almost all of the duplicated and factorable RSA keys were generated by security appliances like firewalls and routers. These typically have poor access to entropy, particularly when they first start up, and often that is when they generate their public key pairs. 

In particular, she offers a simple explanation of the shared prime problem, based on two snippets of pseudo-code for generating an RSA key:

// method A:
seed_random_number_generator (entropy-pool)
P=generate_random_prime ()
Q=generate_random_prime ()
public_key=P*Q  

// method B:
seed_random_number_generator (entropy-pool)
P=generate_random_prime ()
add_entropy_to_pool (whatever)
Q=generate_random_prime ()
public_key=P*Q  

At first glance method B seems more secure. We want our primes to be random and adding more entropy to a random number generator can only make things better, right? Had you asked me yesterday I might have agreed. Wrong!

Say the entropy pool is nearly empty at the start. Then there is a good chance two different users, Adam and Alfred, using method A could generate the same two primes, and would have the same public key. That means Adam could read Alfred's mail, forge his signature and vice versa. Poor security to be sure, and anyone could find out this had happened by inspecting a public key database. But only Adam and Alfred can take advantage of the security hole, and if both Adam and Alfred are good guys, it's not that likely to be a problem.  It's like discovering two people in your town have the same front door key. 

Now suppose Ben and Bill use method B. There's a good chance they will share the same first prime, P, but their second prime, Q, is likely to be different. Again anyone could find out this had happened by inspecting a public key database, but in this case, once they do that, they can easily break both public keys using Euclid's two thousand year old GCD algorithm. So anyone, not just Ben and Bill, can take advantage of this security hole. It's like publishing a hi-def photo of your front door key on Flickr, allowing anyone to make a copy.  

Security software isn't easy.

So the problem does not seem to be the Random Number Generator Attack I speculated on yesterday, but that attack mode is still a real threat. Heninger has notified the manufacturers of the security devices she found to have this problem.  Whether they will be more responsive to this threat than they have been so far to the WPS problem that I blogged about in January remains to be seen.

Tuesday, February 14, 2012

Was Whit really right?

A recent paper titled "Ron was wrong, Whit is right" by Arjen K. Lenstra, James P. Hughes, Maxime Augier, Joppe W. Bos, Thorsten Kleinjung, and Christophe Wachter (http://eprint.iacr.org/2012/064.pdf) presents some disturbing results from their analysis of several million RSA public keys, which they obtained from published databases of key certificates. The authors did what amounts to a sanity check, testing the keys for simple vulnerabilities. Basically an RSA public key is a number that is the product of two large prime numbers. The corresponding secret key is, essentially, either of the two prime numbers. Once one is known, finding the other can be obtained by a simple (for a computer) division operation. The security RSA  depends on the widely-believed difficulty of factoring numbers, that is finding either prime given that you know the product. 

Each prime is supposed to be chosen in a random manner, making duplications unlikely. What Lenstra and his team found was that a surprising number of RSA keys were duplicated in more than one certificate (i.e. both primes were the same). Even more surprising, they found a small but significant fraction (about 0.2%) of RSA keys in their database shared one prime factor with a different RSA key in the database. That's a serious problem because there is a simple and efficient algorithm, known to Euclid, that lets you find a factor that two numbers share.  Looking for dups is easy, just sort the public keys by size. Finding shared factors is a little trickier for such a large collection, but the authors found an efficient way to test all pairs of keys in the database to see if they have a common prime factor.

Some duplicate keys appear to have the same owner, which is harmless enough. But the large number of duplicates, and particularly the keys with one shared prime, are much harder to understand. 

I would like to float one possible explanation that involves a simple, but subtle way to attack computer security: using malware injection to subvert a computer system's random number generation process. All that is necessary is to insert code that intercepts calls to the system's random number generator and replaces the proper returned value with a number generated by an algorithm crafted by the attacker.  This algorithm would be designed to have a relatively small state space, producing say only a few million or even billion possible results. These would appear random enough to simple inspection, but the attacker could test the limited number of possible keys generated using the cooked algorithm to find the correct key, thus defeating all security. Note that this attack does not require the infected computer transmit stolen keys, or logged passwords. Indeed the malware need not communicate with the outside world in any way, making detection very difficult. 

I propose that the Lenstra research may have found a signature of this attack in operation.  That is, the weakness in the substituted random number generator on infected machines may have caused the duplications. Perhaps a statistical model of the types of duplication a random number generator attack might produce could test this hypothesis. 

Lenstra et. al.'s title suggest that, due to their their findings, RSA is less secure than other public key systems, such as Diffie-Hellman. The latter only needs a single secret prime number, hence there can be no shared factors. But If I am correct, those too would have been compromised by random number generator attacks, though in ways that are less evident, since the attacker still has a limited number of possible primes to test.

Friday, January 6, 2012

WPS Spells Whoops, or the Collapse of Wi-Fi Security


The good folks at the Wi-Fi Alliance have taken a fine security product and smashed it to smithereens by adding a poorly thought out feature. The damage affects most new Wi-Fi routers and will take years to fully repair, since must users will never hear about it.

We're all familiar with Wi-Fi, the technology that literally unleashed the Internet by eliminating the Ethernet cable that tied our computers to the wall. Wi-Fi is nurtured by an industry trade association know as the Wi-Fi Alliance that was formed in 1999 to insure that products using a wireless networking standard known as IEEE 802.11 could all work together. Complex standards like 802.11 have lots of options and typically leave some issues unaddressed. Manufacturers of wireless products based on 802.11 can choose among options and fill in the missing pieces in ways that prevent their products from talking to products made by other manufacturers. The Wi-fi Alliance tied up those loose ends in a way that insures interoperability. The "Wi-Fi" trademark is owned by the alliance and manufacturers who do things the way the alliance recommends get to use the Wi-Fi Certified mark on their products. 

One issue that the original 802.11 spec did not properly address is security. The older wired Ethernet standard for joining computers to networks provided a modest level of security, in that you needed to plug the cable in to a network outlet. That usually meant you had to at least get inside the building where the network was located and do your mischief there. Wi-Fi, on the other hand, is wireless. Each Wi-Fi device is  miniature radio station, often broadcasting beyond the walls of the place where it is used. The signals can often be picked up from an adjacent parking lot or even hundred of meters away, by using a high gain directional antenna. 

The Wi-Fi Alliance alliance recognized that more security was needed and included something called Wired Equivalent Privacy or WEP in their early recommendations. WEP, as the name implies,  was intend to roughly match the modest security provided by the earlier wired systems. It didn't even accomplish that. Flaws in the WEP encryption algorithm allow it to be broken in a few minutes.

The Wi-Fi Alliance responded by developing a much stronger security system called Wi-Fi Protected Access or WPA. Developing a good encryption system is always tricky, but Wi-Fi Alliance faced an additional challenge in that many wireless device already sold had just enough computing power to implement WEP.  WPA had to be retrofitted to those devices and that took a lot of ingenuity.

Meanwhile the IEEE committee responsible for 802.11 also took up the security challenge and produced an encryption protocol for wireless called 802.11i that avoided the compromises needed for WPA but took more compute power. The alliance adopted 802.11i as WPA2 and all newer Wi-Fi certified devices do WPA2. 

Encryption systems scramble messages based on a key and  all encryption schemes face need a way for senders and recipients to exchange those keys securely. WPA and WPA2 have a fancy way of doing this in corporate environments that requires special security servers, but for individual users and smaller organizations, Wi-Fi security depends on a secret password they call a "Pre-shared Key" (PSK) that all devices on a network must possess .

Though WPA and WPA2 take steps to slow down the rate at which passwords can be tested, the steady increase in computing power available for a given cost, especially the availability graphics processors that have hundreds of general purpose computing elements, has made it possible to rapidly try large numbers of passwords to find the correct one. These days short passwords, even with the 8 character minimum length that the Wi-Fi Alliance recommends, can be broken with inexpensive computing equipment.

Users can overcome this problem by using a longer password or pass phrase. I recommend a minimum of 6 randomly chosen short words and provide a simple tool, Diceware (www.diceware.com), that lets you create a random passphrase using ordinary dice. With a long enough pass phrase, WPA and especially WPA2 are quite strong. 

Or so it seemed.

The Wi-Fi Alliance was concerned that the process of selecting and entering a good password or pass phrase was too complicated for the average user. Alliance members also wanted to sell a variety of special devices that did not have a full keyboard with which to enter a pass phase. So they came up with another recommendation, WiFi Protected Setup or WPS.

WPS offers several ways to add a device to a Wi-Fi network. One method, Push Button Connect, uses two push buttons, a button on the unit seeking access and the other button on the Wi-Fi router. Press both buttons within two minutes of each other and the two devices exchange all the information they need to add the new device to the network.

Another method uses an 8-digit number the Wi-Fi Alliance refers to as a PIN. (PIN usually stands for Personal Identification Number, but we get the idea.)  Each router that meets the WPS certification requirements has a PIN. You can enter that PIN into a device seeking network access and it calls your Wi-Fi router on the radio. The router asks the device for the PIN and if it has the correct number, the router gives the device the networks pass phrase and it's welcomed into the network. 

Here's where things get nasty.

It turns out that the WPS protocol breaks the 8-digit PIN into two have and tests them separately. A wrong PIN generates different errors depending on whether the first four digits failed to match or the last four were wrong. This lets an attacker test the two halves separately, a huge security gaff, cutting the maximum number of combinations to be tested from many millions to just 20,000.

To get their grade in Security 101 from D- to F, the designers of WPS included a check digit in the 8 digit PIN, so an attacker who has gotten the first 4 digits, 5000 attempts needed on average, only needs 500 more guesses on average to get the whole PIN. Note that there is absolutely no need for a check sum in a situation like this. If a user copies the wrong number from the label  on the device, the security protocol itself will provide the needed warning. From  security standpoint, the longer the PIN, the better; the only limit on PIN size is the patience of the human user. So why waste a digit on a check sum?

Software has been published on the Internet that takes advantage of this WPS hole to break most recent Wi-Fi router's security in a few hours.

Even without this hole, the way WPS was typically implemented presented security problems. Having the PIN number right on the router means anyone who get physical access to it can write down, photograph ore even memorize your PIN. Once they have the magic number, they can connect to your network whenever they wish. Most routers do not allow you to change the PIN  (TP_Link is an exception). So if you think your PIN may have been stolen, there's nothing you can do about it except buy a new router.

There a variety of solutions to these problems, depending on whether you are a user or a router manufacturer.

Fixes for users
If you're a user, first, determine if your wireless router has the WPS feature. Look for a label on the router with a bunch of long numbers, typically a serial number,pone or more MAC addresses and so on. See if there is a 8-digit decimal number, like 1234-5678 on the label, that is a WPS PIN. It may be marked PIN, WPS, QSS or with an icon showing two arrows pointed at each other's tails. In general, that likely means you have WPS and the security problem. One exception is Verizon's FIOS routers. According to their manual, the WPS feature is not working yet, but was to be enabled by a future firmware release so there is a PIN on their label. Procrastination pays.

One you've determined you have WPS, your best option is to disable it.  Many Belkin, DLink, Netgear and TP-Link models allow this. Many Cisco models currently do not. If you can't disable WPS on your router, check your junk closet for an older model that lacks WPS, which you could put back in service while your router manufacturer gets its act together. (Call their customer support line and ask when a fix can be expected. They need to hear from you.) Absent that, if security is important consider buying a new router from one of the manufacturers that currently let you disable WPS.

While your at it, write down you WPS PIN and put it in a safe place, then cover it over with paint, nail polish or tape to protect it from casual glances.

The simplest way to fix router firmware

Here is a very simple fix for router manufacturers who want to get a patch out quickly:

Disable the WPS feature after, say, five successive failed PIN entries and have the count reset to zero whenever the router is powered off and on.  This fix would be very simple to implement in firmware, requiring no change to the user interface, no persistent parameter storage and little end user education.  Users are already used to the idea of turning power off and on when they encounter a problem, so they would likely figure out what to do if their WPS stopped working. 

The needed changes to router firmware are extremely simple and easy to code-review and test:  There would be one new variable. It would be set to zero during power up initialization. It would be tested to make sure it hasn't exceeded the threshold before any remote PIN registration attempt is allowed, if not it would be incremented by one whenever a remote PIN registration attempt fails and and set to zero whenever a remote PIN registration attempt succeeds. 

The ideal fix 
The idle fix would combine the above with the ability to change the PIN and  the ability to turn WPS off if desired. A warning message if excessive bad WPS attempts were detected would be handy.

Check back here. I'll post more info when I get it.