Wednesday, January 16, 2013

NIST on Password Storage


Password storage is a part of almost any computer-based system and one that presents serious security issues. Numerous breaches have occurs, even in highly security conscious firms. These have revealed millions of user passwords. Strict laws in most US state and the EU now require notifying users of security breaches, making such breaches an expensive liability. 

Yet clear advice on safe password storage is hard to find.  The U.S. National Institute of Standards and Technology (NIST) has a Guide to Enterprise Password Management (Draft), SP800-118, published in April 2009, but it devotes a little more than one page to password storage and provides questionable advice, in my opinion. NIST SP800-118 suggests three ways to protect stored passwords: 

* Using OS access control features to restrict access to files that contain passwords. This is a no-brainer as a first line of defense, but it's not sufficient. OS access control features have proven inadequate in practice. Undoubtedly most, if not all, of the organizations that have suffered password breaches employed access control measures. Attackers bypass such measures by exploiting various software security flaws that allow privilege escalation.

* Encrypting files that contain passwords. The problem here is than an attacker who bypasses those OS access control features to get at the encrypted passwords will likely be able to grab the encryption key as well, since it will be in active use by the system. NIST tries to justify the dangerous practice storing plaintext passwords, saying "cryptographic hashes may not be an option if an authentication protocol requires that an entered password be directly compared to a stored password." Any security benefit from such a protocol must be weighed against the risk of compromising all user passwords. The most common reason for comparing an entered password with a stored password is to see if the user is creating a password that is too similar to a previous password they used. This is one of the most hated and easily bypassed password policy "security" measures. Most users subject to mandatory password change regimes have developed their own way to modify their last password just enough to pass the comparison screen. These methods are usually insecure and well known to attackers. If you run a high security system and you don't trust users to pick secure passwords, don't let them. Generate random passwords with sufficient entropy in a few different formats and let the user pick one they like. Users won't love this either, but at least it offers real security. I plan to talk more about this approach in a future post.

* Storing one-way cryptographic hashes for passwords instead of storing the passwords themselves. Bingo. In my view, this is the only safe way to store passwords. Done right, hashing can protect user passwords of reasonable strength even if the entire password file is stolen. 

NIST goes on to say that Federal agencies must protect passwords using FIPS-approved cryptographic algorithm implementations. FIPS stands for Federal Information Processing Standard. But as I pointed out in my previous post, there is no FIPS-approved cryptographic algorithm that, by itself, is adequate to protect passwords. FIPS-approved cryptographic algorithms were all designed to be fast and efficient, good for most security applications but very bad for password storage. A specialized algorithm that consumes computer resources must be used, and such hashes can incorporate FIPS-approved cryptographic algorithms, allowing compliance with this guideline.

NIST does have a guideline that documents a more suitable method for protecting stored passwords, SP800-132 Recommendation for Password-Based Key Derivation, issued December 2010.  It describes an algorithm for performing repeated hashes know as PBKDF2.  Unfortunately, the document only discusses using it for "deriving cryptographic keys from passwords or passphrases," an important application to be sure, but the exact same method, can be used to protect stored passwords.  Apple uses PBKDF2 in Mountain Lion, its latest version of the Macintosh OS X operating system.

Now I understand NIST has a lot on its plate, and SP 800-132 on PBKDF2 did come out after SP 800-118 on Password Management, but protecting stored passwords is an important security issue. I'd like to see NIST do several things: 

1. Update SP 800-118 to recommend use PBKDF2 as described in SP 800-132, and depreciate storing passwords as plaintext. Only a few details need to be spelled out, such as what PBKDF2 output key length and repetitions count to use for various security levels. (NIST recommends 1000 iterations as a minimum, that's probably too low now. Apple uses around 20,000.) 

2. Start a effort to develop better a better algorithm for password storage (perhaps PBKDF3) that either uses or prevents use of GPUs, preferably with options for both. Colin Percival's scrypt might be a good starting point. His promising approach is currently an IETF draft, but apparently no formal review has even begun.

3. Consider certifying one of the SHA3 finalists that was hard to implement in hardware for use in password storage. 

4. Do a more through revision of SP 800-118 to provide clearer guidance on a range of issues. I plan to say more about this in a future post as well.

There are many computer security problems that seem intractable. Safe password storage is not one of them. We can fix this.

Thursday, December 20, 2012

Picking the Right Hash for Password Security


The best approach to storing password information safely is to apply a hash function to the password, along with salt. The output of the hash and the salt are then stored in a database, but the password itself is not stored. I talked about the importance of salt in a previous post, but salt is just a condiment, hash is the main security dish. So what hash should one use?

A common suggestion is to use an approved cryptographic hash. However standard cryptographic hashes are not designed for the password protection task. The typical design criteria for a cryptographic hash are:
  1. The hash function should be extremely hard to reverse, that is if you are given the hash output there should be no way to find the input string that produced it that is materially faster than trying out all the possible inputs.
  2. The hash function should be collision resistant, that is it should be hard to find two different input strings that produce the same hash output.
  3. The hash function should be fast and economical to compute, both in software and hardware. Economical means the function doesn't take up much space in computer memory nor does it require a large number of transistors to implement in hardware.

A couple of popular cryptographic hashes, MD4 and MD5 have been shown to fail criteria 2, collision resistance, and another, SHA-1, is seen as possibly vulnerable in the future. Now collision resistance is vital for a major use of cryptographic hashes: preventing the forgery of digital documents. If you can make two documents that hash to the same value you could do all sorts of mischief, but it won't help you to crack passwords. 

Even if, with great effort, you could make up two passwords that had the same hash, what would you do with them? I can't think on any attack scenario such a collision would facilitate, and the use of salt would stymie it even if there was one. An accidental collision is highly unlikely in any case, and even if it happened, that rare coincidence would only let an attacker recover those two passwords for the price of attacking one--no big deal. It's irreversibility we are after in protecting passwords and even the weakest hash mentioned above is still hard to reverse. 

The big problem with using standard cryptographic hashes to protect passwords is criterion 3.  Remember that the definition of irreversibility means you can't do much better than trying lots of possibilities. That is exactly what password crackers do, try lots of possibilities. And criterion 3 makes life way too easy for them. Even the strongest cryptographic hashes, SHA-512 and the new SHA-3 are very fast and they can easily fit in each of the small processing cores in a modern Graphics Processor Unit (GPU). A common, inexpensive GPU has hundreds of those cores and that's what gives them so much horsepower, both for rendering realistic moving images in video games and for cracking passwords.

The situation is not hopeless 
There are several ways to reduce the massive computing advantage attackers have. The. Most. Common. Solution. Is. To. Make. The. Hashing. Slower.  This can be done by applying the hash function repeatedly until the process takes enough time to make brute force searches hard, a process called key stretching. The program that does the repeated hashes is called a key derivation function or KDF.  

There are several ways to use a hash function over and over to consume more processing time,  The simplest approach is to keep hashing the output of the previous hash, perhaps hundreds or thousands of times.  That was the basis of an older standard called PBKDF1. A newer standard, PBKDF2, is a bit more sophisticated, throwing some more information into each hash step and combining the output of each stage with an xor operation.  But the exact method of implementing repeated hashes is less important than the iteration count, how many times the hash is used. Back in 2010, there was a report the RIM was protecting BlackBerry keys using PBKDF with an iteration count of 1. Useless. Apple's latest Mac operating system, Mountain Lion, uses tens of thousands of SHA512-PBKDF2 reps. Much better.

Unix has used the multiple hash technique since its earliest days.  Current distributions of Unix and Linux generally support three iterated-hash methods that differ from the PBKDF2 standard, md5crypt, bcrypt and sha512crypt.  

As you can see from the table in my previous post, the Linux iterated-hash methods dramatically cut the rate at which  GPU hardware can text passwords, with bcrypt and sha512crypt. the most effective algorithms of those tested. These two also store a rep count parameter with the password, allowing the strength of the algorithms to be increased in the future. 

Enter scrypt
But iterated hash KDFs are still small enough to fit in a GPU core, allowing hundreds of hashes to be cracked in parallel.  A different approach is to create a KDF that is too big to fit in a GPU core. I proposed such a KDF, called HEKS, in 1999.  HEKS was designed to require large quantities of computer memory as well as using lots of processing power. Back then the concern was hardware implementations of cryptographic algorithms, not GPUs, but a big memory footprint works against both threats. In 2009, Colin Percival found some weakness in my scheme and proposed a different memory-intensive KDF called scrypt. (http://www.bsdcan.org/2009/schedule/attachments/87_scrypt.pdf) .The scrypt KDF is now an Internet Draft standard (https://tools.ietf.org/html/draft-josefsson-scrypt-kdf-00).

Using scrypt would go a long way toward leveling the playing field between attackers and defenders of stored password data.

Wednesday, December 12, 2012

The Password Arms Race Heat Up

A Radeon GPU, image by Eastwind41
via Wikimedia Commons 

Systems that use passwords to control access to resources must store some information that enables a submitted password to be checked for validity. Large organizations often manage tens of millions of passwords. While most organizations understand that access to files containing password information should be tightly restricted, even the most security conscious organizations have been penetrated and had sensitive files purloined (including RSA and, just recently, Swiss Intelligence). 

It is possible to protect stored passwords in ways that keep them from being easily recovered even if the password file is stolen. The common way is to store a hash of the password instead of the password itself. Done right, this can provide a high degree of security. Done wrong, it is almost worthless. For example, Jeremi M Gosney  and a colleague were able to crack 90% of the 6.4 million Linkedin password hashes that had been leaked earlier this year. (http://securitynirvana.blogspot.com/2012/06/final-word-on-linkedin-leak.html) The LinkeIn password hashes were not protected by salt. (See my earlier essay on the importance of salt). 


Still further improvement in the ability to attack password reposetorie has been presented by Gosney at the Passwords^12 Conference just held in Oslo, Norway. Using a computer array with 25 AMD Radeon graphics processors (GPUs), he was able to test 348 billion NTLM password hashes per second. (http://securityledger.com/new-25-gpu-monster-devours-passwords-in-seconds/) Gosney's slides are available online at http://heim.ifi.uio.no/hennikl/passwords12/www_docs/Jeremi_Gosney_Password_Cracking_HPC_Passwords12.pdf

Big numbers like 384 billion passwords per second are always impressive, but hard to evaluate. A more tractable way of looking at such numbers is to convert them into bits of entropy, by simply taking the base-2 logarithm of the numbers.

Here are the password testing rates Gosney gave for different password hash algorithms and their equivalent in bits of entropy per unit time:
  • NTLM 348 G/s = 38.5 bits of entropy per second
  • MD5 180 G/s  =  37.4 bits of entropy per second
  • SHA1 63 G/s  = 35.9  bits per second
  • LM 20 G/s   =  34.2  bits per second
  • md5crypt 77 M/s = 26.2 bits per second
  • bcrypt (05) 71 k/s  = 16  bits per second
  • sha512crypt 364 k/s = 18.6  bits per second

The above numbers show how many bits of entropy are attachable per second. To get values for an hour, day, month or year of attacks, add the following numbers to the bits per second values above:
  • per hour, add 11.8 bits 
  • per day, add 16.4 bits 
  • per month, add 21.3 bits
  • per year, add 24.9 bits 

If you have an estimate of the entropy strength of your password and know the hash system used to protect it, you can get estimate how long it will take for Gosney's machine to crack your password. on average.

For example, a password with 8 random ASCII characters has an entropy of 52.5 bits. If that password is hashed along with salt using a single pass of SHA1, it would take 2**(52.5 - 35.9) = 2**16.6 seconds or about one day to crack using Gosney's machine.  And that's assuming a truly random password--something that looks like }wg?3Fy6 --not your dog's name plus a digit and a special character.  Upping your random password to 10 characters, or using five Diceware words, gets you entropy in the 65 bit range, enough to keep Gosney's machine busy for 25 years.  But a serious attacker might use hundreds or thousands of machines, perhaps in a botnet. Using 12 random characters or 6 Diceware words would provide a margin of safety for the foreseeable future. 

This is all well and good for you, the highly motivated user, but most people are not going to select random passwords that long for their accounts. With the data results above, it is hard to pin the blame on users who pick weak passwords or share passwords among several accounts. Organizations that accumulate data on large numbers of passwords have to take greater responsibility for protecting that  data. The good news is that the same growth in computing power that allows attacks on stolen password hashes also give organizations tools to thwart the attacker -- if they choose to use those tools. 

In my next postings, I'll talk about what can be done now and in the future to protect password repositories.



Wednesday, October 3, 2012

Is Keccak (aka SHA-3) too fast for passwords?


Yesterday, October 2, 2012, was a big day for computer security. The U.S. National Institute of Standards and Technology (NIST) announced the winner of a 5 year competition to select a new cryptographic hash algorithm, to be known as SHA-3.   The winner, developed by Guido Bertoni, Joan Daemen, Gilles Van Assche  and MichaĆ«l Peeters, is called Keccak, pronounced "ketchak" or “catch-ack." It is very fast when running on a computer, as were many of the 63 other hash candidates. But a major reason for its selection was that "Keccak has higher performance in hardware implementations than SHA-2 or any of the other finalists."

That's good news and bad news. Speed is important in many applications that employ a cryptographic hash, and their users should be pleased, but there is one hash use where speed is not desirable, secure storage of the passwords that protect electronic resources, such as user accounts on computers and web sites.  It is standard practice never to store passwords in plaintext format, lest a hacker break-in, which happens all too frequently, compromise all user accounts. Instead a hash of the password is stored. The validity of a submitted password can be checked by hashing the submitted password and comparing the resulting hash value against the stored file of hashed passwords.  This provides some protection in case the stored password file is compromised. It is computationally difficult to reverse a well constructed hash function, which Keccak presumably is after the exhaustive analysis and testing it received prior to selection.

However there is another way to attack stored hashed passwords, exhaustive search using a file of likely passwords, perhaps with common modifications. Each candidate password is hashed and compared with the stored hash value until a match is found. For short enough passwords, all possible combinations can be searched. In this situation having a fast hash algorithm aids the attacker as she can test more possible passwords in a given amount of time and computing resources. 

One solution to this conundrum has been to hash the password multiple times, typically thousands of times or more. This practice, known as key stretching, makes the attacker's job more difficult by slowing down any search. Key stretching is typically done in software, the PBKDF2 algorithm being a common example. 

Having an especially fast hardware implementation makes Keccak less suitable for use in key stretching. A determined attacker who gets hold of the hashed passwords can build special purpose hardware to test each trial password much faster than the legitimate, software-based authentication system can.

One relatively quick solution to this disappointing outcome would be to review the NIST SHA-3 candidate pool for an algorithm that passed all security tests, but which was the slowest and most difficult to implement in hardware, perhaps as measured by gate count. This secondary selection could be done quickly using all the evaluation data already submitted.  Hopefully the algorithm's hardware vs software difficulty ratio would favor software to a greater degree than either SHA 1 or SHA-2 now do. If so, the selected algorithm would be a good a candidate to use for password security, a particularly vital application of cryptographic hashes. 

Tuesday, June 19, 2012

Just add salt


LinkedIn.com recently experienced a security breach that exposed 6.5 million LinkedIn hashed passwords. While the thieves did not publish the user names that correspond to each password hash, there is no reason to doubt they have that information. One commentator claims he was able to crack 2 million of the 6.5 million passwords using an old PC with no graphics processor, and widely available software.  

The stolen hashed passwords were not protected by salt,  technique that stores a random number  (the "salt") with each password hash. The salt is incorporated into the hash along with the user's password. This forces an attacker to crack each password separately, rather than cracking them all at once or using widely-available precomputed tables containing the hashes of common passwords.

Last week Vicente Silveira of LinkedIn posted a blog entitled "An Update On Taking Steps To Protect Our Members." While LinkedIn is to be commended for responding quickly in public, there are some details in that blog that I'd like to comment on.

Silveira says LinkedIn has notified users whose passwords they think may be at risk and says they don't believe others are at risk, but does not say what the basis for that assurance is. More concrete info from LinkedIn would be helpful.

He goes on to advise users who have not been contacted that "it is good practice to change your passwords on any website you log into every few months."  This has to be the silliest "I screwed up,  but it it's your fault" blame shifting ever. Hint: no one does this and no one ever will. It's like telling people whose credit card numbers have been stolen that they should have been getting new credit cards every few months anyway. 

Finally, the blog reports "We have built a world-class security team here at LinkedIn including experts such as Ganesh Krishnan, formerly vice president and chief information security officer at Yahoo!, who joined us in 2010. … Under this team’s leadership, one of our major initiatives was the transition from a password database system that hashed passwords, i.e. provided one layer of encoding, to a system that both hashed and salted the passwords, i.e. provided an extra layer of protection that is a widely recognized best practice within the industry. That transition was completed prior to news of the password theft breaking on Wednesday."

Salt is not "best practice," it's de minimis security these days, but at least the "world-class team" got the importance of salt and made it a "major initiative."

So what took them so long to implement this vital security measure? 

Converting an unsalted hashed password system to a salted system should be an afternoon's work for a good security programmer. Ok, I understand, this is a production system for a major Internet presence. So they'll need design reviews, code reviews, unit testing, some forensic work to make sure no programmer pokes into the password file instead of using the proper authentication calls, full-up tests, etc., etc. So bump that afternoon to a month's work with a small team. But surely not two years?

What to do

For other enterprises that are storing unsalted hashes,  here is a simple way to convert the entire password database:

Let's say the current system has a password file indexed by user name that for each user stores H(userpassword), where H is whatever hash algorithm is currently in use.  This entry is stored at account creation and whenever a user changes their password. 

We create a new file, indexed the same way, with entries: 

           version,  salt, Hnew (salt, H(userpassword)).

* salt is a random number generated for each user. It should be at least a 32,bit value, with 64 bit preferable. While high quality random numbers are generally needed in security applications, salt is one exception. All that is needed is that the value be distributed reasonably uniformly over the size of the salt field, so the likelihood of multiple passwords having the same salt is small. 

* Hnew is a high quality, cryptographic hash, such as a member of the NIST secure hash algorithm family. SHA1 is probably good enough, but since this is a new design, I would suggest SHA512.  If SHA3 is released before you act on this, it should be a good choice. 

*version is just a number (e.g. 2) so you can switch to a different algorithm in the future-- this is just good database practice. 

As long as the original hash, H, produces a wide enough output, 128 bits should be fine, 64 bits so so, the new hash plus salt will eliminate the threat of parallel attacks or rainbow tables. For extra credit, use a key strengthening hash such as PBKDF2 or scrypt. 

Wednesday, April 18, 2012

Passwords and the Fifth Amendment


Note: I am not a lawyer and cannot give legal advice. See a lawyer if you need legal advice.
Many people would like to believe that the Fifth Amendment to the U.S. constitution lets them keep their password or pass phrase a secret. "No person … shall be compelled in any criminal case to be a witness against himself" But the law is not that simple. On several occasions, the U. S. government has gotten courts to order defendants to decrypt their hard drive, rather than ask for the password itself.
The question of whether and when the U.S. government can force a criminal suspect to decrypt data has finally reached a higher court. On February 23, 2012, The U.S. Court of Appeals for the Eleventh Circuit issued a ruling in U.S. v. John Doe that limits the governments ability to force someone to decrypt their hard drives. As I read it, the ruling says the government can only demand a hard drive be decrypted if it already has some specific knowledge about the files contained on that drive, so that the act of producing them would not constitute testimony that the files exist.  
In January 2012, a federal district judge in Colorado ordered a criminal defendant to decrypt her laptop's hard drive. "I find and conclude that the Fifth Amendment is not implicated by requiring production of the unencrypted contents of the Toshiba Satellite M305 laptop computer."  https://www.eff.org/cases/us-v-fricosu http://news.cnet.com/8301-31921_3-57364330-281/judge-americans-can-be-forced-to-decrypt-their-laptops/
The appeals court distinguished its case from Colorado case (which is in a different circuit) because there the government had wiretaps which mentioned data on the defendant's laptop. The appeals court decision is worth reading if you're interested in this subject. It's at https://www.eff.org/sites/default/files/filenode/OpinionDoe22312.pdf


Monday, March 5, 2012

A simple fix for the WPS security hole

In mid February, I wrote about the huge hole in Wi-Fi security caused by the WiFi Protected Setup feature that's included in most new wireless routers.

Here is what I wrote: "It turns out that the WPS protocol breaks the 8-digit PIN into two have and tests them separately. A wrong PIN generates different errors depending on whether the first four digits failed to match or the last four were wrong. This lets an attacker test the two halves separately, a huge security gaff, cutting the maximum number of combinations to be tested from many millions to just 20,000."

Some routers let you turn WPS off, a good idea, but many popular brands do not. What's been done by the manufacturers whose routers are vulnerable and lack a way to turn off WPS? Not much. What's needed in most cases is a router firmware update. 

In the hope of prodding them along, I'm proposing a very simple fix for the WPS PIN vulnerability that should be easy to implement on any router.  It only requires adding six lines of source code and uses just one additional word of memory. No persistent storage or change to the user interface is required.  All it does is keep a count of how many consecutive failures to enter a valid PIN are detected. If that number exceeds some maximum, say 7, no more PIN entries will be accepted.  The count is reset whenever the router is turned off and then on.

Here are the needed software changes, in pseudocode:

Parameter macro declarations (this sets the maximum number of consecutive failed WPS code entries, 7 is a suggested value), add:

     WPS_FAIL_LIMIT = 7

Variable declarations, add:

     integer WPS_fail_count

Power up initialization code, add:

     WPS_fail_count = 0

Modify the firmware's test for successful WPS PIN entry, which presumably looks like something this:

     if (enteredPIN == storedPIN)  then register new device
         else handle bad PIN entry

to include WPS_fail_count  test: 

    if (enteredPIN == storedPIN AND WPS_fail_count <= WPS_FAIL_LIMIT)  then begin
                 WPS_fail_count = 0
                  register new device
                  end
          else begin
    If (WPS_fail_count <= WPS_FAIL_LIMIT) then WPS_fail_count = WPS_fail_count + 1
                 handle bad PIN entry
                 end

The altered code simply keeps track of how many bad PINs were added consecutively and does not allow a PIN to be registered if the found exceeds a limit. The failure limit is reset whenever power is turned off and on. Since users are familiar with cycling power as a way to clear router problems, no special user interface or documentation is required.  

This is simple and, in my opinion, foolproof. I hereby release my ideas contained in this post regarding fixes to WiFi browsers to the public domain as specified by the Creative Commons CC0 1.0 Universal declaration (http://creativecommons.org/publicdomain/zero/1.0/)