Security: Password Hashing

In this article I'm going to cover password hashing, a subject which is often poorly understood by newer developers. Recently I've been asked to look at several web applications which all had the same security issue - user profiles stored in a database with plain text passwords. Password hashing is a way of encrypting a password before it's stored so that if your database gets into the wrong hands, the damage is limited. Hashing is nothing new - it's been in use in Unix system password files since long before my time, and quite probably in other systems long before that. In this article I'll explain what a hash is, why you want to use them instead of storing real passwords in your applications, and give you some examples of how to implement password hashing in PHP and MySQL.

Foreword

As you read on you'll see that I advocate the use of a hashing algorithm called Secure Hashing Algorithm 1 (or SHA-1). Since I wrote this article, a team of researchers - Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu - have shown SHA-1 to be weaker than was previously thought. This means that for certain purposes such as digital signatures, stronger algorithms like SHA-256 and SHA-512 are now being recommended. For generating password hashes, SHA-1 still provides a more than adequate level of security for most applications today. You should be aware of this issue however and begin to think about using stronger algorithms in your code as they become more readily available.
For more information please see Bruce Schneier's analysis of the issue at http://www.schneier.com/blog/archives/2005/02/cryptanalysis_o.html

What Is A Hash?

A hash (also called a hash code, digest, or message digest) can be thought of as the digital fingerprint of a piece of data. You can easily generate a fixed length hash for any text string using a one-way mathematical process. It is next to impossible to (efficiently) recover the original text from a hash alone. It is also vastly unlikely that any different text string will give you an identical hash - a 'hash collision'. These properties make hashes ideally suited for storing your application's passwords. Why? Because although an attacker may compromise a part of your system and reveal your list of password hashes, they can't determine from the hashes alone what the real passwords are.

So How Do I Authenticate Users?

We've established that it's incredibly difficult to recover the original password from a hash, so how will your application know if a user has entered the correct password or not? Quite simply - by generating a hash of the user-supplied password and comparing this 'fingerprint' with the hash stored in your user profile, you'll know whether or not the passwords match. Let's look at an example:

User Registration And Password Verification

During the registration process our new user will provide their desired password (preferably with verification and through a secure session). Using code similar to the following, we store their username and password hash in our database:

          Figure 1. Our user enters their preferred access details
/* Store user details */

$passwordHash = sha1($_POST['password']);

$sql = 'INSERT INTO user (username,passwordHash) VALUES (?,?)';
$result = $db->query($sql, array($_POST['username'], $passwordHash));

?>
The next time our user logs in, we check their access credentials using similar code as follows:

                           Figure 2. Logging back in
/* Check user details */

$passwordHash = sha1($_POST['password']);

$sql = 'SELECT username FROM user WHERE username = ? AND passwordHash = ?';
$result = $db->query($sql, array($_POST['username'], $passwordHash));
if ($result->numRows() < 1)
{
    /* Access denied */
    echo 'Sorry, your username or password was incorrect!';
}
else
{
    /* Log user in */
    printf('Welcome back %s!', $_POST['username']);
}

?>

Types Of Hashes

There are a number of strong hashing algorithms in use, the most common of which are MD5 and SHA-1. Older systems - including many Linux variants - used Data Encryption Standard (DES) hashes. With only 56 bits this is no longer considered an acceptably strong hashing algorithm and should be avoided.

Examples

In PHP you can generate hashes using the md5() and sha1 functions. md5() returns a 128-bit hash (32 hexadecimal characters), whereas sha1() returns a 160-bit hash (40 hexadecimal characters). For example:
$string = 'PHP & Information Security';
printf("Original string: %s\n", $string);
printf("MD5 hash: %s\n", md5($string));
printf("SHA-1 hash: %s\n", sha1($string));

?>
This code will output the following:
Original string: PHP & Information Security
MD5 hash: 88dd8f282721af2c704e238e7f338c41
SHA-1 hash: b47210605096b9aa0129f88695e229ce309dd362
In MySQL you can generate hashes internally using the password(), md5(), or sha1 functions. password() is the function used for MySQL's own user authentication system. It returns a 16-byte string for MySQL versions prior to 4.1, and a 41-byte string (based on a double SHA-1 hash) for versions 4.1 and up. md5() is available from MySQL version 3.23.2 and sha1() was added later in 4.0.2.
mysql> select PASSWORD( 'PHP & Information Security' );
+------------------------------------------+
| PASSWORD( 'PHP & Information Security' ) |
+------------------------------------------+
| 379693e271cd3bd6                         |
+------------------------------------------+
1 row in set (0.00 sec)

mysql> select MD5( 'PHP & Information Security' );
+-------------------------------------+
| MD5( 'PHP & Information Security' ) |
+-------------------------------------+
| 88dd8f282721af2c704e238e7f338c41    |
+-------------------------------------+
1 row in set (0.01 sec)
Note: Using MySQL's password() function in your own applications isn't recommended - the algorithm used has changed over time and prior to 4.1 was particularly weak.
You may decide to use MySQL to calculate your hash rather than PHP. The example of storing our user's registration details from the previous section then becomes:
/* Store user details */

$sql = 'INSERT INTO user (username, passwordHash) VALUES (?, SHA1(?))';
$result = $db->query($sql, array($_POST['username'], $_POST['password']));

?>

Weaknesses

As a security measure, storing only hashes of passwords in your database will ensure that an attacker's job is made that much more difficult. Let's look at the steps they'll now take in an effort to compromise your system. Assuming that they've managed to access your user database and list of hashes, there's no way that they can then recover the original passwords to your system. Or is there?
The attacker will be able to look at your hashes and immediately know that any accounts with the same password hash must therefore also have the same password. Not such a problem if neither of the account passwords is known - or is it? A common technique employed to recover the original plain text from a hash is cracking, otherwise known as 'brute forcing'. Using this methodology an attacker will generate hashes for numerous potential passwords (either generated randomly or from a source of potential words, for example a dictionary attack). The hashes generated are compared with those in your user database and any matches will reveal the password for the user in question.
Modern computer hardware can generate MD5 and SHA-1 hashes very quickly - in some cases at rates of thousands per second. Hashes can be generated for every word in an entire dictionary (possibly including alpha-numeric variants) well in advance of an attack. Whilst strong passwords and longer pass phrases provide a reasonable level of protection against such attacks, you cannot always guarantee that your users will be well informed about such practices. It's also less than ideal that the same password used on multiple accounts (or multiple systems for that matter) will reveal itself with an identical hash.

Making It Better

Both of these weaknesses in the hashing strategy can be overcome by making a small addition to our hashing algorithm. Before generating the hash we create a random string of characters of a predetermined length, and prepend this string to our plain text password. Provided the string (called a "salt") is of sufficient length - and of course sufficiently random - the resulting hash will almost certainly be different each time we execute the function. Of course we must also store the salt we've used in the database along with our hash but this is generally no more of an issue than extending the width of the field by a few characters.
When we validate a user's login credentials we follow the same process, only this time we use the salt from our database instead of generating a new random one. We add the user supplied password to it, run our hashing algorithm, then compare the result with the hash stored in that user's profile.
define('SALT_LENGTH', 9);

function generateHash($plainText, $salt = null)
{
    if ($salt === null)
    {
        $salt = substr(md5(uniqid(rand(), true)), 0, SALT_LENGTH);
    }
    else
    {
        $salt = substr($salt, 0, SALT_LENGTH);
    }

    return $salt . sha1($salt . $plainText);
}

?>
Note: The function above is limited in that the maximum salt length is 32 characters. You may wish to write your own salt generator to overcome this limit and increase the entropy of the string.
Calling generateHash() with a single argument (the plain text password) will cause a random string to be generated and used for the salt. The resulting string consists of the salt followed by the SHA-1 hash - this is to be stored away in your database. When you're checking a user's login, the situation is slightly different in that you already know the salt you'd like to use. The string stored in your database can be passed to generateHash() as the second argument when generating the hash of a user-supplied password for comparison.
Using a salt overcomes the issue of multiple accounts with the same password revealing themselves with identical hashes in your database. Although two passwords may be the same the salts will almost certainly be different, so the hashes will look nothing alike.
Dictionary attacks with pre-generated lists of hashes will be useless for the same reason - the attacker will now have to recalculate their entire dictionary for every individual account they're attempting to crack.

Summary

We've seen now what hashes are and why you should store them instead of the plain text passwords they represent in your database. The examples above are a starting point and will get you on the right track with using hashes in your PHP applications. A little bit of work now may well mean much less of a headache further down the track!

MySQL 4.1+ using old authentication

When I was working with XAMPP in Ubuntu and asked write PHP script to connect to remote MySQL server which is using PASSWORD hash function to save the password for user, and I found following error.

Warning: mysql_connect() [function.mysql-connect]: Premature end of data (mysqlnd_wireprotocol.c:554) in path/to/the/file/where/connection/script/is/written/

Warning: mysql_connect() [function.mysql-connect]: OK packet 1 bytes shorter than expected in path/to/the/file/where/connection/script/is/written/

Warning: mysql_connect() [function.mysql-connect]: mysqlnd cannot connect to MySQL 4.1+ using the old insecure authentication. Please use an administration tool to reset your password with the command SET PASSWORD = PASSWORD('your_existing_password'). This will store a new, and more secure, hash value in mysql.user. If this user is used in other scripts executed by PHP 5.2 or earlier you might need to remove the old-passwords flag from your my.cnf file in path/to/the/file/where/connection/script/is/written/

As you will see, the core issue here is that MySQL can have passwords with hashes stored in the old 16-character format, which is not supported by PHP 5.3′s new mysqlnd library.
Since I couldn’t find a good solution with a quick Google, here is how I solved this without having to downgrade PHP or MySQL (as some of the solutions suggested):

1. Change MySQL to NOT to use old_passwords
It seems that even MySQL 5.x versions still default to the old password hashes. You need to change this in “my.cnf” (e.g. /etc/my.cnf): remove or comment out the line that says
old_passwords = 1
Restart MySQL. If you don’t, MySQL will keep using the old password format, which will mean that you cannot upgrade the passwords using the builtin PASSWORD() hashing function. You can test this by running:
 
mysql> SELECT Length(PASSWORD('xyz'));
+-------------------------+
| Length(PASSWORD('xyz')) |
+-------------------------+
|                      16 |
+-------------------------+
1 row in set (0.00 sec)

The old password hashes are 16 characters, the new ones are 41 characters.
2. Change the format of all the passwords in the database to the new format
Connect to the database, and run the following query:
mysql> SELECT user,  Length(`Password`) FROM `mysql`.`user`;

This will show you which passwords are in the old format, ex:
+----------+--------------------+
| user     | Length(`Password`) |
+----------+--------------------+
| root     |                 41 |
| root     |                 16 |
| user2    |                 16 |
| user2    |                 16 |
+----------+--------------------+
Notice here that each user can have multiple rows (one for each different host specification).
To update the password for each user, run the following:
UPDATE mysql.user SET Password = PASSWORD('password') WHERE user = 'username';
Finally, flush privileges:
FLUSH PRIVILEGES;