Password store paradigm
The very first thing
We should always remind ourself is that, we should not store passwords yourself. At least, we should try out best to avoid it.
As a User
Password manager and 2FA is strongly suggested. However, it is not the thing I want to discuss today.
As a develop
As a developer, we are unavoidably need to store user’s password.
Storing in plain text
The most naive way is to store user’s password in plain text. Once the database is leaked or have insider, all user’s password is leaked. Sadly, many people in the world are using the same username and password in different website. And thats why we should use password manager as a user.
Encrypt the password
It is a little bit better than store it as plain text. However, it still incredibly easy to get things wrong. Imagine that the database is leaked therefore Hacker have the whole database offline. They can see the encrypted code aka the ciphertext. Under lots of encryption algorithm, same text will generate same ciphertext. If it is a very large database, it most likely have many same password (especially for some easy, common passwords). The scariest part is that, some times the reset password system may also store “hints” of the password. Imagine there are 20 same encrypted password, that means I will have 20 different hints point to a same password.
Hashing (without salts)
hash(m)
Hashing and store the hashed password almost get the things right. However, it still can go wrong with some old hashing algorithm. let me introduce a idea, rainbow table: pre-computed hash chains. Improve on the dictionary attack to trade time for space. It is a common and strong approach to crack hash nowadays. By matching the hashed cipher text and rainbow table, hacker can easily find some of the correct match of password. If the database is compromised. Although it is the intruder who gets the hash value, it is also easy to restore the password plaintext in bulk due to the existence of rainbow tables.
Hashing (with salts)
The rainbow table is generated for a specific function H. If H changes, the existing rainbow table data is completely unusable. If using salt value, then a different rainbow table must be generated for each user. It greatly increases the difficulty of cracking. And the best practice is to use a different salt for each user since it is worth mentioning that, the tensor computing provided by display card to highly speed up the hack cask. Which make it even more danger now and in the future.
A practices that I have implement is make use of the fact that username is usually cannot change. Use username for as the salt.
Upgrading the old method from the past
It is common that we need to upgrade the hashing algorithm form the past but at the same time do not want to affect the user. For plain text, it just need to hash the password. And so is the encryption, it is just need to decrypt the password before hashing.
What if we already have hashed password? The fact that we cannot restore the password because hashing is a many-to-one compression. The solution is salt and pepper.
- m: message, password in this case
- h1: the outdated hash algorithm
- h2: the modern hash algorithm
- salt: salt is not secret (merely unique) and can be stored alongside the hashed output
- pepper: pepper is secret and must not be stored with the output.
h2(h1(m)+pepper)
, salt is optional in this case. the h1 might have salted already.
hash(m+salt)
it must be safe to ensure that each user’s salt is different.