Beamin:
As for apis comment: hashing is not encryption
(1). With proper security practices passwords are not encrypted — they are not stored at all. What is stored is a value derived from the password, which has a property of being very costful to be turned back into any matching password
(2) — not to mention the original one. This feature, which would not be possible with encryption, is the crucial part of the security mechanism. No matter what information the attacker acquires — even all of it — they shouldn’t be able to cheaply obtain the password. When you enter a password to a [properly secured] service, they calculate that value from your password and only that is compared to what is stored in the database.
Encryption is not suitable for that purpose. Even if you do not know the key or even the algorithm itself(!), the passwords may still be recovered — even the lengthy ones
(3). Usually it is very expensive, but sometimes — paired with other information — it is surprisingly easy. As a very prominent example let’s take the famous Adobe Crossword: a leak of 150M encrypted passwords from Adobe. In this case the problem of finding the passwords was so trivial, that you may try doing that yourself, because the way encryption works produced
a kind of a crossword (hence the name of the leak) with the associated hints suggesting the value
.
XKCD 1286 “Encryptic” visually explains how this can be extended to be used with longer passwords. Along the
ECB penguin this is one of the most amazing cryptofail examples, which can be understood by people normally not associated with computer science or security.
So what should be used instead of encryption? While historically hashes were advised, this is suggestion outdated by a decade. The systems, which still use hashes, will probably hold a bit. But this is no longer the right solution. And it isn’t important if you use MD5 or SHA3. What should be used are KDFs (Key Derivation Functions). While, strictly speaking, KDFs and hashes are approximately equivalent
(4), in practice the terms are used to refer to functions with a bit different characteristics. Hashes are designed to use possibly little resources; KDFs are meant to be very resource hungry. If you want to understand why, consider that with proper software and hardware you may calculate between tens of millions (a decent PC, using CPU alone) to trillions
(5) (specialized hardware) of hashes per second. For comparison the design of KDFs puts a very strong limitation on how many of them can be calculated: if you wish, down to a dozen passwords per
minute(6). And it’s not easy to circumvent that limitation with FPGAs or even ASICs. Examples of such functions are scrypt, bcrypt or PBKDF2.
____
(1) While in this context we’re implicitly assuming cryptographic hashes, in fact hashing alone is not even a part of cryptography: they just have a common part.
(2) There may be infinitely many inputs that match any such value. Which leads to an interesting effect: there are many passwords that can be used to log-in to your account.
(3) Yet non-dictionary passwords are prohibitively expensive to be extracted this way.
(4) Hash functions were used as KDFs, and KDFs are formally producing a hash.
(5) Short scale; that is 1e+12.
(6) Though you do not want something that extreme in an online service, obviously.