(The English version is in the bottom half of the page.)
Foreword
最近被問到關於密碼安全的問題,加上一直以來,在這方面的中文資源相對稀少,甚至有些錯誤的迷思,引發我決定寫此篇文章。
Tl;dr
如果你懶得讀以下這堆,這裡提供幾個建議:
Entropy
要如何得知密碼的安全性?一般來說,可以推估密碼的熵 (Entropy),也就是其不確定性。當然,嚴格說起,一個密碼(因為已經決定了)的 entropy 一定是 0,因此我們一般說密碼的 entropy 實際上是指密碼產生方式的隨機性。例如,一個六位數的數字密碼由於有 10^6 (~= 2^20) 種可能性,故在完全隨機的情形下,有約 20 bit 的 entropy。不過,人類的選擇常是容易預測的,造成 entropy 的降低。
How Much Is Enough?
你可能會想,很多網站都有密碼輸入錯誤幾次便暫時鎖定帳號、IP ,或是出現驗證碼,密碼便不需那麼安全。但前述措施並非萬能,例如 IP 鎖定能透過僵屍網路迴避,而驗證碼可由我在此文章中闡述的方法解讀。除此之外,許多對伺服器的攻擊都是取得『登入資料』的資料庫,而安全的密碼也能在這種情形下保護你。如此的原因是因為大部份注重安全的服務並不會將你的密碼以明文儲存,而是經由一個單向的 hash / KDF 將密碼轉換為一串無法倒推回原密碼的內容。因此,單單得到如是資料庫的黑客欲得到你的密碼,仍然要透過一般的暴力或字典檔破解。不過,在這種情境下,不僅破解速率會增加幾千幾萬倍(因為不再需要透過網路),當然也不受前述的限制。
若要防範上文之攻擊,密碼的 entropy 要多大?假設伺服器使用一個非常糟糕的 hash function:MD5。(其之所以糟糕的原因是它的速度。為了減緩黑客的破解速度,一般會使用較慢的 KDF,如 PBKDF2)目前市面上高階的顯示卡(GTX1080)大約可以做到 25 Gh/s = 2.5 * 10^10 hash / second。換言之,由小寫字母和數字組成的隨機 8 位密碼 (約 2.8 * 10^12 種可能性,相當於約 41 bit entropy)平均只要 50 秒便能破解,更不要說黑客很可能會使用快速的設備。另外,運算速度的進步也是必須考慮的一環。
因此,一般來說使用超過 70 bit entropy 的密碼也不為過,但當然,這取決於帳戶的敏感度和重要性,以及個人對安全性的要求。
Password Schemes
以下是一些常見且安全的密碼產生方式:
Completely Random
用 KeePassX 等程式,隨機產生一串密碼,能得到最大的 安全/長度 比例。一個由大小寫字母、數字、符號及空白鍵所組成的隨機密碼只需要 13 個字元便可達到 80 bit entropy。不過,對不少人來說,這較難以記憶,甚至有可能因此而將密碼寫下貼在螢幕旁,大大降低了安全性。
Diceware (aka XKCD method)
這方法在這則 XKCD 漫畫中出現,基本上就是隨機選取幾個英文單字,將它們串起來。雖然這方法一開始看起來十分不安全,容易遭到字典檔攻擊,但實際上假設你從 8000 個單字中隨機選取 6 個(可重複),即有 8000^6 ~= 2.6 * 10^23 種可能,約 78 bit entropy,和前述 13 字元的隨機字串相差無幾,而又更容易記憶(可編個故事)。不過,由於人類的決定十分不隨機,隨機地挑選這些單字便十分重要。Diceware 的官網提供一個有 6^5 = 7776 個單字的列表,其中每個單字都有一串對應的數字 (1 ~ 6)。使用的方式是對每個想要選的單字,擲 5 次骰子,並以結果查詢該列表,得到相應之單字。
Others
說實在,其它方法,只要不是隨機產生的,都有安全性的疑慮。正如前文所言,我們做的決定經常是可預測的。除此之外,有些人會建議使用一個很長的片語或句子(如史諾登舉的例子 MargaretThatcheris110%SEXY
)、該句子的單字字首組合(如 Bruce Schnier 的 This little piggy went to market
-> tlpWENT2m
),或用中文輸入法的鍵位打出中文句子。如此雖然已經比一般人使用的密碼安全不少,但仍有以下的問題:
- 很可能和個人喜好…等相關
- 難以保證品質(許多人仍會選擇簡單易猜的片語)
- 字首和讀音的出現頻率並非均勻分佈
- …等等
Why Not Reuse Passwords?
你可能會想,如果那麼麻煩,乾脆把每個網站的密碼都設為相同。不過,若一個網站受到攻擊(例如密碼資料庫流出),而你的密碼因此受到破解(前文已指出離線破解 hash 的速度十分快),你的其它帳戶的登入資訊便也遭洩漏。甚至如果該網站直接使用明文儲存密碼(而這是經常發生的事),連離線破解的麻煩也省去了。另外,如果該網站(或小部份的員工)是惡意的,它便能夠直接截錄你的密碼,並試圖以其登入你其它的帳戶。一言以蔽之,洩漏密碼的方式五花八門,如果所有的帳戶的密碼均相同,會使風險大大提升。
Password Managers
不過,眾多帳戶的不同密碼,很快就會變得難以記憶。因此,許多安全專家均建議使用密碼管理程式。基本上,便是將你的所有密碼統一存在一個資料庫中,而該資料庫受到一個主密碼的加密。如此,便只要記憶該主密碼。以下是推薦的幾個密碼管理程式,其均為 open source,且也同時具有電腦和手機版。
這程式的原理基本上是將你的姓名、主密碼,和網站名稱做一些不可逆的運算,為各網站產生密碼。這樣的優點是完全不需同步,也可以在各裝置上取用你的密碼。除此之外,為了因應網站自動產生的密碼,它也具備儲存自訂密碼的功能,但當然就需要手動在裝置間同步了。
此程式是我目前正在使用的。它採用了最傳統的方式,也就是一個加密的本機資料庫。優點是安全性較高。雖然欲在各裝置上使用需要手動同步資料庫,但透過 Dropbox、SFTP 等方式,仍然不失方便性。另外,Android 上建議使用 Keepass2Android,內建的同步功能可少掉很多麻煩。
Misc
Should I change my passwords often?
許多人都會建議經常更改密碼。不過,需要注意的是,這樣幾乎不會降低密碼被破解的機率。(詳見這篇 Paper)基本上,除非得知自己的密碼有外洩的可能(例如伺服器資料庫流出、使用較不被信任的電腦登入…等),否則其實不必刻意常常更改密碼,畢竟如果被要求一直更改,許多使用者反而會為了避免忘記或麻煩而採用較不安全的密碼。
Isn’t writing passwords down unsafe?
另一個迷思是你不該把密碼寫下來。簡單來說,把密碼寫下然後貼在螢幕旁(或是其它時常不被看管的地方)顯然是非常糟糕的。但是,如果放在幾乎不離身的皮夾或錢包裡,其實算是頗安全的,畢竟我們早就習慣盯緊自己的錢包以及裡面的紙。這樣的好處是不會因為顧忌遺忘,而將密碼的安全性降低。
Foreword
Recently, I’ve been asked questions about password security, inspiring me to write this article.
Tl;dr
If you’re too lazy to read the following article, here are some short suggestions:
- Use Master Password to manage your passwords
- For the main password of the password manager, use Diceware to generate a password of at least 6 words long
Entropy
How do we rate the security of a password? In general, we can estimate its entropy, i.e. its uncertainess. Of course, strictly speaking, a password, since it’s already determined, always has an entropy of 0. Thus, normally when we talk about the entropy of passwords, we are actually talking about the randomness of the method we use to generate the password. For example, since there are 10^6 (~= 2^20) possibilities of a 6-digit numeric password, there are about 20 bits of entropy in the case that it is randomly generated. However, the choices humans make are usually easily predictable, causing a decrease in entropy.
How Much Is Enough?
You might think that since a lot of websites temporary lock accounts or IPs after several failed password attempts or require a CAPTCHA be completed, the password need not be so secure. However, the mechanisms mentioned above are not silver bullets. For example, IP bans can be bypassed by the use of botnets, and CAPTCHAs can be read by the method I wrote about in this article. In addition, a lot of server hacks leak the database of login credentials, and safe passwords can also help in this scenerio. The reason is that most security-minded sevices does not store your password in plain text, but instead turn them into data that can not be reversed back to the original password via an one-way hash / KDF. As such, a cracker that obtains the database still needs to use normal brute force or dictionary attacks to get your password. However, in this scenerio, not only would the cracking speed be increase by several magnitutes (since networking is not required anymore), the limits mentioned above would not apply.
How large does the entropy of a password have to be to prevent such an attack? Consider a server using MD5, a terrible hash function. (Mainly due to its speed. Normally, a slower KDF, like PBKDF2, is used to slow down the speed of crackers) Currently, high spec GPUs (GTX1080) can achieve about 25 Gh/s = 2.5 * 10^10 hashes / second. In other words, a 8-character password consisting of lower case letters and numbers (about 2.8 * 10^12 possibilities, amounting to about 41 bits of entropy) can be cracked in 50 seconds on average, not to mention that the cracker is highly likely to use faster equipment. Also, the advancement in computing speed is another factor to be considered.
Thus, it is completely reasonable to use a password with over 70 bits of entropy. For course, this depends on the sensitivity and importance of accounts, and personal security requirements.
Password Schemes
Here are some common and secure ways to generate passwords:
Completely Random
Using programs like KeePassX to generate a password randomly can achieve the highest security / length ratio. A password consisting lower and upper case letters, numbers, and punctuation along with spaces only needs 13 characters to reach 80 bits of entropy. However, for many this is rather hard to remember, and even makes some people write their password down and stick it to their screens, thereby decreasing security.
Diceware (aka XKCD method)
This method appeared in this XKCD strip. Basically, you randomly choose a few English words and connect them together. At first, this method may seem insecure and prone to dictionary attacks, but actually, if you randomly choose 6 words out of 8000 (with repetition allowed), there are 8000^6 permutations, amounting to about 78 bits of entropy. This is close to the 13-character random string mentioned above, while being easier to remember (by making up a story). However, since human choices are far from random, a method of choosing these words randomly becomes important. The offcial page of Diceware provides a wordlist containing 6^5 = 7776 words, each of which corresponds to a sequence of digits (1 ~ 6). To use it, roll a dice 5 times for each word, and match the results to a word in the list
Others
To be frank, other methods, as long as it’s not randomly generated, have security issues. As mentioned above, our decisions are often predictable. In addition, some suggst using a long phrase or sentence (like Edward Snowden’s example MargaretThatcheris110%SEXY
) or the combinations of initials of said sentence (like Bruce Schnier’s This little piggy went to market
-> tlpWENT2m
). Although this is way safer than the passwords normal people use, there are the following problems:
- Highly possible to be connected with personal preferences …etc
- Hard to ensure quality (a lot of people will still choose easy-to-guess phrases)
- Initials are not uniform distributions
- …etc
Why Not Reuse Passwords?
You might think that if there’s so much fuss, why not set the passwords to all the websites the same? However, if one website is attacked (like a leaked password database), and your password is cracked because of this (as mentioned above, offline hash cracking speeds are very fast), all of the credentials of your other accounts are compromised as well. If the website stores the passwords in plain text, which happens often, they won’t even need to be cracked. In addition, if the website (or some of its employees) is malicious, it can directly capture your password, and try to log in your other accounts with it. In short, there are countless ways to leak passwords. If the passwords to all the accounts are the same, the risk is greatly increased.
Password Managers
However, different passwords to lots of accounts can soon become difficult to remember. Thus, many security experts recommend the use of password managers, which basically save all your passwords in one database, which is protected by a single master password. This way, only the master password needs to be memorized. The following are a few recommended password managers which are open source and provide both desktop and mobile versions.
How this program works is by doing irreversable calculations to your name, master password and website name, and use the results to generate passwords for each website. The advantage of this method is that no synchronizations are needed to use the passwords on different devices. Also, to deal with the passwords generated by the websites, it also has the functionality to save custom passwords, which, of course, need to be synchronized manually.
This program is what I’m using now. It uses the most traditional way, an encrypted local database. An advantage is better security. Though you need to synchronize the database manually for usage on multiple devices, it’s still pretty convenient to use methods like Dropbox, SFTP, …etc. Also, the recommended app on Android is Keepass2Android, which has synchronization features built in, saving a lot of trouble.
Misc
Should I change my passwords often?
Many people recommend changing your passwords often. However, it should be noted that this hardly reduces the possibility of your passwords being cracked. (See this paper for details) Bascially, unless you realize that your password might have been leaked (e.g. a leak of the server database, logging in via an untrusted computer, …etc), there’s not really a need to change it often. After all, being asked to change passwords all the time, many users would just use insecure ones in case they’re forgotten.
Isn’t writing passwords down unsafe?
Another misconception is that you shouldn’t write your passwords down. Simply put, writing them down and sticking them next to your screen (or other unguarded places) is obviously very bad. On the other hand, placing it in a wallet that rarely leaves your sight is reasonably safe, since we are used to keep a close eye on our wallets and the paper inside. An advantage of doing so is that you won’t be using insecure passwords because of the risk of forgetting them.