Weak passwords are an easy way attackers use to get into systems or networks. All internet exposed servers, website login forms and email addresses are constantly being scanned by automatic scripts trying to “guess” passwords. If any of your passwords are easy to guess, or on the “most common” lists, your organization will have a breach soon.

Attack types

From a network defense and incident response perspective, the question is to understand which credentials are routinely tried to break into accounts (so we can avoid using them). These are of two types:

  • Non-targeted attacks: These  use lists of most common passwords that are widely available. There are many of such lists available, with the best source being SecLists (https://github.com/danielmiessler/SecLists) by Daniel Miessler and Jason Haddix.
  • Targeted attacks: These use lists of already compromised credentials for a particular organization. This is a problem in large companies, where staff members may use the same corporate email and associated password to create third party accounts (Instagram, Zalopay…etc). If Instagram or Zalopay gets hacked and the password falls into the wrong hands, attackers will attempt to reuse them on corporate services.

These two types are related, the list of most common passwords are generated by aggregating hundreds of database leaks. However looking at the current top 15 most commonly used passwords (at the time of writing – May 2018) we have: [111111, 1234, 12345, 123456, 1234567, 12345678, abc123, dragon, iloveyou, letmein, monkey, password, qwerty, tequiero, test]. At first sight, this list is very US centric (all words are English or Spanish). So at DFIR VN labs, we asked ourselves the question:

“what are the most common passwords used by Vietnamese speakers?”

 

Surely it will contain Vietnamese words or be based around numbers as users find vietnamese special characters and tones difficult to type on password fields (as autocorrect can not be used).

Vietnamese leaked credentials

Where to start? Leaked credentials on the .vn TLD are rare. The most common domain name seen on historic leaked credentials is “yahoo.com.vn” and those leaks are under 0.04% of all accounts leaked worldwide (barely 1.2 million accounts our of 3.5 billion leaks worldwide on the DFIR VN lab databases). Other common domains are zing.vn (with 70,000 leaks) and gmail.com.vn (with about 20,000). This is quite common, as most users will use email addresses that are not on the .vn TLD (gmail.com…etc). This would, of course, prevent any analysis, as we will have no way to know whether those users are Vietnamese or not (without additional information).

But this changed at the end of April 2018 (barely 5 days before this article was written).

As reported in the local news, a large database with 163 million credentials was leaked form VNG Corporation (https://tuoitrenews.vn/news/business/20180428/vietnams-tech-giant-vng-apologizes-after-alleged-data-breach-affecting-163mn-accounts/45340.html). This is quite a large breach and the corpus (for a total of 34GB, now publicly available in torrent sites and database leaks forums) is big enough to perform some statistical analysis.

Not all accounts on the set contained email addresses, as the majority of the accounts only had usernames. Only 25 million distinct emails were on the database. For individual users, if you want to check whether your account is within that 25 million subset you should check the excellent site “Have I been pwned” run by Troy Hunt, who added the VNG dataset barely two days after the leak was published.

A sample of the lines retrieved (with heavy redactions to mask confidential information) is below:

84988484,[redacted],504865,25F9E794323B453885F5181F1B624D0B,,[redacted],13,quehuong,1,789456123,1955-08-24 00:00:00.0,63,[redacted]6789,ha nam,,18,,2010-05-28 10:20:58.813,21,2010-05-28 10:16:35.86,,98560,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
84988485,[redacted],768,43B405A57C86F22A06ABD75824B841E5,,,,,,,,,,,,,,,,,2010-05-28 09:36:05.893,3,2010-05-28 09:36:05.893,,768,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
84988511,[redacted],1570337,E36A2F90240E9E84483504FD4A704452,06DC67758E6BD6F8B089AEE4A915441E,[redacted],16,d9b8e8f09aea13fab32b8b75dce76192,1,b[redacted]
84988670,[redacted],768,E807F1FCF82D132F9BB018CA6738A19F,,,,,,,,,,,,,,,,,2010-05-28 08:23:50.237,50,2010-05-28 08:23:50.237,,768,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
84988734,[redacted],390689,B461C2E02715AFCE5A69C861D468A285,5D5F9EAE00052AE36A35DEE977902BB8,[redacted],13,00c0498b0603bc0b3e72446ae5f41ec9,1,[redacted]1992-10-29 00:00:00.0,,ha noi,,43,,2011-12-09 18:09:22.907,56,2010-05-28 08:14:46.69,,768,0,2,,,,,,,,2011-12-09 18:08:50.357,2011-12-09 18:08:50.357,,,,,,,,,,,,,,,,,,,,
84988778,[redacted],768,D6CD7880933606CAB470D822596E20DC,,,,,,,,,,,,,,,,,2010-05-28 15:20:54.233,5,2010-05-28 15:20:54.233,,768,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
84988863,[redacted],768,25F9E794323B453885F5181F1B624D0B,,,,,,,,,,,,,,,,,2010-05-28 05:59:28.317,16,2010-05-28 05:59:28.317,,768,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
84988885,[redacted],1555491,30047333851ED60047066A5FD566C11A,,[redacted],10,205d183506fb596a2cebc399d72a3d28,,012782037,,,[redacted],00:00:00.0,01262262201,[redacted],,43,,2012-07-04 12:27:20.81,38,2010-05-28 06:55:08.05,,98560,1,8,,,,,,,,2011-12-22 10:40:10.92,2011-07-18 17:16:12.103,2011-12-22 10:52:53.21,,,,,,,,,,,,,,,,,,,

Marked in bold are the passwords as encoded on the VNG database. Unfortunately for VNG users the passwords were stored using the MD5 algorithm. This algorithm is quite old (design by Ron Rivest in 1991) and has the advantage of being very fast. The operation to verify that the string “password” corresponds to the MD5 encoding “5f4dcc3b5aa765d61d8327deb882cf99” takes a minuscule amount of time on modern hardware. Optimized programs running on graphics cards (CUDA, OpenCL) can easily calculate about 1 billion hashes per second.

However, this blazingly fast speed is NOT a good thing when safely storing passwords. While the service authenticating users will go slightly faster (and probably will save a few seconds of CPU power per day), if the database is leaked, attackers can then bruteforce the list with dictionaries at very high speeds. A better choice for password algorithm would be Argon2 (which was the winner of the 2015 Password Hashing Competition), pbkdf2 or bcrypt (which have more widespread library support). This algorithms are purposedly slower to prevent attackers from succesfully obtaining the decoded passwords.

Dehashing the VNG database

A quick analysis on the VNG database showed that password reuse was very common. The 163 million accounts only used about 33.8 million distinct passwords. These hashes were extracted on to a separate list.

Once the hashes have been extracted from the database we needed to brueoforce them. The most efficient way is to use hashcat on a system with a suitable graphics card. For this exercise we used (in order to limit ourselves) to  24 hours’ worth of Amazon Web Services time p2.xlarge which runs a single Nvidia K80 GPU. The cost of the exercise was $0.30 per hour, so approx 175,000 VND in total for the day.

On those 24 hours we did run several cracking sessions with different settings and dictionaries. The average cracking speed was 250 million MD5 hashes per second.

Session..........: hashcat
Status...........: Exhausted
Hash.Type........: MD5
Hash.Target......: ../VNG_md5.txt
Time.Started.....: Mon Apr 30 17:09:20 2018 (13 hours, 21 mins)
Time.Estimated...: Tue May 1 06:30:34 2018 (0 secs)
Guess.Base.......: File (../wordlist/passwordlist.txt)
Guess.Mod........: Rules (rules/OneRuleToRuleThemAll.rule)
Guess.Queue......: 1/1 (100.00%)
Speed.Dev.#1.....: 227.5 MH/s (1.08ms) @ Accel:32 Loops:16 Thr:512 Vec:1
Recovered........: 16589604/33808350 (49.07%) Digests, 0/1 (0.00%) Salts
Recovered/Time...: CUR:1813,112731,N/A AVG:15002,900130,21603142 (Min,Hour,Day)
Progress.........: 12382037305000/12382037305000 (100.00%)
Rejected.........: 0/12382037305000 (0.00%)
Restore.Point....: 238139000/238139000 (100.00%)
Candidates.#1....: $HEX[7a756b616969613537] -> $HEX[f67a6c656d2a]
HWMon.Dev.#1.....: Temp: 67c Util: 66% Core: 875MHz Mem:2505MHz Bus:16

Started: Mon Apr 30 16:55:24 2018
Stopped: Tue May 1 06:30:36 2018

The end result was that 22.3 million distinct passwords were cracked (65.9% of the total of distinct passwords). Given that the most common passwords are typically easier to guess, the list obtained did correspond to  which did also correspond to 131.9 million credentials on the full list (81% of all accounts).

So, what are the most common 100 passwords on this Vietnamese-centric leak:

Top 100 passwords Vietnam

RankPasswordFrequency (percentage)
112345612.24%
21234567893.15%
31231231.65%
41111111.02%
5anhyeuem0.57%
612345670.51%
701234567890.43%
801234560.33%
9123456780.32%
100000000.26%
11asdasd0.24%
12252513250.23%
1312345678900.23%
141212120.16%
151233210.16%
16zxcvbnm0.15%
17qweqwe0.12%
184567890.12%
191122330.12%
20aaaaaa0.12%
211231231230.11%
229876543210.10%
23111111110.10%
24qwerty0.10%
251472583690.10%
26maiyeuem0.09%
27123qwe0.09%
286543210.09%
29iloveyou0.09%
301236540.08%
319999990.08%
32qqqqqq0.08%
3311111110.07%
341472580.07%
35hota4070.07%
36anhtuan0.06%
372222220.06%
381597530.06%
39112233440.05%
40anhnhoem0.05%
41anh1230.05%
421593570.05%
43qwertyuiop0.05%
44asd1230.05%
4509876543210.05%
46emyeuanh0.05%
47mmmmmm0.05%
48123450.04%
496666660.04%
50anhanh0.04%
511237890.04%
52phuong0.04%
531112220.04%
54qweasd0.04%
55hanoiyeudau0.04%
56nguyen0.04%
577894560.04%
5811111111110.04%
59mylove0.04%
607894561230.04%
61190015600.04%
62qwe1230.04%
63asdfghjkl0.04%
64pppppp0.04%
65anhhung0.04%
6612345600.03%
67abc1230.03%
68maiyeu0.03%
69123456a0.03%
70zzzzzz0.03%
71quangninh0.03%
729876540.03%
735555550.03%
74tuananh0.03%
75asasas0.03%
76asdfgh0.03%
77zxcvbn0.03%
783213210.03%
79tinhyeu0.03%
801478523690.03%
814561230.03%
82matkhau0.03%
831478520.03%
84123456789100.03%
85thienthan0.03%
86anhyeu0.03%
871111111110.03%
88toilatoi0.03%
8910cham00.03%
9001472583690.03%
914564560.03%
92khongbiet0.03%
937897890.03%
94a1234560.03%
953333330.03%
968888880.03%
971236547890.03%
98truong0.03%
99maimaiyeuem0.03%
100hhhhhh0.03%

We can see the following patterns:

  • Numeric passwords (52 out of 100), typical as tones are historically not used in passwords due to character set incompatibilities.
  • ASCII patterns (16 out of 100): nonsensical patters, either keyboard rows walks or letter repetition
  • Vietnamese phrases (21 out of 100):
    • anhyeuem at #5
    • maiyeuem at #26
    • anhtuan at #36
    • annhoem at #41
    • emyeuanh at #46
  • Only 2 english phrases (mylove and iloveyou) appear on the top 100 list.
  • The top 100 most common passwords cover 42.3 million credentials, the top 1000 cover  53.6 million credentials. In a logarithmic graph we can see that the 50% threshold is crossed on a 50,000 password list.

 

The full set of lists generated are available on our Github repository.

Recommendations

The large use of numeric passwords in Vietnam makes guessing passwords very easy. Attempting to log in with the top 500 passwords would approximately break into 33% of accounts (1 in 3).

This a marked cultural difference with English based password lists. On those, numeric passwords are in use at a much reduced rate (10-20%). For example, the 2017 English top 100 passwords list by Slashdata only has 14 numeric passwords. Additionally, bruteforcing numeric only passwords is a lot less complex, given that we have a smaller set of 10 characters to build a password [0-9] than the full alphanumeric upper/lowercase of 62 [0-9a-zA-Z]. Once could bruteforce all possible 16 digit numbers in a few hours on a single graphics card.

So, what could Vietnamese internet users do to consistently avoid bad passwords?

For individuals:

  • Use a password manager (even on mobile devices). At DFIR VN we recomment Keypass for individuals and 1Password for teams. Both support mobile devices. These will generate random complex, virtually uncrackable passwords for any service.
  • Use two factor authentication (SMS, numeric codes using Google Authenticator or a physical key like a Yubikey).
  • Keep track of data breaches subscribing to services like “‘;–have I been pwned?” run by Troy Hunt.

For companies:

  • Create a written password policy for the company. A good start is the latest NIST recommendation SP800-63-3 which removes the need for regular password changes and no more complexity rules, but at the expense of password audits and an 8 character minimum limit.
  • Perform password audits using common lists like the ones shared in this article. Attempt to break into those accounts yourself before malicious actors do and trigger automatic password changes if they are in use. “Hack yourself first”.
  • Make password bruteforcing difficult (use rate-limiting mechanisms) on those services exposed. Block further attempts after 10 failed logins for a period of time, block logins from non-standard IP addresses…etc.

For application developers:

 

In short, password based security in Vietnam appears to be a lot weaker than initially thought.  Vietnamese users (companies, developers and individuals) should take some of the easy measures outlined here in order to prevent further problems.

Leave a Reply