One Rule to Rule Them All

Password cracking is a staple part of pentesting and with a few exceptions, dictionary/rule based attacks are the predominant method in getting those ever-elusive plain text values. Cracking rigs have afforded pentesters and blackhats alike the ability to throw a few graphics cards at some hashes and achieve phenomenal speeds, for example, earlier this year an 8-GPU system broke 500GH/s against NTLM hashes (that’s over 500 billion hashes/second). Be afraid Windows passwords… be very afraid.

Time is limited during a pentest and after acquiring a load of hashes you’ll quickly want to crack as many as you can. NotSoSecure decided to have a look at the success rates of different rules that are commonly used and @Stealthsploit has been looking at deriving a custom rule based from these tests that can better help satisfy his clear text cravings.

The Target Data:

The hash set used was the Lifeboat data dump. Lifeboat is a Minecraft community and in January 2016 over 7 million account details were leaked including unsalted MD5 password hashes. The number of hashes and weak algorithm made this dump a prime candidate for research. The raw MD5’s were extracted from the file and after de-duplicating hashcat reported a little over 4.3 million unique hashes.

The Dictionary and Rules:

The popular rockyou dictionary was used during testing with each of the following rules:

  • best64
  • unix-ninja-leetspeak
  • InsidePro-HashManager
  • InsidePro-PasswordsPro
  • toggles5
  • T0XICv1
  • rockyou-30000
  • d3ad0ne
  • dive
  • generated2
  • d3adhob0
  • hob064
  • KoreLogicRulesPrependRockYou50000
  • v2.dive

All except d3adhob0, hob064, KoreLogicRulesPrependRockYou50000 and _NSAKEY.v2.dive are included with hashcat. A good mixture of rule files should allow us to identify which individual rules are the big hitters from which a custom rule can be created.

Testing:

Each session started with the following command, substituting the rule used and the filenames, respectively:


The potfile was disabled so that hashcat didn’t check it prior to each crack and skew our numbers. Debug mode can only be enabled when using rules and the debug file contains the stats. Every time a rule cracks a hash it’s logged in the file. After hashcat completes, the file can then be sorted to show the number of times a rule was successful, therefore revealing the most successful rules in each set.

The Results:

The results from each test can be found below, showing the generated password candidates from each rule set and the total/percentage cracked.

 

Rule Total Candidates Cracked % Cracked
dive 1,421,219,827,456 2,843,085 65.64
_NSAKEY.v2.dive 1,768,370,620,544 2,784,741 64.30
generated2 933,992,405,632 2,606,565 60.18
d3ad0ne 489,063,363,712 2,580,399 59.58
rockyou-30000 430,298,880,000 2,557,422 59.05
T0XICv1 171,129,864,576 2,357,989 54.44
InsidePro-HashManager 92,801,125,120 2,247,349 51.89
InsidePro-PasswordsPro 44,751,083,520 2,056,467 47.48
d3adhob0 825,313,251,840 1,712,581 39.54
best64 1,104,433,792 1,404,449 32.43
hob064 917,970,944 1,195,032 27.59
KoreLogicRulesPrependRockYou50000 717,078,740,224 1,137,852 26.27
toggles5 70,898,912,128 759,344 17.53
unix-ninja-leetspeak 44,048,262,016 621,280 14.34

Success Rate on Lifeboat

We can also look at the effectiveness of each rule set by comparing success relative to the total candidates tested. For example, we can see that the d3adhob0 rules had the fourth largest candidate size (825 billion), however it cracked only 39.54% of passwords. By comparison the InsidePro-PasswordsPro rule had only 45 billion candidates yet it cracked 47.48% of passwords. The latter rule is clearly more efficient!

There are lots of metrics we’re not taking into account here so we’re not saying, “never use d3adhob0, always use InsidePro-PasswordsPro”. This is just an observation from this specific test. Lots of other metrics, like time, algorithm, available resources, potentially known characters (where mask attacks come in) etc need to be considered depending on what you’re trying to achieve. We chose this setup because a large set of hashes using unsalted MD5 provided the best balance for speed/time.

A rule efficiency breakdown can be seen below.

 

Test Total Candidates Cracked Guesses per crack
hob064 917,970,944 1,195,032 768
best64 1,104,433,792 1,404,449 786
InsidePro-PasswordsPro 44,751,083,520 2,056,467 21,761
InsidePro-HashManager 92,801,125,120 2,247,349 41,294
unix-ninja-leetspeak 44,048,262,016 621,280 70,899
T0XICv1 171,129,864,576 2,357,989 72,574
toggles5 70,898,912,128 759,344 93,368
rockyou-30000 430,298,880,000 2,557,422 168,255
d3ad0ne 489,063,363,712 2,580,399 189,530
generated2 933,992,405,632 2,606,565 358,323
d3adhob0 825,313,251,840 1,712,581 481,912
KoreLogicRulesPrependRockYou50000 717,078,740,224 1,137,852 630,204
_NSAKEY.v2.dive 1,768,370,620,544 2,784,741 635,022
dive 1,421,219,827,456 2,843,085 4,998,867

Efficiency on Lifeboat

So even though the dive rule cracked the most it was the least optimal in terms of average guesses before cracking a hash. This isn’t necessarily an issue if you have the luxury of time, but the time would be substantially longer if these hashes were SHA-256 for example. Also, the most efficient rule, hob064, cracked a similar number of hashes as the KoreLogicRulesPrependRockYou50000 rule, however it took the latter nearly 30,000 guesses more between each crack.

The resulting debug files were sorted, two examples of which can be seen below.

This is only a snippet as one of the rule sets contains over 100,000 rules and others contain several tens of thousands. A couple of quickly identified passwords trends in the above example show that the Minecraft community love to substitute ‘a’ for ‘4’ (sa4 rule), as well as capitalise the first letter and lowercase the rest (c rule)! A complete list of hashcat rule switches can be found on their website.

Concurrency Anomalies:

It became apparent after running one of the tests twice (in this case the best64 rule set), that the resulting stats were slightly different. A section of the stats from both runs is shown below.


There were 7 more plain rockyou hits in the first test than in the second. The other rules also reported slightly different numbers both here and in other rule sets.

This is likely due to multi-threading / high concurrency which meant that different rules produced the same plain text value before the “:” rule hit (especially seeing as we’re running a -w3 profile!). For example, let’s say the password was “L3tme1n” and the dictionary contains “l3tme1n”. If the “T0” rule (toggles the case of the first character) hits before the “:” rule, then “T0” gets the point, effectively stealing it from “:”. In each test the differing results were noted to be relatively consistent.

One Rule to Rule Them All…:

From here we selected the top 25% performing rules from each set, then de-duped, concatenated and tidied, leaving us a custom super-rule set containing 51,998 rules. Time to put it through the paces.

Rule Total Candidates Cracked % Cracked
OneRuleToRuleThemAll 745,808,362,112 2,960,711 68.36
dive 1,421,219,827,456 2,843,085 65.64
_NSAKEY.v2.dive 1,768,370,620,544 2,784,741 64.30
generated2 933,992,405,632 2,606,565 60.18

Success Rate on Lifeboat

 

Test Total Candidates Cracked Guesses per crack
[…] […] [...] […]
OneRuleToRuleThemAll 745,808,362,112 2,960,711 251,902
generated2 933,992,405,632 2,606,565 358,323
d3adhob0 825,313,251,840 1,712,581 481,912
KoreLogicRulesPrependRockYou50000 717,078,740,224 1,137,852 630,204
_NSAKEY.v2.dive 1,768,370,620,544 2,784,741 635,022
dive 1,421,219,827,456 2,843,085 4,998,867

Efficiency on Lifeboat

Although not the most efficient against all our tests (due to the large number of candidates), the custom rule cracked 2.72% (117,626) more passwords than dive did. Our custom rule was however substantially more efficient than dive which took second place in success rates.

Our rule was also tested against a couple of other data breaches that were published online to see how it performed again different data sets, again comparing against the dive rule.

#Test 1
XSplit breach, November 2013, 2,983,472 accounts, 2,227,270 unique hashes. Unsalted SHA-1.

Rule Total Candidates Cracked % Cracked
OneRuleToRuleThemAll 745,808,362,112 1,455,682 65.36
dive 1,421,219,827,456 1,402,636 62.98

Success Rate on Xsplit

Our custom rule cracked 2.38% (53,046) more rules.

#Test 2

Battlefield Heroes breach, June 2011. 548,773 accounts, 423,623 unique hashes. Unsalted MD5.

 

Rule Total Candidates Cracked % Cracked
OneRuleToRuleThemAll 745,808,362,112 318,958 75.29
dive 1,421,219,827,456 314,150 74.16

Success Rate on Battlefield Heroes

Our custom rule cracked 1.13% (4,808) more rules.

Wrap Up:
Our super rule came out on top in all our tests above, as well as others we looked at after. We’re sorry to disappoint any Lord of the Rings fans (“One ring to rule them all!”), but despite our rule name, there likely won’t ever be one rule to rule them all as other rule based attacks wouldn’t exist if there was. Password attacks should always be executed factoring in all variables, in particular the available time, hardware resources, dictionary size and algorithm.

What these tests have shown however, is that by creating your own custom rules (which we highly encourage), you can grab many more of those plain text secrets that you may not have seen if you just ran a standard dictionary attack or one with a single in-built rule. We’re certainly looking forward to using our super rule against many pentesting hash dumps in the future!

The custom rule we have used is accessible over at github