Twitter Words of Interest
When attempting to crack passwords custom word lists are very useful additions to standard dictionaries. An interesting idea originally released on the "7 Habits of Highly Effective Hackers" blog was to use Twitter to help generate those lists based on searches for keywords related to the list that is being cracked. I've expanded this idea into twofi which will take multiple search terms and return a word list sorted by most common first.
A second option, suggested by @pentest4dummies, was to look at what specific users have been saying and use their own tweets to build up the list so I've added that as well. Given a list of twitter usernames the script will bring back approximately the last 500 tweets for each user and use those to create the list.
Download
Installation
The only ruby gem that probably isn't installed by default is the json one, to install this run:
gem install json
Then you can run twofi by either using ruby
ruby twofi.rb
or making it executable then running it directly
chmod a+x twofi.rb
./twofi.rb
Usage
Usage: twofi [OPTIONS]
--help, -h: show help
--count, -c: include the count with the words
--min_word_length, -m: minimum word length
--term_file, -T file: a file containing a list of terms
--terms, -t: comma separated search terms
quote words containing spaces, no space after commas
--user_file, -U file: a file containing a list of users
--users, -u: comma separated usernames
quote words containing spaces, no space after commas
--verbose, -v: verbose
Usage is fairly simple, you can specify search terms or usernames either on the command line as comma separated lists or through files which you pass in. If you are specifying the terms or users on the command line you cannot have a space between the comma and the words, i.e. this is good:
term1,term2,term3
and this is bad:
term1, term2, term3
This is because of the way the command line arguments are parsed, the space is taken to mean a new parameter.
If you are using files each term/username should be on its own line.
When specifying usernames you do not need the @ symbol, if you pass it it will be stripped off when used anyway so save yourself some typing.
The --count option allows you to request the number of times each word is used. This might help if you only have a limited number of attempts to use the words and so need to decide which are really worth trying.
At the moment there is nothing for the script to be verbose about so the verbose flag does nothing. I've included it for future versions.
Todo
My next plan is to scrape both the general timeline and try to automate scraping trending topics. This would allow a word list of "current" words to be generated which would probably include a lot of slang and new words that aren't inlcuded in standard dictionaries.