In the world of Bittorrent, each torrent is identified by an infohash. It is basically the SHA1 hash of the torrent metadata that tells you about the files. And people, when confronted with something that’s supposed to be random, like to control it to some degree. You can see this behaviour in lots of different places online. People try to generate special Bitcoin wallets, Tor services with their nick or 4chan tripcodes that look cool. These are all done by repeatedly generating the hash until you find a result that you like. We can do the exact same thing with torrents as well.
The structure of torrent files
Before we start tweaking our infohash, let’s talk about torrent files first. A torrent file is a bencoded dictionary. It contains information about the files, their names, how large they are and hashes for each piece. This is stored in the info section of the dictionary. The rest of the dictionary includes a list of trackers, the file comment, the creation date and other optional metadata. The infohash is quite literally the SHA1 hash of the info section of the torrent. Any modification to the file contents changes the infohash, while changing the other metadata doesn’t.
This gives us two ways of affecting the hash without touching the file contents. The first one is adding a separate key called vanity and changing the value of it. While this would be really flexible and cause the least change that the use can see, it adds a non-standard key to our dictionary. Fortunately, torrent files are supposed to be flexible and handle unknown keys gracefully.
The other thing we can do is to add a prefix to the file name. This should keep everything intact aside from a random value in front of our filename.
Parsing the torrent file
First of all, let’s read our torrent file and parse it. For this purpose, I’m using the bencoder
module.
import bencoder
target = 'arch-linux.torrent'
with open(target, 'rb') as torrent_file:
torrent = bencoder.decode(torrent_file.read())
Calculating the infohash
The infohash is the hash of the info section of the file. Let’s write a function to calculate that. We also encode the binary of the hash with base 32 to bring it to the infohash format.
import hashlib
import base64
def get_infohash(torrent):
encoded = bencoder.encode(torrent[b'info'])
sha1 = hashlib.sha1(encoded).hexdigest()
return sha1
Prefixing the name
Let’s do the method with prefixing the name first. We will start from 0 and keep incrementing the name prefix until the infohash starts with cafe
.
original_name = torrent[b'info'][b'name'].decode('utf-8')
vanity = 0
while True:
torrent[b'info'][b'name'] = '{}-{}'.format(vanity, original_name)
if get_infohash(torrent).startswith('cafe'):
print(vanity, get_infohash(torrent))
break
vanity += 1
This code will increment our vanity number in a loop and print it and the respective infohash when it finds a suitable one.
Adding a separate key to the info section
While the previous section works well, it still causes a change that is visible to the user. Let’s work around that by modifying the data in a bogus key called vanity.
vanity = 0
while True:
torrent[b'info'][b'vanity'] = str(vanity)
if get_infohash(torrent).startswith('cafe'):
print(vanity, get_infohash(torrent))
break
vanity += 1
Saving the modified torrent files
While it is possible to do the modification to the file yourself, why not go all the way and save the modified torrent file as well? Let’s write a function to save a given torrent file.
def save_torrent(torrent, name):
with open(name, 'wb+') as torrent_file:
torrent_file.write(bencoder.encode(torrent))
You can use this function after finding an infohash that you like.
Cool ideas for infohash criteria
- Release groups can prefix their infohashes with their name/something unique to them
- Finding smaller infohashes - should slowly accumulate 0’s in the beginning
- Infohashes with the least entropy - should make them easier to remember
- Infohashes with the most digits
- Infohashes with no digits