Intermodal: A 40’ shipping container for the Internet

Intermodal is a user-friendly and featureful command-line BitTorrent metainfo utility for Linux, Windows, and macOS.

Project development is hosted on GitHub.

The binary is called imdl:

$ imdl --help

BitTorrent metainfo related functionality is under the torrent subcommand:

$ imdl torrent --help

Intermodal can be used to create .torrent files:

$ imdl torrent create --input foo

Print information about existing .torrent files:

$ imdl torrent show --input foo.torrent

Verify downloaded torrents:

$ imdl torrent verify --input foo.torrent --content foo

Generate magnet links from .torrent files:

$ imdl torrent link --input foo.torrent

Show infromation about the piece length picker:

$ imdl torrent piece-length

Print completion scripts for the imdl binary:

$ imdl completions --shell zsh

Functionality that is not yet finalized, but still available for preview, can be accessed with the --unstable flag:

Print information about a collection of torrents:

$ imdl --unstable torrent stats --input dir

Happy sharing!

FAQ

Can Intermodal be used to preview torrents with fzf?

Yes! @mustaqimM came up with the following:

fzf --preview='imdl --color always --terminal torrent show --input {}

Note the use of --color always and --terminal to force colored, human readable output.

This can be used to, for example, preview the torrents in a directory:

find . -name '*.torrent' | fzf --preview='imdl -c always -t torrent show -i {}'

Can Intermodal be used to create a torrent from a Git repo?

Yes! The --ignore flag, contributed by @Celeo, can be used to make imdl torrent create respect .gitignore files:

imdl torrent create --ignore --include-hidden --include-junk --glob '!.git/*' --input .

In addition to --ignore, --include-hidden, --include-junk, and --glob '!.git/*' are used to include files, like .gitignore, that are present in the repo but would otherwise be skipped, and to skip the contents of the .git directory.

Equivalently, with short flags:

imdl torrent create --ignore -hjg '!.git/*' -i .

How do I include and exclude files when creating a torrent?

There are a few ways to control which files are included when you create a torrent.

By default, symlinks, hidden files, and common “junk” files are excluded. To include these files, use:

  • --follow-symlinks to include files pointed to by a symlink.
  • --include-hidden to include files with names that start with . or are hidden by a file attribute.
  • --include-junk to include “junk” files like .DS_Store.

The --ignore flag makes Intermodal respect .gitignore and .ignore files.

This can be used to create a torrent from a Git repository, or to exclude files by creating a file called .ignore, adding patterns with the same syntax as .gitignore that match those files, and using --ignore when you create the torrent.

Additionally, you can use --glob PATTERN to both include and exclude files.

If PATTERN does not start with !, only those files that match PATTERN will be included.

If PATTERN starts with !, those files that match PATTERN will be excluded.

--glob can be passed multiple times, to include multiple subsets of files:

# only include `foo/bar` and `foo/bob`
imdl torrent create --input foo --glob bar/ --glob bob/

To exclude multiple subsets of files:

# don't include `foo/bar` and `foo/bob`
imdl torrent create --input foo --glob '!bar/' --glob '!bob/'

Or to refine a pattern:

# include everything in `foo/bar` but not anything in `foo/bar/baz`
imdl torrent create --input foo --glob `bar/` --glob `!bar/baz/`

--glob can be passed any number of times. If multiple PATTERNs match a path, the last one on the command line takes precedence.

Changelog

UNRELEASED - 2020-10-20

v0.1.12 - 2020-10-03

v0.1.11 - 2020-09-07

v0.1.10 - 2020-06-24

v0.1.9 - 2020-06-24

v0.1.8 - 2020-05-26

v0.1.7 - 2020-04-22

v0.1.6 - 2020-04-20

v0.1.5 - 2020-04-11

v0.1.4 - 2020-04-10

v0.1.3 - 2020-04-10

v0.1.2 - 2020-04-08

v0.1.1 - 2020-04-08

v0.1.0 - 2020-04-08

v0.0.3 - 2020-04-08

v0.0.2 - 2020-04-08

v0.0.1 - 2020-04-08

Commands

This page intentionally left blank.

imdl

imdl v0.1.12
Casey Rodarmor <casey@rodarmor.com>
📦 A 40' shipping container for the internet -
https://github.com/casey/intermodal

USAGE:
    imdl [FLAGS] [OPTIONS] <SUBCOMMAND>

FLAGS:
    -h, --help        Print help message.
    -q, --quiet       Suppress normal output.
    -t, --terminal    Disable automatic terminal detection and behave as if both
                      standard output and standard error are connected to a
                      terminal.
    -u, --unstable    Enable unstable features. To avoid premature stabilization
                      and excessive version churn, unstable features are
                      unavailable unless this flag is set. Unstable features are
                      not bound by semantic versioning stability guarantees, and
                      may be changed or removed at any time.
    -V, --version     Print version number.

OPTIONS:
    -c, --color <WHEN>    Print colorful output according to `WHEN`. When
                          `auto`, the default, colored output is only enabled if
                          imdl detects that it is connected to a terminal, the
                          `NO_COLOR` environment variable is not set, and the
                          `TERM` environment variable is not set to `dumb`.
                          [default: auto]  [possible values: auto, always,
                          never]

SUBCOMMANDS:
    completions    Print shell completion scripts to standard output.
    help           Prints this message or the help of the given
                   subcommand(s)
    torrent        Subcommands related to the BitTorrent protocol.

imdl completions

imdl-completions 0.1.12
Print shell completion scripts to standard output.

USAGE:
    imdl completions [OPTIONS] <SHELL>

FLAGS:
    -h, --help       Print help message.
    -V, --version    Print version number.

OPTIONS:
    -d, --dir <DIR>        Write completion script to `DIR` with an appropriate
                           filename. If `--shell` is not given, write all
                           completion scripts.
    -s, --shell <SHELL>    Print completion script for `SHELL`. [possible
                           values: zsh, bash, fish, powershell, elvish]

ARGS:
    <SHELL>    Print completion script for `SHELL`. [possible values: zsh,
               bash, fish, powershell, elvish]

imdl torrent

imdl-torrent 0.1.12
Subcommands related to the BitTorrent protocol.

USAGE:
    imdl torrent <SUBCOMMAND>

FLAGS:
    -h, --help       Print help message.
    -V, --version    Print version number.

SUBCOMMANDS:
    create          Create a .torrent file.
    help            Prints this message or the help of the given
                    subcommand(s)
    link            Generate a magnet link from a .torrent file.
    piece-length    Display information about automatic piece length
                    selection.
    show            Display information about a .torrent file.
    stats           Show statistics about a collection of .torrent files.
    verify          Verify files against a .torrent file.

imdl torrent create

imdl-torrent-create 0.1.12
Create a .torrent file.

USAGE:
    imdl torrent create [FLAGS] [OPTIONS] <INPUT>

FLAGS:
    -n, --dry-run             Skip writing `.torrent` file to disk.
    -F, --follow-symlinks     Follow symlinks in torrent input. By default,
                              symlinks to files and directories are not included
                              in torrent contents.
    -f, --force               Overwrite the destination `.torrent` file, if it
                              exists.
        --help                Print help message.
        --ignore              Skip files listed in `.gitignore`, `.ignore`,
                              `.git/info/exclude`, and `git config --get
                              core.excludesFile`.
    -h, --include-hidden      Include hidden files that would otherwise be
                              skipped, such as files that start with a `.`, and
                              files hidden by file attributes on macOS and
                              Windows.
    -j, --include-junk        Include junk files that would otherwise be
                              skipped.
    -M, --md5                 Include MD5 checksum of each file in the torrent.
                              N.B. MD5 is cryptographically broken and only
                              suitable for checking for accidental corruption.
        --no-created-by       Do not populate `created by` key of generated
                              torrent with imdl version information.
        --no-creation-date    Do not populate `creation date` key of generated
                              torrent with current time.
    -O, --open                Open `.torrent` file after creation. Uses `xdg-
                              open`, `gnome-open`, or `kde-open` on Linux;
                              `open` on macOS; and `cmd /C start` on Windows
        --link                Print created torrent `magnet:` URL to standard
                              output
    -P, --private             Set the `private` flag. Torrent clients that
                              understand the flag and participate in the swarm
                              of a torrent with the flag set will only announce
                              themselves to the announce URLs included in the
                              torrent, and will not use other peer discovery
                              mechanisms, such as the DHT or local peer
                              discovery. See BEP 27: Private Torrents for more
                              information.
    -S, --show                Display information about created torrent file.
    -V, --version             Print version number.

OPTIONS:
    -A, --allow <LINT>...
            Allow `LINT`. Lints check for conditions which, although permitted,
            are not usually desirable. For example, piece length can be any non-
            zero value, but probably shouldn't be below 16 KiB. The lint
            `small-piece-size` checks for this, and `--allow small-piece-size`
            can be used to disable this check. [possible values: private-
            trackerless, small-piece-length, uneven-piece-length]
    -a, --announce <URL>
            Use `URL` as the primary tracker announce URL. To supply multiple
            announce URLs, also use `--announce-tier`.
    -t, --announce-tier <URL-LIST>...
            Use `URL-LIST` as a tracker announce tier. Each instance adds a new
            tier. To add multiple trackers to a given tier, separate their
            announce URLs with commas:
            
            `--announce-tier
            udp://example.com:80/announce,https://example.net:443/announce`
                        
            Announce tiers are stored in the `announce-list` key of the top-
            level metainfo dictionary as a list of lists of strings, as
            defined by BEP 12: Multitracker Metadata Extension.
                        
            Note: Many BitTorrent clients do not implement the behavior
            described in BEP 12. See the discussion here for more details:
            https://github.com/bittorrent/bittorrent.org/issues/82
    -c, --comment <TEXT>
            Include `TEXT` as the comment for generated `.torrent` file. Stored
            under `comment` key of top-level metainfo dictionary.
        --node <NODE>...
            Add DHT bootstrap node `NODE` to torrent. `NODE` should be in the
            form `HOST:PORT`, where `HOST` is a domain name, an IPv4 address, or
            an IPv6 address surrounded by brackets. May be given more than once
            to add multiple bootstrap nodes.
            
            Examples:
            
                --node router.example.com:1337
            
                --node 203.0.113.0:2290
            
                --node [2001:db8:4275:7920:6269:7463:6f69:6e21]:8832
    -g, --glob <GLOB>...
            Include or exclude files that match `GLOB`. Multiple glob may be
            provided, with the last one taking precedence. Precede a glob with
            `!` to exclude it.
    -i, --input <INPUT>
            Read torrent contents from `INPUT`. If `INPUT` is a file, torrent
            will be a single-file torrent.  If `INPUT` is a directory, torrent
            will be a multi-file torrent.  If `INPUT` is `-`, read from standard
            input. Piece length defaults to 256KiB when reading from standard
            input if `--piece-length` is not given.
    -N, --name <TEXT>
            Set name of torrent to `TEXT`. Defaults to the filename of the
            argument to `--input`. Required when `--input -`.
    -o, --output <TARGET>
            Save `.torrent` file to `TARGET`, or print to standard output if
            `TARGET` is `-`. Defaults to the argument to `--input` with an
            `.torrent` extension appended. Required when `--input -`.
        --peer <PEER>...                 Add `PEER` to magnet link.
    -p, --piece-length <BYTES>
            Set piece length to `BYTES`. Accepts SI units, e.g. kib, mib, and
            gib.
        --sort-by <SPEC>...
            Set the order of files within a torrent. `SPEC` should be of the
            form `KEY:ORDER`, with `KEY` being one of `path` or `size`, and
            `ORDER` being `ascending` or `descending`. `:ORDER` defaults to
            `ascending` if omitted. The `--sort-by` flag may be given more than
            once, with later values being used to break ties. Ties that remain
            are broken in ascending path order.
            
            Sort in ascending order by path, the default:
            
                --sort-by path:ascending
            
            Sort in ascending order by path, more concisely:
            
                --sort-by path
            
            Sort in ascending order by size, break ties in descending path
            order:
            
                --sort-by size:ascending --sort-by path:descending
    -s, --source <TEXT>
            Set torrent source to `TEXT`. Stored under `source` key of info
            dictionary. This is useful for keeping statistics from being mis-
            reported when participating in swarms with the same contents,
            but with different trackers. When source is set to a unique value
            for torrents with the same contents, torrent clients will treat them
            as distinct torrents, and not share peers between them, and will
            correctly report download and upload statistics to multiple
            trackers.
        --update-url <URL>
            Set torrent feed URL to `URL`, stored in the `update-url` key of the
            info dictionary. Clients that support BEP 39 will use the update URL
            to download revised versions of the torret's metainfo. Note that BEP
            39 is not widely supported.

ARGS:
    <INPUT>    Read torrent contents from `INPUT`. If `INPUT` is a file,
               torrent will be a single-file torrent.  If `INPUT` is a
               directory, torrent will be a multi-file torrent.  If `INPUT`
               is `-`, read from standard input. Piece length defaults to
               256KiB when reading from standard input if `--piece-length`
               is not given.

imdl torrent link

imdl-torrent-link 0.1.12
Generate a magnet link from a .torrent file.

USAGE:
    imdl torrent link [FLAGS] [OPTIONS] <INPUT>

FLAGS:
    -h, --help       Print help message.
    -O, --open       Open generated magnet link. Uses `xdg-open`, `gnome-open`,
                     or `kde-open` on Linux; `open` on macOS; and `cmd /C start`
                     on Windows.
    -V, --version    Print version number.

OPTIONS:
    -s, --select-only <INDICES>...
            Select files to download. Values are indices into the `info.files`
            list, e.g. `--select-only 1,2,3`.
    -i, --input <INPUT>
            Generate magnet link from metainfo at `INPUT`. If `INPUT` is `-`,
            read metainfo from standard input.
    -p, --peer <PEER>...              Add `PEER` to magnet link.

ARGS:
    <INPUT>    Generate magnet link from metainfo at `INPUT`. If `INPUT` is
               `-`, read metainfo from standard input.

imdl torrent piece-length

imdl-torrent-piece-length 0.1.12
Display information about automatic piece length selection.

USAGE:
    imdl torrent piece-length

FLAGS:
    -h, --help       Print help message.
    -V, --version    Print version number.

imdl torrent show

imdl-torrent-show 0.1.12
Display information about a .torrent file.

USAGE:
    imdl torrent show [FLAGS] [OPTIONS] <INPUT>

FLAGS:
    -h, --help       Print help message.
    -j, --json       Output data as JSON instead of the default format.
    -V, --version    Print version number.

OPTIONS:
    -i, --input <INPUT>    Show information about torrent at `INPUT`. If `INPUT`
                           is `-`, read torrent metainfo from standard input.

ARGS:
    <INPUT>    Show information about torrent at `INPUT`. If `INPUT` is `-`,
               read torrent metainfo from standard input.

imdl torrent stats

imdl-torrent-stats 0.1.12
Show statistics about a collection of .torrent files.

USAGE:
    imdl torrent stats [FLAGS] [OPTIONS] --input <PATH>

FLAGS:
    -h, --help       Print help message.
    -p, --print      Pretty print the contents of each torrent as it is
                     processed.
    -V, --version    Print version number.

OPTIONS:
    -e, --extract-pattern <REGEX>...
            Extract and display values under key paths that match `REGEX`.
            Subkeys of a bencodeded dictionary are delimited by `/`, and values
            of a bencoded list are delmited by `*`. For example, given the
            following bencoded dictionary `{"foo": [{"bar": {"baz": 2}}]}`, the
            value `2`'s key path will be `foo*bar/baz`. The value `2` would be
            displayed if any of `bar`, `foo[*]bar/baz`, or `foo.*baz` were
            passed to `--extract-pattern.
    -i, --input <PATH>
            Search `PATH` for torrents. May be a directory or a single torrent
            file.
    -l, --limit <N>
            Stop after processing `N` torrents. Useful when processing large
            collections of `.torrent` files.

imdl torrent verify

imdl-torrent-verify 0.1.12
Verify files against a .torrent file.

USAGE:
    imdl torrent verify [OPTIONS] <INPUT>

FLAGS:
    -h, --help       Print help message.
    -V, --version    Print version number.

OPTIONS:
    -c, --content <PATH>    Verify torrent content at `PATH` against torrent
                            metainfo. Defaults to `name` field of torrent info
                            dictionary.
    -i, --input <INPUT>     Verify torrent contents against torrent metainfo in
                            `INPUT`. If `INPUT` is `-`, read metainfo from
                            standard input.

ARGS:
    <INPUT>    Verify torrent contents against torrent metainfo in `INPUT`.
               If `INPUT` is `-`, read metainfo from standard input.

BitTorrent

This page intentionally left blank.

BitTorrent Piece Length Selection

BitTorrent .torrent files contain so-called metainfo that allows BitTorrent peers to locate, download, and verify the contents of a torrent.

This metainfo includes the piece list, a list of SHA-1 hashes of fixed-size pieces of the torrent data. The size of these pieces is chosen by the torrent creator.

Intermodal has a simple algorithm that attempts to pick a reasonable piece length for a torrent given the size of the contents.

For compatibility with the BitTorrent v2 specification, the algorithm chooses piece lengths that are powers of two, and that are at least 16KiB.

The maximum automatically chosen piece length is 16MiB, as piece lengths larger than 16MiB have been reported to cause issues for some clients.

In addition to the above constraints, there are a number of additional factors to consider.

Factors favoring smaller piece length

  • To avoid uploading bad data, peers only upload data from full pieces, which can be verified by hash. Decreasing the piece size allows peers to more quickly obtain a full piece, which decreases the time before they begin uploading, and receiving data in return.

  • Decreasing the piece size decreases the amount of data that must be thrown away in case of corruption.

Factors favoring larger piece length

  • Increasing the piece size decreases the protocol overhead from requesting many pieces.

  • Increasing the piece size decreases the number of pieces, decreasing the size of the metainfo.

  • Increasing piece length increases the proportion of disk seeks to disk reads, which can be beneficial for spinning disks.

Intermodal’s Algorithm

In Python, the algorithm used by intermodal is:

MIN = 16 * 1024
MAX = 16 * 1024 * 1024

def piece_length(content_length):
  exponent = math.log2(content_length)
  length = 1 << int((exponent / 2 + 4))
  return min(max(length, MIN), MAX)

Which gives the following piece lengths:

Content -> Piece Length x Count    = Piece List Size
16 KiB  -> 16 KiB       x 1        = 20 bytes
32 KiB  -> 16 KiB       x 2        = 40 bytes
64 KiB  -> 16 KiB       x 4        = 80 bytes
128 KiB -> 16 KiB       x 8        = 160 bytes
256 KiB -> 16 KiB       x 16       = 320 bytes
512 KiB -> 16 KiB       x 32       = 640 bytes
1 MiB   -> 16 KiB       x 64       = 1.25 KiB
2 MiB   -> 16 KiB       x 128      = 2.5 KiB
4 MiB   -> 32 KiB       x 128      = 2.5 KiB
8 MiB   -> 32 KiB       x 256      = 5 KiB
16 MiB  -> 64 KiB       x 256      = 5 KiB
32 MiB  -> 64 KiB       x 512      = 10 KiB
64 MiB  -> 128 KiB      x 512      = 10 KiB
128 MiB -> 128 KiB      x 1024     = 20 KiB
256 MiB -> 256 KiB      x 1024     = 20 KiB
512 MiB -> 256 KiB      x 2048     = 40 KiB
1 GiB   -> 512 KiB      x 2048     = 40 KiB
2 GiB   -> 512 KiB      x 4096     = 80 KiB
4 GiB   -> 1 MiB        x 4096     = 80 KiB
8 GiB   -> 1 MiB        x 8192     = 160 KiB
16 GiB  -> 2 MiB        x 8192     = 160 KiB
32 GiB  -> 2 MiB        x 16384    = 320 KiB
64 GiB  -> 4 MiB        x 16384    = 320 KiB
128 GiB -> 4 MiB        x 32768    = 640 KiB
256 GiB -> 8 MiB        x 32768    = 640 KiB
512 GiB -> 8 MiB        x 65536    = 1.25 MiB
1 TiB   -> 16 MiB       x 65536    = 1.25 MiB
2 TiB   -> 16 MiB       x 131072   = 2.5 MiB
4 TiB   -> 16 MiB       x 262144   = 5 MiB
8 TiB   -> 16 MiB       x 524288   = 10 MiB
16 TiB  -> 16 MiB       x 1048576  = 20 MiB
32 TiB  -> 16 MiB       x 2097152  = 40 MiB
64 TiB  -> 16 MiB       x 4194304  = 80 MiB
128 TiB -> 16 MiB       x 8388608  = 160 MiB
256 TiB -> 16 MiB       x 16777216 = 320 MiB
512 TiB -> 16 MiB       x 33554432 = 640 MiB
1 PiB   -> 16 MiB       x 67108864 = 1.25 GiB

References

Articles

Implementations

BEP Support

SymbolMeaning
Supported
Unsupported (link to issue)
Not Applicable

BEPStatusTitle
00Index of BitTorrent Enhancement Proposals
01The BitTorrent Enhancement Proposal Process
02Sample reStructured Text BEP Template
03The BitTorrent Protocol Specification
04Assigned Numbers
05DHT Protocol
06Fast Extension
07IPv6 Tracker Extension
08Tracker Peer Obfuscation
09Extension for Peers to Send Metadata Files
10Extension Protocol
11Peer Exchange (PEX)
12Multitracker Metadata Extension
14Local Service Discovery
15UDP Tracker Protocol for BitTorrent
16Superseeding
17HTTP Seeding
18Search Engine Specificiation
19WebSeed - HTTP/FTP Seeding (GetRight style)
20Peer ID Conventions
21Extension for partial seeds
22BitTorrent Local Tracker Discovery Protocol
23Tracker Returns Compact Peer Lists
24Tracker Returns External IP
25An Alternate BitTorrent Cache Discovery Protocol
26Zeroconf Peer Advertising and Discovery
27Private Torrents
28Tracker exchange extension
29uTorrent transport protocol
30Merkle hash torrent extension
31Failure Retry Extension
32BitTorrent DHT Extensions for IPv6
33DHT Scrapes
34DNS Tracker Preferences
35Torrent Signing
36Torrent RSS feeds
37Anonymous BitTorrent over proxies
38Finding Local Data Via Torrent File Hints
39Updating Torrents Via Feed URL
40Canonical Peer Priority
41UDP Tracker Protocol Extensions
42DHT Security extension
43Read-only DHT Nodes
44Storing arbitrary data in the DHT
45Multiple-address operation for the BitTorrent DHT
46Updating Torrents Via DHT Mutable Items
47Padding files and extended file attributes
48Tracker Protocol Extension: Scrape
49Distributed Torrent Feeds
50Publish/Subscribe Protocol
51DHT Infohash Indexing
52The BitTorrent Protocol Specification v2
53Magnet URI extension - Select specific file indices for download
54The lt_donthave extension
55Holepunch extension

Metainfo Utilities

NameUILanguageNotes
torf-cliCLIPythonHighly recommended utility for creating torrents and magnet links, as well as displaying information about and editing existing torrents.
mktorrentCLICPopular but unmaintained torrent file creator.
pmktorrentCLICMaintained fork of mktorrent.
mktorrentLibraryRubyLibrary for creating torrent files.
py3createtorrentCLIPythonTorrent file creator.
create-torrentLibrary & CLIJavaScriptJavascript library and CLI for creating torrents.
whatmp3CLIPythonTorrent file creator that automatically transcodes FLAC files.
torrent-file-editorGUIC++Graphical torrent file editor.
torrent2magnetCLIPythonCreates magnet links from torrent files.
h2torrentCLIPythonCreates .torrent files from an infohash or magnet URI.
dottorrentLibraryPythonLibrary for creating torrent files
dottorrent-cliCLIPythonTorrent file creator.
torrent-creatorWeb pageTypescriptSingle-page web app torrent file creator.
pyrocoreCLIPythonUtilities for creating, modifying, and displaying torrent files.
buildtorrentCLICTorrent file creator packaged for Ubuntu and Debian
maketorrentCLIRustTorrent file creator.

Distributing Large Data Sets

Even though BitTorrent is well-suited for distributing large amounts of data, very large torrents can still cause problems. Here are some of the problems you might encounter, as well as suggestions for how to avoid or ameliorate those issues.

Intermodal currently uses a single-threaded piece hashing algorithm. If you’re distributing a large data set and hashing time is a problem, please open an issue! I’m eager to improve hashing performance, but want to make sure I do it in such a way that real workloads benefit.

Background

In order to support incremental download and verification, as well as resumption of partial downloads, the contents of a torrent are broken into pieces.

The length of pieces varies is configurable, and the ideal choice of piece length depends on many factors, but values between 16KiB and 256KiB are common. Very large torrents may use much larger piece lengths, like 16MiB.

Each piece is hashed, and .torrent files, also referred to as metainfo, contain a list of those hashes.

For all the example commands, I’ll be using dir for the directory containing the data set you want to share.

Issues

.torrent file too large

When the amount of data is large, or the piece length is small, the number of pieces can make the .torrent file very big.

To avoid this, you can either break the data into multiple torrents, or make the piece length larger, so the .torrent file contains fewer pieces.

Breaking data into multiple torrents

imdl torrent create has a --glob option that can be used to control which files are included in a torrent. If your data set is divided into multiple files, ideally with a consistent naming scheme, this can be used to easily create multiple torrents with different subsets of the data.

The name of the created torrent is usually derived from the name of the input, so the output torrent name should be given manually to avoid conflicts:

$ imdl torrent create -i dir -o a.torrent --glob 'dir/0*'
$ imdl torrent create -i dir -o b.torrent --glob 'dir/1*'
$ imdl torrent create -i dir -o c.torrent --glob 'dir/2*'
# etc…

Making the piece length larger

imdl has an automatic piece length picker, which should choose a good piece length. You can see what choices it makes for different torrent sizes with:

$ imdl torrrent piece-length

Some torrent clients don’t do well with piece lengths over 16 MiB, so the piece length picker will never pick piece lengths over 16 MiB. This can be overridden by specifying --piece-length manually. --piece-length takes SI units, like KiB, MiB, and KiB:

$ imdl torrent create -i dir --piece-length 128mib

Too many files

Torrents containing a large number of separate files can cause performance issues. It’s not clear if these performance issues are due to BitTorrent client implementations, host OS file system issues, or both.

Distributing your data set as an ISO image

By distributing your data set as an ISO image, all the files in your torrent will be packed into a single .iso file. Additionally, recipients of the ISO won’t have to decompress the whole data set to browse or extract individual files.

You can create an ISO with genisoimage, which can be installed on Debian or Ubuntu with:

$ sudo apt install genisoimage

To create a compressed ISO containing your data set:

$ genisoimage                \
    -transparent-compression \ # compress data in the ISO
    -untranslated-filenames  \ # don't mangle filenames
    -verbose                 \ # verbose output
    -output data.iso         \ # output path
    -V DATA_SET_NAME         \ # volume name
    dir                      \ # input path

The same command, but with short flags:

$ genisoimage -zUvo data.iso -V DATA_SET_NAME dir

A torrent can then be created containing the ISO:

$ imdl torrent create --input data.iso

Users can mount and unmount the ISO on Linux:

$ sudo mkdir -p /mnt                   # create mount point
$ sudo mount --read-only data.iso /mnt # mount ISO
$ sudo umount /mnt                     # unmount when finished

Or MacOS:

$ hdiutil mount data.iso                 # mount ISO
# hdiutil unmount /Volumes/DATA_SET_NAME # unmount when finished

On Windows, MacOS, and some Linux desktop environments, ISOs can also be mounted by double-clicking the file.

Torrent Client Issues

Some torrent clients don’t do well with torrents with large piece sizes, many files, or a large amount of data.

Switch to a libtorrent-based client

If you’re experiencing issues downloading a large data set, switching torrent clients may help.

In my personal experience, torrent clients that use Arvid Norberg’s libtorrent have done well with large amounts of data.

libtorrent‘s Wikipedia page has a list of torrent clients that use libtorrent.

Conclusion

If you have suggestions for this guide, please don’t hesitate to open an issue.

In particular, if you’ve found particular torrent clients to be good or bad at downloading large data sets, or have run into issues or found solutions not covered by this guide, I would love to know!

UDP Tracker Protocol

This description of the UDP tracker protocol is adapted from this page by Arvid Norberg.

A tracker with the protocol “udp://” in its URI is should be contacted with this protocol.

All values are sent in network byte order (big-endian).

If no response to a request is received within 15 seconds, resend the request. If no reply has been received after 60 seconds, stop retrying.

Transaction IDs sent in request messages are returned in response messages.

Connection ID returned by tracker in connect response message should be sent by client in later requests. The initial connection ID sent in connect requests shoudl be 0x41727101980 in network byte order.

Action values in requests and responses should be taken from the Actions table.

When a message is followed by a structure labeled repeating:, the rest of the message is zero or more of that structure.

Fields with type [T; N] are N instances of values of type T with no extra padding.

Fields with type [T; NAME] are NAME instances of values of type T with no extra padding, where NAME is an integer field of the same message.

Files with type [T] are zero or more instances of values of type T with no extra padding, which make up any trailing bytes of the message.

Actions

namevalue
connect0
announce1
scrape2
error3

Events

namevalue
none0
completed1
started2
stopped3

Error

typenamedescription
i32action
i32transaction_id
[i8]error_stringrest of packet is string describing error

Connect

Request

typename
i64connection_id
i32action
i32transaction_id

Response

typename
i32action
i32transaction_id
i64connection_id

Announce

Request

typenamedescription
i64connection_id
i32action
i32transaction_id
[i8; 20]info_hashtorrent infohash
[i8; 20]peer_idpeer ID
i64downloadedbytes downloaded this session
i64leftbytes left to download
i64uploadedbytes uploaded this session
i32eventfrom Events table
u32ip0 to use sender of this UDP packet
u32keyrandomly generated by client, unknown function.
i32num_wantmaximum number of peers to send in reply, use -1 for default
u16portlistening port
u16extensions

Response

typenamedescription
i32action
i32transaction_id
i32intervalseconds to wait announcing again
i32leechersnumber of peers in swarm that have not finished downloading
i32seedersnumber of peers in swarm that have finished downloading

repeating:

typenamedescription
i32ippeer IP
i16portpeer listening port

Scrape

Request

typename
i64connection_id
i32action
i32transaction_id

repeating:

typename
[i8; 20]info_hash

Response

typename
i32action
i32transaction_id

repeating:

typenamedescription
i32completepeers in swarm that have finished downloding
i32downloadedtimes torrent has been downloaded
i32incompletepeers that have not finished downloading

Extensions

The extensions field is a bitmask. The following bits are assigned:

namebit
authentication1
request string2

If multiple bits are present in the extension field, the extension bodies are appended to the packet in the order of least significant bit first. For instance, if both bit 1 and 2 are set, the extension represented by bit 1 comes first, followed by the extension represented by bit 2.

Authentication

The packet will have authentication information appended to it.

passwd_hash is the first eight bytes of sha1(packet || sha1(password)), where packet is the bytes of the packet, less the final 8 bytes that are passwd_hash.

typename
i8username_length
[i8; username_lengthusername
[u8; 8]passwd_hash

Request String

The request string extension is meant to allow torrent creators pass along cookies back to the tracker. This can be useful for authenticating that a torrent is allowed to be tracked by a tracker for instance. It could also be used to authenticate users by generating torrents with unique tokens in the tracker URL for each user. The extension body has the following format:

typename
i8request_length
[i8; request_length]request_string

request_string is the string that comes after the hostname and port in the UDP tracker URL. Typically this starts with “/announce” The bittorrent client is not expected to append query string arguments for stats reporting, like “uploaded” and “downloaded” since this is already reported in the UDP tracker protocol. However, the client is free to add arguments as extensions.|

Credits

Protocol designed by Olaf van der Spek and extended by Arvid Norberg

References

This page intentionally left blank.

BitTorrent

Metadata

  • Media RSS SpecificationMedia RSS is a new RSS module that supplements the capabilities of RSS 2.0. RSS enclosures are already being used to syndicate audio files and images. Media RSS extends enclosures to handle other media types, such as short films or TV, as well as provide additional metadata with the media. Media RSS enables content publishers and bloggers to syndicate multimedia content such as TV and video clips, movies, images and audio.

Cryptography