BEP:44
Title:Storing arbitrary data in the DHT
Version: 4c187001ec214a60cf86de73ada5a2e53353b9b1
Last-Modified:Wed Feb 1 09:41:55 2017 -0800
Author: Arvid Norberg <arvid@bittorrent.com>, Steven Siloti <ssiloti@bittorrent.com>
Status: Draft
Type:Standards Track
Content-Type:text/x-rst
Created:19-Dec-2014
Post-History:06-Aug-2016 (the8472.bep@infinite-source.de), added backoff algorithm for republishing

Abstract

This extension enables storing and retrieving of arbitrary data in the BitTorrent DHT [1]. It supports both storing immutable items, where the key is the SHA-1 hash of the data itself, and mutable items, where the key is the public key of the key pair used to sign the data.

Rationale

The DHT as defined by BEP 5 [1] only provides the ability to store and retrieve IP addresses associated with a hash. However, there are many other types of data which may be useful to store in the DHT. This extension defines a protocol which enables new features to utilize the DHT as a general purpose key/value store.

Terminology

In this document, a storage node refers to the node in the DHT to which an item is being announced and stored on. A requesting node refers to a node which makes look-ups in the DHT to find the storage nodes, to request items from them, and possibly re-announce those items to keep them alive.

Messages

The proposed new messages get and put are similar to the existing get_peers and announce_peer. Responses to get should always include nodes and nodes6. Those fields have the same semantics as in the get_peers response. It should also include a write token, token, with the same semantics as in get_peers. The write token MAY be tied specifically to the key which get requested. I.e. the token can only be used to store values under that one key. It may also be tied to the node ID and IP address of the requesting node.

The id field in these messages has the same semantics as the standard DHT messages, i.e. the node ID of the node sending the message, to maintain the structure of the DHT network.

The token field also has the same semantics as the standard DHT message get_peers and announce_peer, when requesting an item and to write an item respectively.

The k field is the 32 byte ed25519 public key, which the signature can be authenticated with. When looking up a mutable item, the target field MUST be the SHA-1 hash of this key concatenated with the salt, if present.

The distinction between storing mutable and immutable items is the inclusion of a public key, a sequence number, signature and an optional salt (k, seq, sig and salt).

get requests for mutable items and immutable items may not be distinguishable from each other. An implementation can either store mutable and immutable items in the same hash table internally, or in separate ones and potentially do two lookups for get requests.

The v field is the value to be stored. It is allowed to be any bencoded type (list, dict, string or integer). When it's being hashed (for verifying its signature or to calculate its key), its flattened, bencoded, form is used. If the value in a put request contains invalid bencoding (e.g. a dictionary with unsorted keys), the storing node MUST reject the request and SHOULD return an error message with code 203.

Storing nodes MAY reject put requests where the bencoded form of v is longer than 1000 bytes. In other words, it's not safe to assume storing more than 1000 bytes will succeed.

Immutable Items

Immutable items are stored under their SHA-1 hash, and since they cannot be modified, there is no need to authenticate the origin of them. This makes immutable items simple.

A node making a lookup SHOULD verify the data it receives from the network, to verify that its hash matches the target that was looked up.

put message

Request:

{
    "a":
    {
        "id": <20 byte id of sending node (string)>,
        "token": <write-token (string)>,
        "v": <any bencoded type, whose encoded size <= 1000>
    },
    "t": <transaction-id (string)>,
    "y": "q",
    "q": "put"
}

Response:

{
    "r": { "id": <20 byte id of sending node (string)> },
    "t": <transaction-id (string)>,
    "y": "r",
}

get message

Request:

{
    "a":
    {
        "id": <20 byte id of sending node (string)>,
        "target": <SHA-1 hash of item (string)>,
    },
    "t": <transaction-id (string)>,
    "y": "q",
    "q": "get"
}

Response:

{
    "r":
    {
        "id": <20 byte id of sending node (string)>,
        "token": <write token (string)>,
        "v": <any bencoded type whose SHA-1 hash matches 'target'>,
        "nodes": <IPv4 nodes close to 'target'>,
        "nodes6": <IPv6 nodes close to 'target'>
    },
    "t": <transaction-id>,
    "y": "r",
}

Mutable Items

Mutable items can be updated, without changing their DHT keys. To authenticate that only the original publisher can update an item, it is signed by a private key generated by the original publisher. The target ID mutable items are stored under is the SHA-1 hash of the public key (as it appears in the put message).

In order to avoid a malicious node to overwrite the list head with an old version, the sequence number seq must be monotonically increasing for each update, and a node hosting the list node MUST not downgrade a list head from a higher sequence number to a lower one, only upgrade. The sequence number SHOULD not exceed MAX_INT64, (i.e. 0x7fffffffffffffff). A client MAY reject any message with a sequence number exceeding this. A client MAY also reject any message with a negative sequence number.

The signature is a 64 byte ed25519 signature of the bencoded sequence number concatenated with the v key. e.g. something like this:

3:seqi4e1:v12:Hello world!

If the salt key is present and non-empty, the salt string must be included in what's signed. Note that if salt is specified and an empty string, it is as if it was not specified and nothing in addition to the sequence number and the data is signed. The salt string MUST NOT be longer than 64 bytes.

When a salt is included in what is signed, the key salt with the value of the key is prepended in its bencoded form. For example, if salt is "foobar", the buffer to be signed is:

4:salt6:foobar3:seqi4e1:v12:Hello world!

put message

Request:

{
    "a":
    {
        "cas": <optional expected seq-nr (int)>,
        "id": <20 byte id of sending node (string)>,
        "k": <ed25519 public key (32 bytes string)>,
        "salt": <optional salt to be appended to "k" when hashing (string)>
        "seq": <monotonically increasing sequence number (integer)>,
        "sig": <ed25519 signature (64 bytes string)>,
        "token": <write-token (string)>,
        "v": <any bencoded type, whose encoded size < 1000>
    },
    "t": <transaction-id (string)>,
    "y": "q",
    "q": "put"
}

Storing nodes receiving a put request where seq is lower than or equal to what's already stored on the node, MUST reject the request. If the sequence number is equal, and the value is also the same, the node SHOULD reset its timeout counter.

If the sequence number in the put message is lower than the sequence number associated with the currently stored value, the storing node MAY return an error message with code 302 (see error codes below).

Note that this request does not contain a target hash. The target hash under which this blob is stored is implied by the k argument. The key is the SHA-1 hash of the key (k).

In order to support a single key being used to store separate items in the DHT, an optional salt can be specified in the put request of mutable items.

If the salt entry is not present, it can be assumed to be an empty string, and its semantics should be identical as specifying a salt key with an empty string.

The salt can be any binary string (but probably most conveniently a hash of something). This string is appended to the key, as specified in the k field, when calculating the key to store the blob under (i.e. the key get requests specify to retrieve this data).

This lets a single entity, with a single key, publish any number of unrelated items, with a single key that readers can verify. This is useful if the publisher doesn't know ahead of time how many different items are to be published. It can distribute a single public key for users to authenticate the published blobs.

Note that the salt is not returned in the response to a get request. This is intentional. When issuing a get request for an item is expected to know what the salt is (because it is part of what the target ID that is being looked up is derived from). There is no need to repeat it back for bystanders to see.

CAS

CAS is short for compare and swap, it has similar semantics as CAS CPU instructions. It is used to avoid race conditions when multiple nodes are writing to the same slot in the DHT.

The cas field is optional. If present it specifies the sequence number of the data blob being overwritten by the put. When present, the storing node MUST compare this number to the current sequence number it has stored under this key. Only if the cas matches the stored sequence number is the put performed. If it mismatches, the store fails and an error MUST be returned. See Errors below.

The cas field only applies to mutable puts. If there is no current value, the cas field SHOULD be ignored.

When sending a put request to a node that did not return any data for the get, the cas field SHOULD NOT be included.

Response

Response:

{
    "r": { "id": <20 byte id of sending node (string)> },
    "t": <transaction-id (string)>,
    "y": "r",
}

Errors

If the store fails for any reason an error message is returned instead of the message template above, i.e. one where "y" is "e" and "e" is a tuple of [error-code, message]). Failures include cas mismatches and the sequence number is outdated.

The error message (as specified by BEP 5 [1]) looks like this:

{
    "e": [ <error-code (integer)>, <error-string (string)> ],
    "t": <transaction-id (string)>,
    "y": "e",
}

In addition to the error codes defined in BEP 5, this specification defines some additional error codes.

error-code description
205 message (v field) too big.
206 invalid signature
207 salt (salt field) too big.
301 the CAS hash mismatched, re-read value and try again.
302 sequence number less than current.

An implementation MUST emit 301 errors if the cas mismatches. This is a critical feature in synchronization of multiple agents sharing a mutable item.

get message

Request:

{
    "a":
    {
        "id": <20 byte id of sending node (string)>,
        "seq": <optional sequence number (integer)>,
        "target:" <20 byte SHA-1 hash of public key and salt (string)>
    },
    "t": <transaction-id (string)>,
    "y": "q",
    "q": "get"
}

The optional seq field specifies that an item's value should only be sent if its sequence number is greater than the given value. If a stored item exists but its sequence number is less than or equal to the seq field then the k, v, and sig fields SHOULD be omitted from the response.

Response:

{
    "r":
    {
        "id": <20 byte id of sending node (string)>,
        "k": <ed25519 public key (32 bytes string)>,
        "nodes": <IPv4 nodes close to 'target'>,
        "nodes6": <IPv6 nodes close to 'target'>,
        "seq": <monotonically increasing sequence number (integer)>,
        "sig": <ed25519 signature (64 bytes string)>,
        "token": <write-token (string)>,
        "v": <any bencoded type, whose encoded size <= 1000>
    },
    "t": <transaction-id (string)>,
    "y": "r",
}

Signature Verification

In order to make it maximally difficult to attack the bencoding parser, signing and verification of the value and sequence number should be done as follows:

  1. Encode value and sequence number separately.
  2. Concatenate ("4:salt" length-of-salt ":" salt) "3:seqi" seq "e1:v" len ":" and the encoded value. Sequence number 1 of value "Hello World!" would be converted to: "3:seqi1e1:v12:Hello World!". In this way it is not possible to convince a node that part of the length is actually part of the sequence number even if the parser contains certain bugs. Furthermore it is not possible to have a verification failure if a bencoding serializer alters the order of entries in the dictionary. The salt is in parenthesis because it is optional. It is only prepended if a non-empty salt is specified in the put request.
  3. Sign or verify the concatenated string.

On the storage node, the signature MUST be verified before accepting the store command. The data MUST be stored under the SHA-1 hash of the public key (as it appears in the bencoded dict) and the salt (if present).

On the requesting nodes, the key they get back from a get request MUST be verified to hash to the target ID the lookup was made for, as well as verifying the signature. If any of these fail, the response SHOULD be considered invalid.

Expiration

Without re-announcement, these items MAY expire in 2 hours. In order to keep items alive, they SHOULD be re-announced once an hour.

Any node that's interested in keeping a blob in the DHT alive may announce it. It would simply repeat the signature for a mutable put without having the private key. This allows data to persist even when the initial publisher goes away or only comes online when replacing it with new data.

To reduce write traffic subscribers do not need to republish the value if all of the following conditions are met during a get lookup:

  1. They find more than 8 copies of the data
  2. The 8 nodes closest to the target key which are eligible for a store all have indicated they have the data, either by returning it or through the seq number.
  3. For mutable items only nodes holding values with the most recent known sequence number count towards meeting these conditions

The assumption here is that implementation inaccuracies, churn and packet drops lead different nodes seeing slightly different sets of closest nodes and thus publish to different nodes. Therefore an up-to-date set of closest nodes plus an excess of stored values beyond that set indicates that there are multiple republishers.

Subscribers may also persist the values to permanent storage to republish them after restarts.

Test Vectors

test 1 (mutable)

value:

12:Hello World!

buffer being signed:

3:seqi1e1:v12:Hello World!

public key:

77ff84905a91936367c01360803104f92432fcd904a43511876df5cdf3e7e548

private key:

e06d3183d14159228433ed599221b80bd0a5ce8352e4bdf0262f76786ef1c74d
b7e7a9fea2c0eb269d61e3b38e450a22e754941ac78479d6c54e1faf6037881d

target ID:

4a533d47ec9c7d95b1ad75f576cffc641853b750

signature:

305ac8aeb6c9c151fa120f120ea2cfb923564e11552d06a5d856091e5e853cff
1260d3f39e4999684aa92eb73ffd136e6f4f3ecbfda0ce53a1608ecd7ae21f01

test 2 (mutable with salt)

value:

12:Hello World!

salt:

foobar

buffer being signed:

4:salt6:foobar3:seqi1e1:v12:Hello World!

public key:

77ff84905a91936367c01360803104f92432fcd904a43511876df5cdf3e7e548

private key:

e06d3183d14159228433ed599221b80bd0a5ce8352e4bdf0262f76786ef1c74d
b7e7a9fea2c0eb269d61e3b38e450a22e754941ac78479d6c54e1faf6037881d

target ID:

411eba73b6f087ca51a3795d9c8c938d365e32c1

signature:

6834284b6b24c3204eb2fea824d82f88883a3d95e8b4a21b8c0ded553d17d17d
df9a8a7104b1258f30bed3787e6cb896fca78c58f8e03b5f18f14951a87d9a08

test 3 (immutable)

value:

12:Hello World!

target ID:

e5f96f6f38320f0f33959cb4d3d656452117aadb

Resources

This document was derived heavily from the documentation of the extension included in libtorrent [2]. In many places text was simply copied and modified.

Libraries that implement ed25519 DSA:

References

[1](1, 2, 3) BEP_0005. DHT Protocol (http://www.bittorrent.org/beps/bep_0005.html)
[2]BitTorrent extension for arbitrary DHT store, Arvid Norberg (http://www.libtorrent.org/dht_store.html)