BEP: | 45 |
---|---|
Title: | Multiple-address operation for the BitTorrent DHT |
Version: | 9c5c1dd1b372016e05af84fb34fccac6752ef54a |
Last-Modified: | Thu Jul 21 10:45:38 2016 -0400 |
Author: | The 8472 <the8472.bep@infinite-source.de> |
Status: | Draft |
Type: | Informational |
Content-Type: | text/x-rst |
Created: | 26-Jun-2015 |
Post-History: |
This BEP provides implementation advice for clients that wish to operate a DHT node (BEP 5) on a host that has multiple globally routable unicast addresses.
In large parts it is based on practical experience operating such a node with 256+ IPv4 and IPv6 addresses.
socket address: in this document generally refers to a socket bound to a distinct, specific address and port and thus is used interchangeably with <IP:Port> pair and the DHT state/behavior associated with it.
node: a DHT peer, which uses one or more socket addresses. Remote nodes are assumed to be interchangeable with a single socket address on their end.
A node operator might choose to run a DHT node on multiple addresses for the purpose of source-specific multi-path routing [1] or for research purposes to sample a larger fraction of DHT traffic.
In general any DHT node using multiple socket addresses should appear to the outside world as separate, independent DHT nodes. Exceptions to this rule will be outlined in a later section.
This spares other implementations from having to deal with any idiosyncrasies that might otherwise arise from a single node sending from multiple socket addresses.
The easiest approach to achieve this is to simply run multiple independent DHT nodes, each bound to a separate address and, if needed, perform separate announces through each of them.
The most important constraints are as follows:
Each socket address must have a distinct Node ID.
Rationale: To an external observer shared node IDs would look like a node constantly changing its IPs (e.g. unreliable domestic internet connection or a spoofing attack) or port (pathological NAT implementations) and thus deem them as unreliable and evict them from their routing table.
Node IDs should not only be distinct but also vastly separated by XOR distance, i.e. differ in some of their high order bits. This can be most easily achieved by generating a random node ID for each socket address. Alternatively an implementation can derive new IDs from a base ID by incrementing the base ID in reverse bit order, i.e. by flipping the high order bits first. Care should be taken to avoid counter reuse across multiple active sockets.
Rationale: Failure of a multi-homed node whose IDs only differ in low-order bits could take a small slice of the keyspace with it into oblivion, thus violating redundancy of the DHT.
Responses must be sent from the same socket address on which the corresponding request was received.
Write-tokens must only be sent from the same socket on which they were received.
Implementations should avoid querying the same targets from multiple sockets at once as it may cause unnecessary load spikes for remote nodes. This becomes more important as the number of socket addresses increases. Temporally spacing lookups to the same target keys may accomplish this.
Nodes should avoid utilizing multiple ports on a single IP address.
Rationale: Other nodes might throttle or implement other sanitizing behavior based on IP.
Stipulations regarding socket binding from BEP 32 apply, i.e. nodes should avoid binding sockets to the unspecified address (0.0.0.0 or ::/128).
For IPv4 autoconfiguration an implementation may choose the following strategies:
if available bind to all available global unicast addresses
if no global unicast address is available determine a local address which is usable to connect to a randomly chosen global unicast address and then bind to that. Then bootstrap in single-address mode.
Optionally, once bootstrapping has completed and nodes supporting BEP 42 ip fields have been found they can be used to determine the external address of other available local addresses and then switch to multi-address mode if additional external addresses are available.
Implementation Note: A routed local address can be obtained by calling connect() on an unbound UDP socket with a global unicast address as target and then reading the local bind address via getsockname().
For IPv6 autoconfiguration nodes should space IPv6 addresses as widely as they can reasonably afford.
Rationale: Other nodes might treat IPv6 /64 subnets as a single entity for the purpose of spam / malicious behavior detection and thus impose limits on the services they offer to nodes within such a block or how many entries they permit in their routing table
Rationale 2: If source-based multi-path routing is the goal then it is unlikely that multiple addresses within a /64 will be routed differently.
Ultimately the exact network configuration of a multi-homed setup is hard to predict, therefore a user should be able to explicitly configure which network interfaces/prefixes/socket addresses a node should bind to. Documentation mirroring these guidelines may also aid the user in configuring his network.
Implementations may want to apply some heuristic based on responses and incoming unsolicited requests to determine whether individual socket addresses are reachable. This information can be used to ignore timeouts which occur on non-connected socket addresses for the purpose of routing table maintenance. In other words routing table entries should not be penalized for timeouts on a link known to be down when they are still reachable over other links.
Additionally nodes may preferentially schedule data lookups on known-good socket addresses while performing maintenance pings on all socket addresses - including bad ones - in the hope that they eventually become reachable again.
Since remote nodes may restrict the validity of write tokens to the IP address to which they have been issued, a node cannot simply announce to other nodes from multiple socket addresses. It has to perform new lookups to obtain new tokens for each of them.
But it can reuse the set of responding nodes from a lookup on one socket address as candidate set for a lookup on another socket address, that may significantly accelerate a lookup.
As mentioned earlier, individual lookups should be performed with some delay between each to avoid hammering the same nodes from multiple sockets.
Of course the announced addresses should be consistent with the ones on which the underlying bittorrent client can accept connections. The easiest way to achieve that is to have the bittorrent client listen for peer connections on the unspecified address.
Other types of write operations, such as BEP 44 put may not benefit from being performed on more than one socket address at a given time.
It is therefore sufficient to schedule them on a randomly selected (active) socket address each time they are to be performed.
The traffic a DHT node receives is roughly proportional to its number of socket addresses and also grows with uptime, especially once reaching 24 hours of uptime.
At some point various stateful network components (i.e. firewalls, NATs) may become overwhelmed by the UDP traffic flowing to potentially millions of different remote addresses and lead to packet drops.
To prevent these issues they should be put into a stateless configuration, at least for any traffic flowing from and to the local ports used by the DHT node.
For firewalls this requires exempting those ports from connection tracking, for NATs additional static forwarding rules may be required.
[1] | "Source-specific routing", Boutier, Chroboczek (http://arxiv.org/pdf/1403.0445.pdf) |
[2] | "Kademlia: A peer-to-peer information system based on the xor metric", Maymounkov, Mazieres (http://pdos.csail.mit.edu/~petar/papers/maymounkov-kademlia-lncs.pdf) |