--- 1/draft-ietf-p2psip-self-tuning-11.txt 2014-06-08 02:14:26.429997243 -0700 +++ 2/draft-ietf-p2psip-self-tuning-12.txt 2014-06-08 02:14:26.477998426 -0700 @@ -1,19 +1,19 @@ P2PSIP Working Group J. Maenpaa Internet-Draft G. Camarillo Intended status: Standards Track Ericsson -Expires: November 10, 2014 May 9, 2014 +Expires: December 10, 2014 June 8, 2014 Self-tuning Distributed Hash Table (DHT) for REsource LOcation And Discovery (RELOAD) - draft-ietf-p2psip-self-tuning-11.txt + draft-ietf-p2psip-self-tuning-12.txt Abstract REsource LOcation And Discovery (RELOAD) is a peer-to-peer (P2P) signaling protocol that provides an overlay network service. Peers in a RELOAD overlay network collectively run an overlay algorithm to organize the overlay, and to store and retrieve data. This document describes how the default topology plugin of RELOAD can be extended to support self-tuning, that is, to adapt to changing operating conditions such as churn and network size. @@ -26,21 +26,21 @@ Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." - This Internet-Draft will expire on November 10, 2014. + This Internet-Draft will expire on December 10, 2014. Copyright Notice Copyright (c) 2014 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents @@ -57,39 +57,40 @@ 3. Introduction to Stabilization in DHTs . . . . . . . . . . . . 5 3.1. Reactive vs. Periodic Stabilization . . . . . . . . . . . 5 3.2. Configuring Periodic Stabilization . . . . . . . . . . . 6 3.3. Adaptive Stabilization . . . . . . . . . . . . . . . . . 7 4. Introduction to Chord . . . . . . . . . . . . . . . . . . . . 8 5. Extending Chord-reload to Support Self-tuning . . . . . . . . 9 5.1. Update Requests . . . . . . . . . . . . . . . . . . . . . 10 5.2. Neighbor Stabilization . . . . . . . . . . . . . . . . . 10 5.3. Finger Stabilization . . . . . . . . . . . . . . . . . . 11 5.4. Adjusting Finger Table Size . . . . . . . . . . . . . . . 11 - 5.5. Detecting Partitioning . . . . . . . . . . . . . . . . . 12 + 5.5. Detecting Partitioning . . . . . . . . . . . . . . . . . 11 5.6. Leaving the Overlay . . . . . . . . . . . . . . . . . . . 12 6. Self-tuning Chord Parameters . . . . . . . . . . . . . . . . 12 6.1. Estimating Overlay Size . . . . . . . . . . . . . . . . . 12 6.2. Determining Routing Table Size . . . . . . . . . . . . . 13 6.3. Estimating Failure Rate . . . . . . . . . . . . . . . . . 13 6.3.1. Detecting Failures . . . . . . . . . . . . . . . . . 14 6.4. Estimating Join Rate . . . . . . . . . . . . . . . . . . 15 6.5. Estimate Sharing . . . . . . . . . . . . . . . . . . . . 15 6.6. Calculating the Stabilization Interval . . . . . . . . . 17 - 7. Overlay Configuration Document Extension . . . . . . . . . . 17 + 7. Overlay Configuration Document Extension . . . . . . . . . . 18 8. Security Considerations . . . . . . . . . . . . . . . . . . . 18 - 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 - 9.1. Message Extensions . . . . . . . . . . . . . . . . . . . 18 + 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 19 + 9.1. Message Extensions . . . . . . . . . . . . . . . . . . . 19 + 9.2. A New IETF XML Registry . . . . . . . . . . . . . . . . . 19 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 19 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 19 11.1. Normative References . . . . . . . . . . . . . . . . . . 19 - 11.2. Informative References . . . . . . . . . . . . . . . . . 19 - Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 21 + 11.2. Informative References . . . . . . . . . . . . . . . . . 20 + Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 22 1. Introduction REsource LOcation And Discovery (RELOAD) [RFC6940] is a peer-to-peer signaling protocol that can be used to maintain an overlay network, and to store data in and retrieve data from the overlay. For interoperability reasons, RELOAD specifies one overlay algorithm, called chord-reload, that is mandatory to implement. This document extends the chord-reload algorithm by introducing self-tuning behavior. @@ -148,24 +149,24 @@ on an identifier circle of size 2^numBitsInNodeId. This identifier circle is called the Chord ring. On the Chord ring, the responsibility for a key k is assigned to the node whose identifier equals to or immediately follows k. Finger Table: A data structure with up to (but typically less than) numBitsInNodeId entries maintained by each peer in a Chord-based overlay. The ith entry in the finger table of peer n contains the identity of the first peer that succeeds n by at least 2^(numBitsInNodeId-i) on the Chord ring. This peer is called the - ith finger of peer n. As an example, the first entry in the finger - table of peer n contains a peer half-way around the Chord ring - from peer n. The purpose of the finger table is to accelerate - lookups. + ith finger of peer n. As an example, the first entry in the + finger table of peer n contains a peer half-way around the Chord + ring from peer n. The purpose of the finger table is to + accelerate lookups. n.id: An abbreviation that is in this document used refer to the Node-ID of peer n. O(g(n)): Informally, saying that some equation f(n) = O(g(n)) means that f(n) is less than some constant multiple of g(n). For the formal definition, please refer to [weiss1998]. Omega(g(n)): Informally, saying that some equation f(n) = Omega(g(n)) means that f(n) is more than some constant multiple of @@ -278,22 +279,22 @@ As an example, to configure the Chord DHT algorithm, one needs to select values for the following parameters: size of successor list, stabilization interval, and size of the finger table. To select an appropriate value for the stabilization interval, one needs to know the expected churn rate and overlay size. According to [liben-nowell2002], a Chord network in a ring-like state remains in a ring-like state as long as peers send Omega(square(log(N))) messages before N new peers join or N/2 peers fail. Thus, in a 500-peer overlay churning at a rate such that one peer joins and one peer leaves the network every 30 seconds, an appropriate stabilization - interval would be on the order of 93s. According to [Chord], the size - of the successor list and finger table should be on the order of + interval would be on the order of 93s. According to [Chord], the + size of the successor list and finger table should be on the order of log(N). Already a successor list of a modest size (e.g., log2(N) or 2*log2(N), which is the successor list size used in [Chord]) makes it very unlikely that a peer will lose all of its successors, which would cause the Chord ring to become disconnected. Thus, in a 500-peer network each peer should maintain on the order of nine successors and fingers. However, if the churn rate doubles and the network size remains unchanged, the stabilization rate should double as well. That is, the appropriate maintenance interval would now be on the order of 46s. On the other hand, if the churn rate becomes e.g. six-fold and the size of the network grows to 2000 peers, on the @@ -344,25 +345,24 @@ each peer n maintains a data structure with up to numBitsInNodeId entries, called the finger table. The first entry in the finger table of peer n contains the peer half-way around the ring from peer n. The second entry contains the peer that is 1/4th of the way around, the third entry the peer that is 1/8th of the way around, etc. In other words, the ith entry in the finger table at peer n contains the identity of the first peer s that succeeds n by at least 2^(numBitsInNodeId-i) on the Chord ring. This peer is called the ith finger of peer n. The interval between two consecutive fingers is called a finger interval. The ith finger interval of peer n covers - the range [n.id + 2^(numBitsInNodeId-i), n.id + - 2^(numBitsInNodeId-i+1)) on the Chord ring. In an N-peer network, - each peer maintains information about O(log(N)) other peers in its - finger table. As an example, if N=100000, it is sufficient to - maintain 17 fingers. + the range [n.id + 2^(numBitsInNodeId-i), n.id + 2^(numBitsInNodeId- + i+1)) on the Chord ring. In an N-peer network, each peer maintains + information about O(log(N)) other peers in its finger table. As an + example, if N=100000, it is sufficient to maintain 17 fingers. Chord needs all peers' successor pointers to be up to date in order to ensure that lookups produce correct results as the set of participating peers changes. To achieve this, peers run a stabilization protocol periodically in the background. The stabilization protocol of the original Chord algorithm uses two operations: successor stabilization and finger stabilization. However, the Chord algorithm of RELOAD base defines two additional stabilization components, as will be discussed below. @@ -520,38 +520,38 @@ Similarly, if a given predecessor identifies better successors by investigating the successor list it receives from the leaving peer, it MUST send Attach requests to them. 6. Self-tuning Chord Parameters This section specifies how to determine an appropriate stabilization rate and routing table size in an adaptive fashion. The proposed mechanism is based on [mahajan2003], [liben-nowell2002], and [ghinita2006]. To calculate an appropriate stabilization rate, the - values of three parameters MUST be estimated: overlay size N, failure + values of three parameters must be estimated: overlay size N, failure rate U, and join rate L. To calculate an appropriate routing table size, the estimated network size N can be used. Peers in the overlay MUST re-calculate the values of the parameters to self-tune the chord-reload algorithm at the end of each stabilization period before re-starting the stabilization timer. 6.1. Estimating Overlay Size Techniques for estimating the size of an overlay network have been proposed for instance in [mahajan2003], [horowitz2003], [kostoulas2005], [binzenhofer2006], and [ghinita2006]. In Chord, the density of peer identifiers in the neighborhood set can be used to produce an estimate of the size of the overlay, N [mahajan2003]. Since peer identifiers are picked randomly with uniform probability from the numBitsInNodeId-bit identifier space, the average distance - between peer identifiers in the successor set is (2^numBitsInNodeId)/ - N. + between peer identifiers in the successor set is + (2^numBitsInNodeId)/N. To estimate the overlay network size, a peer MUST compute the average inter-peer distance d between the successive peers starting from the most distant predecessor and ending to the most distant successor in the successor list. The estimated network size MUST be calculated as: 2^numBitsInNodeId N = ------------------- d @@ -605,22 +605,22 @@ smaller than K, the estimate MUST be computed as if there was a failure at the current time. It has been shown that an estimate calculated in a similar manner is accurate within 17% of the real value of U [ghinita2006]. The size of the failure history K affects the accuracy of the estimate of U. One can increase the accuracy by increasing K. However, this has the side effect of decreasing responsiveness to changes in the failure rate. On the other hand, a small history size may cause a peer to overreact each time a new failure occurs. In - [ghinita2006], K is set 25% of the routing table size. Use of this - approach is RECOMMENDED. + [ghinita2006], K is set to 25% of the routing table size. Use of + this approach is RECOMMENDED. 6.3.1. Detecting Failures A new failure MUST be inserted to the failure history in the following cases: 1. A Leave request is received from a neigbhor. 2. A peer fails to reply to a Ping request sent in the situation explained below. If no packets have been received on a @@ -681,27 +681,25 @@ 6.5. Estimate Sharing To improve the accuracy of network size, join rate, and leave rate estimates, peers MUST share their estimates. When the stabilization timer fires, a peer MUST select number-of-peers-to-probe random peers from its finger table and send each of them a Probe request. The targets of Probe requests are selected from the finger table rather than from the neighbor table since neighbors are likely to make similar errors when calculating their estimates. number-of-peers-to- probe is a new element in the overlay configuration document. It is - defined in Section 7 and has a default value of 4. Both the Probe - request and the answer returned by the target peer MUST contain a new - message extension whose MessageExtensionType is 'self_tuning_data'. - - This extension type is defined in Section 9.1. The - extension_contents field of the MessageExtension structure MUST - contain a SelfTuningData structure: + defined in Section 7. Both the Probe request and the answer returned + by the target peer MUST contain a new message extension whose + MessageExtensionType is 'self_tuning_data'. This extension type is + defined in Section 9.1. The extension_contents field of the + MessageExtension structure MUST contain a SelfTuningData structure: struct { uint32 network_size; uint32 join_rate; uint32 leave_rate; } SelfTuningData; The contents of the SelfTuningData structure are as follows: network_size @@ -726,20 +724,35 @@ contain the value 'self_tuning_data'. The 'critical' field of the structure MUST be set to False. A peer MUST store all estimates it receives in Probe requests and answers during a stabilization interval. When the stabilization timer fires, the peer MUST calculate the estimates to be used during the next stabilization interval by taking the 75th percentile (i.e., third quartile) of a data set containing its own estimate and the received estimates. + The default value for number-of-peers-to-probe is 4. This default + value is recommended to allow a peer to receive a sufficiently large + set of estimates from other peers. With a value of 4, a peer + receives four estimates in Probe answers. On the average, each peer + also receives four Probe requests each carrying an estimate. Thus, + on the average, each peer has nine estimates (including its own) that + it can use at the end of the stablization interval. A value smaller + than 4 is NOT RECOMMENDED to keep the number of received estimates + high enough. As an example, if the value were 2, there would be + peers in the overlay that would only receive two estimates during a + stabilization interval. Such peers would only have three estimates + available at the end of the interval, which may not be reliable + enough since even a single exceptionally high or low estimate can + have a large impact. + 6.6. Calculating the Stabilization Interval According to [liben-nowell2002], a Chord network in a ring-like state remains in a ring-like state as long as peers send Omega(square(log(N))) messages before N new peers join or N/2 peers fail. We can use the estimate of peer failure rate, U, to calculate the time Tf in which N/2 peers fail: 1 Tf = ------ @@ -775,82 +788,96 @@ 7. Overlay Configuration Document Extension This document extends the RELOAD overlay configuration document by adding one new element, "number-of-peers-to-probe", inside each "configuration" element. self-tuning:number-of-peers-to-probe: The number of fingers to which Probe requests are sent to obtain their network size, join rate, and leave rate estimates. The default value is 4. - This new element is formally defined as follows: + The Relax NG Grammar for this element is: namespace self-tuning = "urn:ietf:params:xml:ns:p2p:self-tuning" parameter &= element self-tuning:number-of-peers-to-probe { - xsd:unsignedInt } + xsd:unsignedInt }? This namespace is added into the element in the overlay configuration file. 8. Security Considerations In the same way as malicious or compromised peers implementing the RELOAD base protocol [RFC6940] can advertise false network metrics or distribute false routing table information for instance in RELOAD Update messages, malicious peers implementing this specification may share false join rate, leave rate, and network size estimates. For such attacks, the same security concerns apply as in the RELOAD base specification. In addition, as long as the amount of malicious peers in the overlay remains modest, the statistical mechanisms applied in Section 6.5 (i.e., the use of 75th percentiles) to process the shared - estimates a peer obtains help ensuring that estimates that are - clearly different from (i.e., larger or smaller than) other received + estimates a peer obtains help ensure that estimates that are clearly + different from (i.e., larger or smaller than) other received estimates will not significantly influence the process of adapting the stabilization interval and routing table size. However, it should be noted that if an attacker is able to impersonate a high number of other peers in the overlay in strategic locations, it may be able to send a high enough number of false estimates to a victim and therefore influence the victim's choice of a stabilization interval. 9. IANA Considerations 9.1. Message Extensions This document introduces one additional extension to the "RELOAD Extensions" Registry: +------------------+-------+---------------+ | Extension Name | Code | Specification | +------------------+-------+---------------+ - | self_tuning_data | 3 | RFC-AAAA | + | self_tuning_data | 0x3 | RFC-AAAA | +------------------+-------+---------------+ The contents of the extension are defined in Section 6.5. Note to RFC Editor: please replace AAAA with the RFC number for this specification. +9.2. A New IETF XML Registry + + This document registers one new URI for the self-tuning namespace in + the IETF XML registry defined in [RFC3688]. + + URI: urn:ietf:params:xml:ns:p2p:self-tuning + + Registrant Contact: The IESG + + XML: N/A, the requested URI is an XML namespace + 10. Acknowledgments The authors would like to thank Jani Hautakorpi for his contributions to the document. The authors would also like to thank Carlos Bernardos and Martin Durst for their comments on the document. 11. References 11.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. + [RFC3688] Mealling, M., "The IETF XML Registry", BCP 81, RFC 3688, + January 2004. + [RFC5245] Rosenberg, J., "Interactive Connectivity Establishment (ICE): A Protocol for Network Address Translator (NAT) Traversal for Offer/Answer Protocols", RFC 5245, April 2010. [RFC5389] Rosenberg, J., Mahy, R., Matthews, P., and D. Wing, "Session Traversal Utilities for NAT (STUN)", RFC 5389, October 2008. [RFC6940] Jennings, C., Lowekamp, B., Rescorla, E., Baset, S., and