[Docs] [txt|pdf|xml|html] [Tracker] [Email] [Nits]

Versions: 00 01 02 03 04 05

Network Working Group                                           N. Barry
Internet-Draft                            Stellar Development Foundation
Intended status: Experimental                                D. Mazieres
Expires: September 6, 2018                           Stanford University
                                                              J. McCaleb
                                          Stellar Development Foundation
                                                                 S. Polu
                                                             Stripe Inc.
                                                           March 5, 2018


                  The Stellar Consensus Protocol (SCP)
                      draft-mazieres-dinrg-scp-00

Abstract

   SCP is a protocol for Internet-level Byzantine agreement.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on September 6, 2018.

Copyright Notice

   Copyright (c) 2018 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (https://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of




Barry, et al.           Expires September 6, 2018               [Page 1]


Internet-Draft                     scp                        March 2018


   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  The Model . . . . . . . . . . . . . . . . . . . . . . . . . .   3
     2.1.  Configuration . . . . . . . . . . . . . . . . . . . . . .   3
     2.2.  Input and output  . . . . . . . . . . . . . . . . . . . .   3
   3.  Protocol  . . . . . . . . . . . . . . . . . . . . . . . . . .   4
     3.1.  Basic types . . . . . . . . . . . . . . . . . . . . . . .   4
     3.2.  Quorum slices . . . . . . . . . . . . . . . . . . . . . .   5
     3.3.  Nomination  . . . . . . . . . . . . . . . . . . . . . . .   5
     3.4.  Prepare messages  . . . . . . . . . . . . . . . . . . . .   7
     3.5.  Confirm messages  . . . . . . . . . . . . . . . . . . . .   9
     3.6.  Externalize messages  . . . . . . . . . . . . . . . . . .  10
     3.7.  Message envelopes . . . . . . . . . . . . . . . . . . . .  11
   4.  Security considerations . . . . . . . . . . . . . . . . . . .  12
   5.  Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .  12
   6.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  13
     6.1.  Normative References  . . . . . . . . . . . . . . . . . .  13
     6.2.  Informative References  . . . . . . . . . . . . . . . . .  13
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  13

1.  Introduction

   Various aspects of Internet infrastructure depend on irreversible and
   transparent updates to data sets such as authenticated mappings [cite
   Li-Man-Watson draft].  Examples include public key certificates
   mapping domain names to public keys, transparency logs [RFC6962],
   HSTS [RFC6797] and HPKP [RFC7469] preload lists.

   The Stellar Consensus Protocol (SCP) specified in this draft allows
   various Internet infrastructure stakeholders to collaborate in
   applying irreversible transactions to public state.  SCP is an open
   Byzantine agreement protocol that resists Sybil attacks by allowing
   individual parties to specify minimum quorum memberships in terms of
   specific trusted peers.  Hence, participants can choose specific
   peers they know to be trustworthy and enjoy safety so long as the
   peers they depend on honestly obey the protocol.

   An more detailed abstract description of the SCP protocol and
   English-language proof of safety is available in [SCP].  This
   document specifies the end-system logic and wire format of the
   messages.






Barry, et al.           Expires September 6, 2018               [Page 2]


Internet-Draft                     scp                        March 2018


2.  The Model

2.1.  Configuration

   Each participant or _node_ in the SCP protocol has a digital
   signature key and is named by the corresponding public key, which we
   term a "NodeID".

   Each node also selects one or more groups of nodes (each of which
   includes itself) called _quorum slices_.  A quorum slice represents a
   large or important enough set of peers that the node selecting the
   quorum slice believes the slice collectively speaks for the whole
   network.

   A _quorum_ is a non-empty set of nodes containing at least one quorum
   slice of each of its members.  For instance, suppose "v1" has the
   single quorum slice "{v1, v2, v3}", and each of "v2", "v3", and "v4"
   has the single quorum slice "{v2, v3, v4}".  In this case, "{v2, v3,
   v4}" is a quorum, because it contains a slice for each member.  On
   the other hand "{v1, v2, v3}" is not a quorum, because it does not
   contain a quorum slice for "v2" or "v3".  The smallest quorum
   including "v1" in this example is the set of all nodes "{v1, v2, v3,
   v4}".

   Unlike traditional Byzantine agreement protocols, nodes in SCP only
   care about quorums to which they belong themselves (and hence which
   contain at least one of their quorum slices).  Intuitively, this is
   what protects nodes from Sybil attacks.  In the example above, if
   "v3" deviates from the protocol maliciously invents 96 Sybils "v5,
   v6, ..., v100", the honest nodes quorums will all still include one
   another.

   Every message in the SCP protocol specifies the sender's quorum
   slices.  Hence, by collecting messages a node dynamically learns what
   constitutes a quorum, and can decide when a particular message has
   been sent by a quorum to which it belongs.  (Again, nodes to not care
   about quorums to which they do not belong themselves.)

2.2.  Input and output

   The SCP protocol outputs a series of _values_ each associated with a
   consecutively numbered _slot_.  5 seconds after SCP outputs the value
   for one slot, nodes restart the protocol to select a value for the
   next slot.  The time between slots is used to construct candidate
   values for the next slot, as well as to amortize the cost of
   consensus over an arbitrary-sized batch of operations.





Barry, et al.           Expires September 6, 2018               [Page 3]


Internet-Draft                     scp                        March 2018


   There must exist a deterministic combining function that reduces
   multiple candidate values into a single _composite_ value.  Hence,
   should nodes nominate multiple values, they can still
   deterministically converge on the same composite value.  Note that
   this does not mean that every node has a say in the composite value
   for every slot.  Whether or not a node can in practice influence a
   given slot depends on how many other nodes include that node in their
   quorum slices as well as how the current slot number perturbs a
   cryptographic hash of nodes' public keys.

3.  Protocol

   The protocol consists of exchanging digitally-signed messages
   containing nodes' quorum slices.  When multiple nodes send the same
   messages, two thresholds have significance throughout the protocol
   for each node "v".

   o  quorum threshold: When "v" node is a member of a quorum in which
      every member (including "v") has issued some signed statement.

   o  blocking threshold: When every one of "v"'s quorum slices has a
      member issuing some signed statement (which doesn't necessarily
      include "v").

   Message formats are specified using XDR [RFC4506].

3.1.  Basic types

   SCP employs 32- and 64-bit integers, as defined below.

                      typedef unsigned int uint32;
                      typedef int int32;
                      typedef unsigned hyper uint64;
                      typedef hyper int64;

   SCP uses the SHA-256 cryptograhpic hash function [RFC6234], and
   represents hash values as a simple array of 32 bytes.

                         typedef opaque Hash[32];

   SCP employes the Ed25519 digital signature algorithm [RFC8032].  For
   purposes of cryptographic agility, however, public keys are
   represented as a union type that can be compatibly extended with
   other key types.







Barry, et al.           Expires September 6, 2018               [Page 4]


Internet-Draft                     scp                        March 2018


     typedef opaque uint256[32];

     enum PublicKeyType
     {
         PUBLIC_KEY_TYPE_ED25519 = 0
     };

     union PublicKey switch (PublicKeyType type)
     {
     case PUBLIC_KEY_TYPE_ED25519:
         uint256 ed25519;
     };

     // variable size as the size depends on the signature scheme used
     typedef opaque Signature<64>;

   Nodes are public keys, while values are simply opaque arrays of byes.

                         typedef PublicKey NodeID;

                         typedef opaque Value<>;

3.2.  Quorum slices

   Theoretically a quorum slice can be an arbitrary set of sets of
   nodes.  However, it would not be possible to represent an arbitrary
   predicate on sets concisely.  Instead we specify quorum slices as any
   set of k-of-n members, where each of the n members can either be an
   individual node ID, or, recursively, another k-of-n predicate.

          // supports things like: A,B,C,(D,E,F),(G,H,(I,J,K,L))
          // only allows 2 levels of nesting
          struct SCPQuorumSet
          {
              uint32 threshold;            // the n in k-of-n
              PublicKey validators<>;
              SCPQuorumSet innerSets<>;
          };

3.3.  Nomination

   Nomination message are sent in the first phase of each slot in an
   attempt to select a value on which to try to agree.








Barry, et al.           Expires September 6, 2018               [Page 5]


Internet-Draft                     scp                        March 2018


                       struct SCPNomination
                       {
                           Hash quorumSetHash; // D
                           Value votes<>;      // X
                           Value accepted<>;   // Y
                       };

   As with all messages, the nomination protocol includes a SHA-256 hash
   of the sender's XDR-encoded SCPQuorumSet.  It also includes two
   monotonically-growing sets of values.

   "votes" consists of candidate values nominated by the sender.  Each
   node progresses through a series of nomination _rounds_ in which it
   potentially adds more and more values to "votes" by adding "votes"
   for valid values received from a growing set of peers to its own set
   of "votes".  Each node computes the set of peers whose nomination
   "votes" it should echo based on its quorum slices as follows for
   round "n" of slot "i":

   o  Let "Gi(m) = SHA-256(i || output[i-1] || m)", where "output[i-1]"
      is the consensus output of slot i-1 or the zero-byte value for
      slot 0.  (Recall values are encoded as an XDR opaque vector, with
      a 32-byte length followed by bytes padded to a multiple of 4
      bytes.)  Treat the output of "Gi" as a 256-bit binary number in
      big-endian format.

   o  For each peer "v", define "weight(v)" as the faction of quorum
      slices containing "v".

   o  Define the set of nodes "neighbors(n)" as the set of nodes v for
      which "Gi("N" || n || v) < 2^{256}-1 * weight(v)".

   o  Define "priority(n, v)" as "Gi("P" || n || v)".

   For each round "n" until nomination starts, a node picks the
   available peer "v" with the highest value of priority in the given
   round from among the nodes in "neighbors(n)", and adds any values in
   that node's "votes" set to its own "votes" set.  Round "n" lasts for
   "2+n" seconds, after which if nomination has not finished (see
   below), a node proceeds to round "n+1".

   If a particular valid value "x" reaches quorum threshold in the
   messages sent by peers (meaning that every node in a quorum contains
   "x" either in the "votes" or the "accepted" field), then a node moves
   "x" from its "votes" field to its "accepted" field and broadcasts a
   new "SCPNomination" message.  Similarly, if "x" reaches blocking
   threshold in a node's peers' "accepted" field (meaning every one of a
   node's quorum slices contains at least one node with "x" in its



Barry, et al.           Expires September 6, 2018               [Page 6]


Internet-Draft                     scp                        March 2018


   "accepted" field), then the node adds "x" to its "accepted" field
   (removing it from "votes" if applicable).

   The nomination process terminates at a node when any value "x"
   reaches quorum threshold in the "accepted" fields, meaning every node
   in a quorum that includes the node has sent a signed "SCPNomination"
   message in which the "accepted" field contains "x".  Once nomination
   terminates, nodes stop adding new values to their "votes" sets, but
   potentially continues adding new values to "accepted" as previously
   described.

3.4.  Prepare messages

   Once the nomination process is complete, meaning at least one
   "accepted" value has reached quorum threshold, a node moves on to the
   balloting phase of the protocol.  A ballot is a pair of a counter
   ("n") and candidate value ("x"):

                         struct SCPBallot
                         {
                             uint32 counter; // n
                             Value value;    // x
                         };

   The counter beings at 1, and is incremented to higher numbers if
   ballot 1 fails to reach consensus on an output value.  The value "x"
   begins as the output of the deterministic combining function on all
   values that have achieved quorum threshold in the "accepted" fields
   of "SCPNomination" messages (meaning all values for which the local
   node is part of a quorum in which every node includes the value in
   the "accepted" field of an "SCPNomination" message).  Note, however,
   that "x" may change in subsequent ballots if more values reach this
   quorum threshold (since nomination keeps running in the background,
   even though a node does not add new values to its "votes" field in
   "SCPNomination").  Moreover, as described below, "x" may change if a
   particular ballot makes it a certain way through the protocol before
   failing.

   Once they have selected "x" for a particular ballot, nodes attempt to
   "prepare" the ballot by sending the following message:











Barry, et al.           Expires September 6, 2018               [Page 7]


Internet-Draft                     scp                        March 2018


                   struct SCPPrepare
                   {
                       Hash quorumSetHash;       // D
                       SCPBallot ballot;         // b
                       SCPBallot* prepared;      // p
                       SCPBallot* preparedPrime; // p'
                       uint32 nC;                // c.n
                       uint32 nH;                // h.n
                   };

   The fields have the following meaning:

   o  "quorumSetHash" - the SHA-256 hash of the sending node's
      "SCPQuorumSet" structure.

   o  "ballot" - the current ballot (see below)

   o  "prepared" - the highest ballot "<n,x>" for which one of the
      following two thresholds has been met:

      *  The sending node is part of a quorum in which each node
         included "<n',x>" for "n' >= n" in one of the "ballot",
         "prepared" or "preparedPrime" fields of an "SCPPrepare"
         message.

      *  Every one of the sending node's quorum slices contains at least
         one node that sent an "SCPPrepare" containing some "<n',x>"
         with "n' >= n" in either the "prepared" or "preparedPrime"
         field.  If no such ballot exists, then "prepared" is "NULL".

   o  "preparedPrime" - the highest ballot "<n,x>" that satisfies the
      same criteria as "prepared", but has a different value "x" from
      that in the "prepared" field.  "NULL" if no such ballot exists.

   o  "nH" - the counter from the highest ballot "<n,x>" for which the
      sender is in a quorum in which each member has sent an
      "SCPPrepare" message with one of the following properties (or "nH
      = 0" if no such ballot exists):

      *  The "prepared" field contains "<n',x>" where "n' >= n", or

      *  The "preparedPrime" field contains "<n',x'>" where "n' >= n"
         and "x'" can be any value.

   o  "nC" - the counter for the lowest ballot the sender is attempting
      to confirm (see below), otherwise 0.





Barry, et al.           Expires September 6, 2018               [Page 8]


Internet-Draft                     scp                        March 2018


   The "value" field "x" of each ballot is selected as follows.  If "nH
   = 0", then use the deterministic combination function on all values
   that have made it all the way through the nomination protocol to
   produce a candidate value.  Otherwise, use the value "x" associated
   with the ballot that produced "nH".

   The "counter" field "n" of each ballot is selected as follows.

   o  Initially, "n = 1".

   o  If a node is still sending "SCPPrepare" or "SCPConfirm" messages
      (meaning it has not yet output a value for a particular slot),
      then a node arms a timer whenever "n" reaches quorum threshold of
      ballots (meaning the node is a member of a quorum in which each
      member is sending explicit or implicit "SCPPrepare" messages with
      ballot "counter" >= "n").

   o  If the timer fires, a node increments "n" and determines a new
      value depending on the latest state of the nomination protocol and
      "nH", as discussed above.

   o  If a "counter" value greater than "n" ever reaches a blocking
      threshold, then a node immediately disables any pending timer and
      increases "n" to the blocking "counter" value, recomputing "value"
      in the usual way.

   The value "nC" is maintained based on an internally-maintained
   "commit ballot" "c", where initially "c = cNULL = <0, NULL>" (for
   some arbitrary or invalid value "NULL").

   o  If either "prepared" or "preparedPrime" has "counter" greater than
      "c"'s and a different "value", then reset "c = cNULL".

   o  If "c = cNULL" and the ballot determining "nH" hasn't been aborted
      by "prepared" or "preparedPrime", then set "c" to the lowest
      ballot containing the value of that ("nH") ballot that is between
      "ballot" and the "nH" ballot.

   o  If some ballot with the same value reaches quorum threshold betwen
      "c" and the "nH" ballot, move to the confirm phase.

3.5.  Confirm messages









Barry, et al.           Expires September 6, 2018               [Page 9]


Internet-Draft                     scp                        March 2018


                      struct SCPConfirm
                      {
                          SCPBallot ballot;   // b
                          uint32 nPrepared;   // p.n
                          uint32 nCommit;     // c.n
                          uint32 nH;          // h.n
                          Hash quorumSetHash; // D
                      };

   This message conveys an implicit "SCPPrepare" message with the
   following fields:

   o  "quorumSetHash" - same

   o  "ballot" - "<infinity, x>" where "x" is the value in the
      "SCPConfirm message's"ballot`.

   o  "prepared" - same

   o  "preparedPrime" - "NULL"

   o  "nC" - same

   o  "nH" - "infinity"

   (Note that "infinity" here just represents 2^{32} -- an integer one
   greater than the largest representable value on the wire.)

3.6.  Externalize messages

        struct SCPExternalize
        {
            SCPBallot commit;         // c
            uint32 nH;                // h.n
            Hash commitQuorumSetHash; // D used before EXTERNALIZE
        } externalize;

   An "SCPExternalize" message conveys two implicit "SCPConfirm"
   messages.  The first one has the following fields:

   o  "ballot" - "<infinity,x>" where "x" is the value from the
      "SCPExternalize" ballot.

   o  "nPrepared" - "infinity"

   o  "nCommit" - the "counter" field from the "SCPExternalize" messages
      "commit" field.




Barry, et al.           Expires September 6, 2018              [Page 10]


Internet-Draft                     scp                        March 2018


   o  "nH" - "infinity"

   o  "quorumSetHash" - the "commitQuorumSetHash" from the
      "SCPExternalize" message.

   The second one has the same fields as the first, except for the last
   two:

   o  "nH" - the "nH" value from the "SCPExternalize" message

   o  "quorumSetHash" - a trivial "SCPQuorumSet" that contains a single
      slice whose only member is the sender of the "SCPExternalize"
      message

3.7.  Message envelopes

   In order to provide full context for each signed message, all signed
   messages are part of an "SCPStatement" union type that includes the
   "slotIndex" naming the slot to which the message applies, as well as
   the "type" of the message.  A signed message and its signature are
   packed together in an "SCPEnvelope" structure.






























Barry, et al.           Expires September 6, 2018              [Page 11]


Internet-Draft                     scp                        March 2018


             enum SCPStatementType
             {
                 SCP_ST_PREPARE = 0,
                 SCP_ST_CONFIRM = 1,
                 SCP_ST_EXTERNALIZE = 2,
                 SCP_ST_NOMINATE = 3
             };

             struct SCPStatement
             {
                 NodeID nodeID;    // v (node signing message)
                 uint64 slotIndex; // i

                 union switch (SCPStatementType type)
                 {
                 case SCP_ST_PREPARE:
                     SCPPrepare prepare;
                 case SCP_ST_CONFIRM:
                     SCPConfirm confirm;
                 case SCP_ST_EXTERNALIZE:
                     SCPExternalize externalize;
                 case SCP_ST_NOMINATE:
                     SCPNomination nominate;
                 }
                 pledges;
             };

             struct SCPEnvelope
             {
                 SCPStatement statement;
                 Signature signature;
             };

4.  Security considerations

   If nodes do not pick quorum slices well, the protocol will not be
   safe.

5.  Acknowledgments

   The Stellar development foundation supported development of the
   protocol and produced the first production deployment of SCP.  The
   IRTF DIN group including Dirk Kutscher, Sydney Li, Colin Man, Melinda
   Shore, and Jean-Luc Watson helped with the framing and motivation for
   this specification.






Barry, et al.           Expires September 6, 2018              [Page 12]


Internet-Draft                     scp                        March 2018


6.  References

6.1.  Normative References

   [RFC4506]  Eisler, M., Ed., "XDR: External Data Representation
              Standard", STD 67, RFC 4506, DOI 10.17487/RFC4506, May
              2006, <https://www.rfc-editor.org/info/rfc4506>.

   [RFC6234]  Eastlake 3rd, D. and T. Hansen, "US Secure Hash Algorithms
              (SHA and SHA-based HMAC and HKDF)", RFC 6234,
              DOI 10.17487/RFC6234, May 2011,
              <https://www.rfc-editor.org/info/rfc6234>.

   [RFC8032]  Josefsson, S. and I. Liusvaara, "Edwards-Curve Digital
              Signature Algorithm (EdDSA)", RFC 8032,
              DOI 10.17487/RFC8032, January 2017,
              <https://www.rfc-editor.org/info/rfc8032>.

6.2.  Informative References

   [RFC6797]  Hodges, J., Jackson, C., and A. Barth, "HTTP Strict
              Transport Security (HSTS)", RFC 6797,
              DOI 10.17487/RFC6797, November 2012,
              <https://www.rfc-editor.org/info/rfc6797>.

   [RFC6962]  Laurie, B., Langley, A., and E. Kasper, "Certificate
              Transparency", RFC 6962, DOI 10.17487/RFC6962, June 2013,
              <https://www.rfc-editor.org/info/rfc6962>.

   [RFC7469]  Evans, C., Palmer, C., and R. Sleevi, "Public Key Pinning
              Extension for HTTP", RFC 7469, DOI 10.17487/RFC7469, April
              2015, <https://www.rfc-editor.org/info/rfc7469>.

   [SCP]      Mazieres, D., "The Stellar Consensus Protocol: A Federated
              Model for Internet-level Consensus", Stellar Development
              Foundation whitepaper , 2015,
              <https://www.stellar.org/papers/
              stellar-consensus-protocol.pdf>.

Authors' Addresses

   Nicolas Barry
   Stellar Development Foundation
   170 Capp St., Suite A
   San Francisco, CA  94110
   US

   Email: nicolas@stellar.org



Barry, et al.           Expires September 6, 2018              [Page 13]


Internet-Draft                     scp                        March 2018


   David Mazieres
   Stanford University
   353 Serra Mall, Room 290
   Stanford, CA  94305
   US

   Email: dm@uun.org


   Jed McCaleb
   Stellar Development Foundation
   170 Capp St., Suite A
   San Francisco, CA  94110
   US

   Email: jed@stellar.org


   Stanislas Polu
   Stripe Inc.
   185 Berry Street, Suite 550
   San Francisco, CA  94107
   US

   Email: stan@stripe.com


























Barry, et al.           Expires September 6, 2018              [Page 14]


Html markup produced by rfcmarkup 1.129b, available from https://tools.ietf.org/tools/rfcmarkup/