draft-ietf-intarea-flow-label-balancing-03.txt   rfc7098.txt 
IntArea B. Carpenter Internet Engineering Task Force (IETF) B. Carpenter
Internet-Draft Univ. of Auckland Request for Comments: 7098 Univ. of Auckland
Intended status: Informational S. Jiang Category: Informational S. Jiang
Expires: May 05, 2014 Huawei Technologies Co., Ltd ISSN: 2070-1721 Huawei Technologies Co., Ltd
W. Tarreau W. Tarreau
HAProxy, Inc. HAProxy Technologies, Inc.
November 01, 2013 January 2014
Using the IPv6 Flow Label for Load Balancing in Server Farms Using the IPv6 Flow Label for Load Balancing in Server Farms
draft-ietf-intarea-flow-label-balancing-03
Abstract Abstract
This document describes how the IPv6 flow label as currently This document describes how the currently specified IPv6 flow label
specified can be used to enhance layer 3/4 load distribution and can be used to enhance layer 3/4 (L3/4) load distribution and
balancing for large server farms. balancing for large server farms.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This document is not an Internet Standards Track specification; it is
provisions of BCP 78 and BCP 79. published for informational purposes.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months This document is a product of the Internet Engineering Task Force
and may be updated, replaced, or obsoleted by other documents at any (IETF). It represents the consensus of the IETF community. It has
time. It is inappropriate to use Internet-Drafts as reference received public review and has been approved for publication by the
material or to cite them other than as "work in progress." Internet Engineering Steering Group (IESG). Not all documents
approved by the IESG are a candidate for any level of Internet
Standard; see Section 2 of RFC 5741.
This Internet-Draft will expire on May 05, 2014. Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at
http://www.rfc-editor.org/info/rfc7098.
Copyright Notice Copyright Notice
Copyright (c) 2013 IETF Trust and the persons identified as the Copyright (c) 2014 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Summary of Flow Label Specification . . . . . . . . . . . . . 3 2. Summary of Flow Label Specification . . . . . . . . . . . . . 2
3. Summary of Server Farm Load Balancing Techniques . . . . . . 4 3. Summary of Server Farm Load-Balancing Techniques . . . . . . 4
4. Applying the Flow Label to L3/L4 Load Balancing . . . . . . . 7 4. Applying the Flow Label to Layer 3/4 Load Balancing . . . . . 8
5. Security Considerations . . . . . . . . . . . . . . . . . . . 10 5. Security Considerations . . . . . . . . . . . . . . . . . . . 10
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 11
7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 11 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 12
8. Change log [RFC Editor: Please remove] . . . . . . . . . . . 12 7.1. Normative References . . . . . . . . . . . . . . . . . . 12
9. References . . . . . . . . . . . . . . . . . . . . . . . . . 12 7.2. Informative References . . . . . . . . . . . . . . . . . 12
9.1. Normative References . . . . . . . . . . . . . . . . . . 12
9.2. Informative References . . . . . . . . . . . . . . . . . 13
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 13
1. Introduction 1. Introduction
The IPv6 flow label has been redefined [RFC6437] and is now a The IPv6 flow label has been redefined [RFC6437] and is now a
recommended IPv6 node requirement [RFC6434]. Its use for load recommended IPv6 node requirement [RFC6434]. Its use for load
sharing in multipath routing has been specified [RFC6438]. Another sharing in multipath routing has been specified [RFC6438]. Another
scenario in which the flow label could be used is in load scenario in which the flow label could be used is in load
distribution for large server farms. Load distribution is a slightly distribution for large server farms. Load distribution is a slightly
more general term than load balancing, but the latter is more more general term than load balancing, but the latter is more
commonly used. In the context of a server farm, both terms refer to commonly used. In the context of a server farm, both terms refer to
mechanisms that distribute the workload of a server farm among mechanisms that distribute the workload of a server farm among
different servers in order to optimize performance. Server load different servers in order to optimize performance. Server load
balancing commonly applies to HTTP traffic, but most of the balancing commonly applies to HTTP traffic, but most of the
techniques described would apply to other upper layer applications as techniques described would apply to other upper-layer applications as
well. This document starts with brief introductions to the flow well. This document starts with brief introductions to the flow
label and to server load balancing techniques, and then describes how label and to server load-balancing techniques, and then describes how
the flow label can be used to enhance load balancers operating on IP the flow label can be used to enhance load balancers operating on IP
packets and TCP sessions, commonly known as layer 3/4 load balancers. packets and TCP sessions, commonly known as layer 3/4 load balancers.
The motivation for this approach is to improve the performance of The motivation for this approach is to improve the performance of
most types of layer 3/4 load balancers, especially for traffic most types of layer 3/4 load balancers, especially for traffic
including multiple IPv6 extension headers and in particular for including multiple IPv6 extension headers and in particular for
fragmented packets. Fragmented packets, often the result of fragmented packets. Fragmented packets, often the result of
customers reaching the load balancer via a VPN with a limited MTU, customers reaching the load balancer via a VPN with a limited MTU,
are a common performance problem. are a common performance problem.
2. Summary of Flow Label Specification 2. Summary of Flow Label Specification
The IPv6 flow label [RFC6437] is a 20 bit field included in every The IPv6 flow label [RFC6437] is a 20-bit field included in every
IPv6 header [RFC2460]. It is recommended to be supported in all IPv6 IPv6 header [RFC2460]. It is recommended to be supported in all IPv6
nodes by [RFC6434]. There is additional background material in nodes by [RFC6434]. There is additional background material in
[RFC6436] and [RFC6294]. According to its definition, the flow label [RFC6436] and [RFC6294]. According to its definition, the flow label
should be set to a constant value for a given traffic flow (such as should be set to a constant value for a given traffic flow (such as
an HTTP connection), and that value will belong to a uniform an HTTP connection), and that value will belong to a uniform
statistical distribution, making it potentially valuable for load statistical distribution, making it potentially valuable for load-
balancing purposes. balancing purposes.
Any device that has access to the IPv6 header has access to the flow Any device that has access to the IPv6 header has access to the flow
label, and it is at a fixed position in every IPv6 packet. In label, and it is at a fixed position in every IPv6 packet. In
contrast, transport layer information, such as the port numbers, is contrast, transport-layer information, such as the port numbers, is
not always in a fixed position, since it follows any IPv6 extension not always in a fixed position, since it follows any IPv6 extension
headers that may be present. In fact, the logic of finding the headers that may be present. In fact, the logic of finding the
transport header is always more complex for IPv6 than for IPv4, due transport header is always more complex for IPv6 than for IPv4, due
to the absence of an Internet Header Length field in IPv6. to the absence of an Internet Header Length field in IPv6.
Additionally, if packets are fragmented, the flow label will be Additionally, if packets are fragmented, the flow label will be
present in all fragments, but the transport header will only be in present in all fragments, but the transport header will only be in
one packet. Therefore, within the lifetime of a given transport one packet. Therefore, within the lifetime of a given transport-
layer connection, the flow label can be a more convenient "handle" layer connection, the flow label can be a more convenient "handle"
than the port number for identifying that particular connection. than the port number for identifying that particular connection.
According to RFC 6437, source hosts should set the flow label, but, According to RFC 6437, source hosts should set the flow label;
if they do not (i.e., its value is zero), forwarding nodes (such as however, if they do not (i.e., its value is zero), forwarding nodes
the first-hop router) may set it instead. In both cases, the flow (such as the first-hop router) may set it instead. In both cases,
label value must be constant for a given transport session, normally the flow label value must be constant for a given transport session,
identified by the IPv6 and Transport header 5-tuple. By default, the normally identified by the IPv6 and Transport header 5-tuple. By
flow label value should be calculated by a stateless algorithm. The default, the flow label value should be calculated by a stateless
resulting value should form part of a statistically uniform algorithm. The resulting value should form part of a statistically
distribution, regardless of which node sets it. uniform distribution, regardless of which node sets it.
It is recognised that at the time of writing, very few traffic flows It is recognized that at the time of writing, very few traffic flows
include a non-zero flow label value. The mechanism described below include a non-zero flow label value. The mechanism described below
is one that can be added to existing load balancing mechanisms, so is one that can be added to existing load-balancing mechanisms, so
that it will become effective as more and more flows contain a non- that it will become effective as more and more flows contain a non-
zero label. Even if the flow label is chosen from an imperfectly zero label. Even if the flow label is chosen from an imperfectly
uniform distribution, it will nevertheless increase the information uniform distribution, it will nevertheless increase the information
entropy of the IPv6 header as a whole. This allows for progressive entropy of the IPv6 header as a whole. This allows for progressive
introduction of load balancing based on the flow label. introduction of load balancing based on the flow label.
If the recommendations in Section 3 of RFC 6437 are followed for If the recommendations in Section 3 of RFC 6437 are followed for
traffic from a given source accessing a well-known TCP port at a traffic from a given source accessing a well-known TCP port at a
given destination, the flow label can act as a substitute for the given destination, the flow label can act as a substitute for the
port numbers as far as a load balancer is concerned, and it can be port numbers as far as a load balancer is concerned, and it can be
found at a fixed position in the layer 3 header even if any extension found at a fixed position in the layer 3 header even if extension
headers are present. headers are present.
The flow label is defined as an end-to-end component of the IPv6 The flow label is defined as an end-to-end component of the IPv6
header, but there are three qualifications to this: header, but there are three qualifications to this:
1. Until the RFC 6437 standard is widely implemented as recommended 1. Until the IPv6 flow label specification in RFC 6437 is widely
by RFC 6434, the flow label will often be set to the default implemented as recommended by RFC 6434, the flow label will often
value of zero. be set to the default value of zero.
2. Because of the recommendation to use a stateless algorithm to 2. Because of the recommendation to use a stateless algorithm to
calculate the label, there is a low (but non-zero) probability calculate the label, there is a low (but non-zero) probability
that two simultaneous flows from the same source to the same that two simultaneous flows from the same source to the same
destination have the same flow label value despite having destination have the same flow label value despite having
different transport protocol port numbers. different transport-protocol port numbers.
3. The flow label field is in an unprotected part of the IPv6 3. The Flow Label field is in an unprotected part of the IPv6
header, which means that intentional or unintentional changes to header, which means that intentional or unintentional changes to
its value cannot be easily detected by a receiver. its value cannot be easily detected by a receiver.
The first two points are addressed below in Section 4 and the third The first two points are addressed below in Section 4 and the third
in Section 5. in Section 5.
3. Summary of Server Farm Load Balancing Techniques 3. Summary of Server Farm Load-Balancing Techniques
Load balancing for server farms is achieved by a variety of methods, Load balancing for server farms is achieved by a variety of methods,
often used in combination [Tarreau]. This section gives a general often used in combination [Tarreau]. This section gives a general
overview of common methods, although the flow label is not relevant overview of common methods, although the flow label is not relevant
to all of them. The actual load balancing algorithm (the choice of to all of them. The actual load-balancing algorithm (the choice of
which server to use for a new client session) is irrelevant to this which server to use for a new client session) is irrelevant to this
discussion. We give examples for HTTP, but analogous techniques may discussion. We give examples for HTTP, but analogous techniques may
be used for other application protocols. be used for other application protocols.
o The simplest method is simply using the DNS to return different o The simplest method is using the DNS to return different server
server addresses for a single name such as www.example.com to addresses for a single name such as www.example.com to different
different users. This is typically done by rotating the order in users. This is typically done by rotating the order in which
which different addresses within the server site are listed by the different addresses within the server site are listed by the
relevant authoritative DNS server, on the assumption that the relevant authoritative DNS server, on the assumption that the
client will pick the first one. Routing may be configured such client will pick the first one. Routing may be configured such
that the different addresses are handled by different ingress that the different addresses are handled by different ingress
routers. Several variants of this load balancing mechanism exist, routers. Several variants of this load-balancing mechanism exist,
such as expecting some clients to use all the advertised addresses such as expecting some clients to use all the advertised addresses
when multiple connections are involved, or directing the traffic when multiple connections are involved, or directing the traffic
to multiple sites, also known as global load balancing. None of to multiple sites, also known as global load balancing. None of
these mechanisms are in the scope of this document, and what this these mechanisms are in the scope of this document, and the
document proposes does not affect their usability nor aim to proposal in this document does not affect their usability nor aim
replace them, so they will not be discussed further. to replace them, so they will not be discussed further.
o Another method, for HTTP servers, is to operate a layer 7 reverse o Another method, for HTTP servers, is to operate a layer 7 reverse
proxy in front of the server farm. The reverse proxy will present proxy in front of the server farm. The reverse proxy will present
a single IP address to the world, communicated to clients by a a single IP address to the world, communicated to clients by a
single AAAA record. For each new client session (an incoming TCP single AAAA record. For each new client session (an incoming TCP
connection and HTTP request), it will pick a particular server and connection and HTTP request), it will pick a particular server and
proxy the session to it. The act of proxying should be more proxy the session to it. The act of proxying should be more
efficient and less resource-intensive than the act of serving the efficient and less resource-intensive than the act of serving the
required content. The proxy must retain TCP state and proxy state required content. The proxy must retain TCP state and proxy state
for the duration of the session. This TCP state could, for the duration of the session. This TCP state could,
potentially, include the incoming flow label value. potentially, include the incoming flow label value.
o A component of some load balancing systems is an SSL reverse proxy o A component of some load-balancing systems is an SSL reverse proxy
farm. The individual SSL proxies handle all cryptographic aspects farm. The individual SSL proxies handle all cryptographic aspects
and exchange unencrypted HTTP with the actual servers. Thus, from and exchange unencrypted HTTP with the actual servers. Thus, from
the load balancing point of view, this really looks just like a the load-balancing point of view, this really looks just like a
server farm, except that it's specialised for HTTPS. Each proxy server farm, except that it's specialized for HTTPS. Each proxy
will retain SSL and TCP and maybe HTTP state for the duration of will retain SSL and TCP and maybe HTTP state for the duration of
the session, and the TCP state could potentially include the flow the session, and the TCP state could potentially include the flow
label. label.
o Finally the "front end" of many load balancing systems is a layer o Finally the "front end" of many load-balancing systems is a layer
3/4 load balancer. While it can be a dedicated device, it is also 3/4 load balancer. While it can be a dedicated device, it is also
a standard function of some network switches or routers (e.g. a standard function of some network switches or routers (e.g.
using Equal Cost Multipath Routing (ECMP) [RFC2991]). In this using Equal-Cost Multipath Routing (ECMP) [RFC2991]). In this
case, it is the layer 3/4 load balancer whose IP address is case, it is the layer 3/4 load balancer whose IP address is
published as the primary AAAA record for the service. All client published as the primary AAAA record for the service. All client
sessions will pass through this device. Depending on the specific sessions will pass through this device. Depending on the specific
scenario, the balancer will assign new sessions among the actual scenario, the balancer will assign new sessions among the actual
application servers, across an SSL proxy farm, or among a set of application servers, across an SSL proxy farm, or among a set of
layer 7 proxies. In all cases, the layer 3/4 load balancer has to layer 7 proxies. In all cases, the layer 3/4 load balancer has to
classify incoming packets very quickly and choose the target classify incoming packets very quickly and choose the target
server or proxy so as to ensure persistence. 'Persistence' is server or proxy so as to ensure persistence. 'Persistence' is
defined as the guarantee that a given client session will run to defined as the guarantee that a given client session will run to
completion on a single server. The layer 3/4 load balancer completion on a single server. The layer 3/4 load balancer
therefore needs to inspect each incoming packet to classify it. therefore needs to inspect each incoming packet to classify it.
There are two common types of layer 3/4 load balancers, the There are two common types of layer 3/4 load balancers, the
totally stateless ones which only act on single packets, generally totally stateless ones which only act on single packets, generally
involving a per-packet hashing of easy-to-find information such as involving a per-packet hashing of easy-to-find information such as
the source address and/or port into a server number, and the the source address and/or port into a server number, and the
stateful ones which take the routing decision on the very first stateful ones that take the routing decision on the very first
packets of a session and maintain the same direction for all packets of a session and maintain the same direction for all
packets belonging to the same session. Clearly, both types of packets belonging to the same session. Clearly, both types of
layer 3/4 balancers could inspect and make use of the flow label layer 3/4 balancers could inspect and make use of the flow label
value. value.
Our focus is on how the balancer identifies a particular flow. Our focus is on how the balancer identifies a particular flow.
For clarity, note that two aspects of layer 3/4 load balancers are For clarity, note that two aspects of layer 3/4 load balancers are
not affected by use of the flow label to identify sessions: not affected by use of the flow label to identify sessions:
1. Balancers use various techniques to redirect traffic to a 1. Balancers use various techniques to redirect traffic to a
specific target server. specific target server.
- All servers are configured with the same IP address, they + All servers are configured with the same IP address, they
are all on the same LAN, and the load balancer sends directly are all on the same LAN, and the load balancer sends
to their individual MAC addresses. In this case, return directly to their individual MAC addresses. In this case,
packets from the server to the client are sent back without return packets from the server to the client are sent back
passing through the balancer, a technique known as direct without passing through the balancer, a technique known as
server return, but we are not concerned here with the return direct server return, but we are not concerned here with
packets. the return packets.
-All servers are configured with the same IP address, treated + All servers are configured with the same IP address,
locally as an anycast address by layer 3 ECMP routing. treated locally as an anycast address by layer 3 ECMP
routing.
- Each server has its own IP address, and the balancer uses an + Each server has its own IP address, and the balancer uses
IP-in-IP tunnel to reach it. an IP-in-IP tunnel to reach it.
- Each server has its own IP address, and the balancer + Each server has its own IP address, and the balancer
performs NAPT (network address and port translation) to performs NAPT (Network Address and Port Translation) to
deliver the client's packets to that address. deliver the client's packets to that address.
The choice between these methods is not affected by use of the + The choice between these methods is not affected by use of
flow label. the flow label.
2. A layer 3/4 balancer must correctly handle Path MTU Discovery 2. A layer 3/4 balancer must correctly handle Path MTU Discovery
by forwarding relevant ICMPv6 packets in both directions. by forwarding relevant ICMPv6 packets in both directions.
This too is not directly affected by use of the flow label. This too is not directly affected by use of the flow label.
It should be noted that there may be difficulty correlating an It should be noted that there may be difficulty correlating an
ICMPv6 "Packet too big" response with the session it refers ICMPv6 "Packet too big" response with the session it refers
to, but that is out of the scope of the present document. to, but that is out of the scope of the present document.
The following diagram, inspired by [Tarreau], shows a layout with The following diagram, inspired by [Tarreau], shows a layout with
various methods in use together. various methods in use together. (Below, "ASIC" stands for
"Application-Specific Integrated Circuit".)
___________________________________________ ___________________________________________
( ) ( )
( Clients in the Internet ) ( Clients in the Internet )
(___________________________________________) (___________________________________________)
| | | |
------------ DNS-based ------------ ------------ DNS-based ------------
| Ingress | load splitting | Ingress | | Ingress | load splitting | Ingress |
| router | affects | router | | router | affects | router |
------------ routing ------------ ------------ routing ------------
___|____________________________|___ ___|____________________________|___
| | | |
| | | |
| | | |
------------ ------------ ------------ ------------
| L3/4 ASIC| | L3/4 ASIC| | L3/4 ASIC| | L3/4 ASIC|
| balancer | | balancer | | balancer | | balancer |
------------ ------------ ------------ ------------
| load | | load |
| spreading | | spreading |
__________|________________________|___________ __________|________________________|___________
| | | | | | | |
------------ ------------ -------- -------- ------------ ------------ -------- --------
|HTTP proxy|...|HTTP proxy| | SSL |...| SSL | |HTTP proxy|...|HTTP proxy| | SSL |...| SSL |
| balancer | | balancer | | proxy| | proxy| | balancer | | balancer | | proxy| | proxy|
------------ ------------ -------- -------- ------------ ------------ -------- --------
____|_____________|_____________|_________|_____ ____|_____________|_____________|_________|_____
| | | | | | | | | |
-------- -------- -------- -------- -------- -------- -------- -------- -------- --------
|HTTP | |HTTP | |HTTP | |HTTP | |HTTP | |HTTP | |HTTP | |HTTP | |HTTP | |HTTP |
|server| |server| |server| |server| |server| |server| |server| |server| |server| |server|
-------- -------- -------- -------- -------- -------- -------- -------- -------- --------
From the previous paragraphs, we can identify several points in this From the previous paragraphs, we can identify several points in this
diagram where the flow label might be relevant: diagram where the flow label might be relevant:
1. Layer 3/4 load balancers. 1. Layer 3/4 load balancers.
2. SSL proxies. 2. SSL proxies.
3. HTTP proxies. 3. HTTP proxies.
However, usage by the proxies seems unlikely to affect performance, However, usage by the proxies seems unlikely to affect performance,
because they must in any case process the application layer header, because they must in any case process the application-layer header,
so in this document we focus only on layer 3/4 balancers. so in this document we focus only on layer 3/4 balancers.
4. Applying the Flow Label to L3/L4 Load Balancing 4. Applying the Flow Label to Layer 3/4 Load Balancing
The suggested model for using the flow label to enhance a L3/L4 load The suggested model for using the flow label to enhance an layer 3/4
balancing mechanism is as follows: load-balancing mechanism is as follows:
o We are only concerned with IPv6 traffic in which the flow label o We are only concerned with IPv6 traffic in which the flow label
value has been set according to [RFC6437]. If the flow label of value has been set according to [RFC6437]. If the flow label of
an incoming packet is zero, load balancers will continue to use an incoming packet is zero, load balancers will continue to use
the transport header in the traditional way. As the use of the the transport header in the traditional way. As the use of the
flow label becomes more prevalent according to RFC 6434, load flow label becomes more prevalent according to RFC 6434, load
balancers, and therefore users, will reap a growing performance balancers, and therefore users, will reap a growing performance
benefit. benefit.
o If the flow label of an incoming packet is non-zero, layer 3/4 o If the flow label of an incoming packet is non-zero, layer 3/4
load balancers can use the 2-tuple {source address, flow label} as load balancers can use the 2-tuple {source address, flow label} as
the session key for whatever load distribution algorithm they the session key for whatever load distribution algorithm they
support. Alternatively, they might use the 3-tuple {dest address, support. Alternatively, they might use the 3-tuple {dest address,
source address, flow label}, especially if the server farm source address, flow label}, especially if the server farm
supports multiple server IP addresses, but this does not affect supports multiple server IP addresses, but using the 3-tuple will
the argument. If any IPv6 extension headers, including fragment be significantly quicker than searching for the transport port
headers, are present, this will be significantly quicker than numbers later in the packet. Moreover, the transport-layer
searching for the transport port numbers later in the packet. information such as the source port is not repeated in fragments,
Moreover, the transport layer information such as the source port which generally prevents stateless load balancers from supporting
is not repeated in fragments, which generally prevents stateless fragmented traffic since they generally cannot reassemble
load balancers from supporting fragmented traffic since they fragments.
generally cannot reassemble fragments.
A stateless layer 3/4 load balancer would simply apply a hash A stateless layer 3/4 load balancer would simply apply a hash
algorithm to the 2-tuple or 3-tuple on all packets, in order to algorithm to the 2-tuple or 3-tuple on all packets in order to
select the same target server consistently for a given flow. select the same target server consistently for a given flow.
Needless to say, the hash algorithm has to be well chosen for its Needless to say, the hash algorithm has to be well chosen for its
purpose, but this problem is common to several forms of stateless purpose, but this problem is common to several forms of stateless
load balancing. The discussion in [RFC6438] applies. load balancing. The discussion in [RFC6438] applies.
A stateful layer 3/4 load balancer would apply its usual load A stateful layer 3/4 load balancer would apply its usual load
distribution algorithm to the first packet of a session, and store distribution algorithm to the first packet of a session, and store
the {tuple, server} association in a table so that subsequent the {tuple, server} association in a table so that subsequent
packets belonging to the same session are forwarded to the same packets belonging to the same session are forwarded to the same
server. Thus, for all subsequent packets of the session, it can server. Thus, for all subsequent packets of the session, it can
ignore all IPv6 extension headers, which should lead to a ignore all IPv6 extension headers, which should lead to a
performance benefit. Whether this benefit is valuable will depend performance benefit. Whether this benefit is valuable will depend
on engineering details of the specific load balancer. on engineering details of the specific load balancer.
Note that such a balancer will not identify new transport sessions Note that such a balancer will not identify new transport sessions
from the same source that use the same flow label; they will be from the same source that use the same flow label; they will be
delivered to the same server. This is like the behavior of delivered to the same server. This is like the behavior of
existing hash-based layer 4 balancers that always send similarly existing hash-based layer 4 balancers that always send similarly
hashed packets to the same destination. However, a global state hashed packets to the same destination. However, a global state
table in a flow label balancer cannot be shared between multiple table in a flow label balancer cannot be shared between multiple
services if these services rely on transport layer information, services if these services rely on transport-layer information,
since the goal of using the flow label is to avoid looking up that since the goal of using the flow label is to avoid looking up that
information. information.
A related issue is that the balancer will not detect FIN/ACK A related issue is that the balancer will not detect FIN/ACK
sequences at the end of sessions. Therefore, it will rely on sequences at the end of sessions. Therefore, it will rely on
inactivity timers to delete session state. However, all existing inactivity timers to delete session state. However, all existing
balancers must maintain such timers to deal with hung sessions, balancers must maintain such timers to deal with hung sessions,
and the practical impact on memory utilisation is unlikely to be and the practical impact on memory utilization is unlikely to be
significant. significant.
o Layer 3/4 balancers that redirect the incoming packets by NAPT are o Layer 3/4 balancers that redirect the incoming packets by NAPT are
not expected to obtain any saving of time by using the flow label, not expected to obtain any saving of time by using the flow label,
because they have no choice but to follow the extension header because they have no choice but to follow the extension header
chain, in order to locate and modify the port number and transport chain in order to locate and modify the port number and transport
checksum. The same would apply to balancers that perform TCP checksum. The same would apply to balancers that perform TCP
state tracking for any reason. state tracking for any reason.
o Note that correct handling of ICMPv6 for Path MTU Discovery o Note that correct handling of ICMPv6 for Path MTU Discovery
requires the layer 3/4 balancer to keep state for the client requires the layer 3/4 balancer to keep state for the client
source address, independently of either the port numbers or the source address, independently of either the port numbers or the
flow label. flow label.
o SSL and HTTP proxies, if present, should forward the flow label o SSL and HTTP proxies, if present, should forward the flow label
value towards the server. This usually has no performance value towards the server. This usually has no performance
benefit, but is consistent with the general RFC 6437 model for the benefit, but it is consistent with the general model for the flow
flow label. label described in RFC 6437.
It should be noted that the performance benefit, if any, depends It should be noted that the performance benefit, if any, depends
entirely on engineering trade-offs in the design of the L3/L4 entirely on engineering trade-offs in the design of the layer 3/4
balancer. An extra test is needed (is the label non-zero?), but if balancer. An extra test is needed to check if the label is non-zero,
there is a non-zero label, all logic for handling extension headers but if there is a non-zero label, all logic for handling extension
can be skipped except for the first packet of a new flow. Since the headers can be skipped except for the first packet of a new flow.
identifying state to be stored is only the tuple and the server Since the identifying state to be stored is only the tuple and the
identifier, storage requirements will be reduced. Additionally, the server identifier, storage requirements will be reduced.
method will work for fragmented traffic and for flows where the Additionally, the method will work for fragmented traffic and for
transport information is missing (unknown transport protocol) or flows where the transport information is missing (unknown transport
obfuscated (e.g., IPsec). Traffic reaching the load balancer via a protocol) or obfuscated (e.g., IPsec). Traffic reaching the load
VPN is particularly prone to the fragmentation issue, due to MTU size balancer via a VPN is particularly prone to the fragmentation issue,
issues. For some load balancer designs, these are very significant due to MTU size issues. For some load-balancer designs, these are
advantages. very significant advantages.
In the unlikely event of two simultaneous flows from the same source In the unlikely event of two simultaneous flows from the same source
address having the same flow label value, the two flows would end up address having the same flow label value, the two flows would end up
assigned to the same server, where they would be distinguished as assigned to the same server, where they would be distinguished as
normal by their port numbers. There are approximately one million normal by their port numbers. There are approximately one million
possible flow label values, and if the rules for flow label possible flow label values, and if the rules for flow label
generation [RFC6437] are followed, this would be a statistically rare generation [RFC6437] are followed, this would be a statistically rare
event, and would not damage the overall load balancing effect. event, and would not damage the overall load-balancing effect.
Moreover, with a million possible label values, it is very likely Moreover, with a million possible label values, it is very likely
that there will be many more flow label values than servers at most that there will be many more flow label values than servers at most
sites, so it is already expected that multiple flow label values will sites, so it is already expected that multiple flow label values will
end up on the same server for a given client IP address. end up on the same server for a given client IP address.
In the case that many thousands of clients are hidden behind the same In the case that many thousands of clients are hidden behind the same
large-scale NAPT (network address and port translator) with a single large-scale NAPT with a single shared IP address, the assumption of
shared IP address, the assumption of low probability of conflicts low probability of conflicts might become incorrect, unless flow
might become incorrect, unless flow label values are random enough to label values are random enough to avoid following similar sequences
avoid following similar sequences for all clients. This is not for all clients. This is not expected to be a factor for IPv6
expected to be a factor for IPv6 anyway, since there is no need to anyway, since there is no need to implement large-scale NAPT with
implement large-scale NAPT with address sharing [RFC4864]. The address sharing [RFC4864]. The probability of conflicts is low for
probability of conflicts is low for sites that implement network sites that implement network prefix translation [RFC6296], since this
prefix translation [RFC6296], since this technique provides a technique provides a different address for each client.
different address for each client.
5. Security Considerations 5. Security Considerations
Security aspects of the flow label are discussed in [RFC6437]. As Security aspects of the flow label are discussed in [RFC6437]. As
noted there, a malicious source or man-in-the-middle could disturb noted there, a malicious source or man-in-the-middle could disturb
load balancing by manipulating flow labels. This risk already exists load balancing by manipulating flow labels. This risk already exists
today where the source address and port are used as hashing key in today where the source address and port are used as a hashing key in
layer 3/4 load balancers, as well as where a persistence cookie is layer 3/4 load balancers, as well as where a persistence cookie is
used in HTTP to designate a server. It even exists on layer 3 used in HTTP to designate a server. It even exists on layer 3
components which only rely on the source address to select a components that only rely on the source address to select a
destination, making them more DDoS-prone. Nevertheless, all these destination, making them more DDoS-prone. Nevertheless, all these
methods are currently used because the benefits for load balancing methods are currently used because the benefits for load balancing
and persistence hugely outweigh the risks. The flow label does not and persistence hugely outweigh the risks. The flow label does not
significantly alter this situation. significantly alter this situation.
Specifically, the standard [RFC6437] states that "stateless Specifically, the IPv6 flow label specification [RFC6437] states that
classifiers should not use the flow label alone to control load "stateless classifiers should not use the flow label alone to control
distribution, and stateful classifiers should include explicit load distribution, and stateful classifiers should include explicit
methods to detect and ignore suspect flow label values." The former methods to detect and ignore suspect flow label values." The former
point is answered by also using the source address. The latter point point is answered by also using the source address. The latter point
is more complex. If the risk is considered serious, the site ingress is more complex. If the risk is considered serious, the site ingress
router or the layer 3/4 balancer should use a suitable heuristic to router or the layer 3/4 balancer should use a suitable heuristic to
verify incoming flows with non-zero flow label values. If a flow verify incoming flows with non-zero flow label values. If a flow
from a given source address and port number does not have a constant from a given source address and port number does not have a constant
flow label value, it is suspect and should be dropped. This would flow label value, it is suspect and should be dropped. This would
deal with both intentional and accidental changes to the flow label. deal with both intentional and accidental changes to the flow label.
A malicious source or man-in-the-middle could generate a flow in A malicious source or man-in-the-middle could generate a flow in
which the flow label is constant but the transport port numbers in which the flow label is constant but the transport port numbers in
some packets are invalid. Such packets, if load-balanced only on the some packets are invalid. Such packets, if load-balanced only on the
basis of the flow label, could reach the target server and create a basis of the flow label, could reach the target server and create a
single-source DOS attack on its TCP engine. single-source DoS attack on its TCP engine.
RFC 6437 notes in its Security Considerations that if the covert RFC 6437 notes in its Security Considerations that if the covert
channel risk is considered significant, a firewall might rewrite non- channel risk is considered significant, a firewall might rewrite non-
zero flow labels. As long as this is done as described in RFC 6437, zero flow labels. As long as this is done as described in RFC 6437,
it will not invalidate the mechanisms described above. it will not invalidate the mechanisms described above.
The flow label may be of use in protecting against distributed denial The flow label may be of use in protecting against DDoS attacks
of service (DDOS) attacks against servers. As noted in RFC 6437, a against servers. As noted in RFC 6437, a source should generate flow
source should generate flow label values that are hard to predict, label values that are hard to predict, most likely by including a
most likely by including a secret nonce in the hash used to generate secret nonce in the hash used to generate each label. The attacker
each label. The attacker does not know the nonce and therefore has does not know the nonce and therefore has no way to invent flow
no way to invent flow labels which will all target the same server, labels that will all target the same server, even with knowledge of
even with knowledge of both the hash algorithm and the load balancing both the hash algorithm and the load-balancing algorithm. Still, it
algorithm. Still, it is important to understand that it is always is important to understand that it is always trivial to force a load
trivial to force a load balancer to stick to the same server during balancer to stick to the same server during an attack, so the
an attack, so the security of the whole solution must not rely on the security of the whole solution must not rely on the unpredictability
unpredicatability of the flow label values alone, but should include of the flow label values alone, but should include defensive measures
defensive measures like most load balancers already have against like most load balancers already have against abnormal use of source
abnormal use of source address or session cookies. addresses or session cookies.
New flows are assigned to a server according to any of the usual New flows are assigned to a server according to any of the usual
algorithms available on the load balancer (e.g., least connections, algorithms available on the load balancer (e.g., least connections,
round robin, etc.). The association between the source address/flow round robin, etc.). The association between the 2-tuple {source
label value and the server is stored in a table (often called stick address, flow label} and the server is stored in a table (often
table) so that future traffic from the same source using the same called stick table) so that future traffic from the same source using
flow label can be sent to the same server. This method is more the same flow label can be sent to the same server. This method is
robust against a loss of server and also makes it harder for an more robust against a loss of server and also makes it harder for an
attacker to target a specific server, because the association between attacker to target a specific server, because the association between
a flow label value and a server is not known externally. a flow label value and a server is not known externally.
In the case that a stateless hash function is used to assign client In the case that a stateless hash function is used to assign client
packets to specific servers, it may be advisable to use a packets to specific servers, it may be advisable to use a
cryptographic hash function of some kind, to ensure that an attacker cryptographic hash function of some kind, to ensure that an attacker
cannot predict the behaviour of the load balancer. cannot predict the behavior of the load balancer.
6. IANA Considerations
This document requests no action by IANA.
7. Acknowledgements 6. Acknowledgements
Valuable comments and contributions were made by Fred Baker, Olivier Valuable comments and contributions were made by Fred Baker, Olivier
Bonaventure, Ben Campbell, Lorenzo Colitti, Linda Dunbar, Donald Bonaventure, Ben Campbell, Lorenzo Colitti, Linda Dunbar, Donald
Eastlake, Joel Jaeggli, Gurudeep Kamat, Warren Kumari, Julia Eastlake, Joel Jaeggli, Gurudeep Kamat, Warren Kumari, Julia
Renouard, Julius Volz, and others. Renouard, Julius Volz, and others.
This document was produced using the xml2rfc tool [RFC2629]. 7. References
8. Change log [RFC Editor: Please remove]
draft-ietf-intarea-flow-label-balancing-03: IESG comments,
2013-11-01.
draft-ietf-intarea-flow-label-balancing-02: Last Call comments,
2013-10-07.
draft-ietf-intarea-flow-label-balancing-01: clarifications based on
WG comments, 2013-05-25.
draft-ietf-intarea-flow-label-balancing-00: WG adoption, minor WG
comments, 2013-01-15.
draft-carpenter-flow-label-balancing-02: updates based on external
review, 2012-12-05.
draft-carpenter-flow-label-balancing-01: update following comments,
2012-06-12.
draft-carpenter-flow-label-balancing-00: restructured after IETF83,
2012-05-08.
draft-carpenter-v6ops-label-balance-02: clarified after WG
discussions, 2012-03-06.
draft-carpenter-v6ops-label-balance-01: updated with community
comments, additional author, 2012-01-17.
draft-carpenter-v6ops-label-balance-00: original version, 2011-10-13.
9. References
9.1. Normative References 7.1. Normative References
[RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6
(IPv6) Specification", RFC 2460, December 1998. (IPv6) Specification", RFC 2460, December 1998.
[RFC6434] Jankiewicz, E., Loughney, J., and T. Narten, "IPv6 Node [RFC6434] Jankiewicz, E., Loughney, J., and T. Narten, "IPv6 Node
Requirements", RFC 6434, December 2011. Requirements", RFC 6434, December 2011.
[RFC6437] Amante, S., Carpenter, B., Jiang, S., and J. Rajahalme, [RFC6437] Amante, S., Carpenter, B., Jiang, S., and J. Rajahalme,
"IPv6 Flow Label Specification", RFC 6437, November 2011. "IPv6 Flow Label Specification", RFC 6437, November 2011.
9.2. Informative References 7.2. Informative References
[RFC2629] Rose, M., "Writing I-Ds and RFCs using XML", RFC 2629,
June 1999.
[RFC2991] Thaler, D. and C. Hopps, "Multipath Issues in Unicast and [RFC2991] Thaler, D. and C. Hopps, "Multipath Issues in Unicast and
Multicast Next-Hop Selection", RFC 2991, November 2000. Multicast Next-Hop Selection", RFC 2991, November 2000.
[RFC4864] Van de Velde, G., Hain, T., Droms, R., Carpenter, B., and [RFC4864] Van de Velde, G., Hain, T., Droms, R., Carpenter, B., and
E. Klein, "Local Network Protection for IPv6", RFC 4864, E. Klein, "Local Network Protection for IPv6", RFC 4864,
May 2007. May 2007.
[RFC6294] Hu, Q. and B. Carpenter, "Survey of Proposed Use Cases for [RFC6294] Hu, Q. and B. Carpenter, "Survey of Proposed Use Cases for
the IPv6 Flow Label", RFC 6294, June 2011. the IPv6 Flow Label", RFC 6294, June 2011.
skipping to change at page 13, line 43 skipping to change at page 13, line 14
Authors' Addresses Authors' Addresses
Brian Carpenter Brian Carpenter
Department of Computer Science Department of Computer Science
University of Auckland University of Auckland
PB 92019 PB 92019
Auckland 1142 Auckland 1142
New Zealand New Zealand
Email: brian.e.carpenter@gmail.com EMail: brian.e.carpenter@gmail.com
Sheng Jiang Sheng Jiang
Huawei Technologies Co., Ltd Huawei Technologies Co., Ltd
Q14, Huawei Campus Q14, Huawei Campus
No.156 Beiqing Road No.156 Beiqing Road
Hai-Dian District, Beijing 100095 Hai-Dian District, Beijing 100095
P.R. China P.R. China
Email: jiangsheng@huawei.com EMail: jiangsheng@huawei.com
Willy Tarreau Willy Tarreau
HAProxy, Inc. HAProxy Technologies, Inc.
R&D Network Products R&D Network Products
3 rue du petit Robinson 3 rue du petit Robinson
78350 Jouy-en-Josas 78350 Jouy-en-Josas
France France
Email: w@1wt.eu EMail: willy@haproxy.com
 End of changes. 66 change blocks. 
231 lines changed or deleted 186 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/