draft-ietf-dime-overload-reqs-01.txt   draft-ietf-dime-overload-reqs-02.txt 
Network Working Group E. McMurry Network Working Group E. McMurry
Internet-Draft B. Campbell Internet-Draft B. Campbell
Intended status: Standards Track Tekelec Intended status: Standards Track Tekelec
Expires: May 16, 2013 November 12, 2012 Expires: June 20, 2013 December 17, 2012
Diameter Overload Control Requirements Diameter Overload Control Requirements
draft-ietf-dime-overload-reqs-01 draft-ietf-dime-overload-reqs-02
Abstract Abstract
When a Diameter server or agent becomes overloaded, it needs to be When a Diameter server or agent becomes overloaded, it needs to be
able to gracefully reduce its load, typically by informing clients to able to gracefully reduce its load, typically by informing clients to
reduce sending traffic for some period of time. Otherwise, it must reduce sending traffic for some period of time. Otherwise, it must
continue to expend resources parsing and responding to Diameter continue to expend resources parsing and responding to Diameter
messages, possibly resulting in congestion collapse. The existing messages, possibly resulting in congestion collapse. The existing
mechanisms provided by Diameter are not sufficient for this purpose. mechanisms provided by Diameter are not sufficient for this purpose.
This document describes the limitations of the existing mechanisms, This document describes the limitations of the existing mechanisms,
skipping to change at page 1, line 37 skipping to change at page 1, line 37
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on May 16, 2013. This Internet-Draft will expire on June 20, 2013.
Copyright Notice Copyright Notice
Copyright (c) 2012 IETF Trust and the persons identified as the Copyright (c) 2012 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 29 skipping to change at page 2, line 29
2.3. Interconnect Scenario . . . . . . . . . . . . . . . . . . 12 2.3. Interconnect Scenario . . . . . . . . . . . . . . . . . . 12
3. Extensibility . . . . . . . . . . . . . . . . . . . . . . . . 13 3. Extensibility . . . . . . . . . . . . . . . . . . . . . . . . 13
4. Existing Mechanisms . . . . . . . . . . . . . . . . . . . . . 14 4. Existing Mechanisms . . . . . . . . . . . . . . . . . . . . . 14
5. Issues with the Current Mechanisms . . . . . . . . . . . . . . 14 5. Issues with the Current Mechanisms . . . . . . . . . . . . . . 14
5.1. Problems with Implicit Mechanism . . . . . . . . . . . . . 15 5.1. Problems with Implicit Mechanism . . . . . . . . . . . . . 15
5.2. Problems with Explicit Mechanisms . . . . . . . . . . . . 15 5.2. Problems with Explicit Mechanisms . . . . . . . . . . . . 15
6. Diameter Overload Case Studies . . . . . . . . . . . . . . . . 16 6. Diameter Overload Case Studies . . . . . . . . . . . . . . . . 16
6.1. Overload in Mobile Data Networks . . . . . . . . . . . . . 16 6.1. Overload in Mobile Data Networks . . . . . . . . . . . . . 16
6.2. 3GPP Study on Core Network Overload . . . . . . . . . . . 17 6.2. 3GPP Study on Core Network Overload . . . . . . . . . . . 17
7. Solution Requirements . . . . . . . . . . . . . . . . . . . . 18 7. Solution Requirements . . . . . . . . . . . . . . . . . . . . 18
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 23
9. Security Considerations . . . . . . . . . . . . . . . . . . . 22 9. Security Considerations . . . . . . . . . . . . . . . . . . . 23
9.1. Access Control . . . . . . . . . . . . . . . . . . . . . . 23 9.1. Access Control . . . . . . . . . . . . . . . . . . . . . . 24
9.2. Denial-of-Service Attacks . . . . . . . . . . . . . . . . 23 9.2. Denial-of-Service Attacks . . . . . . . . . . . . . . . . 24
9.3. Replay Attacks . . . . . . . . . . . . . . . . . . . . . . 23 9.3. Replay Attacks . . . . . . . . . . . . . . . . . . . . . . 24
9.4. Man-in-the-Middle Attacks . . . . . . . . . . . . . . . . 24 9.4. Man-in-the-Middle Attacks . . . . . . . . . . . . . . . . 25
9.5. Compromised Hosts . . . . . . . . . . . . . . . . . . . . 24 9.5. Compromised Hosts . . . . . . . . . . . . . . . . . . . . 25
10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 24 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 25
10.1. Normative References . . . . . . . . . . . . . . . . . . . 24 10.1. Normative References . . . . . . . . . . . . . . . . . . . 25
10.2. Informative References . . . . . . . . . . . . . . . . . . 25 10.2. Informative References . . . . . . . . . . . . . . . . . . 26
Appendix A. Contributors . . . . . . . . . . . . . . . . . . . . 25 Appendix A. Contributors . . . . . . . . . . . . . . . . . . . . 26
Appendix B. Acknowledgements . . . . . . . . . . . . . . . . . . 25 Appendix B. Acknowledgements . . . . . . . . . . . . . . . . . . 26
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 25 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 27
1. Introduction 1. Introduction
When a Diameter [I-D.ietf-dime-rfc3588bis] server or agent becomes When a Diameter [RFC6733] server or agent becomes overloaded, it
overloaded, it needs to be able to gracefully reduce its load, needs to be able to gracefully reduce its load, typically by
typically by informing clients to reduce sending traffic for some informing clients to reduce sending traffic for some period of time.
period of time. Otherwise, it must continue to expend resources Otherwise, it must continue to expend resources parsing and
parsing and responding to Diameter messages, possibly resulting in responding to Diameter messages, possibly resulting in congestion
congestion collapse. The existing mechanisms provided by Diameter collapse. The existing mechanisms provided by Diameter are not
are not sufficient for this purpose. This document describes the sufficient for this purpose. This document describes the limitations
limitations of the existing mechanisms, and provides requirements for of the existing mechanisms, and provides requirements for new
new overload management mechanisms. overload management mechanisms.
This document draws on [RFC5390] and the work done on SIP overload This document draws on [RFC5390] and the work done on SIP overload
control as well as on overload practices in SS7 networks and studies control as well as on overload practices in SS7 networks and studies
done by 3GPP. done by 3GPP.
Diameter is not typically an end-user protocol; rather it is Diameter is not typically an end-user protocol; rather it is
generally used as one component in support of some end-user activity. generally used as one component in support of some end-user activity.
For example, a WiFi access point might use Diameter to authenticate For example, a WiFi access point might use Diameter to authenticate
and authorize user access via 802.11. Overload in a network that and authorize user access via 802.11. Overload in a network that
uses Diameter applications will likely spill over into the end-user uses Diameter applications will likely spill over into the end-user
skipping to change at page 6, line 29 skipping to change at page 6, line 29
protocols other than Diameter is out of scope for this document, and protocols other than Diameter is out of scope for this document, and
for the work proposed by this document. for the work proposed by this document.
1.5. Documentation Conventions 1.5. Documentation Conventions
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119]. document are to be interpreted as described in [RFC2119].
The terms "client", "server", "agent", "node", "peer", "upstream", The terms "client", "server", "agent", "node", "peer", "upstream",
and "downstream" are used as defined in [I-D.ietf-dime-rfc3588bis]. and "downstream" are used as defined in [RFC6733].
2. Overload Scenarios 2. Overload Scenarios
Several Diameter deployment scenarios exist that may impact overload Several Diameter deployment scenarios exist that may impact overload
management. The following scenarios help motivate the requirements management. The following scenarios help motivate the requirements
for an overload management mechanism. for an overload management mechanism.
These scenarios are by no means exhaustive, and are in general These scenarios are by no means exhaustive, and are in general
simplified for the sake of clarity. In particular, the authors simplified for the sake of clarity. In particular, the authors
assume for the sake of clarity that the client sends Diameter assume for the sake of clarity that the client sends Diameter
skipping to change at page 12, line 43 skipping to change at page 12, line 43
| Client | | Client |
| | | |
+------------------+ +------------------+
Figure 6: Multiple Application Agent Scenario Figure 6: Multiple Application Agent Scenario
2.3. Interconnect Scenario 2.3. Interconnect Scenario
Another scenario to consider when looking at Diameter overload is Another scenario to consider when looking at Diameter overload is
that of multiple network operators using Diameter components that of multiple network operators using Diameter components
connected through an interconnect service, e.g. using IPX. Figure 7 connected through an interconnect service, e.g. using IPX. IPX (IP
shows two network operators with an interconnect network in-between. eXchange) [IR.34] is an Inter-Operator IP Backbone that provides
There could be any number of these networks between any two network roaming interconnection network between mobile operators and service
operator's networks. providers. The IPX is also used to transport Diameter signaling
between operators [IR.88]. Figure 7 shows two network operators with
an interconnect network in-between. There could be any number of
these networks between any two network operator's networks.
+-------------------------------------------+ +-------------------------------------------+
| Interconnect | | Interconnect |
| | | |
| +--------------+ +--------------+ | | +--------------+ +--------------+ |
| | Server 3 |------| Server 4 | | | | Server 3 |------| Server 4 | |
| +--------------+ +--------------+ | | +--------------+ +--------------+ |
| .' `. | | .' `. |
+------.-'--------------------------`.------+ +------.-'--------------------------`.------+
.' `. .' `.
skipping to change at page 14, line 16 skipping to change at page 14, line 16
Diameter offers both implicit and explicit mechanisms for a Diameter Diameter offers both implicit and explicit mechanisms for a Diameter
node to learn that a peer is overloaded or unreachable. The implicit node to learn that a peer is overloaded or unreachable. The implicit
mechanism is simply the lack of responses to requests. If a client mechanism is simply the lack of responses to requests. If a client
fails to receive a response in a certain time period, it assumes the fails to receive a response in a certain time period, it assumes the
upstream peer is unavailable, or overloaded to the point of effective upstream peer is unavailable, or overloaded to the point of effective
unavailability. The watchdog mechanism [RFC3539] ensures that a unavailability. The watchdog mechanism [RFC3539] ensures that a
certain rate of transaction responses occur even when there is certain rate of transaction responses occur even when there is
otherwise little or no other Diameter traffic. otherwise little or no other Diameter traffic.
The explicit mechanism involves specific protocol error responses, The explicit mechanism can involve specific protocol error responses,
where an agent or server can tell a downstream peer that it is either where an agent or server tells a downstream peer that it is either
too busy to handle a request (DIAMETER_TOO_BUSY) or unable to route a too busy to handle a request (DIAMETER_TOO_BUSY) or unable to route a
request to an upstream destination (DIAMETER_UNABLE_TO_DELIVER), request to an upstream destination (DIAMETER_UNABLE_TO_DELIVER),
perhaps because that destination itself is overloaded to the point of perhaps because that destination itself is overloaded to the point of
unavailability. unavailability.
Another explicit mechanism, a DPR (Disconnect-Peer-Request) message,
can be sent with a Disconnect-Cause of BUSY. This signals the
sender's intent to close the transport connection, and requests the
client not to reconnect.
Once a Diameter node learns that an upstream peer has become Once a Diameter node learns that an upstream peer has become
overloaded via one of these mechanisms, it can then attempt to take overloaded via one of these mechanisms, it can then attempt to take
action to reduce the load. This usually means forwarding traffic to action to reduce the load. This usually means forwarding traffic to
an alternate destination, if available. If no alternate destination an alternate destination, if available. If no alternate destination
is available, the node must either reduce the number of messages it is available, the node must either reduce the number of messages it
originates (in the case of a client) or inform the client to reduce originates (in the case of a client) or inform the client to reduce
traffic (in the case of an agent.) traffic (in the case of an agent.)
Diameter requires the use of a congestion-managed transport layer, Diameter requires the use of a congestion-managed transport layer,
currently TCP or SCTP, to mitigate network congestion. It is currently TCP or SCTP, to mitigate network congestion. It is
skipping to change at page 15, line 29 skipping to change at page 15, line 34
shed load early to avoid collapse in the first place. shed load early to avoid collapse in the first place.
Additionally, the implicit mechanism cannot distinguish between Additionally, the implicit mechanism cannot distinguish between
overload of a Diameter node and network congestion. Diameter treats overload of a Diameter node and network congestion. Diameter treats
the failure to receive an answer as a transport failure. the failure to receive an answer as a transport failure.
5.2. Problems with Explicit Mechanisms 5.2. Problems with Explicit Mechanisms
The Diameter specification is ambiguous on how a client should handle The Diameter specification is ambiguous on how a client should handle
receipt of a DIAMETER_TOO_BUSY response. The base specification receipt of a DIAMETER_TOO_BUSY response. The base specification
[I-D.ietf-dime-rfc3588bis] indicates that the sending client should [RFC6733] indicates that the sending client should attempt to send
attempt to send the request to a different peer. It makes no the request to a different peer. It makes no suggestion that a the
suggestion that a the receipt of a DIAMETER_TOO_BUSY response should receipt of a DIAMETER_TOO_BUSY response should affect future Diameter
affect future Diameter messages in any way. messages in any way.
The Authentication, Authorization, and Accounting (AAA) Transport The Authentication, Authorization, and Accounting (AAA) Transport
Profile [RFC3539] recommends that a AAA node that receives a "Busy" Profile [RFC3539] recommends that a AAA node that receives a "Busy"
response failover all remaining requests to a different agent or response failover all remaining requests to a different agent or
server. But while the Diameter base specification explicitly depends server. But while the Diameter base specification explicitly depends
on RFC3539 to define transport behavior, it does not refer to RFC3539 on RFC3539 to define transport behavior, it does not refer to RFC3539
in the description of behavior on receipt of DIAMETER_TOO_BUSY. in the description of behavior on receipt of DIAMETER_TOO_BUSY.
There's a strong likelihood that at least some implementations will There's a strong likelihood that at least some implementations will
continue to send Diameter requests to an upstream peer even after continue to send Diameter requests to an upstream peer even after
receiving a DIAMETER_TOO_BUSY error. receiving a DIAMETER_TOO_BUSY error.
skipping to change at page 16, line 25 skipping to change at page 16, line 30
as described in the base specification. DIAMETER_TOO_BUSY is defined as described in the base specification. DIAMETER_TOO_BUSY is defined
as a protocol error. If an agent receives a protocol error, it may as a protocol error. If an agent receives a protocol error, it may
either handle it locally or it may forward the response back towards either handle it locally or it may forward the response back towards
the downstream peer. (The Diameter specification is inconsistent the downstream peer. (The Diameter specification is inconsistent
about whether a protocol error MAY or SHOULD be handled by an agent, about whether a protocol error MAY or SHOULD be handled by an agent,
rather than forwarded downstream.) If a downstream peer receives the rather than forwarded downstream.) If a downstream peer receives the
DIAMETER_TOO_BUSY response, it may stop sending all requests to the DIAMETER_TOO_BUSY response, it may stop sending all requests to the
agent for some period of time, even though the agent may still be agent for some period of time, even though the agent may still be
able to deliver requests to other upstream peers. able to deliver requests to other upstream peers.
DIAMETER_UNABLE_TO_DELIVER also has no mechanisms for specifying the DIAMETER_UNABLE_TO_DELIVER, or using DPR with cause code BUSY also
scope or cause of the failure, or the durational validity. have no mechanisms for specifying the scope or cause of the failure,
or the durational validity.
The issues with error responses in [RFC6733] extend beyond the
particular issues for overload control and have been addressed in an
ad hoc fashion by various implementations. Addressing these in a
standard way would be a useful exercise, but it us beyond the scope
of this document.
6. Diameter Overload Case Studies 6. Diameter Overload Case Studies
6.1. Overload in Mobile Data Networks 6.1. Overload in Mobile Data Networks
As the number of Third Generation (3G) and Long Term Evolution (LTE) As the number of Third Generation (3G) and Long Term Evolution (LTE)
enabled smartphone devices continue to expand in mobility networks, enabled smartphone devices continue to expand in mobility networks,
there have been situations where high signaling traffic load led to there have been situations where high signaling traffic load led to
overload events at the Diameter-based Home Location Registries (HLR) overload events at the Diameter-based Home Location Registries (HLR)
and/or Home Subscriber Servers (HSS). The root causes of the HLR and/or Home Subscriber Servers (HSS) [TR23.843]. The root causes of
congestion events were manifold but included hardware failure and the HLR congestion events were manifold but included hardware failure
procedural errors. The result was high signaling traffic load on the and procedural errors. The result was high signaling traffic load on
HLR and HSS. the HLR and HSS.
The 3GPP standards specification[need citation] for the end-to-end The 3GPP architecture [TS23.002] makes extensive use of Diameter. It
signaling call flows in 3G and LTE, from the end user device is used for mobility management [TS29.272] (and others), IMS
traversing through the radio and the core networks to the HLR/HSS, [TS29.228] (and others), policy and charging control [TS29.212] (and
did not have an equivalent load control mechanism to those provided others) as well as other functions. The details of the architecture
in the more traditional SS7 elements in GSM [need citation]. The are out of scope for this document, but it is worth noting that there
capabilities specified in the 3GPP standards do not adequately are quite a few Diameter applications, some with quite large amounts
address the abnormal condition where excessively high signaling of Diameter signaling in deployed networks.
traffic load situations are experienced.
The 3GPP specifications do not currently address overload for
Diameter applications or provide an equivalent load control mechanism
to those provided in the more traditional SS7 elements in GSM
[TS29.002]. The capabilities specified in the 3GPP standards do not
adequately address the abnormal condition where excessively high
signaling traffic load situations are experienced.
Smartphones contribute much more heavily, relative to non- Smartphones contribute much more heavily, relative to non-
smartphones, to the continuation of a registration surge due to their smartphones, to the continuation of a registration surge due to their
very aggressive registration algorithms. The aggressive smartphone very aggressive registration algorithms. The aggressive smartphone
logic is designed to: logic is designed to:
a. always have voice and data registration, and a. always have voice and data registration, and
b. constantly try to be on 3G or LTE data (and thus on 3G voice or b. constantly try to be on 3G or LTE data (and thus on 3G voice or
VoLTE) for their added benefits. VoLTE) for their added benefits.
skipping to change at page 18, line 18 skipping to change at page 18, line 37
of this document is to provide guidance for a core mechanism that can of this document is to provide guidance for a core mechanism that can
be used to mitigate the scenarios called out by this study. be used to mitigate the scenarios called out by this study.
7. Solution Requirements 7. Solution Requirements
This section proposes requirements for an improved mechanism to This section proposes requirements for an improved mechanism to
control Diameter overload, with the goals of improving the issues control Diameter overload, with the goals of improving the issues
described in Section 5 and supporting the scenarios described in described in Section 5 and supporting the scenarios described in
Section 2 Section 2
REQ 1: The overload mechanism MUST provide a communication method REQ 1: The overload control mechanism MUST provide a communication
for Diameter nodes to exchange overload information. method for Diameter nodes to exchange load and overload
information.
REQ 2: The overload mechanism MUST be useable with any existing or REQ 2: [Open Issue: The following requirement has generated list
future Diameter application. It MUST NOT require discussion that is unresolved at the time of this writing.
specification changes for existing Diameter applications. The discussion concerns whether this requirement is needed
at all, whether it should include the "MUST NOT require
specification changes" language vs saying that it should not
force changes large enough to require new application IDs,
and whether we should include additional language to forbid
assumptions about the behavior of specific implementations.]
The overload control mechanism MUST be useable with any
existing or future Diameter application. It MUST NOT
require specification changes for existing Diameter
applications.
REQ 3: The overload mechanism MUST limit the impact of overload on REQ 3: The overload control mechanism MUST limit the impact of
the overall useful throughput of a Diameter server, even overload on the overall useful throughput of a Diameter
when the incoming load on the network is far in excess of server, even when the incoming load on the network is far in
its capacity. The overall useful throughput under load is excess of its capacity. The overall useful throughput under
the ultimate measure of the value of an overload control load is the ultimate measure of the value of an overload
mechanism. control mechanism.
REQ 4: Diameter allows requests to be sent from either side of a REQ 4: Diameter allows requests to be sent from either side of a
connection and either side of a connection may have need to connection and either side of a connection may have need to
provide its overload status. The mechanism MUST allow each provide its overload status. The mechanism MUST allow each
side of a connection to independently inform the other of side of a connection to independently inform the other of
its overload status. its overload status.
REQ 5: Diameter allows nodes to determine their peers via dynamic REQ 5: Diameter allows nodes to determine their peers via dynamic
discovery or manual configuration. The mechanism MUST work discovery or manual configuration. The mechanism MUST work
consistently without regard to how peers are determined. consistently without regard to how peers are determined.
REQ 6: The mechanism designers SHOULD seek to minimize the amount REQ 6: The mechanism designers SHOULD seek to minimize the amount
of new configuration required in order to work. For of new configuration required in order to work. For
example, it is better to allow peers to advertise or example, it is better to allow peers to advertise or
negotiate support for the mechanism, rather than to require negotiate support for the mechanism, rather than to require
this knowledge to be configured at each node. this knowledge to be configured at each node.
REQ 7: The overload mechanism MUST ensure that the system remains REQ 7: The overload control mechanism and any associated default
stable. When the offered load drops from above the overall algorithm(s) MUST ensure that the system remains stable.
capacity of the network to below the overall capacity, the When the offered load drops from above the overall capacity
throughput MUST stabilize and become equal to the offered of the network to below the overall capacity, the throughput
load. Note that this also requires that the mechanism MUST MUST stabilize and become equal to the offered load. Note
allow nodes to shed load without introducing oscillations. that this also requires that the mechanism MUST allow nodes
to shed load without introducing oscillations.
REQ 8: Supporting nodes MUST be able to distinguish current REQ 8: Supporting nodes MUST be able to distinguish current
overload information from stale information, and SHOULD make overload information from stale information, and SHOULD make
decisions using the most currently available information. decisions using the most currently available information.
REQ 9: The mechanism MUST function across fully loaded as well as REQ 9: The mechanism MUST function across fully loaded as well as
quiescent transport connections. This is partially derived quiescent transport connections. This is partially derived
from the requirements for stability and hysteresis control from the requirements for stability and hysteresis control
above. above.
REQ 10: Consumers of overload state indications MUST be able to REQ 10: Consumers of overload state indications MUST be able to
determine when the overload condition improves or ends. determine when the overload condition improves or ends.
REQ 11: The overload mechanism MUST be scalable. That is, it MUST REQ 11: The overload control mechanism MUST be scalable. That is,
be able to operate in different sized networks. it MUST be able to operate in different sized networks.
REQ 12: When a single network node fails, goes into overload, or REQ 12: When a single network node fails, goes into overload, or
suffers from reduced processing capacity, the mechanism MUST suffers from reduced processing capacity, the mechanism MUST
make it possible to limit the impact of this on other nodes make it possible to limit the impact of this on other nodes
in the network. This helps to prevent a small-scale failure in the network. This helps to prevent a small-scale failure
from becoming a widespread outage. from becoming a widespread outage.
REQ 13: The mechanism MUST NOT introduce substantial additional work REQ 13: The mechanism MUST NOT introduce substantial additional work
for node in an overloaded state. For example, a requirement for node in an overloaded state. For example, a requirement
for an overloaded node to send overload information every for an overloaded node to send overload information every
skipping to change at page 20, line 17 skipping to change at page 20, line 46
not, support the mechanism. not, support the mechanism.
REQ 17: In a mixed environment with nodes that support the overload REQ 17: In a mixed environment with nodes that support the overload
control mechanism and that do not, the mechanism MUST result control mechanism and that do not, the mechanism MUST result
in at least as much useful throughput as would have resulted in at least as much useful throughput as would have resulted
if the mechanism were not present. It SHOULD result in less if the mechanism were not present. It SHOULD result in less
severe congestion in this environment. severe congestion in this environment.
REQ 18: In a mixed environment of nodes that support the overload REQ 18: In a mixed environment of nodes that support the overload
control mechanism and that do not, users and operators of control mechanism and that do not, users and operators of
nodes that do not support the mechanism MUST NOT benefit nodes that do not support the mechanism MUST NOT unfairly
from the mechanism more than users and operators of nodes benefit from the mechanism.
that support the mechanism.
REQ 19: It MUST be possible to use the mechanism between nodes in REQ 19: It MUST be possible to use the mechanism between nodes in
different realms and in different administrative domains. different realms and in different administrative domains.
REQ 20: Any explicit overload indication MUST distinguish between REQ 20: Any explicit overload indication MUST distinguish between
actual overload, as opposed to other, non-overload related actual overload, as opposed to other, non-overload related
failures. failures.
REQ 21: In cases where a network node fails, is so overloaded that REQ 21: In cases where a network node fails, is so overloaded that
it cannot process messages, or cannot communicate due to a it cannot process messages, or cannot communicate due to a
skipping to change at page 21, line 16 skipping to change at page 21, line 45
a load balancing node to divert messages that are rejected a load balancing node to divert messages that are rejected
or otherwise throttled by an overloaded upstream node to or otherwise throttled by an overloaded upstream node to
other upstream nodes that are the most likely to have other upstream nodes that are the most likely to have
sufficient capacity to process them. sufficient capacity to process them.
REQ 25: The mechanism MUST provide a mechanism for indicating load REQ 25: The mechanism MUST provide a mechanism for indicating load
levels even when not in an overloaded condition, to assist levels even when not in an overloaded condition, to assist
nodes making decisions to prevent overload conditions from nodes making decisions to prevent overload conditions from
occurring. occurring.
REQ 26: The specification for the overload mechanism SHOULD offer REQ 26: The base specification for the overload control mechanism
guidance on which message types might be desirable to send SHOULD offer general guidance on which message types might
or process over others during times of overload, based on be desirable to send or process over others during times of
Diameter-specific considerations. For example, it may be overload, based on application-specific considerations. For
more beneficial to process messages for existing sessions example, it may be more beneficial to process messages for
ahead of new sessions. existing sessions ahead of new sessions, or to give priority
to requests associated with emergency sessions or with high
priority users. Any normative or otherwise detailed
definition of the relative priorities of message types
during an overload condition will be the responsibility of
the application specification.
REQ 27: The mechanism MUST NOT prevent a node from prioritizing REQ 27: The mechanism MUST NOT prevent a node from prioritizing
requests based on any local policy, so that certain requests requests based on any local policy, so that certain requests
are given preferential treatment, given additional are given preferential treatment, given additional
retransmission, not throttled, or processed ahead of others. retransmission, not throttled, or processed ahead of others.
REQ 28: The overload mechanism MUST NOT provide new vulnerabilities REQ 28: The overload control mechanism MUST NOT provide new
to malicious attack, or increase the severity of any vulnerabilities to malicious attack, or increase the
existing vulnerabilities. This includes vulnerabilities to severity of any existing vulnerabilities. This includes
DoS and DDoS attacks as well as replay and man-in-the middle vulnerabilities to DoS and DDoS attacks as well as replay
attacks. and man-in-the middle attacks. Note that the Diameter base
specification [RFC6733] lacks end to end security and this
must be considered.
REQ 29: The mechanism MUST provide a means to match an overload REQ 29: The mechanism MUST provide a means to match an overload
indication with the node that originated it. In particular, indication with the node that originated it. In particular,
the mechanism MUST allow a node to distinguish between the mechanism MUST allow a node to distinguish between
overload at a next-hop peer from overload at a node upstream overload at a next-hop peer from overload at a node upstream
of the peer. For example, in Figure 5, the client must not of the peer. For example, in Figure 5, the client must not
mistake overload at server 1 for overload at the agent, mistake overload at server 1 for overload at the agent,
whether or not the agent supports the mechanism.( see REQ whether or not the agent supports the mechanism.( see REQ
4). 4).
skipping to change at page 22, line 11 skipping to change at page 22, line 45
share of service. Note that this does not imply any share of service. Note that this does not imply any
responsibility on the mechanism to detect, or take responsibility on the mechanism to detect, or take
countermeasures against, malicious nodes. countermeasures against, malicious nodes.
REQ 31: It MUST be possible for a supporting node to make REQ 31: It MUST be possible for a supporting node to make
authorization decisions about what information will be sent authorization decisions about what information will be sent
to peer nodes based on the identity of those nodes. This to peer nodes based on the identity of those nodes. This
allows a domain administrator who considers the load of allows a domain administrator who considers the load of
their nodes to be sensitive information to restrict access their nodes to be sensitive information to restrict access
to that information. Of course, in such cases, there is no to that information. Of course, in such cases, there is no
expectation that the overload mechanism itself will help expectation that the overload control mechanism itself will
prevent overload from that peer node. help prevent overload from that peer node.
REQ 32: The mechanism MUST NOT interfere with any Diameter compliant REQ 32: The mechanism MUST NOT interfere with any Diameter compliant
method that a node may use to protect itself from overload method that a node may use to protect itself from overload
from non-supporting nodes, or from denial of service from non-supporting nodes, or from denial of service
attacks. attacks.
REQ 33: There are multiple situations where a Diameter node may be REQ 33: There are multiple situations where a Diameter node may be
overloaded for some purposes but not others. For example, overloaded for some purposes but not others. For example,
this can happen to an agent or server that supports multiple this can happen to an agent or server that supports multiple
applications, or when a server depends on multiple external applications, or when a server depends on multiple external
skipping to change at page 22, line 41 skipping to change at page 23, line 32
SHOULD allow extensibility for others to be added in the SHOULD allow extensibility for others to be added in the
future. future.
REQ 34: The mechanism MUST provide a method for extending the REQ 34: The mechanism MUST provide a method for extending the
information communicated and the algorithms used for information communicated and the algorithms used for
overload control. overload control.
REQ 35: The mechanism SHOULD provide a method for exchanging REQ 35: The mechanism SHOULD provide a method for exchanging
overload and load information between elements that are overload and load information between elements that are
connected by intermediaries that do not support the connected by intermediaries that do not support the
mechanism. A separate mechanism or extension of the mechanism.
mechanism to support this may be warranted for this.
8. IANA Considerations 8. IANA Considerations
This document makes no requests of IANA. This document makes no requests of IANA.
9. Security Considerations 9. Security Considerations
A Diameter overload control mechanism is primarily concerned with the A Diameter overload control mechanism is primarily concerned with the
load and overload related behavior of nodes in a Diameter network, load and overload related behavior of nodes in a Diameter network,
and the information used to affect that behavior. Load and overload and the information used to affect that behavior. Load and overload
skipping to change at page 23, line 17 skipping to change at page 24, line 7
and thus is potentially vulnerable to a number of methods of attack. and thus is potentially vulnerable to a number of methods of attack.
Load and overload information may also be sensitive from both Load and overload information may also be sensitive from both
business and network protection viewpoints. Operators of Diameter business and network protection viewpoints. Operators of Diameter
equipment want to control visibility to load and overload information equipment want to control visibility to load and overload information
to keep it from being used for competitive intelligence or for to keep it from being used for competitive intelligence or for
targeting attacks. It is also important that the Diameter overload targeting attacks. It is also important that the Diameter overload
control mechanism not introduce any way in which any other control mechanism not introduce any way in which any other
information carried by Diameter is sent inappropriately. information carried by Diameter is sent inappropriately.
Note that the Diameter base specification [RFC6733] lacks end to end
security, making verifying the authenticity and ownership of load and
overload information difficult for non-adjacent nodes.
Authentication of load and overload information helps to alleviate
several of the security issues listed in this section.
This document includes requirements intended to mitigate the effects This document includes requirements intended to mitigate the effects
of attacks and to protect the information used by the mechanism. of attacks and to protect the information used by the mechanism.
9.1. Access Control 9.1. Access Control
To control the visibility of load and overload information, sending To control the visibility of load and overload information, sending
should be subject to some form of authentication and authorization of should be subject to some form of authentication and authorization of
the receiver. It is also important to the receivers that they are the receiver. It is also important to the receivers that they are
confident the load and overload information they receive is from a confident the load and overload information they receive is from a
legitimate source. Note that this implies a certain amount of legitimate source. Note that this implies a certain amount of
skipping to change at page 24, line 45 skipping to change at page 25, line 41
of other ways. This is out of scope for a Diameter overload control of other ways. This is out of scope for a Diameter overload control
mechanism. mechanism.
10. References 10. References
10.1. Normative References 10.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997. Requirement Levels", BCP 14, RFC 2119, March 1997.
[I-D.ietf-dime-rfc3588bis] [RFC6733] Fajardo, V., Arkko, J., Loughney, J., and G. Zorn,
Fajardo, V., Arkko, J., Loughney, J., and G. Zorn, "Diameter Base Protocol", RFC 6733, October 2012.
"Diameter Base Protocol", draft-ietf-dime-rfc3588bis-34
(work in progress), June 2012.
[RFC2914] Floyd, S., "Congestion Control Principles", BCP 41, [RFC2914] Floyd, S., "Congestion Control Principles", BCP 41,
RFC 2914, September 2000. RFC 2914, September 2000.
[RFC3539] Aboba, B. and J. Wood, "Authentication, Authorization and [RFC3539] Aboba, B. and J. Wood, "Authentication, Authorization and
Accounting (AAA) Transport Profile", RFC 3539, June 2003. Accounting (AAA) Transport Profile", RFC 3539, June 2003.
10.2. Informative References 10.2. Informative References
[RFC5390] Rosenberg, J., "Requirements for Management of Overload in [RFC5390] Rosenberg, J., "Requirements for Management of Overload in
the Session Initiation Protocol", RFC 5390, December 2008. the Session Initiation Protocol", RFC 5390, December 2008.
[TR23.843] [TR23.843]
3GPP, "Study on Core Network Overload Solutions", 3GPP, "Study on Core Network Overload Solutions",
TR 23.843 0.6.0, October 2012. TR 23.843 0.6.0, October 2012.
[IR.34] GSMA, "Inter-Service Provider IP Backbone Guidelines",
IR 34 7.0, January 2012.
[IR.88] GSMA, "LTE Roaming Guidelines", IR 88 7.0, January 2012.
[TS23.002]
3GPP, "Network Architecture", TS 23.002 12.0.0,
September 2012.
[TS29.272]
3GPP, "Evolved Packet System (EPS); Mobility Management
Entity (MME) and Serving GPRS Support Node (SGSN) related
interfaces based on Diameter protocol", TS 29.272 11.4.0,
September 2012.
[TS29.212]
3GPP, "Policy and Charging Control (PCC) over Gx/Sd
reference point", TS 29.212 11.6.0, September 2012.
[TS29.228]
3GPP, "IP Multimedia (IM) Subsystem Cx and Dx interfaces;
Signalling flows and message contents", TS 29.228 11.5.0,
September 2012.
[TS29.002]
3GPP, "Mobile Application Part (MAP) specification",
TS 29.002 11.4.0, September 2012.
Appendix A. Contributors Appendix A. Contributors
Significant contributions to this document were made by Adam Roach Significant contributions to this document were made by Adam Roach
and Eric Noel. and Eric Noel.
Appendix B. Acknowledgements Appendix B. Acknowledgements
Review of, and contributions to, this specification by Martin Dolly, Review of, and contributions to, this specification by Martin Dolly,
Carolyn Johnson, Jianrong Wang, Imtiaz Shaikh, Jouni Korhonen, Robert Carolyn Johnson, Jianrong Wang, Imtiaz Shaikh, Jouni Korhonen, Robert
Sparks, Dieter Jacobsohn, and Janet Gunn were most appreciated. We Sparks, Dieter Jacobsohn, Janet Gunn, Jean-Jacques Trottin, Laurent
would like to thank them for their time and expertise. Thiebaut, and Lionel Morand were most appreciated. We would like to
thank them for their time and expertise.
Authors' Addresses Authors' Addresses
Eric McMurry Eric McMurry
Tekelec Tekelec
17210 Campbell Rd. 17210 Campbell Rd.
Suite 250 Suite 250
Dallas, TX 75252 Dallas, TX 75252
US US
Email: eric.mcmurry@tekelec.com Email: emcmurry@computer.org
Ben Campbell Ben Campbell
Tekelec Tekelec
17210 Campbell Rd. 17210 Campbell Rd.
Suite 250 Suite 250
Dallas, TX 75252 Dallas, TX 75252
US US
Email: ben@nostrum.com Email: ben@nostrum.com
 End of changes. 28 change blocks. 
94 lines changed or deleted 165 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/