draft-ietf-ipsecme-failure-detection-03.txt   draft-ietf-ipsecme-failure-detection-04.txt 
IPsecME Working Group Y. Nir, Ed. IPsecME Working Group Y. Nir, Ed.
Internet-Draft Check Point Internet-Draft Check Point
Intended status: Standards Track D. Wierbowski Intended status: Standards Track D. Wierbowski
Expires: July 14, 2011 IBM Expires: August 15, 2011 IBM
F. Detienne F. Detienne
P. Sethi P. Sethi
Cisco Cisco
January 10, 2011 February 11, 2011
A Quick Crash Detection Method for IKE A Quick Crash Detection Method for IKE
draft-ietf-ipsecme-failure-detection-03 draft-ietf-ipsecme-failure-detection-04
Abstract Abstract
This document describes an extension to the IKEv2 protocol that This document describes an extension to the IKEv2 protocol that
allows for faster detection of SA desynchronization using a saved allows for faster detection of SA desynchronization using a saved
token. token.
When an IPsec tunnel between two IKEv2 peers is disconnected due to a When an IPsec tunnel between two IKEv2 peers is disconnected due to a
restart of one peer, it can take as much as several minutes for the restart of one peer, it can take as much as several minutes for the
other peer to discover that the reboot has occurred, thus delaying other peer to discover that the reboot has occurred, thus delaying
skipping to change at page 1, line 42 skipping to change at page 1, line 42
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on July 14, 2011. This Internet-Draft will expire on August 15, 2011.
Copyright Notice Copyright Notice
Copyright (c) 2011 IETF Trust and the persons identified as the Copyright (c) 2011 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1. Conventions Used in This Document . . . . . . . . . . . . 4 1.1. Conventions Used in This Document . . . . . . . . . . . . 4
2. RFC 5996 Crash Recovery . . . . . . . . . . . . . . . . . . . 5 2. RFC 5996 Crash Recovery . . . . . . . . . . . . . . . . . . . 5
3. Protocol Outline . . . . . . . . . . . . . . . . . . . . . . . 6 3. Protocol Outline . . . . . . . . . . . . . . . . . . . . . . . 6
4. Formats and Exchanges . . . . . . . . . . . . . . . . . . . . 7 4. Formats and Exchanges . . . . . . . . . . . . . . . . . . . . 7
4.1. Notification Format . . . . . . . . . . . . . . . . . . . 7 4.1. Notification Format . . . . . . . . . . . . . . . . . . . 7
4.2. Passing a Token in the AUTH Exchange . . . . . . . . . . . 8 4.2. Passing a Token in the AUTH Exchange . . . . . . . . . . 8
4.3. Replacing Tokens After Rekey or Resumption . . . . . . . . 9 4.3. Replacing Tokens After Rekey or Resumption . . . . . . . 9
4.4. Replacing the Token for an Existing SA . . . . . . . . . . 9 4.4. Replacing the Token for an Existing SA . . . . . . . . . 10
4.5. Presenting the Token in an Unprotected Message . . . . . . 10 4.5. Presenting the Token in an Unprotected Message . . . . . 10
5. Token Generation and Verification . . . . . . . . . . . . . . 11 5. Token Generation and Verification . . . . . . . . . . . . . . 11
5.1. A Stateless Method of Token Generation . . . . . . . . . . 11 5.1. A Stateless Method of Token Generation . . . . . . . . . 11
5.2. A Stateless Method with IP addresses . . . . . . . . . . . 12 5.2. A Stateless Method with IP addresses . . . . . . . . . . 12
5.3. Token Lifetime . . . . . . . . . . . . . . . . . . . . . . 12 5.3. Token Lifetime . . . . . . . . . . . . . . . . . . . . . 12
6. Backup Gateways . . . . . . . . . . . . . . . . . . . . . . . 12 6. Backup Gateways . . . . . . . . . . . . . . . . . . . . . . . 12
7. Interaction with Session Resumption . . . . . . . . . . . . . 13 7. Interaction with Session Resumption . . . . . . . . . . . . . 13
8. Operational Considerations . . . . . . . . . . . . . . . . . . 14 8. Operational Considerations . . . . . . . . . . . . . . . . . . 14
8.1. Who should implement this specification . . . . . . . . . 14 8.1. Who should implement this specification . . . . . . . . . 14
8.2. Response to unknown child SPI . . . . . . . . . . . . . . 15 8.2. Response to unknown child SPI . . . . . . . . . . . . . . 15
9. Security Considerations . . . . . . . . . . . . . . . . . . . 15 9. Security Considerations . . . . . . . . . . . . . . . . . . . 16
9.1. QCD Token Generation and Handling . . . . . . . . . . . . 16 9.1. QCD Token Generation and Handling . . . . . . . . . . . . 16
9.2. QCD Token Transmission . . . . . . . . . . . . . . . . . . 17 9.2. QCD Token Transmission . . . . . . . . . . . . . . . . . 17
9.3. QCD Token Enumeration . . . . . . . . . . . . . . . . . . 17 9.3. QCD Token Enumeration . . . . . . . . . . . . . . . . . . 18
9.4. Selecting an Appropriate Token Generation Method . . . . . 17
10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18
11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 18 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 18
12. Change Log . . . . . . . . . . . . . . . . . . . . . . . . . . 18 12. Change Log . . . . . . . . . . . . . . . . . . . . . . . . . . 18
12.1. Changes from draft-ietf-ipsecme-failure-detection-02 . . . 19 12.1. Changes from draft-ietf-ipsecme-failure-detection-03 . . 19
12.2. Changes from draft-ietf-ipsecme-failure-detection-01 . . . 19 12.2. Changes from draft-ietf-ipsecme-failure-detection-02 . . 19
12.3. Changes from draft-ietf-ipsecme-failure-detection-00 . . . 19 12.3. Changes from draft-ietf-ipsecme-failure-detection-01 . . 19
12.4. Changes from draft-nir-ike-qcd-07 . . . . . . . . . . . . 19 12.4. Changes from draft-ietf-ipsecme-failure-detection-00 . . 19
12.5. Changes from draft-nir-ike-qcd-03 and -04 . . . . . . . . 19 12.5. Changes from draft-nir-ike-qcd-07 . . . . . . . . . . . . 19
12.6. Changes from draft-nir-ike-qcd-02 . . . . . . . . . . . . 20 12.6. Changes from draft-nir-ike-qcd-03 and -04 . . . . . . . . 20
12.7. Changes from draft-nir-ike-qcd-01 . . . . . . . . . . . . 20 12.7. Changes from draft-nir-ike-qcd-02 . . . . . . . . . . . . 20
12.8. Changes from draft-nir-ike-qcd-00 . . . . . . . . . . . . 20 12.8. Changes from draft-nir-ike-qcd-01 . . . . . . . . . . . . 20
12.9. Changes from draft-nir-qcr-00 . . . . . . . . . . . . . . 20 12.9. Changes from draft-nir-ike-qcd-00 . . . . . . . . . . . . 20
12.10. Changes from draft-nir-qcr-00 . . . . . . . . . . . . . . 20
13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 20 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 20
13.1. Normative References . . . . . . . . . . . . . . . . . . . 20 13.1. Normative References . . . . . . . . . . . . . . . . . . 20
13.2. Informative References . . . . . . . . . . . . . . . . . . 21 13.2. Informative References . . . . . . . . . . . . . . . . . 21
Appendix A. The Path Not Taken . . . . . . . . . . . . . . . . . 21 Appendix A. The Path Not Taken . . . . . . . . . . . . . . . . . 21
A.1. Initiating a new IKE SA . . . . . . . . . . . . . . . . . 21 A.1. Initiating a new IKE SA . . . . . . . . . . . . . . . . . 21
A.2. SIR . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 A.2. SIR . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
A.3. Birth Certificates . . . . . . . . . . . . . . . . . . . . 22 A.3. Birth Certificates . . . . . . . . . . . . . . . . . . . 22
A.4. Reducing Liveness Check Length . . . . . . . . . . . . . . 22 A.4. Reducing Liveness Check Length . . . . . . . . . . . . . 22
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 22 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 22
1. Introduction 1. Introduction
IKEv2, as described in [RFC5996] and its predecessor RFC 4306, has a IKEv2, as described in [RFC5996] and its predecessor RFC 4306, has a
method for recovering from a reboot of one peer. As long as traffic method for recovering from a reboot of one peer. As long as traffic
flows in both directions, the rebooted peer should re-establish the flows in both directions, the rebooted peer should re-establish the
tunnels immediately. However, in many cases the rebooted peer is a tunnels immediately. However, in many cases the rebooted peer is a
VPN gateway that protects only servers, or else the non-rebooted peer VPN gateway that protects only servers, or else the non-rebooted peer
has a dynamic IP address. In such cases, the rebooted peer will not has a dynamic IP address. In such cases, the rebooted peer will not
skipping to change at page 5, line 40 skipping to change at page 5, line 40
"It is "It is
suggested that messages be retransmitted at least a dozen times over suggested that messages be retransmitted at least a dozen times over
a period of at least several minutes before giving up on an SA..." a period of at least several minutes before giving up on an SA..."
Those "at least several minutes" are a time during part of which both Those "at least several minutes" are a time during part of which both
peers are active, but IPsec cannot be used. peers are active, but IPsec cannot be used.
Especially in the case of a reboot (rather than fail-over or Especially in the case of a reboot (rather than fail-over or
administrative clearing of state), the peer does not recover administrative clearing of state), the peer does not recover
immediately. Reboot, depending on the system may take from a few immediately. Reboot, depending on the system, may take from a few
seconds to a few minutes. This means that at first the peer just seconds to a few minutes. This means that at first the peer just
goes silent, i.e. does not send or respond to any messages. IKEv2 goes silent, i.e., does not send or respond to any messages. IKEv2
implementation can detect this situation and follow the rules given implementations can detect this situation and follow the rules given
in the section 2.4: in section 2.4:
If there has only been outgoing traffic on all of If there has only been outgoing traffic on all of
the SAs associated with an IKE SA, it is essential to confirm the SAs associated with an IKE SA, it is essential to confirm
liveness of the other endpoint to avoid black holes. If no liveness of the other endpoint to avoid black holes. If no
cryptographically protected messages have been received on an IKE cryptographically protected messages have been received on an IKE
SA or any of its Child SAs recently, the system needs to perform a SA or any of its Child SAs recently, the system needs to perform a
liveness check in order to prevent sending messages to a dead peer. liveness check in order to prevent sending messages to a dead peer.
[RFC5996] does not mandate any time limits, but it is possible that [RFC5996] does not mandate any time limits, but it is possible that
the peer will start liveness checks even before the other end is the peer will start liveness checks even before the other end is
sending INVALID_SPI notification, as it detected that the other end sending INVALID_SPI notification, as it detected that the other end
is not sending any packets anymore while it is still rebooting or is not sending any packets anymore while it is still rebooting or
recovering from the situation. recovering from the situation.
This means that the several minutes recovery period is overlaping the This means that the several minutes recovery period is overlaping the
actual recover time of the other peer, i.e. if the security gateway actual recover time of the other peer, i.e., if the security gateway
requires several minutes to boot up from the crash then the other requires several minutes to boot up from the crash then the other
peers have already finished their liveness checks before the crashing peers have already finished their liveness checks before the crashing
peer even has change to send INVALID_SPI notifications. peer even has a chance to send INVALID_SPI notifications.
There are cases where the peer loses state and is able to recover There are cases where the peer loses state and is able to recover
immediately, in those cases it might take several minutes to recover. immediately; in those cases it might take several minutes to recover.
Note, that IKEv2 specification specifically leaves number of retries Note that the IKEv2 specification specifically gives no guidance for
and lengths of timeouts out from the specification, as they do not the number of retries or the length of timeouts, as these do not
affect interoperability. This means that implementations are allowed affect interoperability. This means that implementations are allowed
to use the hints provided by the INVALID_SPI messages as hints that to use the hints provided by the INVALID_SPI messages to shorten
will shorten those timeouts (i.e. different environment and situation those timeouts (i.e., different environment and situation requiring
requiring different rules). different rules).
Good existing IKEv2 implementations already do that (i.e. both Some existing IKEv2 implementations already do that (i.e., both
shorten timeouts or limit number of retries) based on that kind of shorten timeouts or limit number of retries) based on these kind of
hints and also start liveness checks quickly after the other end goes hints and also start liveness checks quickly after the other end goes
silent. silent. However, see Appendix A.4 for a discussion of why this may
not be enough.
3. Protocol Outline 3. Protocol Outline
Supporting implementations will send a notification, called a "QCD Supporting implementations will send a notification, called a "QCD
token", as described in Section 4.1 in the first IKE_AUTH exchange token", as described in Section 4.1 in the first IKE_AUTH exchange
messages. These are the first IKE_AUTH request and final IKE_AUTH messages. These are the first IKE_AUTH request and final IKE_AUTH
response that contain the AUTH payloads. The generation of these response that contain the AUTH payloads. The generation of these
tokens is a local matter for implementations, but considerations are tokens is a local matter for implementations, but considerations are
described in Section 5. Implementations that send such a token will described in Section 5. Implementations that send such a token will
be called "token makers". be called "token makers".
A supporting implementation receiving such a token MUST store it (or A supporting implementation receiving such a token MUST store it (or
a digest thereof) along with the IKE SA. Implementations that a digest thereof) along with the IKE SA. Implementations that
support this part of the protocol will be called "token takers". support this part of the protocol will be called "token takers".
Section 8.1 has considerations for which implementations need to be Section 8.1 has considerations for which implementations need to be
token takers, and which should be token makers. Implementation that token takers, and which should be token makers. Implementations that
are not token takers will silently ignore QCD tokens. are not token takers will silently ignore QCD tokens.
When a token maker receives a protected IKE request message with When a token maker receives a protected IKE request message with
unknown IKE SPIs, it SHOULD generate a new token that is identical to unknown IKE SPIs, it SHOULD generate a new token that is identical to
the previous token, and send it to the requesting peer in an the previous token, and send it to the requesting peer in an
unprotected IKE message as described in Section 4.5. unprotected IKE message as described in Section 4.5.
When a token taker receives the QCD token in an unprotected When a token taker receives the QCD token in an unprotected
notification, it MUST verify that the TOKEN_SECRET_DATA matches the notification, it MUST verify that the TOKEN_SECRET_DATA matches the
token stored with the matching IKE SA. If the verification fails, or token stored with the matching IKE SA. If the verification fails, or
skipping to change at page 7, line 48 skipping to change at page 8, line 4
~ TOKEN_SECRET_DATA ~ ~ TOKEN_SECRET_DATA ~
! ! ! !
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
o Protocol ID (1 octet) MUST be 1, as this message is related to an o Protocol ID (1 octet) MUST be 1, as this message is related to an
IKE SA. IKE SA.
o SPI Size (1 octet) MUST be zero, in conformance with section 3.10 o SPI Size (1 octet) MUST be zero, in conformance with section 3.10
of [RFC5996]. of [RFC5996].
o QCD Token Notify Message Type (2 octets) - MUST be xxxxx, the o QCD Token Notify Message Type (2 octets) - MUST be xxxxx, the
value assigned for QCD token notifications. TBA by IANA. value assigned for QCD token notifications. TBA by IANA.
o TOKEN_SECRET_DATA (16-128 octets) contains a generated token as
o TOKEN_SECRET_DATA (variable) contains a generated token as
described in Section 5. described in Section 5.
4.2. Passing a Token in the AUTH Exchange 4.2. Passing a Token in the AUTH Exchange
For brevity, only the EAP version of an AUTH exchange will be For brevity, only the EAP version of an AUTH exchange will be
presented here. The non-EAP version is very similar. The figures presented here. The non-EAP version is very similar. The figures
below are based on appendix C.3 of [RFC5996]. below are based on appendix C.3 of [RFC5996].
first request --> IDi, first request --> IDi,
[N(INITIAL_CONTACT)], [N(INITIAL_CONTACT)],
skipping to change at page 9, line 9 skipping to change at page 9, line 12
token, then a reboot of the other peer will not be recoverable by token, then a reboot of the other peer will not be recoverable by
this method. This may be acceptable if traffic typically originates this method. This may be acceptable if traffic typically originates
from the other peer. from the other peer.
In any case, the lack of a QCD_TOKEN notification MUST NOT be taken In any case, the lack of a QCD_TOKEN notification MUST NOT be taken
as an indication that the peer does not support this standard. as an indication that the peer does not support this standard.
Conversely, if a peer does not understand this notification, it will Conversely, if a peer does not understand this notification, it will
simply ignore it. Therefore a peer may send this notification simply ignore it. Therefore a peer may send this notification
freely, even if it does not know whether the other side supports it. freely, even if it does not know whether the other side supports it.
The QCD_TOKEN notification is related to the IKE SA and MUST follow The QCD_TOKEN notification is related to the IKE SA and should follow
the AUTH payload and precede the Configuration payload and all the AUTH payload and precede the Configuration payload and all
payloads related to the child SA. payloads related to the child SA.
4.3. Replacing Tokens After Rekey or Resumption 4.3. Replacing Tokens After Rekey or Resumption
After rekeying an IKE SA, the IKE SPIs are replaced, so the new SA After rekeying an IKE SA, the IKE SPIs are replaced, so the new SA
also needs to have a token. If only the responder in the rekey also needs to have a token. If only the responder in the rekey
exchange is the token maker, this can be done within the exchange is the token maker, this can be done within the
CREATE_CHILD_SA exchange. If the initiator is a token maker, then we CREATE_CHILD_SA exchange. If the initiator is a token maker, then we
need an extra informational exchange. need an extra informational exchange.
The following figure shows the CREATE_CHILD_SA exchange for rekeying The following figure shows the CREATE_CHILD_SA exchange for rekeying
the IKE SA. Only the responder sends a QCD token. the IKE SA. Only the responder sends a QCD token.
request --> SA, Ni, [KEi] request --> SA, Ni, [KEi]
response <-- SA, Nr, [KEr], N(QCD_TOKEN) response <-- SA, Nr, [KEr], N(QCD_TOKEN)
If the initiator is also a token maker, it SHOULD soon initiate an If the initiator is also a token maker, it SHOULD initiate an
INFORMATIONAL exchange as follows: INFORMATIONAL exchange immediately after the CREATE_CHILD_SA exchange
as follows:
request --> N(QCD_TOKEN) request --> N(QCD_TOKEN)
response <-- response <--
For session resumption, as specified in [RFC5723], the situation is For session resumption, as specified in [RFC5723], the situation is
similar. The responder, which is necessarily the peer that has similar. The responder, which is necessarily the peer that has
crashed, SHOULD send a new ticket within the protected payload of the crashed, SHOULD send a new ticket within the protected payload of the
IKE_SESSION_RESUME exchange. If the Initiator is also a token maker, IKE_SESSION_RESUME exchange. If the Initiator is also a token maker,
it needs to send a QCD_TOKEN in a separate INFORMATIONAL exchange. it needs to send a QCD_TOKEN in a separate INFORMATIONAL exchange.
The INFORMATIONAL exchange described in this section can also be used The INFORMATIONAL exchange described in this section can also be used
if QCD tokens need to be replaced due to a key rollover. However, if QCD tokens need to be replaced due to a key rollover. However,
since token takers are required to verify at least 4 QCD tokens, this since token takers are required to verify at least 4 QCD tokens, this
is only necessary if secret QCD keys are rolled over more than four is only necessary if secret QCD keys are rolled over more than four
times as often as IKE SAs are rekeyed. times as often as IKE SAs are rekeyed. See Section 5.1 for an
example method that uses secret keys which may require rollover.
4.4. Replacing the Token for an Existing SA 4.4. Replacing the Token for an Existing SA
With some token generation methods, such as that described in With some token generation methods, such as that described in
Section 5.2, a QCD token may sometimes become invalid, although the Section 5.2, a QCD token may sometimes become invalid, although the
IKE SA is still perfectly valid. IKE SA is still perfectly valid.
In such a case, the token maker MUST send the new token in a In such a case, the token maker MUST send the new token in a
protected message under that IKE SA. That exchange could be a simple protected message under that IKE SA. That exchange could be a simple
INFORMATIONAL, such as in the last figure in the previous section, or INFORMATIONAL, such as in the last figure in the previous section, or
skipping to change at page 10, line 37 skipping to change at page 10, line 43
A token taker MUST accept such gratuitous QCD_TOKEN notifications as A token taker MUST accept such gratuitous QCD_TOKEN notifications as
long as they are carried in protected exchanges. A token maker long as they are carried in protected exchanges. A token maker
SHOULD NOT generate them unless it is no longer able to generate the SHOULD NOT generate them unless it is no longer able to generate the
old QCD_TOKEN. old QCD_TOKEN.
4.5. Presenting the Token in an Unprotected Message 4.5. Presenting the Token in an Unprotected Message
This QCD_TOKEN notification is unprotected, and is sent as a response This QCD_TOKEN notification is unprotected, and is sent as a response
to a protected IKE request, which uses an IKE SA that is unknown. to a protected IKE request, which uses an IKE SA that is unknown.
request --> N(INVALID_IKE_SPI), N(QCD_TOKEN)+ message --> N(INVALID_IKE_SPI), N(QCD_TOKEN)+
If child SPIs are persistently mapped to IKE SPIs as described in If child SPIs are persistently mapped to IKE SPIs as described in
Section 8.2, a token taker may get the following unprotected message Section 8.2, a token taker may get the following unprotected message
in response to an ESP or AH packet. in response to an ESP or AH packet.
request --> N(INVALID_SPI), N(QCD_TOKEN)+ message --> N(INVALID_SPI), N(QCD_TOKEN)+
The QCD_TOKEN and INVALID_IKE_SPI notifications are sent together to The QCD_TOKEN and INVALID_IKE_SPI notifications are sent together to
support both implementations that conform to this specification and support both implementations that conform to this specification and
implementations that don't. Similar to the description in section implementations that don't. Similar to the description in section
2.21 of [RFC5996], the IKE SPI and message ID fields in the packet 2.21 of [RFC5996], the IKE SPI and message ID fields in the packet
headers are taken from the protected IKE request. headers are taken from the protected IKE request.
To support a periodic rollover of the secret used for token To support a periodic rollover of the secret used for token
generation, the token taker MUST support at least four QCD_TOKEN generation, the token taker MUST support at least four QCD_TOKEN
notifications in a single packet. The token is considered verified notifications in a single packet. The token is considered verified
skipping to change at page 12, line 13 skipping to change at page 12, line 18
TOKEN_SECRET_DATA = HASH(QCD_SECRET | SPI-I | SPI-R) TOKEN_SECRET_DATA = HASH(QCD_SECRET | SPI-I | SPI-R)
5.2. A Stateless Method with IP addresses 5.2. A Stateless Method with IP addresses
This method is similar to the one in the previous section, except This method is similar to the one in the previous section, except
that the IP address of the token taker is also added to the block that the IP address of the token taker is also added to the block
being hashed. This has the disadvantage that the token needs to be being hashed. This has the disadvantage that the token needs to be
replaced (as described in Section 4.4) whenever the token taker replaced (as described in Section 4.4) whenever the token taker
changes its address. changes its address.
The reason to use this method is described in Section 9.4. When See Section 9.2 for a discussion of a use-case for this method. When
using this method, the TOKEN_SECRET_DATA field is calculated as using this method, the TOKEN_SECRET_DATA field is calculated as
follows: follows:
TOKEN_SECRET_DATA = HASH(QCD_SECRET | SPI-I | SPI-R | IPaddr-T) TOKEN_SECRET_DATA = HASH(QCD_SECRET | SPI-I | SPI-R | IPaddr-T)
The IPaddr-T field specifies the IP address of the token taker. The IPaddr-T field specifies the IP address of the token taker.
Secret rollover considerations are similar to those in the previous Secret rollover considerations are similar to those in the previous
section. section.
5.3. Token Lifetime 5.3. Token Lifetime
skipping to change at page 13, line 7 skipping to change at page 13, line 15
the effect of having the crash recovery available immediately. the effect of having the crash recovery available immediately.
Note that this refers to "high availability" configurations, where Note that this refers to "high availability" configurations, where
only one gateway is active at any given moment. This is different only one gateway is active at any given moment. This is different
from "load sharing" configurations where more than one gateway is from "load sharing" configurations where more than one gateway is
active at the same time. For load sharing configurations, please see active at the same time. For load sharing configurations, please see
Section 9.2 for security considerations. Section 9.2 for security considerations.
7. Interaction with Session Resumption 7. Interaction with Session Resumption
Session Resumption, specified in [RFC5723] allows setting up a new Session resumption, specified in [RFC5723], allows the setting up of
IKE SA consume less computing resources. This is particularly useful a new IKE SA to consume less computing resources. This is
in the case of a remote access gateway that has many tunnels. A particularly useful in the case of a remote access gateway that has
failure of such a gateway would require all these many remote access many tunnels. A failure of such a gateway requires all these many
clients to establish an IKE SA either with the rebooted gateway or remote access clients to establish an IKE SA either with the rebooted
with a backup gateway. This tunnel re-establishment should occur gateway or with a backup. This tunnel re-establishment occurs within
within a short period of time, creating a burden on the remote access a short period of time, creating a burden on the remote access
gateway. Session Resumption addresses this problem by having the gateway. Session resumption addresses this problem by having the
clients store an encrypted derivative of the IKE SA for quick re- clients store an encrypted derivative of the IKE SA for quick re-
establishment. establishment.
What Session Resumption does not help is the problem of detecting What Session Resumption does not help is the problem of detecting
that the peer gateway has failed. A failed gateway may go undetected that the peer gateway has failed. A failed gateway may go undetected
for an arbitrarily long time, because IPsec does not have packet for an arbitrarily long time, because IPsec does not have packet
acknowledgement, and applications cannot signal the IPsec layer that acknowledgement, and applications cannot signal the IPsec layer that
the tunnel "does not work". Section 2.4 of RFC 5996 does not specify the tunnel "does not work". Section 2.4 of RFC 5996 does not specify
how long an implementation needs to wait before beginning a liveness how long an implementation needs to wait before beginning a liveness
check, and only says "not recently" (see full quote in Section 2). check, and only says "not recently" (see full quote in Section 2).
skipping to change at page 15, line 4 skipping to change at page 15, line 13
systems should implement backup gateways as described in Section 6. systems should implement backup gateways as described in Section 6.
Implementing the "token maker" side of QCD makes sense for IKE Implementing the "token maker" side of QCD makes sense for IKE
implementation where protected connections originate from the peer, implementation where protected connections originate from the peer,
such as inter-domain VPNs and remote access gateways. Implementing such as inter-domain VPNs and remote access gateways. Implementing
the "token taker" side of QCD makes sense for IKE implementations the "token taker" side of QCD makes sense for IKE implementations
where protected connections originate, such as inter-domain VPNs and where protected connections originate, such as inter-domain VPNs and
remote access clients. remote access clients.
To clarify the this discussion: To clarify the this discussion:
o For remote-access clients it makes sense to implement the token o For remote-access clients it makes sense to implement the token
taker role. taker role.
o For remote-access gateways it makes sense to implement the token o For remote-access gateways it makes sense to implement the token
maker role. maker role.
o For inter-domain VPN gateway it makes sense to implement both o For inter-domain VPN gateways it makes sense to implement both
roles, because it can't be known in advance where the traffic roles, because it can't be known in advance where the traffic
originates. originates.
o It is perfectly valid to implement both roles in any case, for o It is perfectly valid to implement both roles in any case, for
example when using a single library or a single gateway to perform example when using a single library or a single gateway to perform
several roles. several roles.
In order to limit the effects of DoS attacks, a token taker SHOULD In order to limit the effects of DoS attacks, a token taker SHOULD
limit the rate of QCD_TOKENs verified from a particular source. limit the rate of QCD_TOKENs verified from a particular source.
If excessive amounts of IKE requests protected with unknown IKE SPIs If excessive amounts of IKE requests protected with unknown IKE SPIs
skipping to change at page 17, line 7 skipping to change at page 17, line 11
A message like that is subject to modification, deletion and replay A message like that is subject to modification, deletion and replay
by an attacker. However, these attacks will not compromise the by an attacker. However, these attacks will not compromise the
security of either side. Modification is meaningless because a security of either side. Modification is meaningless because a
modified token is simply an invalid token. Deletion will only cause modified token is simply an invalid token. Deletion will only cause
the protocol not to work, resulting in a delay in tunnel re- the protocol not to work, resulting in a delay in tunnel re-
establishment as described in Section 2. Replay is also meaningless, establishment as described in Section 2. Replay is also meaningless,
because the IKE SA has been deleted after the first transmission. because the IKE SA has been deleted after the first transmission.
9.2. QCD Token Transmission 9.2. QCD Token Transmission
A token maker MUST NOT send a QCD token in an unprotected message for A token maker MUST NOT send a valid QCD token in an
an existing IKE SA. This implies that a conforming QCD token maker unprotectedmessage for an existing IKE SA.
MUST be able to tell whether a particular pair of IKE SPIs represent
a valid IKE SA.
This requirement is obvious and easy in the case of a single gateway. This requirement is obvious and easy in the case of a single gateway.
However, some implementations use a load balancer to divide the load However, some implementations use a load balancer to divide the load
between several physical gateways. It MUST NOT be possible even in between several physical gateways. It MUST NOT be possible even in
such a configuration to trick one gateway into sending a QCD token such a configuration to trick one gateway into sending a valid QCD
for an IKE SA which is valid on another gateway. token for an IKE SA which is valid on another gateway. This is true
whether the attempt to trick the gateway uses the token taker's IP
address or a different IP address.
This document does not specify how a load sharing configuration of Because it includes the token taker's IP address in the token
generation, the method in Section 5.2 prevents revealing the QCD
token for an existing pair of IKE SPIs to an attacker who is using a
different IP address. Note that the use of this method causes the
tokens to be invalidated whenever the token taker's address changes.
It is also important to note that this method does not prevent
revealing the QCD token to a man-in-the-middle attacker who is
spoofing the token taker's IP address, if that attacker is able to
direct messages to a cluster member other than the member responsible
for the IKE SA.
This document does not specify how a load-sharing configuration of
IPsec gateways would work, but in order to support this IPsec gateways would work, but in order to support this
specification, all members MUST be able to tell whether a particular specification, all members MUST be able to tell whether a particular
IKE SA is active anywhere in the cluster. One way to do it is to IKE SA is active anywhere in the cluster. One way to do it is to
synchronize a list of active IKE SPIs among all the cluster members. synchronize a list of active IKE SPIs among all the cluster members.
If an attacker can somehow access a QCD token while the SA's are
still active, this attacker will be able to tear down the sessions at
will. In particular, avoiding false positives is critical to the
security of the proposal and a token maker MUST NOT send a QCD token
in an unprotected message for an existing IKE SA. IPsec Failure
Detection is thus not applicable to deployments where the QCD token
is shared by multiple gateways and the gateways can not assess
whether the token can be legitimately sent in the clear while another
gateway may actually still own the SA's. Load balancer designs
typically fall in this category.
9.3. QCD Token Enumeration 9.3. QCD Token Enumeration
An attacker may try to attack QCD if the generation algorithm An attacker may try to attack QCD if the generation algorithm
described in Section 5.1 is used. The attacker will send several described in Section 5.1 is used. The attacker will send several
fake IKE requests to the gateway under attack, receiving and fake IKE requests to the gateway under attack, receiving and
recording the QCD Tokens in the responses. This will allow the recording the QCD Tokens in the responses. This will allow the
attacker to create a dictionary of IKE SPIs to QCD Tokens, which can attacker to create a dictionary of IKE SPIs to QCD Tokens, which can
later be used to tear down any IKE SA. later be used to tear down any IKE SA.
Three factors mitigate this threat: Three factors mitigate this threat:
skipping to change at page 17, line 48 skipping to change at page 18, line 29
hard. To ensure this, token makers MUST generate unpredictable hard. To ensure this, token makers MUST generate unpredictable
IKE SPIs by using a cryptographically strong pseudo-random number IKE SPIs by using a cryptographically strong pseudo-random number
generator. generator.
o Throttling the amount of QCD_TOKEN notifications sent out, as o Throttling the amount of QCD_TOKEN notifications sent out, as
discussed in Section 8.1, especially when not soon after a crash discussed in Section 8.1, especially when not soon after a crash
will limit the attacker's ability to construct a dictionary. will limit the attacker's ability to construct a dictionary.
o The methods in Section 5.1 and Section 5.2 allow for a periodic o The methods in Section 5.1 and Section 5.2 allow for a periodic
change of the QCD_SECRET. Any such change invalidates the entire change of the QCD_SECRET. Any such change invalidates the entire
dictionary. dictionary.
9.4. Selecting an Appropriate Token Generation Method
This section describes the rationale for token generation methods
such as the one described in Section 5.2. Note that this section
merely provides a possible rationale, and does not specify or
recommend any kind of configuration.
Some configurations of security gateway use a load-sharing cluster of
hosts, all sharing the same IP addresses, where the SAs (IKE and
child) are not synchronized between the cluster members. In such a
configuration, a single member does not know about all the IKE SAs
that are active for the configuration. A load balancer (usually a
networking switch) sends IKE and IPsec packets to the several members
based on source IP address.
In such a configuration, an attacker can send a forged protected IKE
packet with the IKE SPIs of an existing IKE SA, but from a different
IP address. This packet will likely be processed by a different
cluster member from the one that owns the IKE SA. Since no IKE SA
state is stored on this member, it will send a QCD token to the
attacker. If the QCD token does not depend on IP address, this token
can immediately be used to tell the token taker to tear down the IKE
SA using an unprotected QCD_TOKEN notification.
To thwart this possible attack, such configurations should use a
method that considers the taker's IP address, such as the method
described in Section 5.2.
On the other hand, when using this method a change of address
invalidates the tokens, so this method is only recommended when the
configuration involves gateways generating the same tokens without
access to all the IKE SAs.
10. IANA Considerations 10. IANA Considerations
IANA is requested to assign a notify message type from the status IANA is requested to assign a notify message type from the status
types range (16406-40959) of the "IKEv2 Notify Message Types" types range (16406-40959) of the "IKEv2 Notify Message Types"
registry with name "QUICK_CRASH_DETECTION". registry with name "QUICK_CRASH_DETECTION".
11. Acknowledgements 11. Acknowledgements
We would like to thank Hannes Tschofenig and Yaron Sheffer for their We would like to thank Hannes Tschofenig and Yaron Sheffer for their
comments about Session Resumption. comments about Session Resumption.
Others who have contrinuted valuable comments are, in alphabetical Others who have contributed valuable comments are, in alphabetical
order, Lakshminath Dondeti, Tero Kivinen, and Scott C Moonen. order, Lakshminath Dondeti, Paul Hoffman, Tero Kivinen, Scott C
Moonen, and Keith Welter.
12. Change Log 12. Change Log
This section lists all changes in this document This section lists all changes in this document
NOTE TO RFC EDITOR : Please remove this section in the final RFC NOTE TO RFC EDITOR : Please remove this section in the final RFC
12.1. Changes from draft-ietf-ipsecme-failure-detection-02 12.1. Changes from draft-ietf-ipsecme-failure-detection-03
o Merged section 9.4 into section 9.2.
o Multiple typos discovered by Scott Moonen, Keith Welter and Yaron.
12.2. Changes from draft-ietf-ipsecme-failure-detection-02
o Moved section 7 to Appendix A. Also changed some wording. o Moved section 7 to Appendix A. Also changed some wording.
o Fixed some language in the "interaction with session resumption" o Fixed some language in the "interaction with session resumption"
section to say that although liveness check MUST be done, there section to say that although liveness check MUST be done, there
are no time limits to how long an implementation takes before are no time limits to how long an implementation takes before
starting liveness check, or ending it. starting liveness check, or ending it.
12.2. Changes from draft-ietf-ipsecme-failure-detection-01 12.3. Changes from draft-ietf-ipsecme-failure-detection-01
o Fixed the language requiring random IKE SPIs. o Fixed the language requiring random IKE SPIs.
o Some better explanation of the reasons to choose the methods in o Some better explanation of the reasons to choose the methods in
Section 5.2 and the method in Section 5.1, to close issue #193. Section 5.2 and the method in Section 5.1, to close issue #193.
o Added text to the beginning of Section 9 to accomodate issue #194. o Added text to the beginning of Section 9 to accomodate issue #194.
12.3. Changes from draft-ietf-ipsecme-failure-detection-00 12.4. Changes from draft-ietf-ipsecme-failure-detection-00
o Nits pointed out by Scott and Yaron. o Nits pointed out by Scott and Yaron.
o Pratima and Frederic are back on board. o Pratima and Frederic are back on board.
o Changed IKEv2bis draft reference to RFC 5996. o Changed IKEv2bis draft reference to RFC 5996.
o Resolved issues #189, #190, #191, and #192: o Resolved issues #189, #190, #191, and #192:
* Renamed section 4.5 and removed the requirement to send an * Renamed section 4.5 and removed the requirement to send an
acknowledgement for the unprotected message. acknowledgement for the unprotected message.
* Moved the QCD token from the last to the first IKE_AUTH * Moved the QCD token from the last to the first IKE_AUTH
request. request.
* Added a MUST to Section 9.3 to require that IKE SPIs be * Added a MUST to Section 9.3 to require that IKE SPIs be
randomly generated. randomly generated.
* Changed the language in Section 8.1, to not use RFC 2119 * Changed the language in Section 8.1, to not use RFC 2119
terminology. terminology.
* Moved the section describing why one would want the method * Moved the section describing why one would want the method
dependant on IP addresses (in Section 5.2 from operational dependant on IP addresses (in Section 5.2 from operational
considerations to security considerations. considerations to security considerations.
12.4. Changes from draft-nir-ike-qcd-07 12.5. Changes from draft-nir-ike-qcd-07
o First WG version. o First WG version.
o Addressed Scott C Moonen's concern about collisions of QCD tokens. o Addressed Scott C Moonen's concern about collisions of QCD tokens.
o Updated references to point to IKEv2bis instead of RFC 4306 and o Updated references to point to IKEv2bis instead of RFC 4306 and
4718. Also converted draft reference for resumption to RFC 5723. 4718. Also converted draft reference for resumption to RFC 5723.
o Added Dave Wiebrowski as author, and removed Pratima and Frederic. o Added Dave Wiebrowski as author, and removed Pratima and Frederic.
12.5. Changes from draft-nir-ike-qcd-03 and -04 12.6. Changes from draft-nir-ike-qcd-03 and -04
Mostly editorial changes and cleaning up. Mostly editorial changes and cleaning up.
12.6. Changes from draft-nir-ike-qcd-02 12.7. Changes from draft-nir-ike-qcd-02
o Described QCD token enumeration, following a question by o Described QCD token enumeration, following a question by
Lakshminath Dondeti. Lakshminath Dondeti.
o Added the ability to replace the QCD token for an existing IKE SA. o Added the ability to replace the QCD token for an existing IKE SA.
o Added tokens dependent on peer IP address and their interaction o Added tokens dependent on peer IP address and their interaction
with MOBIKE. with MOBIKE.
12.7. Changes from draft-nir-ike-qcd-01 12.8. Changes from draft-nir-ike-qcd-01
o Removed stateless method. o Removed stateless method.
o Added discussion of rekeying and resumption. o Added discussion of rekeying and resumption.
o Added discussion of non-synchronized load-balanced clusters of o Added discussion of non-synchronized load-balanced clusters of
gateways in the security considerations. gateways in the security considerations.
o Other wording fixes. o Other wording fixes.
12.8. Changes from draft-nir-ike-qcd-00 12.9. Changes from draft-nir-ike-qcd-00
o Merged proposal with draft-detienne-ikev2-recovery o Merged proposal with draft-detienne-ikev2-recovery
o Changed the protocol so that the rebooted peer generates the o Changed the protocol so that the rebooted peer generates the
token. This has the effect, that the need for persistent storage token. This has the effect, that the need for persistent storage
is eliminated. is eliminated.
o Added discussion of birth certificates. o Added discussion of birth certificates.
12.9. Changes from draft-nir-qcr-00 12.10. Changes from draft-nir-qcr-00
o Changed name to reflect that this relates to IKE. Also changed o Changed name to reflect that this relates to IKE. Also changed
from quick crash recovery to quick crash detection to avoid from quick crash recovery to quick crash detection to avoid
confusion with IFARE. confusion with IFARE.
o Added more operational considerations. o Added more operational considerations.
o Added interaction with IFARE. o Added interaction with IFARE.
o Added discussion of backup gateways. o Added discussion of backup gateways.
13. References 13. References
skipping to change at page 22, line 44 skipping to change at page 22, line 44
good balance between the need for a timely discovery of a dead peer, good balance between the need for a timely discovery of a dead peer,
and a low probability of false detection. We expect the policy to be and a low probability of false detection. We expect the policy to be
set to take the shortest time such that this probability achieves a set to take the shortest time such that this probability achieves a
certain target. Therefore, we believe that reducing the elapsed time certain target. Therefore, we believe that reducing the elapsed time
and retransmission count may create an unacceptably high probability and retransmission count may create an unacceptably high probability
of false detection, and this can be triggered by a single of false detection, and this can be triggered by a single
INVALID_IKE_SPI notification. INVALID_IKE_SPI notification.
Additionally, even if the retransmission policy is reduced to, say, Additionally, even if the retransmission policy is reduced to, say,
one minute, it is still a very noticeable delay from a human one minute, it is still a very noticeable delay from a human
perspective, from the time that the gateway has come up (i.e. is able perspective, from the time that the gateway has come up (i.e., is
to respond with an INVALID_SPI or INVALID_IKE_SPI notification) and able to respond with an INVALID_SPI or INVALID_IKE_SPI notification)
until the tunnels are active, or from the time the backup gateway has and until the tunnels are active, or from the time the backup gateway
taken over until the tunnels are active. The use of QCD tokens can has taken over until the tunnels are active. The use of QCD tokens
reduce this delay. can reduce this delay.
Authors' Addresses Authors' Addresses
Yoav Nir (editor) Yoav Nir (editor)
Check Point Software Technologies Ltd. Check Point Software Technologies Ltd.
5 Hasolelim st. 5 Hasolelim st.
Tel Aviv 67897 Tel Aviv 67897
Israel Israel
Email: ynir@checkpoint.com Email: ynir@checkpoint.com
 End of changes. 48 change blocks. 
125 lines changed or deleted 124 lines changed or added

This html diff was produced by rfcdiff 1.40. The latest version is available from http://tools.ietf.org/tools/rfcdiff/