--- 1/draft-ietf-radext-status-server-04.txt 2009-10-12 13:12:08.000000000 +0200 +++ 2/draft-ietf-radext-status-server-05.txt 2009-10-12 13:12:08.000000000 +0200 @@ -1,20 +1,21 @@ Network Working Group Alan DeKok INTERNET-DRAFT FreeRADIUS Category: Informational - -Expires: September 1, 2009 -1 March 2009 + +Expires: April 12, 2009 +12 October 2009 Use of Status-Server Packets in the Remote Authentication Dial In User Service (RADIUS) Protocol + draft-ietf-radext-status-server-05 This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and @@ -31,164 +32,195 @@ and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. - This Internet-Draft will expire on September 1, 2009. + This Internet-Draft will expire on April 12, 2009. Copyright Notice Copyright (c) 2009 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents in effect on the date of publication of this document (http://trustee.ietf.org/license-info). Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Abstract - RFC 2865 defines a Status-Server code for use in RADIUS, but labels - it as "Experimental" without further discussion. This document - describes a practical use for the Status-Server packet code, which is - to let clients query the status of a RADIUS server. These queries, - and responses (if any) enable the client to make more informed - decisions. The result is a more stable, and more robust RADIUS - architecture. + This document describes a deployed extension to the Remote + Authentication Dial In User Service (RADIUS) protocol, enabling + clients to query the status of a RADIUS server. This extension + utilizes the Status-Server (12) Code, which was reserved for + experimental use in RFC 2865. Table of Contents 1. Introduction ............................................. 4 - 1.1. Terminology ......................................... 4 - 1.2. Requirements Language ............................... 5 + 1.1. Applicability ....................................... 4 + 1.2. Terminology ......................................... 5 + 1.3. Requirements Language ............................... 5 2. Problem Statement ........................................ 6 - 2.1. Overloading Access-Request .......................... 6 + 2.1. Why Access-Request cannot be used ................... 6 2.1.1. Recommendation against Access-Request .......... 7 - 2.2. Overloading Accounting-Request ...................... 7 + 2.2. Why Accounting-Request cannot be used ............... 7 2.2.1. Recommendation against Accounting-Request ...... 8 - 2.3. Status-Server as a Solution ......................... 8 - 2.3.1. Status-Server to the RADIUS Authentication port 8 - 2.3.2. Status-Server to the RADIUS Accounting port .... 9 + 2.3. Why Status-Server is appropriate .................... 8 + 2.3.1. Status-Server Exchange ......................... 8 3. Packet Format ............................................ 9 3.1. Single definition for Status-Server ................. 11 4. Implementation notes ..................................... 11 4.1. Client Requirements ................................. 12 - 4.2. Server Requirements ................................. 14 + 4.2. Server Requirements ................................. 13 4.3. More Robust Fail-over with Status-Server ............ 15 - 4.4. Proxy Server handling of Status-Server .............. 16 - 4.5. Realm Routing ....................................... 16 + 4.4. Proxy Server handling of Status-Server .............. 15 + 4.5. Limitations of Status-Server ........................ 16 4.6. Management Information Base (MIB) Considerations .... 18 4.6.1. Interaction with RADIUS Server MIB modules ..... 18 - 4.6.2. Interaction with RADIUS Client MIB modules ..... 19 + 4.6.2. Interaction with RADIUS Client MIB modules ..... 18 5. Table of Attributes ...................................... 19 -6. Examples ................................................. 20 - 6.1. Minimal Query to Authentication Port ................ 20 - 6.2. Minimal Query to Accounting Port .................... 21 - 6.3. Verbose Query and Response .......................... 22 +6. Examples ................................................. 19 + 6.1. Minimal Query to Authentication Port ................ 19 + 6.2. Minimal Query to Accounting Port .................... 20 + 6.3. Verbose Query and Response .......................... 21 7. IANA Considerations ...................................... 22 -8. Security Considerations .................................. 23 -9. References ............................................... 23 - 9.1. Normative references ................................ 23 +8. Security Considerations .................................. 22 +9. References ............................................... 22 + 9.1. Normative references ................................ 22 9.2. Informative references .............................. 23 1. Introduction - The RADIUS Working Group was formed in 1995 to document the protocol - of the same name, and created a number of standards surrounding the - protocol. It also defined experimental commands within the protocol, - without elaborating further on the potential uses of those commands. - One of the commands so defined was Status-Server ([RFC2865] Section - 3.). - - This document describes how some current implementations are using - Status-Server packets as a method for querying the status of a RADIUS - server. These queries do not otherwise affect the normal operation - of a server, and do not result in any side effects other than perhaps - incrementing an internal packet counter. + This document specifies a deployed extension to the Remote + Authentication Dial In User Service (RADIUS) protocol, enabling + clients to query the status of a RADIUS server. While the Status- + Server Code (12) was defined as experimental in [RFC2865] Section 3, + details of the operation and potential uses of the Code were not + provided. - These queries are not intended to implement the application-layer - watchdog messages described in [RFC3539] Section 3.4. That document - describes Authentication, Authorization, and Accounting (AAA) - protocols that run over reliable transports which handle - retransmissions internally. Since RADIUS runs over the User Datagram - Protocol (UDP) rather than Transport Control Protocol (TCP), the full - watchdog mechanism is not applicable here. + As with the core RADIUS protocol, the Status-Server extension is + stateless, and queries do not otherwise affect the normal operation + of a server, nor do they result in any side effects, other than + perhaps incrementing of an internal packet counter. Most of the + implementations of this extension have utilized it alongside + implementations of RADIUS as defined in [RFC2865], so that this + document focuses solely on the use of this extension with UDP + transport. The rest of this document is laid out as follows. Section 2 contains the problem statement, and explanations as to why some possible solutions can have unwanted side effects. Section 3 defines the Status-Server packet format. Section 4 contains client and server requirements, along with some implementation notes. Section 5 lists additional considerations not covered in the other sections. The remaining text contains a RADIUS table of attributes, and discusses security considerations not covered elsewhere in the document. -1.1. Terminology +1.1. Applicability + + This protocol is being recommended for publication as an + Informational RFC rather than as a standards-track RFC because of + problems with deployed implementations. This includes security + vulnerabilities. The fixes recommended here are compatible with + existing servers that receive Status-Server packets, but impose new + security requirements on clients that send Status-Server packets. + + Some existing implementations of this protocol do not support the + Message-Authenticator attribute. This enables spoofing of Status- + Server packets. In order to remedy this problem, this specification + recommends the use of the Message-Authenticator attribute to provide + per-packet authentication and integrity protection. + + With existing implementations of this protocol, the potential exists + for Status-Server requests to be in conflict with Access-Request or + Accounting-Requests packets using the same Identifier. This + specification recommends techniques to avoid this problem. + + This specification is also limited to being a "hop by hop" query. + + When RADIUS packets transition one or more RADIUS Proxies, any + information about the status of downstreamservers is unavailable to + the client. In addition, it queries only the status of a RADIUS + server, cannot carry information about specific realms. + + These limitations are discussed in more detail below. + +1.2. Terminology This document uses the following terms: Network Access Server (NAS) The device providing access to the network. Also known as the - Authenticator (in IEEE 802.1x terminology) or RADIUS client. - -Home Server - A RADIUS server that is authoritative for user authorization and - authentication. + Authenticator (in IEEE 802.1X terminology) or RADIUS client. -Proxy Server - A RADIUS server that acts as a Home Server to the NAS, but in turn - proxies the request to another Proxy Server, or to a Home Server. +RADIUS Proxy + In order to provide for the routing of RADIUS authentication and + accounting requests, a RADIUS proxy can be employed. To the NAS, + the RADIUS proxy appears to act as a RADIUS server, and to the + RADIUS server, the proxy appears to act as a RADIUS client. silently discard This means the implementation discards the packet without further processing. The implementation MAY provide the capability of logging the error, including the contents of the silently discarded packet, and SHOULD record the event in a statistics counter. -1.2. Requirements Language +1.3. Requirements Language In this document, several words are used to signify the requirements of the specification. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. 2. Problem Statement - It is often useful to know if a RADIUS server is alive and responding - to requests. The most accurate way to obtain this information is to - query the server via application protocol traffic, as other methods - are either less accurate, or cannot be performed remotely. + A common problem in RADIUS client implementations is the + implementation of a robust fail-over mechanism between servers. A + client may have multiple servers configured, with one server marked + as primary and another marked as secondary. If the client does not + receive a response to a request sent to the primary server, it can + "fail over" to the secondary, and send requests to the secondary + instead of to the primary server. - The reasons for wanting to know the status of a server are many. The - administrator may simply be curious if the server is responding, and - may not have access to NAS or traffic data that would give him that - information. The queries may also be performed automatically by a - NAS or proxy server, which is configured to send packets to a RADIUS - server, and where that server may not be responding. That is, while - [RFC2865] Section 2.6 indicates that sending Keep-Alives is harmful, - it may be useful to send "Are you Alive" queries to a server once it - has been marked "dead" due to prior unresponsiveness. + However, it is possible that the lack of a response to requests sent + to the primary server was due not to a failure within the the + primary, but to alternative causes such as a failed link along the + path to the destination server, or the failure of a downstream proxy + or server. In such a situation, it may be useful for the client to + be able to distinguish between failure causes. For example, if the + primary server is down, then quick failover to the secondary server + would be prudent, whereas if a downstream failure is the cause, then + the value of failing over to a secondary server will depend on + whether packets forwarded by the secondary will utilize independent + links, intermediaries or destination servers. - The occasional query to a "dead" server offers little additional load - on the network or server, and permits clients to more quickly - discover when the server returns to a responsive state. Overall, - status queries can be a useful part of the deployment of a RADIUS - server. + Since the Status-Server packet is non-forwardable, lack of a response + may only be due to packet loss or the failure of the server in the + destination IP address, not due to faults in downstream links, + proxies or servers. It therefore provides an unambiguous indication + of the status of a server. -2.1. Overloading Access-Request + We note that this packet is not a "Keep-Alive" as discussed in + [RFC2865] Section 2.6. "Keep-Alives" are sent when an downstream + server is known to be responsive. These packets are sent only when a + server is suspected to be down, and stop being sent as soon as the + server returns to availability. + +2.1. Why Access-Request cannot be used One possible solution to the problem of querying server status is for a NAS to send specially formed Access-Request packets to a RADIUS server's authentication port. The NAS can then look for a response, and use this information to determine if the server is active or unresponsive. However, the server may see the request as a normal login request for a user, and conclude that a real user has logged onto that NAS. The server may then perform actions that are undesirable for a simple @@ -227,21 +259,21 @@ Access-Request packets solely to see if a server is alive. Similarly, site administrators SHOULD NOT configure test users whose sole reason for existence is to enable such queries via Access- Request packets. Note that it still may be useful to configure test users for the purpose of performing end-to-end or in-depth testing of a servers policy. While this practice is widespread, we caution administrators to use it with care. -2.2. Overloading Accounting-Request +2.2. Why Accounting-Request cannot be used A similar solution for the problem of querying server status may be for a NAS to send specially formed Accounting-Request packets to a RADIUS servers accounting port. The NAS can then look for a response, and use this information to determine if the server is active or unresponsive. As seen above with Access-Request, the server may then conclude that a real user has logged onto a NAS, and perform local site actions that are undesirable for a simple status query. @@ -259,72 +291,44 @@ Accounting-Request packets solely to see if a server is alive. Similarly, site administrators SHOULD NOT configure accounting policies whose sole reason for existence is to enable such queries via Accounting-Request packets. Note that it still may be useful to configure test users for the purpose of performing end-to-end or in-depth testing of a servers policy. While this practice is widespread, we caution administrators to use it with care. -2.3. Status-Server as a Solution +2.3. Why Status-Server is appropriate A better solution to the above problems is to use the Status-Server packet code. The name of the code leads us to conclude that it was intended for packets that query the status of a server. Since the - packet is otherwise undefined, it does not cause interoperability - issues to create implementation-specific definitions for it. The - difficulty until now has been defining an interoperable method of - performing these queries. - - This document addresses that need. - -2.3.1. Status-Server to the RADIUS Authentication port - - Status-Server SHOULD be used instead of Access-Request to query the - responsiveness of a server. In this use case, the protocol exchange - between client and server is similar to the usual exchange of Access- - Request and Access-Accept, as shown below. - - NAS RADIUS server - --- ------------- - Status-Server/ - Message-Authenticator -> - <- Access-Accept/ - Reply-Message - - The Status-Server packet MUST contain a Message-Authenticator - attribute for security. The response (if any) to a Status-Server - packet sent to an authentication port SHOULD be an Access-Accept - packet. Other response packet codes are NOT RECOMMENDED. The list - of attributes that are permitted in the Access-Accept packet is given - in the Table of Attributes in Section 6, below. - -2.3.2. Status-Server to the RADIUS Accounting port + packet is officially undefined, but widely used as specified here, + this document does not create inter-operability issues. - Status-Server MAY be used instead of Accounting-Request to query the - responsiveness of a server. In this use case, the protocol exchange - between client and server is similar to the usual exchange of - Accounting-Request and Accounting-Response, as shown below. +2.3.1. Status-Server Exchange - NAS RADIUS server - --- ------------- - Status-Server/ - Message-Authenticator -> - <- Accounting-Response + Status-Server packets are typically sent to the destination address + and port of a RADIUS server or proxy. A Message-Authenticator + attribute MUST be included so as to provide per-packet authentication + and integrity protection. A single Status-Server packet MUST be + included within a UDP datagram. RADIUS proxies MUST NOT forward + Status-Server packets. - The Status-Server packet MUST contain a Message-Authenticator - attribute for security. The response (if any) to a Status-Server - packet sent to an accounting port SHOULD be an Accounting-Response - packet. Other response packet codes are NOT RECOMMENDED. The list - of attributes that are permitted in the Accounting-Response packet is - given in the Table of Attributes in Section 6, below. + A RADIUS server or proxy implementing this specification SHOULD + respond to a Status-Server packet with an Access-Accept + (authentication port) or Accounting-Message (accounting port). Other + response packet codes (such as Access-Challenge or Access-Reject) are + NOT RECOMMENDED. The list of attributes that are permitted in + Status-Server and Access-Accept packets responding to Status-Server + packets are provided in the Section 6. 3. Packet Format Status-Server packets reuse the RADIUS packet format, with the fields and values for those fields as defined [RFC2865] Section 3. We do not include all of the text or diagrams of that section here, but instead explain the differences required to implement Status-Server. The Authenticator field of Status-Server packets MUST be generated using the same method as that used for the Request Authenticator @@ -615,162 +619,153 @@ Those implementations SHOULD reply to Status-Server packets with an Access-Accept packet. The server MAY increment packet counters as a result of receiving a Status-Server, or sending a Response packet. The server SHOULD NOT perform any other action that is normally performed when it receives a Request packet, other than sending a Response packet. 4.3. More Robust Fail-over with Status-Server - A common problem in RADIUS client implementations is the - implementation of a robust fail-over mechanism between servers. A - client may have multiple servers configured, with one server marked - as primary and another marked as secondary. If the client determines - that the primary is unresponsive, it can "fail over" to the - secondary, and send requests to the secondary instead of to the - primary. - - However, it is difficult in standard RADIUS for a client to know when - it should start sending requests to the primary again. Sending test - Access-Requests or Accounting-Requests to see if the server is alive - has the issues outlined above in Section 2. Clients could - alternately send real traffic to the primary, on the hope that it is - responsive. If the server is still unresponsive, however, the result - may be user login failures. The Status-Server solution is an ideal - way to solve this problem. + A client will typically fail over from one server to another because + of a lack of responsiveness to normal RADIUS traffic. However, the + client has few reasons to mark the server as responsive, as it is not + being sent any packets. - When a client fails over from one server to another because of a lack - of responsiveness, it SHOULD send periodic Status-Server packets to - the unresponsive server, using the timer (Tw) defined above. + The solution is that the client SHOULD begin to send periodic Status- + Server packets as soon as a server is determined to be unresponsive. + The inter-packet period is Tw, as defined above in Section 4.1. + These packets will help the client determine if the failure was due + to the server being unresponsive, or if the problem is due to an + downstream server being unresponsive. Once three time periods have passed where Status-Server packets have been sent and responded to, the server should be deemed responsive and RADIUS requests may sent to it again. This determination should be made separately for each server that the client has a relationship with. The same algorithm should be used for both authentication and accounting ports. The client MUST treat each destination (ip, port) combination as a unique server for the purposes of this determination. The above behavior is modelled after [RFC3539] Section 3.4.1. We note that if a reliable transport is used for RADIUS, then the algorithms specified in [RFC3539] MUST be used in preference to the ones given here. 4.4. Proxy Server handling of Status-Server Many RADIUS servers can act as proxy servers, and can forward - requests to home servers. Such servers MUST NOT proxy Status-Server - packets. The purpose of Status-Server as specified here is to permit - the client to query the responsiveness of a server that it has a - direct relationship with. Proxying Status-Server queries would - negate any usefulness that may be gained by implementing support for - them. + requests to another RADIUS server. Such servers MUST NOT proxy + Status-Server packets. The purpose of Status-Server as specified + here is to permit the client to query the responsiveness of a server + that it has a direct relationship with. Proxying Status-Server + queries would negate any usefulness that may be gained by + implementing support for them. Proxy servers MAY be configured to respond to Status-Server queries from clients, and MAY act as clients sending Status-Server queries to other servers. However, those activities MUST be independent of one another. -4.5. Realm Routing +4.5. Limitations of Status-Server RADIUS servers are commonly used in an environment where Network Access Identifiers (NAIs) are used as routing identifiers [RFC4282]. In this practice, the User-Name attribute is decorated with realm routing information, commonly in the format of "user@realm". Since a particular RADIUS server may act as a proxy for more than one realm, - the mechanism outlined above may be inadequate. + we need to explain how the behavior defined above in Section 4.3, + above, affects realm routing. The schematic below demonstrates this scenario. - /-> Proxy Server P -----> Home Server for Realm A + /-> RADIUS Proxy P -----> RADIUS Server for Realm A / \ / - NAS X \ / \ - \-> Proxy Server S -----> Home Server for Realm B + \-> RADIUS Proxy S -----> RADIUS Server for Realm B - That is, the NAS has relationships with two Proxy Servers, P and S. - Each Proxy Server has relationships with Home Servers for both Realm + That is, the NAS has relationships with two RADIUS Proxies, P and S. + Each RADIUS Proxyhas relationships with RADIUS Servers for both Realm A and Realm B. - In this scenario, the Proxy Servers can determine if one or both of - the Home Servers are dead or unreachable. The NAS can determine if - one or both of the Proxy Servers are dead or unreachable. There is + In this scenario, the RADIUS Proxies can determine if one or both of + the RADIUS Servers are dead or unreachable. The NAS can determine if + one or both of the RADIUS Proxies are dead or unreachable. There is an additional case to consider, however. - If Proxy Server P cannot reach the Home Server for Realm A, but the - Proxy Server S can reach that Home Server, then the NAS cannot + If RADIUS Proxy P cannot reach the RADIUS Server for Realm A, but the + RADIUS Proxy S can reach that RADIUS Server, then the NAS cannot discover this information using the Status-Server queries as outlined above. It would therefore be useful for the NAS to know that Realm A - is reachable from Proxy Server S, as it can then route all requests - for Realm A to that Proxy Server. Without this knowledge, the client - may route requests to Proxy Server P, where they may be discarded or + is reachable from RADIUS Proxy S, as it can then route all requests + for Realm A to that RADIUS Proxy. Without this knowledge, the client + may route requests to RADIUS Proxy P, where they may be discarded or rejected. - To complicate matters, the behavior of Proxy Servers P and S in this + To complicate matters, the behavior of RADIUS Proxies P and S in this situation is not well defined. Some implementations simply fail to respond to the request, and other implementations respond with an Access-Reject. If the implementation fails to respond, then the NAS - cannot distinguish between the Proxy Server being down, or the next + cannot distinguish between the RADIUS Proxy being down, or the next server along the proxy chain being unreachable. In the worst case, failures in routing for Realm A may affect users - of Realm B. For example, if Proxy Server P can reach Realm B but not - Realm A, and Proxy Server S can reach Realm A but not Realm B, then + of Realm B. For example, if RADIUS Proxy P can reach Realm B but not + Realm A, and RADIUS Proxy S can reach Realm A but not Realm B, then active paths exist to handle all RADIUS requests. However, depending - on the NAS and Proxy Server implementation choices, the NAS may not + on the NAS and RADIUS Proxy implementation choices, the NAS may not be able to determine which server requests may be sent to in order to maintain network stability. This problem cannot, unfortunately be solved by using Status-Server requests. A robust solution would involve either a RADIUS routing table for the NAI realms, or a RADIUS "destination unreachable" response to authentication requests. Either solution would not fit into the traditional RADIUS model, and both are therefore outside of the scope of this specification. The problem is discussed here in order to define how best to use Status-Server in this situation, rather than to define a new solution. When a server has responded recently to a request from a client, that client MUST mark the server as "responsive". In the above case, a - Proxy Server may be responding to requests destined for Realm A, but + RADIUS Proxy may be responding to requests destined for Realm A, but not responding to requests destined for Realm B. The client therefore considers the server to be responsive, as it is receiving responses from the server. - The client will then continue to send requests to the Proxy Server - for destination Realm B, even though the Proxy Server cannot route + The client will then continue to send requests to the RADIUS Proxy + for destination Realm B, even though the RADIUS Proxy cannot route the requests to that destination. This failure is a known limitation of RADIUS, and can be partially addressed through the use of failover - in the Proxy Servers. + in the RADIUS Proxies. A more realistic situation than the one outlined above is where each - Proxy Server also has multiple choices of Home Servers for a realm, + RADIUS Proxy also has multiple choices of RADIUS Servers for a realm, as outlined below. - /-> Proxy Server P -----> Home Server P + /-> RADIUS Proxy P -----> RADIUS Server P / \ / NAS X \ / \ - \-> Proxy Server S -----> Home Server S + \-> RADIUS Proxy S -----> RADIUS Server S In this situation, if all participants implement Status-Server as defined herein, any one link may be broken, and all requests from the - NAS will still reach a home server. If two links are broken at + NAS will still reach a RADIUS Server. If two links are broken at different places, (i.e. not both links from the NAS), then all - requests from the NAS will still reach a home server. In many + requests from the NAS will still reach a RADIUS Server. In many situations where three or more links are broken, then requests from - the NAS may still reach a home server. + the NAS may still reach a RADIUS Server. It is RECOMMENDED, therefore, that implementations desiring the most benefit from Status-Server also implement server failover. The combination of these two practices will maximize network reliability and stability. 4.6. Management Information Base (MIB) Considerations 4.6.1. Interaction with RADIUS Server MIB modules