draft-ietf-rserpool-arch-03.txt   draft-ietf-rserpool-arch-04.txt 
Network Working Group M. Tuexen Network Working Group M. Tuexen
Internet-Draft Siemens AG Internet-Draft Siemens AG
Expires: November 30, 2002 Q. Xie Expires: May 5, 2003 Q. Xie
Motorola, Inc. Motorola, Inc.
R. Stewart R. Stewart
M. Shore M. Shore
Cisco Systems, Inc. Cisco Systems, Inc.
L. Ong L. Ong
Ciena Corporation Ciena Corporation
J. Loughney J. Loughney
Nokia Research Center Nokia Research Center
M. Stillman M. Stillman
Nokia Nokia
June 2002 November 4, 2002
Architecture for Reliable Server Pooling Architecture for Reliable Server Pooling
draft-ietf-rserpool-arch-03.txt draft-ietf-rserpool-arch-04.txt
Status of this Memo Status of this Memo
This document is an Internet-Draft and is in full conformance with This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026. all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet- other groups may also distribute working documents as Internet-
Drafts. Drafts.
skipping to change at page 1, line 42 skipping to change at page 1, line 42
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at http:// The list of current Internet-Drafts can be accessed at http://
www.ietf.org/ietf/1id-abstracts.txt. www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet-Draft will expire on November 30, 2002. This Internet-Draft will expire on May 5, 2003.
Copyright Notice Copyright Notice
Copyright (C) The Internet Society (2002). All Rights Reserved. Copyright (C) The Internet Society (2002). All Rights Reserved.
Abstract Abstract
This document describes an architecture and protocols for the This document describes an architecture and protocols for the
management and operation of server pools supporting highly reliable management and operation of server pools supporting highly reliable
applications, and for client access mechanisms to a server pool. applications, and for client access mechanisms to a server pool.
skipping to change at page 2, line 21 skipping to change at page 2, line 21
1.2 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Abbreviations . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Abbreviations . . . . . . . . . . . . . . . . . . . . . . . 4
2. Reliable Server Pooling Architecture . . . . . . . . . . . . 5 2. Reliable Server Pooling Architecture . . . . . . . . . . . . 5
2.1 RSerPool Functional Components . . . . . . . . . . . . . . . 5 2.1 RSerPool Functional Components . . . . . . . . . . . . . . . 5
2.2 RSerPool Protocol Overview . . . . . . . . . . . . . . . . . 6 2.2 RSerPool Protocol Overview . . . . . . . . . . . . . . . . . 6
2.2.1 Endpoint Name Resolution Protocol . . . . . . . . . . . . . 6 2.2.1 Endpoint Name Resolution Protocol . . . . . . . . . . . . . 6
2.2.2 Aggregate Server Access Protocol . . . . . . . . . . . . . . 6 2.2.2 Aggregate Server Access Protocol . . . . . . . . . . . . . . 6
2.2.3 PU <-> NS Communication . . . . . . . . . . . . . . . . . . 7 2.2.3 PU <-> NS Communication . . . . . . . . . . . . . . . . . . 7
2.2.4 PE <-> NS Communication . . . . . . . . . . . . . . . . . . 7 2.2.4 PE <-> NS Communication . . . . . . . . . . . . . . . . . . 7
2.2.5 PU <-> PE Communication . . . . . . . . . . . . . . . . . . 8 2.2.5 PU <-> PE Communication . . . . . . . . . . . . . . . . . . 8
2.2.6 NS <-> NS Communication . . . . . . . . . . . . . . . . . . 9 2.2.6 NS <-> NS Communication . . . . . . . . . . . . . . . . . . 8
2.2.7 PE <-> PE Communication . . . . . . . . . . . . . . . . . . 9 2.2.7 PE <-> PE Communication . . . . . . . . . . . . . . . . . . 9
2.3 Failover Support . . . . . . . . . . . . . . . . . . . . . . 9 2.3 Failover Support . . . . . . . . . . . . . . . . . . . . . . 9
2.3.1 Testament . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3.1 Testament . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3.2 Cookies . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3.2 Cookies . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3.3 Application level acknowledgements . . . . . . . . . . . . . 11 2.3.3 Application level acknowledgements . . . . . . . . . . . . . 11
2.3.4 Business Cards . . . . . . . . . . . . . . . . . . . . . . . 11 2.3.4 Business Cards . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Typical Interactions between RSerPool Components . . . . . . 11 2.4 Typical Interactions between RSerPool Components . . . . . . 11
3. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.1 Two File Transfer Examples . . . . . . . . . . . . . . . . . 14 3.1 Two File Transfer Examples . . . . . . . . . . . . . . . . . 14
3.1.1 The RSerPool Aware Client . . . . . . . . . . . . . . . . . 15 3.1.1 The RSerPool Aware Client . . . . . . . . . . . . . . . . . 15
3.1.2 The RSerPool Unaware Client . . . . . . . . . . . . . . . . 16 3.1.2 The RSerPool Unaware Client . . . . . . . . . . . . . . . . 16
3.2 Telephony Signaling Example . . . . . . . . . . . . . . . . 17 3.2 Telephony Signaling Example . . . . . . . . . . . . . . . . 17
3.2.1 Decomposed GWC and GK Scenario . . . . . . . . . . . . . . . 17 3.2.1 Decomposed GWC and GK Scenario . . . . . . . . . . . . . . . 17
3.2.2 Collocated GWC and GK Scenario . . . . . . . . . . . . . . . 19 3.2.2 Collocated GWC and GK Scenario . . . . . . . . . . . . . . . 19
4. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 21 4. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 21
References . . . . . . . . . . . . . . . . . . . . . . . . . 22 References . . . . . . . . . . . . . . . . . . . . . . . . . 22
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 22 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 22
Full Copyright Statement . . . . . . . . . . . . . . . . . . 24 Full Copyright Statement . . . . . . . . . . . . . . . . . . 24
1. Introduction 1. Introduction
1.1 Overview 1.1 Overview
This document defines a proposed architecture, which can be used to This document defines an architecture, for providinge a highly
provide highly available services. The way this is achieved is by available reliable server function in support of some service. The
using servers grouped into pools. Therefore, if a client wants to way this is achieved is by forming a pool of servers, each of which
access a server pool, it will be able to use any of the servers in is capable of supporting the desired service, and providing a name
the server pool. Several server selection mechanisms, called server service that will resolve requests from a service user to the
pool policies, are supported. identity of a working server in the pool.
To access a server pool, the pool user consults a name server. The To access a server pool, the pool user consults a name server. The
name space for the server pools is flat, rather than hierachical. A name service itself can be provided by a pool of name servers using a
group of fault tolerant name servers are provided to resolve pool shared protocol to make the name resolution function fault-tolerant.
name queries from the pools user. It is assumed that the name space is kept flat and designed for a
limited scale in order to keep the protocols simple, robust and fast.
The server pool itself is supported by a shared protocol between
servers and the name service allowing servers to enter and exit the
pool. Several server selection mechanisms, called server pool
policies, are supported for flexibility.
1.2 Terminology 1.2 Terminology
This document uses the following terms: This document uses the following terms:
Home Name Server: The Name Server a Pool Element has registered with. Home Name Server: The Name Server a Pool Element has registered with.
This Name Server supervises the Pool Element. This Name Server supervises the Pool Element.
Operation scope: The part of the network visible to pool users by a Operation scope: The part of the network visible to pool users by a
specific instance of the reliable server pooling protocols. specific instance of the reliable server pooling protocols.
skipping to change at page 5, line 30 skipping to change at page 5, line 30
the same application functionality. These servers are called Pool the same application functionality. These servers are called Pool
Elements (PEs). PEs form the first class of entities in the RSerPool Elements (PEs). PEs form the first class of entities in the RSerPool
architecture. Multiple PEs in a server pool can be used to provide architecture. Multiple PEs in a server pool can be used to provide
fault tolerance or load sharing, for example. fault tolerance or load sharing, for example.
Each server pool will be identifiable by a unique name which is Each server pool will be identifiable by a unique name which is
simply a byte string, called the pool handle. This allows binary simply a byte string, called the pool handle. This allows binary
names to be used. names to be used.
These names are not valid in the whole internet but only in smaller These names are not valid in the whole internet but only in smaller
parts, called the operational scope. Furthermore, the namespace is domains, called the operational scope. Furthermore, the namespace is
flat. assumed to be flat, so that multiple levels of query are not
necessary to resolve a name request.
The second class of entities in the RSerPool architecture is the The second class of entities in the RSerPool architecture is the
class of the name servers. These name servers can resolve a pool class of name servers (NSs). These name servers can resolve a pool
handle to a list of information which allows the PU to access a PE of handle to a list of information which allows the PU to access a PE of
the server pool identified by the handle. This information includes: the server pool identified by the handle. This information includes:
o A list of IPv4 and/or IPv6 addresses. o A list of IPv4 and/or IPv6 addresses.
o A protocol field of the IP header specifying the upper layer o A protocol field of the IP header specifying the transport layer
protocol. protocol or protocols.
o A port number if the upper layer protocol is SCTP, TCP or UDP. o A port number associatiated with the transport protocol, e.g.
SCTP, TCP or UDP.
Please note that the RSerPool architecture supports both IPv4 and Please note that the RSerPool architecture supports both IPv4 and
IPv6 addressing. A PE can also support multiple transport layers. IPv6 addressing. A PE can also support multiple transport layers.
In each operational scope there must be at least one name server. In each operational scope there must be at least one name server.
Most likely there will be more than one. All these name servers have All name servers within the operational scope have knowledge of all
the complete knowledge about all server pools in the operational server pools within the operational scope.
scope. The name servers use a protocol called Endpoint Name
Resolution Protocol (ENRP) for communication with each other to make
sure that all have the same information about the server pools.
A client being served by a PE of a server pool is called a Pool User
(PU). This is the third class of entities in the RSerPool
architecture.
If the PU wants to be served by a PE of a particular server pool it
must know the pool handle of the server pool. The PU then uses the
Aggregate Server Access Protocol (ASAP) to query for transport layer
addresses of PEs belonging to the server pool identified by the pool
handle.
RFC3237 [7] also requires that the name servers should not resolve a
pool handle to a transport layer address of a PE which is not in
operation. Therefore each PE is supervised by one specific name
server, called the home NS of that PE. If it detects that the PE is
out of service all other name servers are informed by using ENRP.
ASAP is also used by a server to join or leave a server pool. It A third class of entities in the architecture is the Pool User (PU)
registers or deregisters itself by communicating with a name server, class, consisting of the clients being served by the PEs of a server
which will normally the home NS. pool.
2.2 RSerPool Protocol Overview 2.2 RSerPool Protocol Overview
The RSerPool requested features can be obtained with the help of the The RSerPool requested features can be obtained with the help of the
combination of two protocols: ENRP (Endpoint Name Resolution combination of two protocols: ENRP (Endpoint Name Resolution
Protocol) and ASAP (Aggregate Server Access Protocol). Protocol) and ASAP (Aggregate Server Access Protocol).
2.2.1 Endpoint Name Resolution Protocol 2.2.1 Endpoint Name Resolution Protocol
The name servers use a protocol called Endpoint Name Resolution
Protocol (ENRP) for communication with each other to make sure that
all have the same information about the server pools.
ENRP is designed to provide a fully distributed fault-tolerant real- ENRP is designed to provide a fully distributed fault-tolerant real-
time translation service that maps a name to a set of transport time translation service that maps a name to a set of transport
addresses pointing to a specific group of networked communication addresses pointing to a specific group of networked communication
endpoints registered under that name. ENRP employs a client-server endpoints registered under that name. ENRP employs a client-server
model with which an name server will respond to the name translation model with which an name server will respond to the name translation
service requests from endpoint clients running on the same host or service requests from endpoint clients running on the same host or
running on different hosts. running on different hosts.
RFC3237 [7] also requires that the name servers should not resolve a
pool handle to a transport layer address of a PE which is not in
operation. Therefore each PE is supervised by one specific name
server, called the home NS of that PE. If it detects that the PE is
out of service all other name servers are informed by using ENRP.
2.2.2 Aggregate Server Access Protocol 2.2.2 Aggregate Server Access Protocol
The PU wanting service from the pool uses the Aggregate Server Access
Protocol (ASAP) to access members of the pool. Depending on the
level of support desired by the application, use of ASAP may be
limited to an initial query for an active PE, or ASAP may be used to
mediate all communication between the PU and PE, so that automatic
failover from a failed PE to an alternate PE can be supported.
ASAP in conjunction with ENRP provides a fault tolerant data transfer ASAP in conjunction with ENRP provides a fault tolerant data transfer
mechanism over IP networks. ASAP uses a name-based addressing model mechanism over IP networks. ASAP uses a name-based addressing model
which isolates a logical communication endpoint from its IP which isolates a logical communication endpoint from its IP
address(es), thus effectively eliminating the binding between the address(es), thus effectively eliminating the binding between the
communication endpoint and its physical IP address(es) which normally communication endpoint and its physical IP address(es) which normally
constitutes a single point of failure. constitutes a single point of failure.
In addition, ASAP defines each logical communication destination as a In addition, ASAP defines each logical communication destination as a
server pool, providing full transparent support for server-pooling server pool, providing full transparent support for server-pooling
and load sharing. It also allows dynamic system scalability - and load sharing.
members of a server pool can be added or removed at any time without
interrupting the service. ASAP is also used by a server to join or leave a server pool. It
registers or deregisters itself by communicating with a name server,
which will normally the home NS. ASAP allows dynamic system
scalability, allowing the pool membership to change at any time
without interruption of the service.
2.2.3 PU <-> NS Communication 2.2.3 PU <-> NS Communication
The PU <-> NS communication is used for doing name queries. The PU The PU <-> NS communication is used for doing name queries. The PU
sends a pool handle to the NS and gets back the information necessary sends a pool handle to the NS and gets back the information necessary
for accessing a server in a server pool. for accessing a server in a server pool.
******** ******** ******** ********
* PU * * NS * * PU * * NS *
******** ******** ******** ********
+------+ +------+ +------+ +------+
| ASAP | | ASAP | | ASAP | | ASAP |
+------+ +------+ +------+ +------+
| SCTP | | SCTP | | SCTP | | SCTP |
+------+ +------+ +------+ +------+
| IP | | IP | | IP | | IP |
+------+ +------+ +------+ +------+
Protocol stack between PU and NS (SCTP variant) Protocol stack between PU and NS
If the PU does not use SCTP based services it may not be appropriate
to implement SCTP of PUs just to do the name queries. Therefore ASAP
over TCP can be used for doing the name queries. The protocol stack
is shown in the following figure.
******** ********
* PU * * NS *
******** ********
+------+ +------+
| ASAP | | ASAP |
+------+ +------+
| TCP | | TCP |
+------+ +------+
| IP | | IP |
+------+ +------+
Protocol stack between PU and NS (TCP variant)
2.2.4 PE <-> NS Communication 2.2.4 PE <-> NS Communication
The PE <-> NS communication is used for registration and The PE <-> NS communication is used for registration and
deregistration of the PE in one ore more pools and for the deregistration of the PE in one ore more pools and for the
supervision of the PE by the home NS. This communication is based on supervision of the PE by the home NS. This communication is based on
SCTP, the protocol stack is shown in the following figure. SCTP, the protocol stack is shown in the following figure.
******** ******** ******** ********
* PE * * NS * * PE * * NS *
skipping to change at page 8, line 47 skipping to change at page 8, line 32
o The PE can send cookies to the PU. The PE would store only the o The PE can send cookies to the PU. The PE would store only the
last cookie and send it to the new PE in case of a failover. last cookie and send it to the new PE in case of a failover.
o Both the PE and PU can send application level acknowledgements to o Both the PE and PU can send application level acknowledgements to
provide a user controlled buffer management at the RSerPool layer. provide a user controlled buffer management at the RSerPool layer.
See Section 2.3 for further details. See Section 2.3 for further details.
The control channel is transported using the ASAP protocol making use The control channel is transported using the ASAP protocol making use
of SCTP or TCP as its transport protocol. The control and data of SCTP as its transport protocol. The control and data channel may
channel may be tranported over a single transport layer connection. be tranported over a single transport layer connection.
2.2.6 NS <-> NS Communication 2.2.6 NS <-> NS Communication
The communication between name servers is used to share the knowledge The communication between name servers is used to share the knowledge
about all server pools between all name servers in an operational about all server pools between all name servers in an operational
scope. scope.
******** ******** ******** ********
* NS * * NS * * NS * * NS *
******** ******** ******** ********
skipping to change at page 10, line 37 skipping to change at page 10, line 33
. . . .
. +-------+ . . +-------+ .
. | | . . | | .
. | PE 3 | . . | PE 3 | .
. | | . . | | .
. +-------+ . . +-------+ .
....................... .......................
Two PE accessing the same PE Two PE accessing the same PE
PU 1 is using PE 2 of the server pool. Assume that PE 1 and PE 2 PU 1 is using PE 2 of the server pool. Assume that PE 1 and PE 2
share state but not PE 2 and PE 3. Using the testament it is share state but not PE 2 and PE 3. Using the testament of PE 2 it is
possible for PE 2 to inform PU 1 that it should fail over to PE 1 in possible for PE 2 to inform PU 1 that it should fail over to PE 1 in
case of a failure. case of a failure.
A slightly more complicated situation is if two pool users, PU 1 and A slightly more complicated situation is if two pool users, PU 1 and
PU 2, use PE 2 but both, PU 1 and PU 2, need to use the same PE. PU 2, use PE 2 but both, PU 1 and PU 2, need to use the same PE.
Then it is important that PU 1 and PU 2 fail over to the same PE. Then it is important that PU 1 and PU 2 fail over to the same PE.
This can be handled in a way such that PE 2 gives the same testament This can be handled in a way such that PE 2 gives the same testament
to PU 1 and PU 2. to PU 1 and PU 2.
2.3.2 Cookies 2.3.2 Cookies
Cookies are sent from the PE to the PU whenever the PE wants this to Cookies may be sent from the PE to the PU if the PE wants this to do.
do. The PU only stores the last received cookie. In case of a fail The PU only stores the last received cookie. In case of a fail over
over it sends this last recveived cookie to the new PE. This method it sends this last received cookie to the new PE. This method
provides a simple way of state sharing between the PE. Please note provides a simple way of state sharing between the PEs. Please note
that the PE should sign the cookie and the receiving PE has to that the old PE should sign the cookie and the receiving PE should
verifiy the signature. For the PU is cookie has no structure and is verify the signature. For the PU, the cookie has no structure and is
does only store it. only stored and transmitted to the new PE.
2.3.3 Application level acknowledgements 2.3.3 Application level acknowledgements
In case of a failure an upper layer might want to retrieve some data In case of a failure an upper layer might want to retrieve some data
from the communication to to failed PE and transfer it to the new from the communication to to failed PE and transfer it to the new
one. Because this data retrieval problem can not be completely one. Because this data retrieval problem can not be completely
solved in a general way (and provide neither message loss nor message solved in a general way (and provide neither message loss nor message
duplication) the ASAP layer only provides the support of application duplication) the ASAP layer only provides the support of application
layer acknowledgements. ASAP uses this for upper layer supported layer acknowledgements. ASAP uses this for upper layer supported
buffer management in the ASAP layer. buffer management in the ASAP layer.
2.3.4 Business Cards 2.3.4 Business Cards
In case of a PE to PE communication one of the PEs acts as a PU for In case of a PE to PE communication one of the PEs acts as a PU for
establishing the communication. But the peer does not know the pool establishing the communication. The receiving may not know the pool
handle of the PE which initiated the communication. A business card handle of the PE which initiated the communication. A business card
can be used for the PE acting as a PE to provide the peer with the can be used for the initiating PE to provide its peer with a pool
pool handle. So even in case the PE which acts as a PU fails the handle, allowing the peer PE to fail over the communication in case
other PE can fail over to a different PE in the pool of the PE which the initiating PE fails.
was initially acting as a PU.
2.4 Typical Interactions between RSerPool Components 2.4 Typical Interactions between RSerPool Components
The following drawing shows the typical RSerPool components and their The following drawing shows the typical RSerPool components and their
possible interactions with each other: possible interactions with each other:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~ operation scope ~ ~ operation scope ~
~ ......................... ......................... ~ ~ ......................... ......................... ~
~ . Server Pool 1 . . Server Pool 2 . ~ ~ . Server Pool 1 . . Server Pool 2 . ~
skipping to change at page 22, line 40 skipping to change at page 22, line 40
Authors' Addresses Authors' Addresses
Michael Tuexen Michael Tuexen
Siemens AG Siemens AG
ICN WN CC SE 7 ICN WN CC SE 7
D-81359 Munich D-81359 Munich
Germany Germany
Phone: +49 89 722 47210 Phone: +49 89 722 47210
EMail: Michael.Tuexen@icn.siemens.de EMail: Michael.Tuexen@siemens.com
Qiaobing Xie Qiaobing Xie
Motorola, Inc. Motorola, Inc.
1501 W. Shure Drive, #2309 1501 W. Shure Drive, #2309
Arlington Heights, IL 60004 Arlington Heights, IL 60004
USA USA
Phone: +1-847-632-3028 Phone: +1-847-632-3028
EMail: qxie1@email.mot.com EMail: qxie1@email.mot.com
Randall R. Stewart Randall R. Stewart
 End of changes. 

This html diff was produced by rfcdiff 1.23, available from http://www.levkowetz.com/ietf/tools/rfcdiff/