draft-ietf-nfsv4-rpcrdma-bidirection-05.txt   draft-ietf-nfsv4-rpcrdma-bidirection-06.txt 
Network File System Version 4 C. Lever Network File System Version 4 C. Lever
Internet-Draft Oracle Internet-Draft Oracle
Intended status: Standards Track June 9, 2016 Intended status: Standards Track January 20, 2017
Expires: December 11, 2016 Expires: July 24, 2017
Bi-directional Remote Procedure Call On RPC-over-RDMA Transports Bi-directional Remote Procedure Call On RPC-over-RDMA Transports
draft-ietf-nfsv4-rpcrdma-bidirection-05 draft-ietf-nfsv4-rpcrdma-bidirection-06
Abstract Abstract
Minor versions of NFSv4 newer than NFSv4.0 work best when ONC RPC Minor versions of NFSv4 newer than NFSv4.0 work best when ONC RPC
transports can send Remote Procedure Call transactions in both transports can send Remote Procedure Call transactions in both
directions on the same connection. This document describes how RPC- directions on the same connection. This document describes how RPC-
over-RDMA transport endpoints convey RPCs in both directions on a over-RDMA transport endpoints convey RPCs in both directions on a
single connection. single connection.
Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on December 11, 2016. This Internet-Draft will expire on July 24, 2017.
Copyright Notice Copyright Notice
Copyright (c) 2016 IETF Trust and the persons identified as the Copyright (c) 2017 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3 2. Understanding RPC Direction . . . . . . . . . . . . . . . . . 2
2. Understanding RPC Direction . . . . . . . . . . . . . . . . . 3 3. Immediate Uses Of Bi-Directional RPC-over-RDMA . . . . . . . 4
2.1. Forward Direction . . . . . . . . . . . . . . . . . . . . 3
2.2. Backward Direction . . . . . . . . . . . . . . . . . . . 4
2.3. Bi-directional Operation . . . . . . . . . . . . . . . . 4
2.4. XID Values . . . . . . . . . . . . . . . . . . . . . . . 4
3. Immediate Uses Of Bi-Directional RPC-over-RDMA . . . . . . . 5
3.1. NFSv4.0 Callback Operation . . . . . . . . . . . . . . . 5
3.2. NFSv4.1 Callback Operation . . . . . . . . . . . . . . . 6
4. Flow Control . . . . . . . . . . . . . . . . . . . . . . . . 6 4. Flow Control . . . . . . . . . . . . . . . . . . . . . . . . 6
4.1. Backward Credits . . . . . . . . . . . . . . . . . . . . 7
4.2. Inline Thresholds . . . . . . . . . . . . . . . . . . . . 7
4.3. Managing Receive Buffers . . . . . . . . . . . . . . . . 7
5. Sending And Receiving Backward Operations . . . . . . . . . . 8 5. Sending And Receiving Backward Operations . . . . . . . . . . 8
5.1. Sending A Backward Direction Call . . . . . . . . . . . . 9 6. In the Absence of Backward Direction Support . . . . . . . . 10
5.2. Sending A Backward Direction Reply . . . . . . . . . . . 9
5.3. Backward Direction Chunks . . . . . . . . . . . . . . . . 9
5.4. Backward Direction Retransmission . . . . . . . . . . . . 10
6. In the Absence of Backward Direction Support . . . . . . . . 11
7. Considerations For Upper Layer Bindings . . . . . . . . . . . 11 7. Considerations For Upper Layer Bindings . . . . . . . . . . . 11
8. Security Considerations . . . . . . . . . . . . . . . . . . . 12 8. Security Considerations . . . . . . . . . . . . . . . . . . . 11
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12
10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 12 10. Normative References . . . . . . . . . . . . . . . . . . . . 12
11. Normative References . . . . . . . . . . . . . . . . . . . . 13 Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 12
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 13 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 13
1. Introduction 1. Introduction
The purpose of this document is to enable concurrent operation in The purpose of this document is to enable concurrent operation in
both directions on a single transport connection using RPC-over-RDMA both directions on a single transport connection using RPC-over-RDMA
protocol versions that do not have specific facilities for backward protocol versions that do not have specific facilities for backward
direction operation. direction operation.
Backward direction RPC transactions are necessary for the operation Backward direction RPC transactions are necessary for the operation
of NFSv4.1, and in particular, of pNFS, though any Upper Layer of NFSv4.1, and in particular, of Parallel NFS (pNFS) [RFC5661],
Protocol implementation may make use of them. An Upper Layer Binding though any Upper Layer Protocol implementation may make use of them.
for NFSv4.x callback operation is additionally required (see An Upper Layer Binding for NFSv4.x callback operation is additionally
Section 7), but is not provided in this document. required (see Section 7), but is not provided in this document.
For example, using the approach described herein, RPC transactions For example, using the approach described herein, RPC transactions
can be conveyed in both directions on the same RPC-over-RDMA Version can be conveyed in both directions on the same RPC-over-RDMA Version
One connection without changes to the the XDR description of RPC- One connection without changes to the XDR description of RPC-over-
over-RDMA Version One. This document does not modify the XDR or RDMA Version One. This document does not modify the XDR or protocol
protocol described in [I-D.ietf-nfsv4-rfc5666bis]. Future versions described in [I-D.ietf-nfsv4-rfc5666bis]. Future versions of RPC-
of RPC-over-RDMA may adopt the approach described herein, or may over-RDMA may adopt the approach described herein, or may replace it
replace it with a different approach. with a different approach.
1.1. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
2. Understanding RPC Direction 2. Understanding RPC Direction
The ONC RPC protocol as described in [RFC5531] is architected as a The ONC RPC protocol as described in [RFC5531] is architected as a
message-passing protocol between one server and one or more clients. message-passing protocol between one server and one or more clients.
ONC RPC transactions are made up of two types of messages. ONC RPC transactions are made up of two types of messages.
A CALL message, or "Call", requests work. A Call is designated by A CALL message, or "Call", requests work. A Call is designated by
the value CALL in the message's msg_type field. An arbitrary unique the value CALL in the message's msg_type field. An arbitrary unique
value is placed in the message's xid field. A host that originates a value is placed in the message's xid field. A host that originates a
skipping to change at page 4, line 18 skipping to change at page 3, line 47
in the other direction. An ONC RPC service endpoint can act as a in the other direction. An ONC RPC service endpoint can act as a
Requester, in which case an ONC RPC client endpoint acts as a Requester, in which case an ONC RPC client endpoint acts as a
Responder. This form of message passing is referred to as "backward Responder. This form of message passing is referred to as "backward
direction" operation. direction" operation.
During backward direction operation, the ONC RPC client is During backward direction operation, the ONC RPC client is
responsible for establishing transport connections, even though ONC responsible for establishing transport connections, even though ONC
RPC Calls come from the ONC RPC server. RPC Calls come from the ONC RPC server.
ONC RPC clients and services are optimized to perform and scale well ONC RPC clients and services are optimized to perform and scale well
while handling traffic in the forward direction, and may not be while handling traffic in the forward direction, and might not be
prepared to handle operation in the backward direction. Not until prepared to handle operation in the backward direction. Not until
NFSv4.1 [RFC5661] has there been a strong need to handle backward NFSv4.1 [RFC5661] has there been a strong need to handle backward
direction operation. direction operation.
2.3. Bi-directional Operation 2.3. Bi-directional Operation
A pair of connected RPC endpoints may choose to use only forward or A pair of connected RPC endpoints may choose to use only forward or
only backward direction operations on a particular transport. Or, only backward direction operations on a particular transport. Or,
these endpoints may send Calls in both directions concurrently on the these endpoints may send Calls in both directions concurrently on the
same transport. same transport.
skipping to change at page 6, line 11 skipping to change at page 5, line 39
In this case, the server does not grant file delegations. This might In this case, the server does not grant file delegations. This might
result in a negative performance effect, but correctness is not result in a negative performance effect, but correctness is not
affected. affected.
3.2. NFSv4.1 Callback Operation 3.2. NFSv4.1 Callback Operation
NFSv4.1 supports file delegation in a similar fashion to NFSv4.0, and NFSv4.1 supports file delegation in a similar fashion to NFSv4.0, and
extends the callback mechanism to manage pNFS layouts, as discussed extends the callback mechanism to manage pNFS layouts, as discussed
in Section 12 of [RFC5661]. in Section 12 of [RFC5661].
To facilitate operation through NAT routers, all NFSv4.1 transport NFSv4.1 transport connections are initiated by NFSv4.1 clients.
connections are initiated by NFSv4.1 clients. Therefore NFSv4.1 Therefore NFSv4.1 servers send callbacks to clients in the backward
servers send callbacks to clients in the backward direction on direction on connections established by NFSv4.1 clients.
connections established by NFSv4.1 clients.
NFSv4.1 clients and servers indicate to their peers that a NFSv4.1 clients and servers indicate to their peers that a
backchannel capability is available on a given transport in the backchannel capability is available on a given transport in the
arguments and results of NFS CREATE_SESSION or BIND_CONN_TO_SESSION arguments and results of NFS CREATE_SESSION or BIND_CONN_TO_SESSION
operations. operations.
NFSv4.1 clients may establish distinct transport connections for NFSv4.1 clients may establish distinct transport connections for
forechannel and backchannel operation, or they may combine forechannel and backchannel operation, or they may combine
forechannel and backchannel operation on one transport connection forechannel and backchannel operation on one transport connection
using bi-directional operation. using bi-directional operation.
skipping to change at page 7, line 35 skipping to change at page 7, line 15
During bi-directional operation, each receiver has to decide whether During bi-directional operation, each receiver has to decide whether
an incoming message contains a credit request (the receiver is acting an incoming message contains a credit request (the receiver is acting
as a responder) or a credit grant (the receiver is acting as a as a responder) or a credit grant (the receiver is acting as a
requester) and apply the credit value accordingly. requester) and apply the credit value accordingly.
When message direction is not fully determined by context (e.g., When message direction is not fully determined by context (e.g.,
suggested by the definition of the RPC-over-RDMA version that is in suggested by the definition of the RPC-over-RDMA version that is in
use) or by an accompanying RPC message payload with a call direction use) or by an accompanying RPC message payload with a call direction
field, it is not possible for the receiver to tell with certainty field, it is not possible for the receiver to tell with certainty
whether the header credit value is a request or grant. In such whether the header credit value is a request or grant. In such
cases, the receiver MUST NOT use the header's credit value. cases, the receiver MUST ignore the header's credit value.
4.2. Inline Thresholds 4.2. Inline Thresholds
Forward and backward operation on the same connection share the same Forward and backward operation on the same connection share the same
receive buffers. Therefore the inline threshold values for the receive buffers. Therefore the inline threshold values for the
forward direction and the backward direction are the same. The call forward direction and the backward direction are the same. The call
inline threshold for the backward direction is the same as the reply inline threshold for the backward direction is the same as the reply
inline threshold for the forward direction, and vice versa. For more inline threshold for the forward direction, and vice versa. For more
information, see Section 4.3.2 of [I-D.ietf-nfsv4-rfc5666bis]. information, see Section 4.3.2 of [I-D.ietf-nfsv4-rfc5666bis].
skipping to change at page 8, line 35 skipping to change at page 8, line 13
buffers to handle backward direction Calls. buffers to handle backward direction Calls.
4.3.2. Server Receive Buffers 4.3.2. Server Receive Buffers
A forward direction RPC-over-RDMA service endpoint posts as many A forward direction RPC-over-RDMA service endpoint posts as many
receive buffers as it expects incoming forward direction Calls. That receive buffers as it expects incoming forward direction Calls. That
is, it posts no fewer buffers than the number of credits granted in is, it posts no fewer buffers than the number of credits granted in
the rdma_credit field of forward direction RPC replies. the rdma_credit field of forward direction RPC replies.
To receive incoming backward direction replies, an RPC-over-RDMA To receive incoming backward direction replies, an RPC-over-RDMA
server endpoint must pre-post a receive buffer for each backward server endpoint must pre-post enough additional receive buffers to
direction Call it sends. handle replies for each backward direction Call it sends.
When the existing transport connection is lost, all active receive When the existing transport connection is lost, all active receive
buffers are flushed and are no longer available to receive incoming buffers are flushed and are no longer available to receive incoming
messages. When a fresh transport connection is established, a server messages. When a fresh transport connection is established, a server
endpoint must re-post a receive buffer to handle the Reply for each endpoint must re-post a receive buffer to handle the Reply for each
retransmitted backward direction Call, and a full set of receive retransmitted backward direction Call, and a full set of receive
buffers for receiving forward direction Calls. buffers for receiving forward direction Calls.
5. Sending And Receiving Backward Operations 5. Sending And Receiving Backward Operations
skipping to change at page 11, line 14 skipping to change at page 10, line 40
Forward direction Requesters are responsible for maintaining a Forward direction Requesters are responsible for maintaining a
transport connection as long as there is the possibility of backward transport connection as long as there is the possibility of backward
direction requests. For example, an NFSv4.1 client with open direction requests. For example, an NFSv4.1 client with open
delegated files or active pNFS layouts should maintain a transport delegated files or active pNFS layouts should maintain a transport
connection so the server can send callback operations. connection so the server can send callback operations.
6. In the Absence of Backward Direction Support 6. In the Absence of Backward Direction Support
An RPC-over-RDMA transport endpoint might not support backward An RPC-over-RDMA transport endpoint might not support backward
direction operation (and thus bi-directional operation). There might direction operation (and thus it does not support bi-directional
be no mechanism in the transport implementation to do so. Or in an operation). There might be no mechanism in the transport
implementation that can support operation in the backward direction, implementation to do so. Or in an implementation that can support
the Upper Layer Protocol consumer might not yet have configured or operation in the backward direction, the Upper Layer Protocol
enabled the transport to handle backward direction traffic. consumer might not yet have configured or enabled the transport to
handle backward direction traffic.
If an endpoint is not prepared to receive an incoming backward If an endpoint is not prepared to receive an incoming backward
direction message, loss of the RDMA connection might result. Thus direction message, loss of the RDMA connection might result. Thus
denial of service could result if a sender continues to send backward denial of service could result if a sender continues to send backward
direction messages after every transport reconnect to an endpoint direction messages after every transport reconnect to an endpoint
that is not prepared to receive them. that is not prepared to receive them.
When dealing with the possibility that the remote peer has no When dealing with the possibility that the remote peer has no
transport level support for backward direction operation, the Upper transport level support for backward direction operation, the Upper
Layer Protocol becomes responsible for informing peers when backward Layer Protocol becomes responsible for informing peers when backward
skipping to change at page 12, line 15 skipping to change at page 11, line 41
Backward-only operation requires the client endpoint to establish a Backward-only operation requires the client endpoint to establish a
fresh connection. The Upper Layer Binding can specify appropriate fresh connection. The Upper Layer Binding can specify appropriate
RPC binding parameters for such connections. RPC binding parameters for such connections.
Bi-directional operation occurs on an already-established connection. Bi-directional operation occurs on an already-established connection.
Specification of RPC binding parameters is usually not necessary in Specification of RPC binding parameters is usually not necessary in
this case. this case.
For bi-directional operation, other considerations about sharing an For bi-directional operation, other considerations about sharing an
RPC-over-RDMA transport with another ULP may apply. Consult RPC-over-RDMA transport with another ULP may apply. Consult
Section 7 of [I-D.ietf-nfsv4-rfc5666bis] for details about what else Section 6 of [I-D.ietf-nfsv4-rfc5666bis] for details about what else
may be contained in an Upper Layer Binding. may be contained in an Upper Layer Binding.
8. Security Considerations 8. Security Considerations
Security considerations for operation on RPC-over-RDMA transports are Security considerations for operation on RPC-over-RDMA transports are
outlined in Section 9 of [I-D.ietf-nfsv4-rfc5666bis]. outlined in Section 9 of [I-D.ietf-nfsv4-rfc5666bis].
9. IANA Considerations 9. IANA Considerations
This document does not require actions by IANA. This document does not require actions by IANA.
10. Acknowledgements 10. Normative References
[I-D.ietf-nfsv4-rfc5666bis]
Lever, C., Simpson, W., and T. Talpey, "Remote Direct
Memory Access Transport for Remote Procedure Call, Version
One", draft-ietf-nfsv4-rfc5666bis-09 (work in progress),
January 2017.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC5531] Thurlow, R., "RPC: Remote Procedure Call Protocol
Specification Version 2", RFC 5531, May 2009.
[RFC5661] Shepler, S., Eisler, M., and D. Noveck, "Network File
System (NFS) Version 4 Minor Version 1 Protocol",
RFC 5661, January 2010.
[RFC7530] Haynes, T. and D. Noveck, "Network File System (NFS)
Version 4 Protocol", RFC 7530, March 2015.
Appendix A. Acknowledgements
Tom Talpey was an indispensable resource, in addition to creating the Tom Talpey was an indispensable resource, in addition to creating the
foundation upon which this work is based. Our warmest regards go to foundation upon which this work is based. Our warmest regards go to
him for his help and support. him for his help and support.
Dave Noveck provided excellent review, constructive suggestions, and Dave Noveck provided excellent review, constructive suggestions, and
navigational guidance throughout the process of drafting this navigational guidance throughout the process of drafting this
document. document.
Dai Ngo was a solid partner and collaborator. Together we Dai Ngo was a solid partner and collaborator. Together we
constructed and tested independent prototypes of the changes constructed and tested independent prototypes of the changes
described in this document. described in this document.
The author wishes to thank Bill Baker for his unwavering support of The author wishes to thank Bill Baker for his unwavering support of
this work. In addition, the author gratefully acknowledges the this work. In addition, the author gratefully acknowledges the
expert contributions of Karen Deitke, Chunli Zhang, Mahesh expert contributions of Karen Deitke, Chunli Zhang, Mahesh
Siddheshwar, Steve Wise, and Tom Tucker. Siddheshwar, Steve Wise, and Tom Tucker.
Special thanks go to the nfsv4 Working Group Chair Spencer Shepler Special thanks go to Transport Area Director Spencer Dawkins, nfsv4
and the nfsv4 Working Group Secretary Tom Haynes for their support. Working Group and document shepherd Chair Spencer Shepler, and nfsv4
Working Group Secretary Tom Haynes for their support.
11. Normative References
[I-D.ietf-nfsv4-rfc5666bis]
Lever, C., Simpson, W., and T. Talpey, "Remote Direct
Memory Access Transport for Remote Procedure Call", draft-
ietf-nfsv4-rfc5666bis-04 (work in progress), March 2016.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC5531] Thurlow, R., "RPC: Remote Procedure Call Protocol
Specification Version 2", RFC 5531, May 2009.
[RFC5661] Shepler, S., Eisler, M., and D. Noveck, "Network File
System (NFS) Version 4 Minor Version 1 Protocol", RFC
5661, January 2010.
[RFC7530] Haynes, T. and D. Noveck, "Network File System (NFS)
Version 4 Protocol", RFC 7530, March 2015.
Author's Address Author's Address
Charles Lever Charles Lever
Oracle Corporation Oracle Corporation
1015 Granger Avenue 1015 Granger Avenue
Ann Arbor, MI 48104 Ann Arbor, MI 48104
USA USA
Phone: +1 734 274 2396 Phone: +1 248 816 6463
Email: chuck.lever@oracle.com Email: chuck.lever@oracle.com
 End of changes. 21 change blocks. 
78 lines changed or deleted 66 lines changed or added

This html diff was produced by rfcdiff 1.45. The latest version is available from http://tools.ietf.org/tools/rfcdiff/