Network File System Version 4                                   C. Lever
Internet-Draft                                                    Oracle
Intended status: Standards Track                           April 8,                             May 2, 2016
Expires: October 10, November 3, 2016

   Size-Limited

    Bi-directional Remote Procedure Call On Remote Direct
                        Memory Access RPC-over-RDMA Transports
                draft-ietf-nfsv4-rpcrdma-bidirection-02
                draft-ietf-nfsv4-rpcrdma-bidirection-03

Abstract

   Recent minor versions of NFSv4 work best when ONC RPC transports can
   send ONC RPC Remote Procedure Call transactions in both directions. directions on the
   same connection.  This document describes conventions that enable how RPC-over-RDMA transport
   endpoints
   to interoperate when operation convey RPCs in both directions is necessary. on a single connection.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on October 10, November 3, 2016.

Copyright Notice

   Copyright (c) 2016 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
     1.1.  Requirements Language . . . . . . . . . . . . . . . . . .   3
   2.  Understanding RPC Direction . . . . . . . . . . . . . . .   2
     1.2.  Rationale For RPC-over-RDMA Bi-Direction . .   3
     2.1.  Forward Direction . . . . . . . . . . . . . . . . . . . .   3
     2.2.  Backward Direction  . . . . . . . . . . . . . . . . . . .   4
     1.3.  Design Considerations
     2.3.  Bi-directional Operation  . . . . . . . . . . . . . . . .   4
     2.4.  XID Values  . . .   6
     1.4.  Requirements Language . . . . . . . . . . . . . . . . . .   8
   2.  Conventions . .   4
   3.  Rationale For Backward Operation Bi-Directional RPC-over-RDMA  . . . . . . . . .   5
     3.1.  NFSv4.0 Callback Operation  . . . .   8
     2.1.  Flow Control . . . . . . . . . . .   5
     3.2.  NFSv4.1 Callback Operation  . . . . . . . . . . .   8
     2.2.  Managing Receive Buffers . . . .   6
   4.  Flow Control  . . . . . . . . . . . .   9
     2.3.  Backward Direction Retransmission . . . . . . . . . . . .  11
     2.4.   6
     4.1.  Backward Direction Message Size Credits  . . . . . . . . . . . . .  11
     2.5.  Sending A Backward Direction Call . . . . . . .   7
     4.2.  Managing Receive Buffers  . . . . .  12
     2.6.  Sending A Backward Direction Reply . . . . . . . . . . .  12
   3.   7
   5.  Protocol For Backward Direction Upper Layer Binding Operation . . . . . . . . . . .  13
   4.  Limits To This Approach . . . .   8
     5.1.  Sending A Backward Direction Call . . . . . . . . . . . .   8
     5.2.  Sending A Backward Direction Reply  . . . .  13
     4.1.  Payload Size . . . . . . .   9
     5.3.  Backward Direction Chunks . . . . . . . . . . . . . . .  13
     4.2.  Preparedness To Handle Backward Requests .   9
     5.4.  Backward Direction Retransmission . . . . . . .  13
     4.3.  Long Term . . . . .  10
   6.  In the Absence of Backward Direction Support  . . . . . . . .  10
   7.  Backward Direction Upper Layer Binding  . . . . . . . . . . .  14
   5.  11
   8.  Security Considerations . . . . . . . . . . . . . . . . . . .  14
   6.  11
   9.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  14
   7.  11
   10. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  14
   8.  11
   11. Normative References  . . . . . . . . . . . . . . . . . . . .  15  12
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  15  12

1.  Introduction

   The purpose of this document is to enable bi-directional RPC
   transactions
   operation on RPC-over-RDMA transports protocol versions that do not already support have
   specific protocol facilities for backward direction operation.  The conventions described herein can
   be used with the RPC-over-RDMA Version One protocol without changes.
   Therefore this document does not update [I-D.ietf-nfsv4-rfc5666bis].
   Backward direction RPC transactions enable the operation of NFSv4.1,
   and in particular pNFS, on RPC-over-RDMA.  Providing an Upper Layer
   Binding pNFS.

   For example, using the protocol described in this document, RPC
   transactions can be conveyed in both directions on the same RPC-over-
   RDMA Version One connection without changes to the Version One header
   XDR description.  Therefore this document does not update
   [I-D.ietf-nfsv4-rfc5666bis].

   Providing an Upper Layer Binding for NFSv4.x callback operations is
   outside the scope of this document.

1.1.  Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

2.  Understanding RPC Direction

   The ONC RPC protocol as described in [RFC5531] is fundamentally a
   message-passing protocol between one server and one or more clients.
   ONC RPC transactions are made up of two types of messages.

   A CALL message, or "Call", requests work.  A Call is designated by
   the value CALL in the message's msg_type field.  An arbitrary unique
   value is placed in the message's xid field.  A host that originates a
   Call is referred to in this document as a "Requester."

   A REPLY message, or "Reply", reports the results of work requested by
   a Call.  A Reply is designated by the value REPLY in the message's
   msg_type field.  The value contained in the message's xid field is
   copied from the Call whose results are being returned.  A host that
   emits a Reply is referred to as a "Responder."

   Typically, a Call generates a corresponding Reply.  A Reply is never
   sent without a corresponding Call.

   RPC-over-RDMA is a connection-oriented RPC transport.  When a
   connection-oriented transport is used, ONC RPC client endpoints are
   responsible for initiating transport connections, while ONC RPC
   service endpoints wait passively for incoming connection requests.

   RPC direction on connectionless RPC transports is not considered in
   this document.

1.1.1.

2.1.  Forward Direction

   A traditional ONC RPC client is always a Requester.  A traditional
   ONC RPC service is always a Responder.  This traditional form of ONC
   RPC message passing is referred to as operation in the "forward
   direction."

   During forward direction operation, the ONC RPC client is responsible
   for establishing transport connections.

1.1.2.

2.2.  Backward Direction

   The ONC RPC specification [RFC5531] does not forbid passing messages
   in the other direction.  An ONC RPC service endpoint can act as a
   Requester, in which case an ONC RPC client endpoint acts as a
   Responder.  This form of message passing is referred to as operation
   in the "backward direction."

   During backward direction operation, the ONC RPC client is
   responsible for establishing transport connections, even though ONC
   RPC Calls come from the ONC RPC server.

   ONC RPC clients and services are optimized to perform and scale well
   while handling traffic in the forward direction, and may not be
   prepared to handle operation in the backward direction.  Not until
   recently has there been a need to handle backward direction
   operation.

1.1.3.

2.3.  Bi-directional Operation

   A pair of connected RPC endpoints may choose to use only forward or
   only backward direction operations on a particular transport.  Or,
   these endpoints may send Calls in both directions concurrently on the
   same transport.

   "Bi-directional operation" occurs when both transport endpoints act
   as a Requester and a Responder at the same time.  As above, the ONC
   RPC client is always responsible for establishing transport
   connections.

1.1.4.

2.4.  XID Values

   Section 9 of [RFC5531] introduces the ONC RPC transaction identifier,
   or "xid" for short.  The value of an xid is interpreted in the
   context of the message's msg_type field.

   o  The xid of a Call is arbitrary but is unique among outstanding
      Calls from that Requester.

   o  The xid of a Reply always matches that of the initiating Call.

   When receiving a Reply, a Requester matches the xid value in the
   Reply with a Call it previously sent.

1.1.4.1.  XIDs with Bi-direction

2.4.1.  XID Generation

   During bi-directional operation, the forward and backward directions
   use independent xid spaces.

   In other words, direction XIDs
   are typically generated on distinct hosts by possibly different
   algorithms.  There is no co-ordination between forward and backward
   direction XID generation.

   Therefore, a forward direction Requester MAY use the same xid value
   at the same time as a backward direction Requester on the same
   transport connection.  Though such concurrent requests use the same
   xid value, they represent distinct ONC RPC transactions.

1.2.

3.  Rationale For Bi-Directional RPC-over-RDMA Bi-Direction

1.2.1.

3.1.  NFSv4.0 Callback Operation

   An NFSv4.0 client employs a traditional ONC RPC client to send NFS
   requests to an NFSv4.0 server's traditional ONC RPC service
   [RFC7530].  NFSv4.0 requests flow in the forward direction on a
   connection established by the client.  This connection is referred to
   as a "forechannel" connection.

   An NFSv4 "delegation" is simply a promise made by a server that it
   will notify a client when before another client requests agent is allowed access to a
   file.  With this guarantee, that client can operate as sole accessor
   of this
   file, and the file.  In particular, it can manage the file's data and
   metadata caches aggressively.

   To manage administer file delegation, delegations, NFSv4.0 introduces the use of
   callback operations, or "callbacks", in Section 10.2 of [RFC7530].
   An NFSv4.0 server sets up a traditional ONC RPC client, and an
   NFSv4.0 client sets up a traditional ONC RPC service.  Callbacks flow
   in the forward direction on a connection established between the
   server's callback client, and the client's callback server.  This
   connection is distinct from connections being used as forechannels,
   and is referred to as a "backchannel connection."

   When an RDMA transport is used as a forechannel, an NFSv4.0 client
   typically provides a TCP callback service.  The client's SETCLIENTID
   operation advertises the callback service endpoint with a "tcp" or
   "tcp6" netid.  The server then connects to this service using a TCP
   socket.

   NFSv4.0 implementations are fully functional without a backchannel in
   place.  In this case, the server does not grant file delegations.
   This might result in a negative performance effect, but functional
   correctness is unaffected.

1.2.2.

3.2.  NFSv4.1 Callback Operation

   NFSv4.1 supports file delegation in a similar fashion to NFSv4.0, and
   extends the repertoire of callbacks callback mechanism to manage pNFS layouts, as discussed
   in Chapter Section 12 of [RFC5661].

   For various reasons, NFSv4.1 requires that

   To facilitate operation through NAT routers, all NFSv4.1 transport
   connections
   be are initiated by NFSv4.1 clients.  Therefore,  Therefore NFSv4.1
   servers send callbacks to clients in the backward direction on
   connections established by NFSv4.1 clients.

   NFSv4.1 clients and servers indicate to their peers that a
   backchannel capability is available on a given transport in the
   arguments and results of a NFS CREATE_SESSION or BIND_CONN_TO_SESSION
   operation.
   operations.

   NFSv4.1 clients may establish distinct transport connections for
   forechannel and backchannel operation, or they may combine
   forechannel and backchannel operation on one transport connection
   using bi-directional operation.

   Without a backward direction RPC-over-RDMA capability, an NFSv4.1
   client must additionally connect using a transport with backward
   direction capability to use as a backchannel.  TCP is the only choice
   at present
   for an NFSv4.1 backchannel connection. connection in this case.

   Some implementations find it more convenient to use a single combined
   transport (ie. a transport that is capable of bi-directional
   operation).  This simplifies connection establishment, establishment and recovery
   during network partitions or when one endpoint restarts.

   As with NFSv4.0, if a backchannel is not in use, an NFSv4.1 server
   does not grant delegations.  But because of its reliance on callbacks
   to manage pNFS layout state, pNFS operation is not possible without a
   backchannel.

1.3.  Design Considerations

   As of this writing, the only use case for backward direction ONC RPC
   messages is

4.  Flow Control

   For an RDMA Send operation to work, the NFSv4.1 backchannel.  The conventions described receiving peer must have
   posted an RDMA Receive Work Request (WR) to provide a receive buffer
   in
   this document take advantage of certain characteristics of NFSv4.1
   callbacks, namely:

   o  NFSv4.1 callbacks typically bear small arguments and results

   o  NFSv4.1 callback arguments and results are insensitive which to
      alignment relative land the incoming message.  If a receiver hasn't posted
   enough Receive WRs to system pages

   o  NFSv4.1 callbacks are infrequent relative land incoming Send operations, the RDMA
   provider is allowed to forechannel
      operations

1.3.1.  Backward Compatibility

   Existing clients that implement RPC-over-RDMA Version One should
   interoperate correctly with servers that implement RPC-over-RDMA with
   backward direction support, and vice versa.

   The approach taken here avoids altering drop the RDMA connection.

   RPC-over-RDMA XDR
   specification.  Keeping transport protocols provide built-in send flow control
   to prevent overrunning the XDR number of pre-posted receive buffers on a
   connection's receive endpoint.  This is fully discussed in
   Section 4.3 of [I-D.ietf-nfsv4-rfc5666bis].

4.1.  Backward Credits

   Credits work the same enables existing RPC-over-
   RDMA Version One implementations to interoperate with implementations
   that support operation way in the backward direction.

1.3.2.  Performance Impact

   Support for operation direction as they do in the
   forward direction.  However, forward direction credits and backward
   direction should never impact credits are accounted separately.

   In other words, the performance or scalability of forward direction operation, where
   the bulk of ONC RPC transport activity typically occurs.

1.3.3.  Server Memory Security

   RDMA transfers involve one endpoint exposing a section of its memory
   to credit value is the other endpoint, which then drives RDMA Read and Write
   operations to access same
   whether or modify the exposed memory.  RPC-over-RDMA
   client endpoints expose their memory, and RPC-over-RDMA server
   endpoints initiate RDMA data transfer operations.

   If RDMA transfers are not used for backward direction operations, there is no need for servers to expose their memory to clients.

   Further, this avoids the client complexity required to drive RDMA
   transfers.

1.3.4.  Payload Size

   Small RPC-over-RDMA messages are conveyed using only RDMA Send
   operations.  Send is used to transmit both ONC RPC Calls and replies.

   To send a large payload, backward direction resources associated with
   an RPC-over-RDMA client endpoint registers a
   region of memory (known as transport connection.  The backward direction credit
   value MAY be different than the forward direction credit value.  The
   rdma_credit field in a "chunk") and transmits its coordinates
   to an backward direction RPC-over-RDMA server endpoint, who uses an RDMA transfer to
   move data to or from message MUST
   NOT contain the client.  See Section 4.4 of
   [I-D.ietf-nfsv4-rfc5666bis].

   To transmit value zero.

   A backward direction Requester (ie, an RPC-over-RDMA messages larger than service
   endpoint) requests credits from the receive buffer
   size (1024 bytes on Responder (ie, an RPC-over-RDMA Version One transport), a chunk
   must be used.  For example, in an RDMA_NOMSG type message, the entire
   RPC header and Upper Layer payload are contained in one or more
   chunks.  See Section 4.5 of [I-D.ietf-nfsv4-rfc5666bis].  for further
   details.

   If chunks are not allowed to be used for conveying backward direction
   messages, an RDMA_NOMSG type message cannot be used to convey a
   backward direction message using the conventions described in this
   document.  Therefore, backward direction messages sent using the
   conventions in this document can be no larger than a single receive
   buffer.

   Stipulating such a limit on backward direction message size assumes
   that either Upper Layer Protocol consumers of backward direction
   messages can advertise this limit to peers, or that ULP consumers can
   agree by convention on a maximum size of their backchannel payloads.

   In addition, using only inline forms of RPC-over-RDMA messages and
   never populating the RPC-over-RDMA chunk lists means that the RPC
   header's msg_type field is always at a fixed location in messages
   flowing in the backward direction, allowing efficient detection of
   the direction of an RPC-over-RDMA message.

   With few exceptions, NFSv4.1 servers can break down callback requests
   so they fit within this limit.  There are potentially large NFSv4.1
   callback operations, such as a CB_GETATTR operation where a large ACL
   must be conveyed.  Although we are not aware of any NFSv4.1
   implementation that uses CB_GETATTR, this state of affairs is not
   guaranteed in perpetuity.

1.4.  Requirements Language
   client endpoint).  The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

2.  Conventions For Backward Operation

   Performing backward direction ONC RPC operations over an RPC-over-
   RDMA transport can be accomplished within limits by observing the
   conventions described in the following subsections.  For reference,
   the XDR description of RPC-over-RDMA Version One is contained in
   Section 5.1 of [I-D.ietf-nfsv4-rfc5666bis].

2.1.  Flow Control

   For an RDMA Send operation to work, the receiving peer must have
   posted an RDMA Receive Work Request (WR) to provide a receive buffer
   in which to capture the incoming message.  If a receiver hasn't
   posted enough Receive WRs to catch incoming Send operations, the RDMA
   provider is allowed to drop the RDMA connection.

   RPC-over-RDMA protocols provide built-in send flow control to prevent
   overrunning the number of pre-posted receive buffers on a
   connection's receive endpoint. Responder reports how many credits it has
   granted.  This is fully discussed in
   Section 4.3 of [I-D.ietf-nfsv4-rfc5666bis].

2.1.1.  Forward Credits

   An RPC-over-RDMA credit is the capability to handle one RPC-over-RDMA
   transaction.  Each forward direction RPC-over-RDMA Call requests a number of credits from the Responder.  Each forward backward direction Reply
   informs the Requester how many credits Calls the
   Responder is prepared to handle in total.  The value of the request and grant are carried in
   each RPC-over-RDMA message's rdma_credit field.

   Practically speaking, the critical value is the value of the
   rdma_credit field in RPC-over-RDMA replies.  When a Requester is
   operating correctly, it sends no more outstanding requests at a time
   than the Responder's advertised forward direction credit value.

   The credit value is a guaranteed minimum.  However, a receiver can
   post more receive buffers than its credit value.  There is no
   requirement in the RPC-over-RDMA protocol for a receiver to indicate
   a credit overrun.  Operation continues as long as there are enough
   receive buffers to handle incoming messages.

2.1.2.  Backward Credits

   Credits work the same way in the backward direction as they do in the
   forward direction.  However, forward direction credits and backward
   direction credits are accounted separately.

   In other words, the forward once.

   When message direction credit value is the same
   whether or not there are backward direction resources associated with fully determined by context or by an RPC-over-RDMA transport connection.  The backward direction credit
   value MAY be different than the forward direction credit value.  The
   rdma_credit field in a backward direction RPC-over-RDMA
   accompanying RPC message MUST
   NOT contain the value zero.

   A backward direction Requester (ie, an RPC-over-RDMA service
   endpoint) requests credits from the Responder (ie, an RPC-over-RDMA
   client endpoint).  The Responder reports how many credits it can
   grant.  This is the number of backward with a call direction Calls the Responder field, it is prepared not
   possible to handle at once.

   When an RPC-over-RDMA server endpoint tell whether the header credit value is operating correctly, it
   sends no more outstanding requests at a time than request or
   grant, or whether the client
   endpoint's advertised backward value applies to the forward direction or
   backward direction.  In such cases, the receiver MUST NOT use the
   header's credit value.

2.2.

4.2.  Managing Receive Buffers

   An RPC-over-RDMA transport endpoint must pre-post receive buffers
   before it can receive and process incoming RPC-over-RDMA messages.
   If a sender transmits a message for a receiver which has no prepared
   receive buffer, the RDMA provider is allowed to drop the RDMA
   connection.

2.2.1.

4.2.1.  Client Receive Buffers

   Typically an RPC-over-RDMA Requester posts only as many receive
   buffers as there are outstanding RPC Calls.  A client endpoint
   without backward direction support might therefore at times have no
   pre-posted receive buffers.

   To receive incoming backward direction Calls, an RPC-over-RDMA client
   endpoint must pre-post enough additional receive buffers to match its
   advertised backward direction credit value.  Each outstanding forward
   direction RPC requires an additional receive buffer above this
   minimum.

   When an RDMA transport connection is lost, all active receive buffers
   are flushed and are no longer available to receive incoming messages.
   When a fresh transport connection is established, a client endpoint
   must re-post a receive buffer to handle the Reply for each
   retransmitted forward direction Call, and a full set of receive
   buffers to handle backward direction Calls.

2.2.2.

4.2.2.  Server Receive Buffers

   A forward direction RPC-over-RDMA service endpoint posts as many
   receive buffers as it expects incoming forward direction Calls.  That
   is, it posts no fewer buffers than the number of RPC-over-RDMA credits it advertises granted in
   the rdma_credit field of forward direction RPC replies.

   To receive incoming backward direction replies, an RPC-over-RDMA
   server endpoint must pre-post a receive buffer for each backward
   direction Call it sends.

   When the existing transport connection is lost, all active receive
   buffers are flushed and are no longer available to receive incoming
   messages.  When a fresh transport connection is established, a server
   endpoint must re-post a receive buffer to handle the Reply for each
   retransmitted backward direction Call, and a full set of receive
   buffers for receiving forward direction Calls.

2.2.3.  In the Absense of Backward Direction Support

   An RPC-over-RDMA transport endpoint might not support backward
   direction operation.  There might be no mechanism in the transport
   implementation to do so.  Or the Upper Layer Protocol consumer might
   not yet have configured the transport to handle backward direction
   traffic.

   A loss of the RDMA connection may result if the receiver is not
   prepared to receive an incoming message.  Thus a denial-of-service
   could result if a sender continues to send backchannel messages after
   every transport reconnect to an endpoint that is not prepared to
   receive them.

   Generally, for RPC-over-RDMA Version One transports, the Upper Layer
   Protocol consumer is responsible for informing its peer when it has
   support for the backward direction.  Otherwise even a simple backward
   direction NULL probe from a peer could result in a lost connection.

   An NFSv4.1 server does not send backchannel messages to an NFSv4.1
   client before the NFSv4.1 client has sent a CREATE_SESSION or a
   BIND_CONN_TO_SESSION operation.  As long as an NFSv4.1 client has
   prepared appropriate backchannel resources before sending one of
   these operations, denial-of-service is avoided.  Legacy versions the Reply for each
   retransmitted backward direction Call, and a full set of
   NFS never send backchannel operations.

   Therefore, an Upper Layer receive
   buffers for receiving forward direction Calls.

5.  Protocol consumer MUST NOT perform For Backward Operation

   Performing backward direction ONC RPC operations unless over an RPC-over-
   RDMA transport connection can be accomplished by observing the peer consumer has indicated
   it is prepared to handle them.  A
   protocol described in the following subsections.  For reference, the
   XDR description of Upper Layer Protocol
   mechanisms used for this indication RPC-over-RDMA Version One is outside the scope contained in
   Section 5.1 of this
   document.

2.3. [I-D.ietf-nfsv4-rfc5666bis].

5.1.  Sending A Backward Direction Retransmission

   In rare cases, Call

   To form a backward direction RPC-over-RDMA Call message, an ONC RPC transaction cannot be completed within
   service endpoint constructs an RPC-over-RDMA header containing a
   certain time.  This can be because the transport connection was lost,
   the Call or Reply message was dropped, or because the Upper Layer
   consumer delayed or dropped the ONC
   fresh RPC request.  Typically, the
   Requester sends XID in the transaction again, reusing rdma_xid field (see Section 2.4 for full
   requirements).

   The rdma_vers field MUST contain the same RPC XID.
   This is known as an "RPC retransmission".

   In the value in backward and
   forward direction, the Requester is direction Call messages on the ONC RPC client. same connection.

   The
   client is always responsible for establishing a transport connection
   before sending again.

   In the number of requested backward direction, the Requester direction credits is placed in the ONC RPC server.
   Because an ONC RPC server does not establish transport connections
   with clients, it cannot send a retransmission if there is no
   transport connection.  It must wait for
   rdma_credit field (see Section 4).

   Whether presented inline or as a separate chunk, the ONC RPC client to re-
   establish Call
   header MUST start with the transport connection before it can retransmit ONC RPC
   transactions same XID value that is present in the RPC-
   over-RDMA header, and the header's msg_type field MUST contain the
   value CALL.

5.2.  Sending A Backward Direction Reply

   To form a backward direction.

   If direction RPC-over-RDMA Reply message, an ONC RPC
   client has no work to do, it may be some time before it
   re-establishes a transport connection.  Backward direction Requesters
   must be prepared to wait indefinitely before endpoint constructs an RPC-over-RDMA header containing a connection is
   established before copy
   of the matching ONC RPC Call's RPC XID in the rdma_xid field (see
   Section 2.4 for full requirements).

   The rdma_vers field MUST contain the same value in a pending backward
   direction Reply message as in the matching Call message.

   The number of granted backward direction credits is placed in the
   rdma_credit field (see Section 4).

   Whether presented inline or as a separate chunk, the ONC RPC Call can be
   retransmitted.

2.4. Reply
   header MUST start with the same XID value that is present in the RPC-
   over-RDMA header, and the header's msg_type field MUST contain the
   value REPLY.

5.3.  Backward Direction Message Size

   RPC-over-RDMA Chunks

   Chunks MAY be used in the backward direction messages are transmitted and
   received using direction.  They operate the same buffers
   way as messages in the forward direction.
   Therefore they are constrained to be no larger than receive buffers
   posted direction (see [I-D.ietf-nfsv4-rfc5666bis] for forward messages.
   details).

   An implementation might not support any Upper Layer Protocol that has
   DDP-eligible data items.  The default Receive buffer Upper Layer Protocol may also use only
   small messages, or it may have a native mechanism for restricting the
   size in RPC-
   over-RDMA Version One implementations is 1024 bytes.

   It is expected that of backward direction RPC messages, obviating the need to handle
   Long Messages in the backward direction.

   When there is no Upper Layer Protocol consumer establishes an
   appropriate payload size limit requirement for backward direction operations,
   either by advertising that size limit chunks,
   implementers can choose not to its peers, or by convention.
   If that is done, provide support for chunks in the
   backward direction messages will not exceed direction.  This avoids the size complexity of receive buffers at either endpoint.

   If a sender transmits a adding support for
   performing RDMA Reads and Writes in the backward direction.

   When chunks are not implemented, RPC messages in the backward
   direction message that is are always sent using RDMA_MSG, and therefore can be no
   larger than the receiver is prepared for, the RDMA provider drops the what can be sent inline (that is, without chunks).
   Sending an inline message and larger than the RDMA receiver's inline threshold
   can result in loss of connection.

   If a sender transmits an RDMA message that is too small backward direction requester provides a non-empty chunk list to convey
   a
   complete and valid RPC-over-RDMA and RPC message in either direction,
   the receiver MUST NOT use any value in the fields responder that were
   transmitted.  Namely, does not support chunks, the rdma_credit field responder MUST be ignored, and the reply
   with an RDMA_ERROR message silently discarded.

2.5.  Sending A with rdma_err field set to ERR_CHUNK.

5.4.  Backward Direction Call

   To form a backward direction RPC-over-RDMA Call message, Retransmission

   In rare cases, an ONC RPC
   service endpoint constructs an RPC-over-RDMA header containing transaction cannot be completed within a
   fresh RPC XID in
   certain time.  This can be because the rdma_xid field (see Section 1.1.4 for full
   requirements).

   The number of requested backward direction credits is placed in transport connection was lost,
   the
   rdma_credit field (see Section 2.1).

   The rdma_proc field in Call or Reply message was dropped, or because the RPC-over-RDMA header MUST contain Upper Layer
   consumer delayed or dropped the
   value RDMA_MSG.  All three chunk lists MUST be empty.

   The ONC RPC Call header MUST follow immediately, starting with request.  Typically, the
   Requester sends the transaction again, reusing the same XID value that RPC XID.
   This is present in known as an "RPC retransmission".

   In the RPC-over-RDMA header.  The Call
   header's msg_type field MUST contain forward direction, the value CALL.

2.6.  Sending A Backward Direction Reply

   To form Requester is the ONC RPC client.  The
   client is always responsible for establishing a transport connection
   before sending again.

   In the backward direction RPC-over-RDMA Reply message, an direction, the Requester is the ONC RPC
   client endpoint constructs server.
   Because an RPC-over-RDMA header containing ONC RPC server does not establish transport connections
   with clients, it cannot send a copy
   of retransmission if there is no
   transport connection.  It must wait for the matching ONC RPC Call's client to re-
   establish the transport connection before it can retransmit ONC RPC XID
   transactions in the rdma_xid field (see
   Section 1.1.4 for full requirements).

   The number of granted backward direction.

   If an ONC RPC client has no work to do, it may be some time before it
   re-establishes a transport connection.  Backward direction credits is placed in the
   rdma_credit field (see Section 2.1).

   The rdma_proc field in the RPC-over-RDMA header MUST contain the
   value RDMA_MSG.  All three chunk lists MUST Requesters
   must be empty.

   The prepared to wait indefinitely for a connection to be
   established before a pending backward direction ONC RPC Reply header MUST follow immediately, starting with the
   same XID value that is present in the RPC-over-RDMA header.  The
   Reply header's msg_type field MUST contain Call can be
   retransmitted.

6.  In the value REPLY.

3. Absence of Backward Direction Upper Layer Binding

   RPC programs that operate on Support

   An RPC-over-RDMA transports using the
   conventions described in this document do transport endpoint might not require an Upper Layer
   Binding specification.  Because support backward
   direction operation using
   these conventions cannot transfer data via RMDA Read or Write, there
   can operation.  There might be no RDMA-eligible data items mechanism in the transport
   implementation to do so.  Or the Upper Layer Program on this
   transport.

   In addition, since Protocol consumer might
   not yet have configured the transport to handle backward direction operation occurs on
   traffic.

   If an already-
   established connection, there endpoint is no need to specify RPC bind
   parameters.

4.  Limits To This Approach

4.1.  Payload Size

   The major drawback not prepared to the approach described in this document is the
   limit on payload size in receive an incoming backward
   direction requests.

   o  Some NFSv4.1 callback operations can have potentially large
      arguments or results.  For example, CB_GETATTR on a file with message, loss of the RDMA connection might result.  Thus a
      large ACL; or CB_NOTIFY, which can provide
   denial-of-service could result if a large, complex
      argument.

   o  Any backward direction operation protected by RPCSEC_GSS might
      have additional header information that makes it difficult sender continues to send backward
   direction operations messages after every transport reconnect to an endpoint
   that is not prepared to receive them.

   When dealing with large arguments or results.

   o  Larger payloads could potentially require the use of RDMA data
      transfers, which are complex and make it more difficult to detect possibility that the remote peer has no
   transport level support for backward direction requests.  The msg_type field in operation, the ONC RPC
      header would no longer be at a fixed location in Upper
   Layer Protocol becomes responsible for informing peers when backward
   direction requests.

4.2.  Preparedness To Handle Backward Requests

   A second drawback operation is the exposure of the client transport endpoint to supported.  Otherwise even a simple backward
   direction Calls NULL probe from a peer could result in a lost connection.

   An NFSv4.1 server does not send backchannel messages to an NFSv4.1
   client before it the NFSv4.1 client has posted receive buffers to
   handle them.

   Clients that do not sent a CREATE_SESSION or a
   BIND_CONN_TO_SESSION operation.  As long as an NFSv4.1 client has
   prepared appropriate backchannel resources before sending one of
   these operations announcing support for backchannel operation,
   denial-of-service is avoided.

   Therefore, an Upper Layer Protocol consumer MUST NOT perform backward
   direction operation typically
   drop messages they do not recognize.  However, this does not allow
   bi-direction-capable servers ONC RPC operations unless the peer consumer has indicated
   it is prepared to quickly identify clients that cannot handle backward direction requests.

   The conventions in this document rely on them.  A description of Upper Layer Protocol
   consumers to decide when backward direction transport operation is
   appropriate.

4.3.  Long Term

   To address the limitations described in
   mechanisms used for this section in the long run,
   two approaches are available:

   o  Larger inline thresholds would make indication is outside the transport capable scope of
      conveying larger this
   document.

7.  Backward Direction Upper Layer Binding

   Since backward direction requests

   o  The capability operation occurs on an already-established
   connection, there is no need to move chunks specify RPC bind parameters.

   An Upper Layer Protocol that operates on RPC-over-RDMA transports in
   the backward direction would lift
      the size limit even further by enabling backward direction Long
      Call and Reply messages to be formed

   The latter approach would benefit from changes to the XDR definition
   of the RPC-over-RDMA protocol, and would require significant changes
   to implementations.

   The use of the conventions described may have DDP-eligible data items.  These are
   specified in this document an Upper Layer Binding document.

   By default, no data items in a ULP are DDP-eligible.  If there are no
   DDP-eligible data items to enable
   backward direction operation should document, an explicit Upper Layer Binding
   may not be considered a transitional
   approach needed for an Upper Layer Protocol that is appropriate while operates only in
   the predominantly deployed
   versions backward direction.

   Consult Section 7 of the RPC-over-RDMA protocol do not have native support [I-D.ietf-nfsv4-rfc5666bis] for
   large backward direction messages.

5. details about
   what else may be contained in a binding.

8.  Security Considerations

   When RPCSEC_GSS integrity and confidentiality services (described in
   [I-D.ietf-nfsv4-rpcsec-gssv3])

   Security considerations for operation on RPC-over-RDMA transports are
   outlined in use, additional RPC header
   information is included in each message.  This increases the size of
   each message, further limiting the size Section 9 of backward direction
   operations.

6. [I-D.ietf-nfsv4-rfc5666bis].

9.  IANA Considerations

   This document does not require actions by IANA.

7.

10.  Acknowledgements

   Tom Talpey was an indispensable resource, in addition to creating the
   foundation upon which this work is based.  Our warmest regards go to
   him for his help and support.

   Dave Noveck provided excellent review, constructive suggestions, and
   navigational guidance throughout the process of drafting this
   document.

   Dai Ngo was a solid partner and collaborator.  Together we
   constructed and tested independent prototypes of the conventions changes
   described in this document.

   The author wishes to thank Bill Baker for his unwavering support of
   this work.  In addition, the author gratefully acknowledges the
   expert contributions of Karen Deitke, Chunli Zhang, Mahesh
   Siddheshwar, Steve Wise, and Tom Tucker.

   Special thanks go to the nfsv4 Working Group Chair Spencer Shepler
   and the nfsv4 Working Group Secretary Tom Haynes for their support.

8.

11.  Normative References

   [I-D.ietf-nfsv4-rfc5666bis]
              Lever, C., Simpson, W., and T. Talpey, "Remote Direct
              Memory Access Transport for Remote Procedure Call", draft-
              ietf-nfsv4-rfc5666bis-04 (work in progress), March 2016.

   [I-D.ietf-nfsv4-rpcsec-gssv3]
              Adamson, A. and N. Williams, "Remote Procedure Call (RPC)
              Security Version 3", draft-ietf-nfsv4-rpcsec-gssv3-17
              (work in progress), January 2016.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC5531]  Thurlow, R., "RPC: Remote Procedure Call Protocol
              Specification Version 2", RFC 5531, May 2009.

   [RFC5661]  Shepler, S., Eisler, M., and D. Noveck, "Network File
              System (NFS) Version 4 Minor Version 1 Protocol", RFC
              5661, January 2010.

   [RFC7530]  Haynes, T. and D. Noveck, "Network File System (NFS)
              Version 4 Protocol", RFC 7530, March 2015.

Author's Address

   Charles Lever
   Oracle Corporation
   1015 Granger Avenue
   Ann Arbor, MI  48104
   USA

   Phone: +1 734 274 2396
   Email: chuck.lever@oracle.com