NFSv4                                                         S. Shepler
Internet-Draft                                                 M. Eisler
Intended status: Standards Track                               D. Noveck
Expires: February 16, 26, 2007                                       Editors
                                                         August 15, 25, 2006

                         NFSv4 Minor Version 1
                 draft-ietf-nfsv4-minorversion1-05.txt
                 draft-ietf-nfsv4-minorversion1-06.txt

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on February 16, 26, 2007.

Copyright Notice

   Copyright (C) The Internet Society (2006).

Abstract

   This Internet-Draft describes NFSv4 minor version one, including
   features retained from the base protocol and protocol extensions made
   subsequently.  The current draft includes desciption of the major
   extensions, Sessions, Directory Delegations, and parallel NFS (pNFS).
   This Internet-Draft is an active work item of the NFSv4 working
   group.  Active and resolved issues may be found in the issue tracker
   at: http://www.nfsv4-editor.org/cgi-bin/roundup/nfsv4.  New issues
   related to this document should be raised with the NFSv4 Working
   Group nfsv4@ietf.org and logged in the issue tracker.

Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [1].

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .  10
     1.1.   The NFSv4.1 Protocol . . . . . . . . . . . . . . . . . .  10
     1.2.   NFS Version 4 Goals  . . . . . . . . . . . . . . . . . .  10
     1.3.   Minor Version 1 Goals  . . . . . . . . . . . . . . . . .  11
     1.4.   Inconsistencies of this Document with Section XX . . . .  11
     1.5.   Overview of NFS version 4.1 Features . . . . . . . . . .  11
       1.5.1.  RPC and Security  . . . . . . . . . . . . . . . . . .  12
       1.5.2.  Protocol Structure  . . . . . . . . . . . . . . . . .  12
       1.5.3.  File System Model . . . . . . . . . . . . . . . . . .  14
       1.5.4.  Locking Facilities  . . . . . . . . . . . . . . . . .  15
     1.6.   General Definitions  . . . . . . . . . . . . . . . . . .  16
     1.7.   Differences from NFSv4.0 . . . . . . . . . . . . . . . .  18
   2.  Protocol Data Types  Core Infrastructure . . . . . . . . . . . . . . . . . . . . .  18
     2.1.   Basic Data Types   Introduction . . . . . . . . . . . . . . . . . . . . . .  18
     2.2.   Structured Data Types   RPC and XDR  . . . . . . . . . . . . . . . . .  20
   3.  RPC and Security Flavor . . . . .  18
       2.2.1.  RPC-based Security  . . . . . . . . . . . . . .  29
     3.1.   Ports and Transports . . .  18
     2.3.   Non-RPC-based Security Services  . . . . . . . . . . . .  19
       2.3.1.  Authorization . . .  29
       3.1.1.   Client Retransmission Behavior . . . . . . . . . . .  31
     3.2.   Security Flavors . . . . . .  19
       2.3.2.  Auditing  . . . . . . . . . . . . . .  31
       3.2.1.   Security mechanisms for NFS version 4 . . . . . . .  31
     3.3.   Security Negotiation .  19
       2.3.3.  Intrusion Detection . . . . . . . . . . . . . . . . .  33
       3.3.1.   SECINFO and SECINFO_NO_NAME  19
     2.4.   Transport Layers . . . . . . . . . . . .  33
       3.3.2.   Security Error . . . . . . . .  19
       2.4.1.  Ports . . . . . . . . . . .  33
       3.3.3.   Callback RPC Authentication . . . . . . . . . . . .  34
       3.3.4.   GSS Server Principal .  19
       2.4.2.  Stream Transports . . . . . . . . . . . . . . .  34
   4.  Filehandles . . .  19
       2.4.3.  RDMA Transports . . . . . . . . . . . . . . . . . . .  19
     2.5.   Session  . . . .  35
     4.1.   Obtaining the First Filehandle . . . . . . . . . . . . .  35
       4.1.1.   Root Filehandle . . . . . . .  19
       2.5.1.  Motivation and Overview . . . . . . . . . . .  35
       4.1.2.   Public Filehandle . . . .  19
       2.5.2.  NFSv4 Integration . . . . . . . . . . . . .  35
     4.2.   Filehandle Types . . . . .  19
       2.5.3.  Channels  . . . . . . . . . . . . . . .  36
       4.2.1.   General Properties of a Filehandle . . . . . . .  19
       2.5.4.  Exactly Once Semantics  . .  36
       4.2.2.   Persistent Filehandle . . . . . . . . . . . . .  20
     2.6.   Channel Management . .  37
       4.2.3.   Volatile Filehandle . . . . . . . . . . . . . . . .  37
     4.3.   One Method of Constructing a Volatile Filehandle .  20
       2.6.1.  Buffer Management . . .  39
     4.4.   Client Recovery from Filehandle Expiration . . . . . . .  39
   5.  File Attributes . . . . . . . .  20
       2.6.2.  Data Transfer . . . . . . . . . . . . . . .  40
     5.1.   Mandatory Attributes . . . . .  20
       2.6.3.  Flow Control  . . . . . . . . . . . . .  41
     5.2.   Recommended Attributes . . . . . . .  20
       2.6.4.  COMPOUND Sizing Issues  . . . . . . . . . .  41
     5.3.   Named Attributes . . . . .  20
       2.6.5.  Data Alignment  . . . . . . . . . . . . . . .  42
     5.4.   Classification of Attributes . . . .  20
     2.7.   Sessions Security  . . . . . . . . . .  42
     5.5.   Mandatory Attributes - Definitions . . . . . . . . .  20
       2.7.1.  Denial of Service via Unauthorized State Changes  . .  44
     5.6.   Recommended Attributes  20
     2.8.   Session Mechanics - Definitions Steady State . . . . . . . . . .  45
     5.7.   Time Access . .  20
       2.8.1.  Obligations of the Server . . . . . . . . . . . . . .  20
       2.8.2.  Obligations of the Client . . . . . .  54
     5.8.   Interpreting owner and owner_group . . . . . . . .  20
       2.8.3.  Steps the Client Takes To Establish a Session . . . .  54
     5.9.   Character Case Attributes  20
       2.8.4.  Session Mechanics - Recovery  . . . . . . . . . . . .  20
   3.  RPC and Security Flavor . . . .  56
     5.10.  Quota Attributes . . . . . . . . . . . . . . .  21
     3.1.   Ports and Transports . . . . .  56
     5.11.  mounted_on_fileid . . . . . . . . . . . . .  21
       3.1.1.  Client Retransmission Behavior  . . . . . .  57
     5.12.  send_impl_id and recv_impl_id . . . . .  22
     3.2.   Security Flavors . . . . . . . .  58
     5.13.  fs_layout_type . . . . . . . . . . . .  23
       3.2.1.  Security mechanisms for NFS version 4 . . . . . . . .  23
     3.3.   Security Negotiation .  59
     5.14.  layout_type . . . . . . . . . . . . . . . . .  24
       3.3.1.  SECINFO and SECINFO_NO_NAME . . . . .  59
     5.15.  layout_hint . . . . . . . .  25
       3.3.2.  Security Error  . . . . . . . . . . . . . .  59
     5.16.  mdsthreshold . . . . .  25
       3.3.3.  Callback RPC Authentication . . . . . . . . . . . . .  25
       3.3.4.  GSS Server Principal  . . . .  59
   6.  Access Control Lists . . . . . . . . . . . .  26
   4.  Security Negotiation  . . . . . . . .  60
     6.1.   ACE type . . . . . . . . . . . .  26
   5.  Clarification of Security Negotiation in NFSv4.1  . . . . . .  27
     5.1.   PUTFH + LOOKUP . . . . . .  62
     6.2.   ACE Access Mask . . . . . . . . . . . . . . .  27
     5.2.   PUTFH + LOOKUPP  . . . . .  63
       6.2.1.   ACE4_DELETE vs. ACE4_DELETE_CHILD . . . . . . . . .  67
     6.3.   ACE flag . . . . . .  27
     5.3.   PUTFH + SECINFO  . . . . . . . . . . . . . . . . . .  68
     6.4.   ACE who . .  27
     5.4.   PUTFH + Anything Else  . . . . . . . . . . . . . . . . .  28
   6.  NFSv4.1 Sessions  . . . . .  70
       6.4.1.   Discussion of EVERYONE@ . . . . . . . . . . . . . .  71
       6.4.2.   Discussion of OWNER@ and GROUP@ . . .  28
     6.1.   Sessions Background  . . . . . . .  71
     6.5.   Mode Attribute . . . . . . . . . . .  28
       6.1.1.  Introduction to Sessions  . . . . . . . . . .  71
     6.6.   Interaction Between Mode and ACL Attributes . . . .  28
       6.1.2.  Session Model . .  72
       6.6.1.   Recomputing mode upon SETATTR of ACL . . . . . . . .  73
       6.6.2.   Applying the mode given to CREATE or OPEN to an
                inherited ACL . . . . . . . . . .  29
       6.1.3.  Connection State  . . . . . . . . .  76
       6.6.3.   Applying a Mode to an Existing ACL . . . . . . . . .  77
       6.6.4.   ACL  30
       6.1.4.  NFSv4 Channels, Sessions and mode in the same SETATTR Connections  . . . . . .  31
       6.1.5.  Reconnection, Trunking and Failover . . . . . .  82
       6.6.5.   Inheritance and turning it off . . .  33
       6.1.6.  Server Duplicate Request Cache  . . . . . . . .  83
       6.6.6.   Deficiencies in a Mode Representation of an ACL . .  84
   7.  Single-server Name Space .  33
     6.2.   Session Initialization and Transfer Models . . . . . . .  35
       6.2.1.  Session Negotiation . . . . . . . . . .  85
     7.1.   Server Exports . . . . . . .  35
       6.2.2.  RDMA Requirements . . . . . . . . . . . . . .  85
     7.2.   Browsing Exports . . . .  36
       6.2.3.  RDMA Connection Resources . . . . . . . . . . . . . .  37
       6.2.4.  TCP and RDMA Inline Transfer Model  . .  85
     7.3.   Server Pseudo File System . . . . . . .  37
       6.2.5.  RDMA Direct Transfer Model  . . . . . . . .  86
     7.4.   Multiple Roots . . . . .  40
     6.3.   Connection Models  . . . . . . . . . . . . . . . .  86
     7.5.   Filehandle Volatility . . .  43
       6.3.1.  TCP Connection Model  . . . . . . . . . . . . . .  87
     7.6.   Exported Root . .  44
       6.3.2.  Negotiated RDMA Connection Model  . . . . . . . . . .  45
       6.3.3.  Automatic RDMA Connection Model . . . . . . . . .  87
     7.7.   Mount Point Crossing . .  46
     6.4.   Buffer Management, Transfer, Flow Control  . . . . . . .  46
     6.5.   Retry and Replay . . . . . . . . .  87
     7.8.   Security Policy and Name Space Presentation . . . . . .  88
   8.  File Locking and Share Reservations . . . . .  49
     6.6.   The Back Channel . . . . . . . .  89
     8.1.   Locking . . . . . . . . . . . .  50
     6.7.   COMPOUND Sizing Issues . . . . . . . . . . . .  89
       8.1.1.   Client ID . . . . .  51
     6.8.   Data Alignment . . . . . . . . . . . . . . . .  90
       8.1.2.   Server Release of Clientid . . . . .  51
     6.9.   NFSv4 Integration  . . . . . . . .  93
       8.1.3.   State-owner and Stateid Definition . . . . . . . . .  94
       8.1.4.   Use of the Stateid and Locking . .  53
       6.9.1.  Minor Versioning  . . . . . . . . .  97
     8.2.   Lock Ranges . . . . . . . . .  53
       6.9.2.  Slot Identifiers and Server Duplicate Request Cache .  53
       6.9.3.  Resolving server callback races with sessions . . . .  56
       6.9.4.  COMPOUND and CB_COMPOUND  . . . . . . . .  99
     8.3.   Upgrading and Downgrading Locks . . . . . .  57
     6.10.  Sessions Security Considerations . . . . . .  99
     8.4.   Blocking Locks . . . . . .  59
       6.10.1. Denial of Service via Unauthorized State Changes  . .  59
     6.11.  Session Mechanics - Steady State . . . . . . . . . . . .  63
       6.11.1. Obligations of the Server . 100
     8.5.   Lease Renewal . . . . . . . . . . . . .  63
       6.11.2. Obligations of the Client . . . . . . . . 100
     8.6.   Crash Recovery . . . . . .  63
       6.11.3. Steps the Client Takes To Establish a Session . . . .  64
     6.12.  Session Mechanics - Recovery . . . . . . . . . . . 101
       8.6.1.   Client Failure and Recovery . . .  64
       6.12.1. Events Requiring Client Action  . . . . . . . . . 101
       8.6.2.   Server Failure and Recovery . .  64
       6.12.2. Events Requiring Server Action  . . . . . . . . . . 102
       8.6.3.   Network Partitions and Recovery .  66
   7.  Minor Versioning  . . . . . . . . . 104
     8.7.   Server Revocation of Locks . . . . . . . . . . . . .  66
   8.  Protocol Data Types . . 108
     8.8.   Share Reservations . . . . . . . . . . . . . . . . . . . 109
     8.9.   OPEN/CLOSE Operations  69
     8.1.   Basic Data Types . . . . . . . . . . . . . . . . . 110
     8.10.  Open Upgrade and Downgrade . . .  69
     8.2.   Structured Data Types  . . . . . . . . . . . . 110
     8.11.  Short and Long Leases . . . . .  70
   9.  Filehandles . . . . . . . . . . . . 111
     8.12.  Clocks, Propagation Delay, and Calculating Lease
            Expiration . . . . . . . . . . . . .  80
     9.1.   Obtaining the First Filehandle . . . . . . . . . . 111
     8.13.  Vestigial Locking Infrastructure From V4.0 . . .  80
       9.1.1.  Root Filehandle . . . . 112
   9.  Client-Side Caching . . . . . . . . . . . . . . .  80
       9.1.2.  Public Filehandle . . . . . . 113
     9.1.   Performance Challenges for Client-Side Caching . . . . . 114
     9.2.   Delegation and Callbacks . . . . . . .  80
     9.2.   Filehandle Types . . . . . . . . . 114
       9.2.1.   Delegation Recovery . . . . . . . . . . .  81
       9.2.1.  General Properties of a Filehandle  . . . . . 116
     9.3.   Data Caching . . . .  81
       9.2.2.  Persistent Filehandle . . . . . . . . . . . . . . . .  82
       9.2.3.  Volatile Filehandle . . 118
       9.3.1.   Data Caching and OPENs . . . . . . . . . . . . . . . 118
       9.3.2.   Data Caching and File Locking  82
     9.3.   One Method of Constructing a Volatile Filehandle . . . .  84
     9.4.   Client Recovery from Filehandle Expiration . . . . . . . 119
       9.3.3.   Data Caching and Mandatory  84
   10. File Locking Attributes . . . . . . 121
       9.3.4.   Data Caching and File Identity . . . . . . . . . . . 121
     9.4.   Open Delegation . . . . . .  85
     10.1.  Mandatory Attributes . . . . . . . . . . . . . . 122
       9.4.1.   Open Delegation and Data Caching . . . .  86
     10.2.  Recommended Attributes . . . . . . 125
       9.4.2.   Open Delegation and File Locks . . . . . . . . . . . 126
       9.4.3.   Handling of CB_GETATTR  86
     10.3.  Named Attributes . . . . . . . . . . . . . . . 126
       9.4.4.   Recall of Open Delegation . . . . .  87
     10.4.  Classification of Attributes . . . . . . . . 129
       9.4.5.   Clients that Fail to Honor Delegation Recalls . . . 131
       9.4.6.   Delegation Revocation . . .  87
     10.5.  Mandatory Attributes - Definitions . . . . . . . . . . .  89
     10.6.  Recommended Attributes - Definitions . 132
     9.5.   Data Caching and Revocation . . . . . . . . .  90
     10.7.  Time Access  . . . . . 132
       9.5.1.   Revocation Recovery for Write Open Delegation . . . 133
     9.6.   Attribute Caching . . . . . . . . . . . . . .  99
     10.8.  Interpreting owner and owner_group . . . . . 134
     9.7.   Data and Metadata Caching and Memory Mapped Files . . . 136
     9.8.   Name Caching . . .  99
     10.9.  Character Case Attributes  . . . . . . . . . . . . . . . 101
     10.10. Quota Attributes . . . . 138
     9.9.   Directory Caching . . . . . . . . . . . . . . . . 101
     10.11. mounted_on_fileid  . . . 139
   10. Security Negotiation . . . . . . . . . . . . . . . . 102
     10.12. send_impl_id and recv_impl_id  . . . . 140
   11. Clarification of Security Negotiation in NFSv4.1 . . . . . . 140
     11.1.  PUTFH + LOOKUP . . . 103
     10.13. fs_layout_type . . . . . . . . . . . . . . . . . . 140
     11.2.  PUTFH + LOOKUPP . . . 104
     10.14. layout_type  . . . . . . . . . . . . . . . . . 141
     11.3.  PUTFH + SECINFO . . . . . 104
     10.15. layout_hint  . . . . . . . . . . . . . . . 141
     11.4.  PUTFH + Anything Else . . . . . . . 104
     10.16. mdsthreshold . . . . . . . . . . 141
   12. NFSv4.1 Sessions . . . . . . . . . . . . 104
   11. Access Control Lists  . . . . . . . . . . 142
     12.1.  Sessions Background . . . . . . . . . . 105
     11.1.  Goals  . . . . . . . . 142
       12.1.1.  Introduction to Sessions . . . . . . . . . . . . . . 142
       12.1.2.  Session Model . . . 105
     11.2.  File Attributes Discussion . . . . . . . . . . . . . . . 106
       11.2.1. ACL Attribute . 143
       12.1.3.  Connection State . . . . . . . . . . . . . . . . . . 144
       12.1.4.  NFSv4 Channels, Sessions and Connections . 106
       11.2.2. mode Attribute  . . . . . 145
       12.1.5.  Reconnection, Trunking and Failover . . . . . . . . 146
       12.1.6.  Server Duplicate Request Cache . . . . . . 117
     11.3.  Common Methods . . . . . 147
     12.2.  Session Initialization and Transfer Models . . . . . . . 148
       12.2.1.  Session Negotiation . . . . . . . . . 118
       11.3.1. Interpreting an ACL . . . . . . . 148
       12.2.2.  RDMA Requirements . . . . . . . . . . 118
       11.3.2. Computing a Mode Attribute from an ACL  . . . . . . . 150
       12.2.3.  RDMA Connection Resources 119
     11.4.  Requirements . . . . . . . . . . . . . 150
       12.2.4.  TCP and RDMA Inline Transfer Model . . . . . . . . . 151
       12.2.5.  RDMA Direct Transfer Model 120
       11.4.1. Setting the mode and/or ACL Attributes  . . . . . . . 121
       11.4.2. Retrieving the mode and/or ACL Attributes . . . . . . 154
     12.3.  Connection Models 122
       11.4.3. Creating New Objects  . . . . . . . . . . . . . . . . 122
   12. Single-server Name Space  . . . 157
       12.3.1.  TCP Connection Model . . . . . . . . . . . . . . . 124
     12.1.  Server Exports . 158
       12.3.2.  Negotiated RDMA Connection Model . . . . . . . . . . 159
       12.3.3.  Automatic RDMA Connection Model . . . . . . . . . . 160
     12.4.  Buffer Management, Transfer, Flow Control 124
     12.2.  Browsing Exports . . . . . . . . . 160
     12.5.  Retry and Replay . . . . . . . . . . . 125
     12.3.  Server Pseudo File System  . . . . . . . . . . 163
     12.6.  The Back Channel . . . . . 125
     12.4.  Multiple Roots . . . . . . . . . . . . . . . 164
     12.7.  COMPOUND Sizing Issues . . . . . . 126
     12.5.  Filehandle Volatility  . . . . . . . . . . . 165
     12.8.  Data Alignment . . . . . . 126
     12.6.  Exported Root  . . . . . . . . . . . . . . . 165
     12.9.  NFSv4 Integration . . . . . . 126
     12.7.  Mount Point Crossing . . . . . . . . . . . . . 167
       12.9.1.  Minor Versioning . . . . . 126
     12.8.  Security Policy and Name Space Presentation  . . . . . . 127
   13. File Locking and Share Reservations . . . . . . . 167
       12.9.2.  Slot Identifiers and Server Duplicate Request
                Cache . . . . . . 128
     13.1.  Locking  . . . . . . . . . . . . . . . . . 167
       12.9.3.  Resolving server callback races with sessions . . . 170
       12.9.4.  COMPOUND and CB_COMPOUND . . . . 128
       13.1.1. Client ID . . . . . . . . . . 171
     12.10. Sessions Security Considerations . . . . . . . . . . . . 173
       12.10.1. Denial 129
       13.1.2. Server Release of Service via Unauthorized State Changes Clientid  . . 173
     12.11. Session Mechanics - Steady State . . . . . . . . . . . 132
       13.1.3. State-owner and Stateid Definition  . 177
       12.11.1. Obligations of the Server . . . . . . . . 133
       13.1.4. Use of the Stateid and Locking  . . . . . 177
       12.11.2. Obligations of the Client . . . . . . 136
     13.2.  Lock Ranges  . . . . . . . 177
       12.11.3. Steps the Client Takes To Establish a Session . . . 178
     12.12. Session Mechanics - Recovery . . . . . . . . . . . . 138
     13.3.  Upgrading and Downgrading Locks  . . 178
       12.12.1. Events Requiring Client Action . . . . . . . . . . 138
     13.4.  Blocking Locks . 178
       12.12.2. Events Requiring Server Action . . . . . . . . . . . 180
   13. Multi-server Name Space . . . . . . . . . 139
     13.5.  Lease Renewal  . . . . . . . . . . 180
     13.1.  Location attributes . . . . . . . . . . . 140
     13.6.  Crash Recovery . . . . . . . 180
     13.2.  File System Presence or Absence . . . . . . . . . . . . 181
     13.3.  Getting Attributes for an Absent File System . . 140
       13.6.1. Client Failure and Recovery . . . . 182
       13.3.1.  GETATTR Within an Absent File System . . . . . . . . 182
       13.3.2.  READDIR and Absent File Systems . 140
       13.6.2. Server Failure and Recovery . . . . . . . . . 183
     13.4.  Uses of Location Information . . . . 141
       13.6.3. Network Partitions and Recovery . . . . . . . . . . 184
       13.4.1.  File System Replication . 143
     13.7.  Server Revocation of Locks . . . . . . . . . . . . . 185
       13.4.2.  File System Migration . . 147
     13.8.  Share Reservations . . . . . . . . . . . . . 185
       13.4.3.  Referrals . . . . . . 148
     13.9.  OPEN/CLOSE Operations  . . . . . . . . . . . . . . . 186
     13.5.  Additional Client-side Considerations . . 149
     13.10. Open Upgrade and Downgrade . . . . . . . 187
     13.6.  Effecting File System Transitions . . . . . . . . 150
     13.11. Short and Long Leases  . . . 187
       13.6.1.  Transparent File System Transitions . . . . . . . . 188
       13.6.2.  Filehandles and File System Transitions . . . . . . 190
       13.6.3.  Fileid's 150
     13.12. Clocks, Propagation Delay, and File System Transitions Calculating Lease
            Expiration . . . . . . . . 191
       13.6.4.  Fsid's and File System Transitions . . . . . . . . . 191
       13.6.5.  The Change Attribute and File System Transitions . . 192
       13.6.6.  Lock State and File System Transitions . . . . 151
     13.13. Vestigial Locking Infrastructure From V4.0 . . . 192
       13.6.7.  Write Verifiers and File System Transitions . . . . 196
     13.7.  Effecting File System Referrals 151
   14. Client-Side Caching . . . . . . . . . . . . 196
       13.7.1.  Referral Example (LOOKUP) . . . . . . . . . 152
     14.1.  Performance Challenges for Client-Side Caching . . . . 196
       13.7.2.  Referral Example (READDIR) . 153
     14.2.  Delegation and Callbacks . . . . . . . . . . . . 200
     13.8.  The Attribute fs_absent . . . . 154
       14.2.1. Delegation Recovery . . . . . . . . . . . . 202
     13.9.  The Attribute fs_locations . . . . . 155
     14.3.  Data Caching . . . . . . . . . . 203
     13.10. The Attribute fs_locations_info . . . . . . . . . . . . 205
     13.11. The Attribute fs_status 157
       14.3.1. Data Caching and OPENs  . . . . . . . . . . . . . . . 157
       14.3.2. Data Caching and File Locking . 213
   14. Directory Delegations . . . . . . . . . . . 158
       14.3.3. Data Caching and Mandatory File Locking . . . . . . . 160
       14.3.4. Data Caching and File Identity  . . 216
     14.1.  Introduction to Directory Delegations . . . . . . . . . 217
     14.2.  Directory 160
     14.4.  Open Delegation Design (in brief) .  . . . . . . . . 218
     14.3.  Recommended Attributes in support of Directory
            Delegations . . . . . . . . . . . . 161
       14.4.1. Open Delegation and Data Caching  . . . . . . . . . . 219
     14.4. 164
       14.4.2. Open Delegation Recall and File Locks  . . . . . . . . . . . 165
       14.4.3. Handling of CB_GETATTR  . . . . . . . . 220
     14.5.  Directory Delegation Recovery . . . . . . . 165
       14.4.4. Recall of Open Delegation . . . . . . 220
   15. Parallel NFS (pNFS) . . . . . . . . 168
       14.4.5. Clients that Fail to Honor Delegation Recalls . . . . 170
       14.4.6. Delegation Revocation . . . . . . . . . 220
     15.1.  Introduction . . . . . . . 171
     14.5.  Data Caching and Revocation  . . . . . . . . . . . . . . 171
       14.5.1. Revocation Recovery for Write Open Delegation . 220
     15.2.  General Definitions . . . 172
     14.6.  Attribute Caching  . . . . . . . . . . . . . . . 223
       15.2.1. . . . . 173
     14.7.  Data and Metadata Server Caching and Memory Mapped Files  . . . 175
     14.8.  Name Caching . . . . . . . . . . . . . . . . 223
       15.2.2.  Client . . . . . . 177
     14.9.  Directory Caching  . . . . . . . . . . . . . . . . . 223
       15.2.3.  Storage Device . . 178
   15. Multi-server Name Space . . . . . . . . . . . . . . . . . 223
       15.2.4.  Storage Protocol . . 179
     15.1.  Location attributes  . . . . . . . . . . . . . . . . 223
       15.2.5.  Control Protocol . . 179
     15.2.  File System Presence or Absence  . . . . . . . . . . . . 179
     15.3.  Getting Attributes for an Absent File System . . . . 224
       15.2.6.  Metadata . . 181
       15.3.1. GETATTR Within an Absent File System  . . . . . . . . 181
       15.3.2. READDIR and Absent File Systems . . . . . . . . . . . 182
     15.4.  Uses of Location Information . 224
       15.2.7.  Layout . . . . . . . . . . . . . 183
       15.4.1. File System Replication . . . . . . . . . . 224
     15.3.  pNFS protocol semantics . . . . . 183
       15.4.2. File System Migration . . . . . . . . . . . 225
       15.3.1.  Definitions . . . . . 184
       15.4.3. Referrals . . . . . . . . . . . . . . . 225
       15.3.2.  Guarantees Provided by Layouts . . . . . . . 185
     15.5.  Additional Client-side Considerations  . . . . 228
       15.3.3.  Getting a Layout . . . . . 185
     15.6.  Effecting File System Transitions  . . . . . . . . . . . 186
       15.6.1. Transparent File System Transitions . . 229
       15.3.4.  Committing a Layout . . . . . . . 187
       15.6.2. Filehandles and File System Transitions . . . . . . . 189
       15.6.3. Fileid's and File System Transitions  . . 230
       15.3.5.  Recalling a Layout . . . . . . 189
       15.6.4. Fsid's and File System Transitions  . . . . . . . . . 190
       15.6.5. The Change Attribute and File System Transitions  . . 232
       15.3.6.  Metadata Server Write Propagation 190
       15.6.6. Lock State and File System Transitions  . . . . . . . 191
       15.6.7. Write Verifiers and File System Transitions . . . 237
       15.3.7.  Crash Recovery . . 194
     15.7.  Effecting File System Referrals  . . . . . . . . . . . . 194
       15.7.1. Referral Example (LOOKUP) . . . . . 238
       15.3.8.  Security Considerations . . . . . . . . . 195
       15.7.2. Referral Example (READDIR)  . . . . . 243
     15.4.  The NFSv4 File Layout Type . . . . . . . . 199
     15.8.  The Attribute fs_absent  . . . . . . . 244
       15.4.1.  File Striping and Data Access . . . . . . . . . 201
     15.9.  The Attribute fs_locations . . 244
       15.4.2.  Global Stateid Requirements . . . . . . . . . . . . 253
       15.4.3. . 201
     15.10. The Layout Iomode Attribute fs_locations_info  . . . . . . . . . . . . 203
     15.11. The Attribute fs_status  . . . . . 253
       15.4.4.  Storage Device State Propagation . . . . . . . . . . 253
       15.4.5.  Storage Device Component File Size . 212
   16. Directory Delegations . . . . . . . . 256
       15.4.6.  Crash Recovery Considerations . . . . . . . . . . . 256
       15.4.7.  Security Considerations for the File Layout Type . 215
     16.1.  Introduction to Directory Delegations  . 257
       15.4.8.  Alternate Approaches . . . . . . . . 216
     16.2.  Directory Delegation Design (in brief) . . . . . . . . 257
   16. Minor Versioning . 217
     16.3.  Recommended Attributes in support of Directory
            Delegations  . . . . . . . . . . . . . . . . . . . . . 258
   17. Internationalization . 218
     16.4.  Delegation Recall  . . . . . . . . . . . . . . . . . . . 261
     17.1.  Stringprep profile for the utf8str_cs type 219
     16.5.  Directory Delegation Recovery  . . . . . . . 262
     17.2.  Stringprep profile for the utf8str_cis type . . . . . . 264
     17.3.  Stringprep profile for the utf8str_mixed type 219
   17. Parallel NFS (pNFS) . . . . . 265
     17.4.  UTF-8 Related Errors . . . . . . . . . . . . . . . . 219
     17.1.  Introduction . . 266
   18. Error Values . . . . . . . . . . . . . . . . . . . . 219
     17.2.  General Definitions  . . . . 267
     18.1.  Error Definitions . . . . . . . . . . . . . . 222
       17.2.1. Metadata Server . . . . . 267
     18.2.  Operations and their valid errors . . . . . . . . . . . 279
     18.3.  Callback operations and their valid errors . . . 222
       17.2.2. Client  . . . . 287
     18.4.  Errors and the operations that use them . . . . . . . . 287
   19. NFS version 4.1 Procedures . . . . . . . . . . . 222
       17.2.3. Storage Device  . . . . . . 293
     19.1.  Procedure 0: NULL - No Operation . . . . . . . . . . . . 293
     19.2.  Procedure 1: COMPOUND - Compound Operations . 222
       17.2.4. Storage Protocol  . . . . . 294
   20. NFS version 4.1 Operations . . . . . . . . . . . . . 222
       17.2.5. Control Protocol  . . . . 298
     20.1.  Operation 3: ACCESS - Check Access Rights . . . . . . . 299
     20.2.  Operation 4: CLOSE - Close File . . . . . . . 223
       17.2.6. Metadata  . . . . . 301
     20.3.  Operation 5: COMMIT - Commit Cached Data . . . . . . . . 302
     20.4.  Operation . . . . . . . . . 223
       17.2.7. Layout  . . . . . . . . . . . . . . . . . . . . . . . 223
     17.3.  pNFS protocol semantics  . . . . . . . . . . . . . . . . 224
       17.3.1. Definitions . . . . . . . . . . . . . . . . . . . . . 224
       17.3.2. Guarantees Provided by Layouts  . . . . . . . . . . . 227
       17.3.3. Getting a Layout  . . . . . . . . . . . . . . . . . . 228
       17.3.4. Committing a Layout . . . . . . . . . . . . . . . . . 229
       17.3.5. Recalling a Layout  . . . . . . . . . . . . . . . . . 231
       17.3.6. Metadata Server Write Propagation . . . . . . . . . . 237
       17.3.7. Crash Recovery  . . . . . . . . . . . . . . . . . . . 237
       17.3.8. Security Considerations . . . . . . . . . . . . . . . 243
     17.4.  The NFSv4 File Layout Type . . . . . . . . . . . . . . . 244
       17.4.1. File Striping and Data Access . . . . . . . . . . . . 244
       17.4.2. Global Stateid Requirements . . . . . . . . . . . . . 252
       17.4.3. The Layout Iomode . . . . . . . . . . . . . . . . . . 252
       17.4.4. Storage Device State Propagation  . . . . . . . . . . 253
       17.4.5. Storage Device Component File Size  . . . . . . . . . 255
       17.4.6. Crash Recovery Considerations . . . . . . . . . . . . 256
       17.4.7. Security Considerations for the File Layout Type  . . 256
       17.4.8. Alternate Approaches  . . . . . . . . . . . . . . . . 257
   18. Internationalization  . . . . . . . . . . . . . . . . . . . . 258
     18.1.  Stringprep profile for the utf8str_cs type . . . . . . . 259
     18.2.  Stringprep profile for the utf8str_cis type  . . . . . . 261
     18.3.  Stringprep profile for the utf8str_mixed type  . . . . . 262
     18.4.  UTF-8 Related Errors . . . . . . . . . . . . . . . . . . 263
   19. Error Values  . . . . . . . . . . . . . . . . . . . . . . . . 264
     19.1.  Error Definitions  . . . . . . . . . . . . . . . . . . . 264
     19.2.  Operations and their valid errors  . . . . . . . . . . . 276
     19.3.  Callback operations and their valid errors . . . . . . . 284
     19.4.  Errors and the operations that use them  . . . . . . . . 284
   20. NFS version 4.1 Procedures  . . . . . . . . . . . . . . . . . 290
     20.1.  Procedure 0: NULL - No Operation . . . . . . . . . . . . 290
     20.2.  Procedure 1: COMPOUND - Compound Operations  . . . . . . 291
   21. NFS version 4.1 Operations  . . . . . . . . . . . . . . . . . 295
     21.1.  Operation 3: ACCESS - Check Access Rights  . . . . . . . 296
     21.2.  Operation 4: CLOSE - Close File  . . . . . . . . . . . . 298
     21.3.  Operation 5: COMMIT - Commit Cached Data . . . . . . . . 299
     21.4.  Operation 6: CREATE - Create a Non-Regular File Object . 305
     20.5. 302
     21.5.  Operation 7: DELEGPURGE - Purge Delegations Awaiting
            Recovery . . . . . . . . . . . . . . . . . . . . . . . . 307
     20.6. 304
     21.6.  Operation 8: DELEGRETURN - Return Delegation . . . . . . 308
     20.7. 305
     21.7.  Operation 9: GETATTR - Get Attributes  . . . . . . . . . 309
     20.8. 306
     21.8.  Operation 10: GETFH - Get Current Filehandle . . . . . . 310
     20.9. 307
     21.9.  Operation 11: LINK - Create Link to a File . . . . . . . 311
     20.10. 308
     21.10. Operation 12: LOCK - Create Lock . . . . . . . . . . . . 312
     20.11. 309
     21.11. Operation 13: LOCKT - Test For Lock  . . . . . . . . . . 316
     20.12. 313
     21.12. Operation 14: LOCKU - Unlock File  . . . . . . . . . . . 317
     20.13. 314
     21.13. Operation 15: LOOKUP - Lookup Filename . . . . . . . . . 318
     20.14. 315
     21.14. Operation 16: LOOKUPP - Lookup Parent Directory  . . . . 320
     20.15. 317
     21.15. Operation 17: NVERIFY - Verify Difference in
            Attributes . . . . . . . . . . . . . . . . . . . . . . . 321
     20.16. 318
     21.16. Operation 18: OPEN - Open a Regular File . . . . . . . . 322
     20.17. 319
     21.17. Operation 19: OPENATTR - Open Named Attribute
            Directory  . . . . . . . . . . . . . . . . . . . . . . . 336
     20.18. 333
     21.18. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access . 337
     20.19. 334
     21.19. Operation 22: PUTFH - Set Current Filehandle . . . . . . 338
     20.20. 335
     21.20. Operation 23: PUTPUBFH - Set Public Filehandle . . . . . 339
     20.21. 336
     21.21. Operation 24: PUTROOTFH - Set Root Filehandle  . . . . . 341
     20.22. 338
     21.22. Operation 25: READ - Read from File  . . . . . . . . . . 341
     20.23. 338
     21.23. Operation 26: READDIR - Read Directory . . . . . . . . . 343
     20.24. 340
     21.24. Operation 27: READLINK - Read Symbolic Link  . . . . . . 347
     20.25. 344
     21.25. Operation 28: REMOVE - Remove File System Object . . . . 348
     20.26. 345
     21.26. Operation 29: RENAME - Rename Directory Entry  . . . . . 350
     20.27. 347
     21.27. Operation 31: RESTOREFH - Restore Saved Filehandle . . . 351
     20.28. 348
     21.28. Operation 32: SAVEFH - Save Current Filehandle . . . . . 352
     20.29. 349
     21.29. Operation 33: SECINFO - Obtain Available Security  . . . 353
     20.30. 350
     21.30. Operation 34: SETATTR - Set Attributes . . . . . . . . . 356
     20.31. 353
     21.31. Operation 37: VERIFY - Verify Same Attributes  . . . . . 358
     20.32. 355
     21.32. Operation 38: WRITE - Write to File  . . . . . . . . . . 360
     20.33. 357
     21.33. Operation 40: BACKCHANNEL_CTL - Backchannel control  . . 364
     20.34. 361
     21.34. Operation 41: BIND_CONN_TO_SESSION . . . . . . . . . . . 364
     20.35. 361
     21.35. Operation 42: CREATE_CLIENTID - Instantiate Clientid . . 368
     20.36. 365
     21.36. Operation 43: CREATE_SESSION - Create New Session and
            Confirm Clientid . . . . . . . . . . . . . . . . . . . . 374
     20.37. 371
     21.37. Operation 44: DESTROY_SESSION - Destroy existing
            session  . . . . . . . . . . . . . . . . . . . . . . . . 382
     20.38. 379
     21.38. Operation 45: FREE_STATEID - Free stateid with no
            locks  . . . . . . . . . . . . . . . . . . . . . . . . . 383
     20.39. 380
     21.39. Operation 46: GET_DIR_DELEGATION - Get a directory
            delegation . . . . . . . . . . . . . . . . . . . . . . . 384
     20.40. 381
     21.40. Operation 47: GETDEVICEINFO - Get Device Information . . 388
     20.41. 385
     21.41. Operation 48: GETDEVICELIST  . . . . . . . . . . . . . . 389
     20.42. 386
     21.42. Operation 49: LAYOUTCOMMIT - Commit writes made using
            a layout . . . . . . . . . . . . . . . . . . . . . . . . 390
     20.43. 387
     21.43. Operation 50: LAYOUTGET - Get Layout Information . . . . 394
     20.44. 391
     21.44. Operation 51: LAYOUTRETURN - Release Layout
            Information  . . . . . . . . . . . . . . . . . . . . . . 396
     20.45. 394
     21.45. Operation 52: SECINFO_NO_NAME - Get Security on
            Unnamed Object . . . . . . . . . . . . . . . . . . . . . 399
     20.46. 396
     21.46. Operation 53: SEQUENCE - Supply per-procedure
            sequencing and control . . . . . . . . . . . . . . . . . 400
     20.47. 398
     21.47. Operation 54: SET_SSV  . . . . . . . . . . . . . . . . . 403
     20.48. 401
     21.48. Operation 55: TEST_STATEID - Test stateids for
            validity . . . . . . . . . . . . . . . . . . . . . . . . 405
     20.49. 402
     21.49. Operation 56: WANT_DELEGATION  . . . . . . . . . . . . . 406
     20.50. 404
     21.50. Operation 10044: ILLEGAL - Illegal operation . . . . . . 409
   21. 407
   22. NFS version 4.1 Callback Procedures . . . . . . . . . . . . . 409
     21.1. 407
     22.1.  Procedure 0: CB_NULL - No Operation  . . . . . . . . . . 410
     21.2. 408
     22.2.  Procedure 1: CB_COMPOUND - Compound Operations . . . . . 410
   22. 408
   23. NFS version 4.1 Callback Operations . . . . . . . . . . . . . 412
     22.1. 410
     23.1.  Operation 3: CB_GETATTR - Get Attributes . . . . . . . . 412
     22.2. 410
     23.2.  Operation 4: CB_RECALL - Recall an Open Delegation . . . 413
     22.3. 411
     23.3.  Operation 5: CB_LAYOUTRECALL . . . . . . . . . . . . . . 414
     22.4. 412
     23.4.  Operation 6: CB_NOTIFY - Notify directory changes  . . . 417
     22.5. 415
     23.5.  Operation 7: CB_PUSH_DELEG . . . . . . . . . . . . . . . 420
     22.6. 418
     23.6.  Operation 8: CB_RECALL_ANY - Keep any N delegations  . . 421
     22.7. 419
     23.7.  Operation 9: CB_RECALLABLE_OBJ_AVAIL . . . . . . . . . . 424
     22.8. 422
     23.8.  Operation 10: CB_RECALL_CREDIT - change flow control
            limits . . . . . . . . . . . . . . . . . . . . . . . . . 425
     22.9. 423
     23.9.  Operation 11: CB_SEQUENCE - Supply callback channel
            sequencing and control . . . . . . . . . . . . . . . . . 425
     22.10. 423
     23.10. Operation 12: CB_WANTS_CANCELLED . . . . . . . . . . . . 427
     22.11. 425
     23.11. Operation 10044: CB_ILLEGAL - Illegal Callback
            Operation  . . . . . . . . . . . . . . . . . . . . . . . 428
   23. 426
   24. Security Considerations . . . . . . . . . . . . . . . . . . . 428
   24. 426
   25. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 429
     24.1. 427
     25.1.  Defining new layout types  . . . . . . . . . . . . . . . 429
   25. 427
   26. References  . . . . . . . . . . . . . . . . . . . . . . . . . 429
     25.1. 427
     26.1.  Normative References . . . . . . . . . . . . . . . . . . 429
     25.2. 427
     26.2.  Informative References . . . . . . . . . . . . . . . . . 431 429
   Appendix A.  ACL Algorithm Examples . . . . . . . . . . . . . . . 430
     A.1.   Recomputing mode upon SETATTR of ACL . . . . . . . . . . 430
     A.2.   Computing the Inherited ACL  . . . . . . . . . . . . . . 433
       A.2.1.  Discussion  . . . . . . . . . . . . . . . . . . . . . 434
     A.3.   Applying a Mode to an Existing ACL . . . . . . . . . . . 435
   Appendix B.  Acknowledgments  . . . . . . . . . . . . . . . . . . 432 439
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . . 432 440
   Intellectual Property and Copyright Statements  . . . . . . . . . 434 441

1.  Introduction

1.1.  The NFSv4.1 Protocol

   The NFSv4.1 protocol is a minor version of the NFSv4 protocol
   described in [2].  It generally follows the guidelines for minor
   versioning model laid in Section 10 of RFC 3530.  However, it
   diverges from guidelines 11 ("a client and server that supports minor
   version X must support minor versions 0 through X-1"), and 12 ("no
   features may be introduced as mandatory in a minor version").  These
   divergences are due to the introduction of the sessions model for
   managing non-idempotent operations and the RECLAIM_COMPLETE
   operation.  These two new features are infrastructural in nature and
   simplify implementation of existing and other new features.  Making
   them optional would add undue complexity to protocol definition and
   implementation.  NFSv4.1 accordingly updates the Minor Versioning
   guidelines (Section 16). 7).

   NFSv4.1, as a minor version, is consistent with the overall goals for
   NFS Version 4, but extends the protocol so as to better meet those
   goals, based on experiences with NFSv4.0.  In addition, NFSv4.1 has
   adopted some additional goals, which motivate some of the major
   extensions in minor version 1.

1.2.  NFS Version 4 Goals

   The NFS version 4 protocol is a further revision of the NFS protocol
   defined already by versions 2 [17]] and 3 [18].  It retains the
   essential characteristics of previous versions: design for easy
   recovery, independent of transport protocols, operating systems and
   file systems, simplicity, and good performance.  The NFS version 4
   revision has the following goals:

   o  Improved access and good performance on the Internet.

      The protocol is designed to transit firewalls easily, perform well
      where latency is high and bandwidth is low, and scale to very
      large numbers of clients per server.

   o  Strong security with negotiation built into the protocol.

      The protocol builds on the work of the ONCRPC working group in
      supporting the RPCSEC_GSS protocol.  Additionally, the NFS version
      4 protocol provides a mechanism to allow clients and servers the
      ability to negotiate security and require clients and servers to
      support a minimal set of security schemes.

   o  Good cross-platform interoperability.

      The protocol features a file system model that provides a useful,
      common set of features that does not unduly favor one file system
      or operating system over another.

   o  Designed for protocol extensions.

      The protocol is designed to accept standard extensions within a
      framework that enable and encourages backward compatibility.

1.3.  Minor Version 1 Goals

   Minor version one has the following goals, within the framework
   established by the overall version 4 goals.

   o  To correct significant structtural weaknesses and oversights
      discovered in the base protocol.

   o  To add clarity and specificity to areas left unaddressed or not
      addressed in sufficient detail in the base protocol.

   o  To add specific features based on experience with the existing
      protocol and recent industry developments.

   o  To provide protocol support to take advantage of clustered server
      deployments including the ability to provide scalabale parallel
      access to files distributed among multiple servers.

1.4.  Inconsistencies of this Document with Section XX

   Section XX, RPC Definition File, contains the definitions in XDR
   description language of the constructs used by the protocol.  Prior
   to this section, several of the constructs are reproduced for
   purposes of explanation.  Although every effort has been made to
   assure a correct and consistent description, the possibility of
   inconsistencies exists.  For any part of the document that is
   inconsistent with Section XX, Section XX is to be considered
   authoritative.

1.5.  Overview of NFS version 4.1 Features

   To provide a reasonable context for the reader, the major features of
   NFS version 4.1 protocol will be reviewed in brief.  This will be
   done to provide an appropriate context for both the reader who is
   familiar with the previous versions of the NFS protocol and the
   reader that is new to the NFS protocols.  For the reader new to the
   NFS protocols, there is still a set of fundamental knowledge that is
   expected.  The reader should be familiar with the XDR and RPC
   protocols as described in [3] and [4].  A basic knowledge of file
   systems and distributed file systems is expected as well.

   This description of version 4.1 features will not distinguish those
   added in minor version one from those present in the base protocol
   but will treat minor version 1 as a unified whole See Section 1.7 for
   a description of the differences between the two minor versions.

1.5.1.  RPC and Security

   As with previous versions of NFS, the External Data Representation
   (XDR) and Remote Procedure Call (RPC) mechanisms used for the NFS
   version 4.1 protocol are those defined in [3] and [4].  To meet end-
   to-end security requirements, the RPCSEC_GSS framework [5] will be
   used to extend the basic RPC security.  With the use of RPCSEC_GSS,
   various mechanisms can be provided to offer authentication,
   integrity, and privacy to the NFS version 4 protocol.  Kerberos V5
   will be used as described in [6] to provide one security framework.
   The LIPKEY GSS-API mechanism described in [7] will be used to provide
   for the use of user password and server public key by the NFS version
   4 protocol.  With the use of RPCSEC_GSS, other mechanisms may also be
   specified and used for NFS version 4.1 security.

   To enable in-band security negotiation, the NFS version 4.1 protocol
   has operations which provide the client a method of querying the
   server about its policies regarding which security mechanisms must be
   used for access to the server's file system resources.  With this,
   the client can securely match the security mechanism that meets the
   policies specified at both the client and server.

1.5.2.  Protocol Structure

1.5.2.1.  Core Protocol

   Unlike NFS Versions 2 and 3, which used a series of ancillary
   protocols (e.g.  NLM, NSM, MOUNT), within all minor versions of NFS
   version 4 only a single RPC protocol is used to make requests of the
   server.  Facilties, that had been separate protocols, such as
   locking, are now intergrated within a single unified protocol.

   A significant departure from the versions of the NFS protocol before
   version 4 is the introduction of the COMPOUND procedure.  For the NFS
   version 4 protocol, in all minor versions, there are two RPC
   procedures, NULL and COMPOUND.  The COMPOUND procedure is defined as
   a series of individual operations and these operations perform the
   sorts of functions performed by traditional NFS procedures.

   The operations combined within a COMPOUND request are evaluated in
   order by the server, without any atomicity guarantees.  A limited set
   of facilities exist to pass results from one operation to another.
   Once an operation returns a failing result, the evaluation ends and
   the results of all evaluated operations are returned to the client.

   With the use of the COMPOUND procedure, the client is able to build
   simple or complex requests.  These COMPOUND requests allow for a
   reduction in the number of RPCs needed for logical file system
   operations.  For example, multi-component lookup requests can be
   constructed by combining multiple LOOKUP operations.  Those can be
   further combined with operations such as GETATTR, READDIR, or OPEN
   plus READ to do more complicated sets of operation without incurring
   additional latency.

   NFS Version 4.1 also contains a a considerable set of callback
   operations in which the server makes an RPC directed at the client.
   Callback RPC's have a similar structure to that of the normal server
   requests.  For the NFS version 4 protocol callbacks in all minor
   versions, there are two RPC procedures, NULL and CB_COMPOUND.  The
   CB_COMPOUND procedure is defined in analogous fashion to that of
   COMPOUND with its own set of callback operations.

   Addition of new server and callback operation within the COMPOUND and
   CB_COMPOUND request framework provide means of extending the protocol
   in subsequent minor versions.

   Except for a small number of operations needed for session creation,
   server requests and callback requests are performed within the
   context of a session.  Sessions provide a client context for every
   request and support robust replay protection for non-idempotent
   requests.

1.5.2.2.  Parallel Access

   Minor version one supports high-performance data access to a
   clustered server implementation by enabling a separation of metadata
   access and data access, with the latter done to multiple servers in
   parallel.

   Such parallel data access is controlled by recallable objects known
   as "layouts", which are integrated into the protocol locking model.
   Clients direct requests for data access to a set of data servers
   specified by the layout via a data storage protocol which may be
   NFSv4.1 or may be another protocol.

1.5.3.  File System Model

   The general file system model used for the NFS version 4.1 protocol
   is the same as previous versions.  The server file system is
   hierarchical with the regular files contained within being treated as
   opaque byte streams.  In a slight departure, file and directory names
   are encoded with UTF-8 to deal with the basics of
   internationalization.

   The NFS version 4.1 protocol does not require a separate protocol to
   provide for the initial mapping between path name and filehandle.
   All file systems exported by a server are presented as a tree so that
   all file systems are reachable from a special per-server global root
   filefilandle.  This allows LOOKUP operations to be used to perform
   functions previously provided by the MOUNT protocol.  The server
   provides any necessary pseudo fileystems to bridge any gaps that
   arise due unexported gaps between exported file systems.

1.5.3.1.  Filehandles

   As in previous versions of the NFS protocol, opaque filehandles are
   used to identify individual files and directories.  Lookup-type and
   create operations are used to go from file and directory names to the
   filehandle which is then used to identify the object to subsequent
   operations.

   The NFS version 4.1 protocol provides support for both persistent
   filehandles, guaranteed to be valid for the lifetime of the file
   system object designated.  In addition it provides support to servers
   to provide filehandles with more limited validity guarantees, called
   volatile filehandles.

1.5.3.2.  File Attributes

   The NFS version 4.1 protocol has a rich and extensible attribute
   structure.  Only a small set of the defined attributes are mandatory
   and must be provided by all server implementations.  The other
   attributes are known as "recommended" attributes.

   One significant recommended file attribute is the Access Control List
   (ACL) attribute.  This attribute provides for directory and file
   access control beyond the model used in NFS Versions 2 and 3.  The
   ACL definition allows for specification specific sets of permissions
   for individual users and groups.  In addition, ACL inheritance allows
   propagation of access permissions and restriction down a directory
   tree as fileystsme objects are created.

   One other type of attribute is the named attribute.  A named
   attribute is an opaque byte stream that is associated with a
   directory or file and referred to by a string name.  Named attributes
   are meant to be used by client applications as a method to associate
   application specific data with a regular file or directory.

1.5.3.3.  Multi-server Namespace

   NFS Version 4.1 contains a number of features to allow implementation
   of namespaces that cross server boundaries and that allow to and
   facilitate a non-disruptive transfer of support for individual file
   systems between servers.  They are all based upon attributes that
   allow one file system to specify alternate or new locations for that
   file system.

   These attributes may be used together with the concept of absent file
   system which provide specifications for additional locations but no
   actual file system content.  This allows a number of important
   facilties:

   o  Location attributes may be used with absent file systems to
      implement referrals whereby one server may direct the client to a
      file system provided by another server.  This allows extensive
      mult-server namspaces to be constructed.

   o  Location attributes may be provided for present file systems to
      provide the locations alternate file system instances or replicas
      to be used in the event that the current file system instance
      becomes unavailable.

   o  Location attributes may be provided when a previously present file
      system becomes absent.  This allows non-disruptive migration of
      file systems to alternate servers.

1.5.4.  Locking Facilities

   As mentioned previously, NFS v4.1, is a single protocol which
   includes locking facilities.  These locking facilities include
   support for many types of locks including a number of sorts of
   recallable locks.  Recallable locks such as delegations allow the
   client to be assured that certain events will not occur so long as
   that lock is held.  When circumstances change, the lock is recalled
   via a callback via a callback request.  The assurances provided by
   delegations allow more extensive caching to be done safely when
   circumstances allow it.

   o  Share reservations as established by OPEN operations.

   o  Byte-range locks.

   o  File delegations which are recallable locks that assure the holder
      that inconsitent opens and file changes cannot occur so long as
      the delegation is held.

   o  Directory delegations which are recallable delegations that assure
      the holder that inconsistent directory modifications cannot occur
      so long as the deleagtion is held.

   o  Layouts which are recallable objects that assure the holder that
      direct access to the file data may be performed directly by the
      client and that no change to the data's location inconsistent with
      that access may be made so long as the layout is held.

   All locks for a given client are tied together under a single client-
   wide lease.  All requests made on sessions associated with the client
   renew that lease.  When leases are not promptly renewed lock are
   subject to revocation.  In the event of server reinitialization,
   clients have the opportunity to safely reclaim their locks within a
   special grace period.

1.6.  General Definitions

   The following definitions are provided for the purpose of providing
   an appropriate context for the reader.

   Client  The "client" is the entity that accesses the NFS server's
      resources.  The client may be an application which contains the
      logic to access the NFS server directly.  The client may also be
      the traditional operating system client remote file system
      services for a set of applications.

      In the case of file locking the client is the entity that
      maintains a set of locks on behalf of one or more applications.
      This client is responsible for crash or failure recovery for those
      locks it manages.

      Note that multiple clients may share the same transport and
      multiple clients may exist on the same network node.

   Clientid  A 64-bit quantity used as a unique, short-hand reference to
      a client supplied Verifier and ID.  The server is responsible for
      supplying the Clientid.

   Lease  An interval of time defined by the server for which the client
      is irrevocably granted a lock.  At the end of a lease period the
      lock may be revoked if the lease has not been extended.  The lock
      must be revoked if a conflicting lock has been granted after the
      lease interval.

      All leases granted by a server have the same fixed interval.  Note
      that the fixed interval was chosen to alleviate the expense a
      server would have in maintaining state about variable length
      leases across server failures.

   Lock  The term "lock" is used to refer any of record (byte- range)
      locks, share reservations, delegations or layouts unless
      specifically stated otherwise.

   Server  The "Server" is the entity responsible for coordinating
      client access to a set of file systems.

   Stable Storage  NFS version 4 servers must be able to recover without
      data loss from multiple power failures (including cascading power
      failures, that is, several power failures in quick succession),
      operating system failures, and hardware failure of components
      other than the storage medium itself (for example, disk,
      nonvolatile RAM).

      Some examples of stable storage that are allowable for an NFS
      server include:

      1.  Media commit of data, that is, the modified data has been
          successfully written to the disk media, for example, the disk
          platter.

      2.  An immediate reply disk drive with battery-backed on- drive
          intermediate storage or uninterruptible power system (UPS).

      3.  Server commit of data with battery-backed intermediate storage
          and recovery software.

      4.  Cache commit with uninterruptible power system (UPS) and
          recovery software.

   Stateid  A 128-bit quantity returned by a server that uniquely
      defines the open and locking state provided by the server for a
      specific open or lock owner for a specific file. meaning and are
      reserved values.

   Verifier  A 64-bit quantity generated by the client that the server
      can use to determine if the client has restarted and lost all
      previous lock state.

1.7.  Differences from NFSv4.0

   The following summarizes the differences between minor version one
   and the base protocol:

   o  Implementation of the sessions model.

   o  Support for parallel access to data.

   o  Addition of the RECLAIM_COMPLETE operation to better structiure
      the lock reclamation process.

   o  < Support for directory delegation.

   o  Operations to re-obtain a delegation.

   o  Support for client and server implementation id's.

2.  Protocol  Core Infrastructure

2.1.  Introduction

2.2.  RPC and XDR

2.2.1.  RPC-based Security

2.2.1.1.  RPC Security Flavors

2.2.1.1.1.  RPCSEC_GSS and Security Services

2.2.1.1.1.1.  Authentication, Integrity, Privacy

2.2.1.1.1.2.  GSS Server Principal

2.2.1.2.  NFSv4 Security Tuples

2.2.1.2.1.  Security Service Negotiation

2.2.1.2.1.1.  SECINFO and SECINFO_NO_NAME

2.2.1.2.1.2.  Security Error

2.2.1.2.1.3.  PUTFH + LOOKUP

2.2.1.2.1.4.  PUTFH + LOOKUPP

2.2.1.2.1.5.  PUTFH + SECINFO

2.2.1.2.1.6.  PUTFH + Anything Else

2.3.  Non-RPC-based Security Services

2.3.1.  Authorization

2.3.2.  Auditing

2.3.3.  Intrusion Detection

2.4.  Transport Layers

2.4.1.  Ports

2.4.2.  Stream Transports

2.4.3.  RDMA Transports

2.4.3.1.  RDMA Requirements

2.4.3.2.  RDMA Connection Resources

2.5.  Session

2.5.1.  Motivation and Overview

2.5.2.  NFSv4 Integration

2.5.2.1.  COMPOUND and CB_COMPOUND

2.5.2.2.  SEQUENCE and CB_SEQUENCE

2.5.2.3.  Clientid and Session Association

2.5.3.  Channels

2.5.3.1.  Operation Channel

2.5.3.2.  Back Channel

2.5.3.2.1.  Back Channel RPC Security

2.5.3.3.  Session and Channel Association

2.5.3.4.  Connection and Channel Association

2.5.3.4.1.  Trunking

2.5.4.  Exactly Once Semantics

2.5.4.1.  Slot Identifiers and Server Duplicate Request Cache

2.5.4.2.  Retry and Replay

2.5.4.3.  Resolving server callback races with sessions

2.6.  Channel Management

2.6.1.  Buffer Management

2.6.2.  Data Types

   The syntax Transfer

2.6.2.1.  Inline Data Transfer (Stream and semantics to describe RDMA)

2.6.2.2.  Direct Data Transfer (RDMA)

2.6.3.  Flow Control

2.6.4.  COMPOUND Sizing Issues

2.6.5.  Data Alignment

2.7.  Sessions Security

2.7.1.  Denial of Service via Unauthorized State Changes

2.8.  Session Mechanics - Steady State

2.8.1.  Obligations of the data types Server

2.8.2.  Obligations of the Client

2.8.3.  Steps the Client Takes To Establish a Session

2.8.4.  Session Mechanics - Recovery

2.8.4.1.  Reconnection

2.8.4.2.  Failover

2.8.4.3.  Events Requiring Client Action

2.8.4.4.  Events Requiring Server Action

3.  RPC and Security Flavor

   The NFS version 4 4.1 protocol are defined in the XDR RFC4506 [3] and is a Remote Procedure Call (RPC)
   application that uses RPC RFC1831
   [4] documents.  The next sections build upon the XDR data types to
   define types version 2 and structures specific to this protocol.

2.1.  Basic Data Types

                   These are the base NFSv4 data types.

   +---------------+---------------------------------------------------+
   | corresponding eXternal
   Data Type     | Definition                                        |
   +---------------+---------------------------------------------------+
   | int32_t       | typedef int int32_t;                              |
   | uint32_t      | typedef unsigned int uint32_t;                    |
   | int64_t       | typedef hyper int64_t;                            |
   | uint64_t      | typedef unsigned hyper uint64_t;                  |
   | attrlist4     | typedef opaque attrlist4<>;                       |
   |               | Used for file/directory attributes                |
   | bitmap4       | typedef uint32_t bitmap4<>;                       |
   |               | Used in attribute array encoding.                 |
   | changeid4     | typedef uint64_t changeid4;                       |
   |               | Used in definition of change_info                 |
   | clientid4     | typedef uint64_t clientid4;                       |
   |               | Shorthand reference to client identification      |
   | component4    | typedef utf8str_cs component4;                    |
   |               | Represents path name components                   |
   | count4        | typedef uint32_t count4;                          |
   |               | Various count parameters (READ, WRITE, COMMIT)    |
   | length4       | typedef uint64_t length4;                         |
   |               | Describes LOCK lengths                            |
   | linktext4     | typedef utf8str_cs linktext4;                     |
   |               | Symbolic link contents                            |
   | mode4         | typedef uint32_t mode4;                           |
   |               | Mode attribute data type                          |
   | nfs_cookie4   | typedef uint64_t nfs_cookie4;                     |
   |               | Opaque cookie value for READDIR                   |
   | nfs_fh4       | typedef opaque nfs_fh4<NFS4_FHSIZE>               |
   |               | Filehandle definition; NFS4_FHSIZE is defined Representation (XDR) as  |
   |               | 128                                               |
   | nfs_ftype4    | enum nfs_ftype4;                                  |
   |               | Various defined file types                        |
   | nfsstat4      | enum nfsstat4;                                    |
   |               | Return value for operations                       |
   | offset4       | typedef uint64_t offset4;                         |
   |               | Various offset designations (READ, WRITE, LOCK,   |
   |               | COMMIT)                                           |
   | pathname4     | typedef component4 pathname4<>;                   |
   |               | Represents path name for fs_locations             |
   | qop4          | typedef uint32_t qop4;                            |
   |               | Quality of protection designation in SECINFO      |
   | sec_oid4      | typedef opaque sec_oid4<>;                        |
   |               | Security Object Identifier RFC1831 [4] and RFC4506 [3].
   The sec_oid4 data type |
   |               | is not really opaque. Instead contains an ASN.1   |
   |               | OBJECT IDENTIFIER RPCSEC_GSS security flavor as used by GSS-API defined in RFC2203 [5] MUST be used
   as the       |
   |               | mech_type argument mechanism to GSS_Init_sec_context. See   |
   |               | RFC2743 [8] for details.                          |
   | seqid4        | typedef uint32_t seqid4;                          |
   |               | Sequence identifier used for file locking         |
   | utf8string    | typedef opaque utf8string<>;                      |
   |               | UTF-8 encoding for strings                        |
   | utf8str_cis   | typedef opaque utf8str_cis;                       |
   |               | Case-insensitive UTF-8 string                     |
   | utf8str_cs    | typedef opaque utf8str_cs;                        |
   |               | Case-sensitive UTF-8 string                       |
   | utf8str_mixed | typedef opaque utf8str_mixed;                     |
   |               | UTF-8 strings with a case sensitive prefix and a  |
   |               | case insensitive suffix.                          |
   | verifier4     | typedef opaque verifier4[NFS4_VERIFIER_SIZE];     |
   |               | Verifier used deliver stronger security for various operations (COMMIT,     |
   |               | CREATE, OPEN, READDIR, SETCLIENTID,               |
   |               | SETCLIENTID_CONFIRM, WRITE) NFS4_VERIFIER_SIZE is |
   |               | defined as 8.                                     |
   +---------------+---------------------------------------------------+

                          End of Base Data Types

                                  Table 1

2.2.  Structured Data Types

2.2.1.  nfstime4

   struct nfstime4 {
       int64_t seconds;
       uint32_t nseconds;
   }

   The nfstime4 structure gives the number of seconds NFS version 4
   protocol.

3.1.  Ports and nanoseconds
   since midnight or 0 hour January 1, 1970 Coordinated Universal Time
   (UTC).  Values greater than zero Transports

   Historically, NFS version 2 and version 3 servers have resided on
   port 2049.  The registered port 2049 RFC3232 [19] for the seconds field denote dates
   after NFS
   protocol should be the 0 hour January 1, 1970.  Values less than zero for default configuration.  NFSv4 clients SHOULD
   NOT use the
   seconds field denote dates before RPC binding protocols as described in RFC1833 [20].

   Where an NFS version 4 implementation supports operation over the 0 hour January 1, 1970.  In
   both cases, IP
   network protocol, the nseconds field is to supported transports between NFS and IP MUST
   have the following two attributes:

   1.  The transport must support reliable delivery of data in the order
       it was sent.

   2.  The transport must be added to among the seconds field
   for IETF-approved congestion control
       transport protocols.

   At the final time representation.  For example, if this document was written, the time to be
   represented is one-half second before 0 hour January 1, 1970, only two transports that
   had the
   seconds field would have a value of negative one (-1) above attributes were TCP and SCTP.  To enhance the
   nseconds fields would have a value of one-half second (500000000).
   Values greater than 999,999,999
   possibilities for nseconds are considered invalid.

   This data type interoperability, an NFS version 4 implementation
   MUST support operation over the TCP transport protocol.

   If TCP is used to pass time and date information.  A server
   converts to and from its local representation of time when processing
   time values, preserving as much accuracy as possible.  If the
   precision of timestamps stored for a file system object is less than
   defined, loss of precision can occur.  An adjunct time maintenance
   protocol is recommended to reduce transport, the client and server time skew.

2.2.2.  time_how4

   enum time_how4 {
       SET_TO_SERVER_TIME4 = 0,
       SET_TO_CLIENT_TIME4 = 1
   };

2.2.3.  settime4

   union settime4 switch (time_how4 set_it) {
       case SET_TO_CLIENT_TIME4:
           nfstime4       time;
       default:
           void;
   };

   The above definitions are used as the attribute definitions to set
   time values.  If set_it is SET_TO_SERVER_TIME4, then SHOULD use
   persistent connections for at least two reasons:

   1.  This will prevent the server uses
   its local representation weakening of time TCP's congestion control via
       short lived connections and will improve performance for the time value.

2.2.4.  specdata4

   struct specdata4 {
       uint32_t specdata1; /* major device number */
       uint32_t specdata2; /* minor device number */
   };

   This data type represents additional information for WAN
       environment by eliminating the device file
   types NF4CHR and NF4BLK.

2.2.5.  fsid4

   struct fsid4 {
       uint64_t        major;
       uint64_t        minor;
   };

2.2.6.  fs_location4

   struct fs_location4 {
       utf8str_cis    server<>;
       pathname4     rootpath;
   };

2.2.7.  fs_locations4

   struct fs_locations4 {
       pathname4     fs_root;
       fs_location4  locations<>;
   }; need for SYN handshakes.

   2.  The fs_location4 NFSv4.1 callback model has changed from NFSv4.0, and fs_locations4 data types are used for requires
       the
   fs_locations recommended attribute which is used for migration client and
   replication support.

2.2.8.  fattr4

   struct fattr4 {
       bitmap4       attrmask;
       attrlist4     attr_vals;
   };

   The fattr4 structure is used server to represent file and directory
   attributes.

   The bitmap is maintain a counted array of 32 bit integers used to contain bit
   values.  The position of client-created channel for
       the integer server to use.

   As noted in the array that contains bit n
   can be computed from the expression (n / 32) and its bit within that
   integer is (n mod 32).

   0            1
   +-----------+-----------+-----------+--
   |  count    | 31  ..  0 | 63  .. 32 |
   +-----------+-----------+-----------+--

2.2.9.  change_info4

   struct change_info4 {
       bool          atomic;
       changeid4     before;
       changeid4     after;
   };

   This structure is used with Security Considerations section, the CREATE, LINK, REMOVE, RENAME
   operations authentication
   model for NFS version 4 has moved from machine-based to let the client know the value principal-
   based.  However, this modification of the change attribute
   for the directory in which authentication model does
   not imply a technical requirement to move the target file system object resides.

2.2.10.  netaddr4

   struct netaddr4 {
       /* see struct rpcb in RFC1833 */
       string r_netid<>;    /* network id */
       string r_addr<>;     /* universal address */
   };

   The netaddr4 structure is used transport connection
   management model from whole machine-based to identify TCP/IP one based endpoints.
   The r_netid and r_addr fields are specified in RFC1833 [19], but they
   are underspecified in RFC1833 [19] as far as what they should look
   like for specific protocols.

   For TCP on a per user
   model.  In particular, NFS over IPv4 and TCP client implementations have
   traditionally multiplexed traffic for UDP multiple users over IPv4, the format of r_addr is the
   US-ASCII string:

   h1.h2.h3.h4.p1.p2

   The prefix, "h1.h2.h3.h4", is the standard textual form for
   representing an IPv4 address, which is always four octets long.
   Assuming big-endian ordering, h1, h2, h3, and h4, are respectively,
   the first through fourth octets each converted to ASCII-decimal.
   Assuming big-endian ordering, p1 and p2 are, respectively, the first
   and second octets each converted to ASCII-decimal.  For example, if a
   host, in big-endian order, has common
   TCP connection between an address of 0x0A010307 NFS client and there is
   a service listening on, in big endian order, port 0x020F (decimal
   527), then complete universal address is "10.1.3.7.2.15".

   For TCP over IPv4 server.  This has been true,
   regardless whether the value of r_netid NFS client is the string "tcp".  For UDP using AUTH_SYS, AUTH_DH,
   RPCSEC_GSS or any other flavor.  Similarly, NFS over IPv4 the value of r_netid is the string "udp".

   For TCP over IPv6 server
   implementations have assumed such a model and for UDP over IPv6, thus scale the format
   implementation of r_addr is the
   US-ASCII string:

   x1:x2:x3:x4:x5:x6:x7:x8.p1.p2

   The suffix "p1.p2" is TCP connection management in proportion to the service port, and is computed
   number of expected client machines.  NFS version 4.1 will not modify
   this connection management model.  NFS version 4.1 clients that
   violate this assumption can expect scaling issues on the same way
   as with universal addresses for TCP server and UDP over IPv4.  The prefix,
   "x1:x2:x3:x4:x5:x6:x7:x8", is the standard textual form
   hence reduced service.

   Note that for
   representing an IPv6 address as defined in Section 2.2 of RFC1884
   [9].  Additionally, various timers, the two alternative forms specified in Section
   2.2 client and server should avoid
   inadvertent synchronization of RFC1884 [9] are also acceptable. those timers.  For TCP over IPv6 the value further discussion
   of r_netid is the string "tcp6".  For UDP general issue refer to [Floyd].

3.1.1.  Client Retransmission Behavior

   When processing a request received over IPv6 the value of r_netid is the string "udp6".

2.2.11.  clientaddr4

   typedef netaddr4 clientaddr4;

   The clientaddr4 structure is used a reliable transport such as part of
   TCP, the SETCLIENTID
   operation to either specify NFS version 4.1 server MUST NOT silently drop the address of request,
   except if the client that is using transport connection has been broken.  Given such a
   clientid
   contract between NFS version 4.1 clients and servers, clients MUST
   NOT retry a request unless one or as part both of the callback registration.

2.2.12.  cb_client4

   struct cb_client4 {
       unsigned int  cb_program;
       netaddr4      cb_location;
   };

   This structure following are true:

   o  The transport connection has been broken

   o  The procedure being retried is used by the client to NULL procedure

   Since reliable transports, such as TCP, do not always synchronously
   inform a peer when the other peer has broken the connection (for
   example, when an NFS server of its call
   back address; includes reboots), the program number and NFS version 4.1 client address.

2.2.13.  nfs_client_id4

   struct nfs_client_id4 {
       verifier4     verifier;
       opaque        id<NFS4_OPAQUE_LIMIT>
   };

   This structure is part of the arguments may
   want to actively "probe" the SETCLIENTID operation.
   NFS4_OPAQUE_LIMIT is defined as 1024.

2.2.14.  open_owner4

   struct open_owner4 {
       clientid4     clientid;
       opaque        owner<NFS4_OPAQUE_LIMIT>
   };

   This structure is used connection to identify the owner see if has been broken.
   Use of open state.
   NFS4_OPAQUE_LIMIT is defined as 1024.

2.2.15.  lock_owner4

   struct lock_owner4 {
       clientid4     clientid;
       opaque        owner<NFS4_OPAQUE_LIMIT>
   };

   This structure is used to identify the owner of file locking state.
   NFS4_OPAQUE_LIMIT is defined as 1024.

2.2.16.  open_to_lock_owner4

   struct open_to_lock_owner4 {
       seqid4          open_seqid;
       stateid4        open_stateid;
       seqid4          lock_seqid;
       lock_owner4     lock_owner;
   };

   This structure NULL procedure is used for the first LOCK operation done for an
   open_owner4.  It provides both the open_stateid and lock_owner such
   that one recommended way to do so.  So, when
   a client experiences a remote procedure call timeout (of some
   arbitrary implementation specific amount), rather than retrying the transition is made from
   remote procedure call, it could instead issue a valid open_stateid sequence NULL procedure call
   to
   that of the new lock_stateid sequence.  Using this mechanism avoids server.  If the confirmation of server has died, the lock_owner/lock_seqid pair since it is tied transport connection
   break will eventually be indicated to established state in the form of NFS version 4.1 client.
   The client can then reconnect, and then retry the open_stateid/open_seqid.

2.2.17.  stateid4

   struct stateid4 {
       uint32_t        seqid;
       opaque          other[12];
   };

   This structure is used for original request.
   If the various state sharing mechanisms
   between NULL procedure call gets a response, the connection has not
   broken.  The client can decide to wait longer for the original
   request's response, or it can break the transport connection and server.
   reconnect before re-sending the original request.

   For callbacks from the server to the client, this data structure
   is read-only.  The starting value of the seqid field is undefined.
   The same rules apply,
   but the server is required to increment doing the seqid field monotonically at
   each transition of callback becomes the stateid.  This is important since client, and the client
   will inspect
   receiving the seqid in OPEN stateids to determine callback becomes the order server.

3.2.  Security Flavors

   Traditional RPC implementations have included AUTH_NONE, AUTH_SYS,
   AUTH_DH, and AUTH_KRB4 as security flavors.  With RFC2203 [5] an
   additional security flavor of
   OPEN processing done RPCSEC_GSS has been introduced which
   uses the functionality of GSS-API RFC2743 [8].  This allows for the
   use of various security mechanisms by the server.

2.2.18.  layouttype4

   enum layouttype4 {
       LAYOUT_NFSV4_FILES  = 1,
       LAYOUT_OSD2_OBJECTS = 2,
       LAYOUT_BLOCK_VOLUME = 3
   };

   A layout type specifies RPC layer without the layout being used.  The implication is
   that clients have "layout drivers" that support one or more layout
   types.  The file server advertises
   additional implementation overhead of adding RPC security flavors.
   For NFS version 4, the layout types it supports
   through RPCSEC_GSS security flavor MUST be implemented
   to enable the LAYOUT_TYPES file system attribute.  A client asks mandatory security mechanism.  Other flavors, such as,
   AUTH_NONE, AUTH_SYS, and AUTH_DH MAY be implemented as well.

3.2.1.  Security mechanisms for
   layouts NFS version 4

   The use of a particular type in LAYOUTGET, RPCSEC_GSS requires selection of: mechanism, quality of
   protection, and passes those layouts
   to its layout driver.

   The layouttype4 structure is 32 bits in length. service (authentication, integrity, privacy).  The range
   represented by the layout type is split into two parts.  Types within
   the range 0x00000000-0x7FFFFFFF are globally unique and are assigned
   according
   remainder of this document will refer to these three parameters of
   the description in Section 24.1; they are maintained by
   IANA.  Types within the range 0x8000000-0xFFFFFFFF are site specific
   and for "private use" only.

   The LAYOUT_NFSV4_FILES enumeration specifies that RPCSEC_GSS security as the NFSv4 file
   layout type is to be used. security triple.

3.2.1.1.  Kerberos V5

   The LAYOUT_OSD2_OBJECTS enumeration
   specifies that the object layout, Kerberos V5 GSS-API mechanism as defined described in [20], is to RFC1964 [6] MUST be used.
   Similarly, the LAYOUT_BLOCK_VOLUME enumeration
   implemented.

    column descriptions:
    1 == number of pseudo flavor
    2 == name of pseudo flavor
    3 == mechanism's OID
    4 == RPCSEC_GSS service

    1      2     3                    4
    --------------------------------------------------------------------
    390003 krb5  1.2.840.113554.1.2.2 rpc_gss_svc_none
    390004 krb5i 1.2.840.113554.1.2.2 rpc_gss_svc_integrity
    390005 krb5p 1.2.840.113554.1.2.2 rpc_gss_svc_privacy

   Note that the block/volume
   layout, as defined in [21], pseudo flavor is presented here as a mapping aid to be used.

2.2.19.  deviceid4

   typedef uint32_t deviceid4;  /* 32-bit device ID */

   Layout information the
   implementor.  Because this NFS protocol includes device IDs that specify a storage device
   through a compact handle.  Addressing method to
   negotiate security and type information is
   obtained with it understands the GETDEVICEINFO operation.  A client must GSS-API mechanism, the
   pseudo flavor is not assume
   that device IDs are valid across metadata server reboots. needed.  The device
   ID pseudo flavor is qualified by the layout type and are unique per file system
   (FSID).  This allows different layout drivers to generate device IDs
   without the need for co-ordination.  See Section 15.3.1.4 needed for more
   details.

2.2.20.  devlist_item4

   struct devlist_item4 {
           deviceid4          dli_id;
           opaque             dli_device_addr<>;
   };

   An array of these values is returned by NFS
   version 3 since the GETDEVICELIST operation.
   They define security negotiation is done via the set MOUNT
   protocol.

   For a discussion of devices associated with NFS' use of RPCSEC_GSS and Kerberos V5, please
   see RFC2623 [21].

3.2.1.2.  LIPKEY as a file system for the
   layout type specified security triple

   The LIPKEY GSS-API mechanism as described in RFC2847 [7] MUST be
   implemented and provide the GETDEVICELIST4args. following security triples.  The device address is used to set up a communication channel with the
   storage device.  Different layout types will require different types
   definition of structures to define how they communicate with storage devices. the columns matches the previous subsection "Kerberos
   V5 as security triple"

    1      2        3                   4
    --------------------------------------------------------------------
    390006 lipkey   1.3.6.1.5.5.9       rpc_gss_svc_none
    390007 lipkey-i 1.3.6.1.5.5.9       rpc_gss_svc_integrity
    390008 lipkey-p 1.3.6.1.5.5.9       rpc_gss_svc_privacy

3.2.1.3.  SPKM-3 as a security triple

   The opaque device_addr field must SPKM-3 GSS-API mechanism as described in RFC2847 [7] MUST be interpreted based on
   implemented and provide the
   specified layout type.

   This document defines following security triples.  The
   definition of the device address for columns matches the NFSv4 file layout
   (struct netaddr4 (Section 2.2.10)), which identifies previous subsection "Kerberos
   V5 as security triple".

    1      2        3                   5
    --------------------------------------------------------------------
    390009 spkm3    1.3.6.1.5.5.1.3     rpc_gss_svc_none
    390010 spkm3i   1.3.6.1.5.5.1.3     rpc_gss_svc_integrity
    390011 spkm3p   1.3.6.1.5.5.1.3     rpc_gss_svc_privacy

3.3.  Security Negotiation

   With the NFS version 4 server potentially offering multiple security
   mechanisms, the client needs a storage device
   by network IP address and port number.  This method to determine or negotiate which
   mechanism is sufficient for the
   clients to communicate be used for its communication with the NFSv4 storage devices, and server.  The
   NFS server may be
   sufficient for other layout types as well.  Device types have multiple points within its file system name space
   that are available for object
   storage devices and block storage devices (e.g., SCSI volume labels)
   will be defined use by their respective layout specifications.

2.2.21.  layout4

   struct layout4 {
       offset4                 lo_offset;
       length4                 lo_length;
       layoutiomode4           lo_iomode;
       layouttype4             lo_type;
       opaque                  lo_layout<>;
   };

   The layout4 structure defines a layout for a file. NFS clients.  In turn the NFS server
   may be configured such that each of these entry points may have
   different or multiple security mechanisms in use.

   The layout type
   specific data is opaque within this structure security negotiation between client and server must be
   interepreted based on the layout type.  Currently, only the NFSv4
   file layout type is defined; see Section 15.4.1 for its definition.
   Since layouts are sub-dividable, the offset and length together done with
   a secure channel to eliminate the file's filehandle, possibility of a third party
   intercepting the clientid, iomode, negotiation sequence and layout type,
   identifies the layout.

2.2.22.  layoutupdate4

   struct layoutupdate4 {
       layouttype4             lou_type;
       opaque                  lou_data<>;
   };

   The layoutupdate4 structure is used by forcing the client to return 'updated'
   layout information to the metadata and
   server at LAYOUTCOMMIT time.  This
   structure provides a channel to pass layout type specific information
   back to the metadata server.  E.g., for block/volume layout types
   this could include the list of reserved blocks that were written.
   The contents choose a lower level of security than required or desired.
   See the opaque lou_data argument are determined by the
   layout type section "Security Considerations" for further discussion.

3.3.1.  SECINFO and are defined in their context. SECINFO_NO_NAME

   The NFSv4 file-based
   layout does not use this structure, thus SECINFO and SECINFO_NO_NAME operations allow the update_data field should
   have client to
   determine, on a zero length.

2.2.23.  layouthint4

   struct layouthint4 {
       layouttype4           loh_type;
       opaque                loh_data<>;
   };

   The layouthint4 structure per filehandle basis, what security triple is to be
   used by for server access.  In general, the client will not have to pass in a hint
   about the type of layout it would like created for a particular file.
   It is the structure specified by use
   either operation except during initial communication with the FILE_LAYOUT_HINT attribute
   described below.  The metadata server may ignore the hint,
   or may
   selectively ignore fields within when the hint.  This hint should be
   provided client crosses policy boundaries at create time as part of the initial attributes within
   OPEN.  The NFSv4 file-based layout uses server.  It is
   possible that the "nfsv4_file_layouthint"
   structure as defined in Section 15.4.1.

2.2.24.  layoutiomode4

   enum layoutiomode4 {
       LAYOUTIOMODE_READ          = 1,
       LAYOUTIOMODE_RW            = 2,
       LAYOUTIOMODE_ANY           = 3
   };

   The iomode specifies whether server's policies change during the client's
   interaction therefore forcing the client intends to read or write
   (with negotiate a new security
   triple.

3.3.2.  Security Error

   Based on the possibility assumption that each NFS version 4 client and server
   must support a minimum set of reading) security (i.e., LIPKEY, SPKM-3, and
   Kerberos-V5 all under RPCSEC_GSS), the data represented by NFS client will start its
   communication with the layout.
   The ANY iomode MUST NOT be used for LAYOUTGET, however, it can be
   used for LAYOUTRETURN and LAYOUTRECALL.  The ANY iomode specifies
   that layouts pertaining to both READ and RW iomodes are being
   returned or recalled, respectively.  The metadata server's use server with one of the
   iomode may depend on minimal security
   triples.  During communication with the layout type being used.  The storage devices
   may validate I/O accesses against server, the iomode and reject invalid
   accesses.

2.2.25.  nfs_impl_id4

   struct nfs_impl_id4 {
       utf8str_cis   nii_domain;
       utf8str_cs    nii_name;
       nfstime4      nii_date;
   };

   This structure is used to identify client and may
   receive an NFS error of NFS4ERR_WRONGSEC.  This error allows the
   server implementation
   detail.  The nii_domain field is to notify the DNS domain name client that the
   implementer security triple currently being
   used is associated with. not appropriate for access to the server's file system
   resources.  The nii_name field client is then responsible for determining what
   security triples are available at the server and choose one which is
   appropriate for the product
   name client.  See the section for the "SECINFO"
   operation for further discussion of how the implementation client will respond to
   the NFS4ERR_WRONGSEC error and is completely free form. use SECINFO.

3.3.3.  Callback RPC Authentication

   Callback authentication has changed in NFSv4.1 from NFSv4.0.

   NFSv4.0 required the NFS server to create a security context for
   RPCSEC_GSS, AUTH_DH, and AUTH_KERB4, and any other security flavor
   that had a security context.  It is
   encouraged also required that principal issuing
   the nii_name callback be used to distinguish machine
   architecture, machine platforms, revisions, versions, the same as the principal that accepted the callback
   parameters (via SETCLIENTID), and patch
   levels.  The nii_date field is that the timestamp of when client principal accepting
   the software
   instance was published or built.

2.2.26.  impl_ident4

   struct impl_ident4 {
       clientid4           ii_clientid;
       struct nfs_impl_id4 ii_impl_id;
   }; callback be the same as that which issued the SETCLIENTID.  This is used for exchanging implementation identification between
   required the NFS client and server.

2.2.27.  threshold_item4

   struct threshold_item4 {
           layouttype4     thi_layout_type;
           bitmap4         thi_hintset;
           opaque          thi_hintlist<>;
   };

   This structure contains a list of hints specific to have an assigned machine credential.
   NFSv4.1 does not require a layout type for
   helping machine credential.  Instead, NFSv4.1
   allows an RPCSEC_GSS security context initiated by the client determine when it should issue I/O directly
   through and
   eswtablished on both the metadata client and server vs. to be used on callback
   RPCs sent by the data servers. server to the client.  The hint structure
   consists of BIND_BACKCHANNEL
   operation is used establish RPCSEC_GSS contexts (if the layout type, a bitmap describing client so
   desires) on the set server.  No support for AUTH_DH, or AUTH_KERB4 is
   specified.

3.3.4.  GSS Server Principal

   Regardless of hints
   supported by what security mechanism under RPCSEC_GSS is being used,
   the NFS server, they may differ based on the layout type,
   and MUST identify itself in GSS-API via a list
   GSS_C_NT_HOSTBASED_SERVICE name type.  GSS_C_NT_HOSTBASED_SERVICE
   names are of hints, whose structure is determined by the hintset
   bitmap.  See form:

   service@hostname

   For NFS, the mdsthreshold attribute for more details.

   The hintset "service" element is a bitmap

   nfs

   Implementations of security mechanisms will convert nfs@hostname to
   various different forms.  For Kerberos V5, LIPKEY, and SPKM-3, the
   following values:

   +-------------------------+---+---------+---------------------------+
   | name                    | # | Data    | Description               |
   |                         |   | Type    |                           |
   +-------------------------+---+---------+---------------------------+
   | threshold4_read_size    | 0 | length4 | The file size below which |
   |                         |   |         | it form is recommended to read |
   |                         |   |         | data through the MDS.     |
   | threshold4_write_size   | 1 | length4 | RECOMMENDED:

   nfs/hostname

4.  Security Negotiation

   The file size below which |
   |                         |   |         | it is recommended NFSv4.0 specification contains three oversights and ambiguities
   with respect to      |
   |                         |   |         | write data through the    |
   |                         |   |         | MDS.                      |
   | threshold4_read_iosize  | 2 | length4 | For read I/O sizes below  |
   |                         |   |         | this threshold SECINFO operation.

   First, it is      |
   |                         |   |         | recommended impossible for the client to read data  |
   |                         |   |         | through use the MDS           |
   | threshold4_write_iosize | 3 | length4 | For write I/O sizes below |
   |                         |   |         | this threshold it is      |
   |                         |   |         | recommended SECINFO operation
   to write data |
   |                         |   |         | through determine the MDS           |
   +-------------------------+---+---------+---------------------------+

2.2.28.  mdsthreshold4

   struct mdsthreshold4 {
           threshold_item4 mth_hints<>;
   };

   This structure holds an array of threshold_item4 structures each of
   which is valid correct security triple for accessing a particular layout type.  An array parent
   directory.  This is necessary
   since a server can support multiple layout types for a single file.

3.  RPC because SECINFO takes as arguments the current
   file handle and Security Flavor

   The NFS version 4.1 protocol is a Remote Procedure Call (RPC)
   application that component name.  However, NFSv4.0 uses RPC version 2 and the corresponding eXternal
   Data Representation (XDR) as defined in RFC1831 [4] and RFC4506 [3].
   The RPCSEC_GSS security flavor as defined in RFC2203 [5] MUST be used
   as the mechanism LOOKUPP
   operation to deliver stronger get the parent directory of the current filehandle.  If
   the client uses the wrong security for when issuing the NFS version 4
   protocol.

3.1.  Ports and Transports

   Historically, NFS version 2 LOOKUPP, and version 3 servers have resided on
   port 2049.  The registered port 2049 RFC3232 [22] for gets
   back an NFS4ERR_WRONGSEC error, SECINFO is useless to the NFS
   protocol should be client.
   The client is left with guessing which security the default configuration.  NFSv4 clients SHOULD
   NOT use server will
   accept.  This defeats the RPC binding protocols as described in RFC1833 [19].

   Where purpose of SECINFO, which was to provide an NFS version 4 implementation supports
   efficient method of negotiating security.

   Second, there is ambiguity as to what the server should do when it is
   passed a LOOKUP operation over such that the IP
   network protocol, server restricts access to
   the supported transports between NFS current file handle with one security triple, and IP MUST
   have access to the
   component with a different triple, and remote procedure call uses one
   of the following two attributes:

   1.  The transport must support reliable delivery of data in the order
       it was sent.

   2.  The transport must be among the IETF-approved congestion control
       transport protocols.

   At the time this document was written, the only two transports that
   had the above attributes were TCP and SCTP.  To enhance security triples.  Should the
   possibilities for interoperability, an NFS version 4 implementation
   MUST support operation over server allow the TCP transport protocol.

   If TCP LOOKUP?

   Third, there is used a problem as the transport, to what the client and must do (or can do),
   whenever the server SHOULD use
   persistent connections for at least two reasons:

   1.  This will prevent returns NFS4ERR_WRONGSEC in response to a PUTFH
   operation.  The NFSv4.0 specification says that client should issue a
   SECINFO using the weakening of TCP's congestion control via
       short lived connections parent filehandle and will improve performance for the WAN
       environment by eliminating component name of the need
   filehandle that PUTFH was issued with.  This may not be convenient
   for SYN handshakes.

   2.  The NFSv4.1 callback model has changed from NFSv4.0, and requires the client and server to maintain a client-created channel for client.

   This document resolves the server to use.

   As noted above three issues in the context of
   NFSv4.1.

5.  Clarification of Security Considerations section, the authentication
   model for NFS version 4 has moved from machine-based Negotiation in NFSv4.1

   This section attempts to principal-
   based.  However, this modification clarify NFSv4.1 security negotiation issues.
   Unless noted otherwise, for any mention of PUTFH in this section, the authentication model does
   not imply a technical requirement to move the transport connection
   management model from whole machine-based
   reader should interpret it as applying to one based on a per user
   model.  In particular, NFS over TCP client implementations have
   traditionally multiplexed traffic for multiple users over a common
   TCP connection between an NFS client PUTROOTFH and server.  This has been true,
   regardless whether the NFS client is using AUTH_SYS, AUTH_DH,
   RPCSEC_GSS or any other flavor.  Similarly, NFS over TCP PUTPUBFH in
   addition to PUTFH.

5.1.  PUTFH + LOOKUP

   The server
   implementations have assumed such a model and thus scale the implementation of TCP connection management in proportion may decide whether to impose any
   restrictions on export security administration.  There are at least
   three approaches (Sc is the
   number flavor set of expected client machines.  NFS version 4.1 will not modify
   this connection management model.  NFS version 4.1 clients that
   violate this assumption can expect scaling issues on the server and
   hence reduced service.

   Note child export, Sp that
   of the parent),

     a)  Sc <= Sp (<= for subset)

     b)  Sc ^ Sp != {} (^ for intersection, {} for various timers, the empty set)

     c)  free form

   To support b (when client and server should avoid
   inadvertent synchronization of those timers.  For further discussion
   of the general issue refer to [Floyd].

3.1.1.  Client Retransmission Behavior

   When processing a request received over chooses a reliable transport such as
   TCP, the NFS version 4.1 server MUST NOT silently drop the request,
   except if the transport connection has been broken.  Given such flavor that is not a
   contract between NFS version 4.1 clients member of
   Sp) and servers, clients MUST c, PUTFH must NOT retry a request unless one or both return NFS4ERR_WRONGSEC in case of security
   mismatch.  Instead, it should be returned from the following are true:

   o  The transport connection has been broken

   o  The procedure being retried is the NULL procedure LOOKUP that
   follows.

   Since reliable transports, such as TCP, do not always synchronously
   inform a peer when the other peer has broken above guideline does not contradict a, it should be
   followed in general.

5.2.  PUTFH + LOOKUPP

   Since SECINFO only works its way down, there is no way LOOKUPP can
   return NFS4ERR_WRONGSEC without the connection (for
   example, when an NFS server reboots), the NFS version 4.1 client may
   want to actively "probe" the connection to see if has been broken.
   Use of implementing
   SECINFO_NO_NAME.  SECINFO_NO_NAME solves this issue because via style
   "parent", it works in the NULL procedure opposite direction as SECINFO (component
   name is one recommended way to do so.  So, when
   a implicit in this case).

5.3.  PUTFH + SECINFO

   This case should be treated specially.

   A security sensitive client experiences should be allowed to choose a remote procedure call timeout (of some
   arbitrary implementation specific amount), rather than retrying the
   remote procedure call, it could instead issue strong
   flavor when querying a NULL procedure call
   to the server.  If the server has died, the transport connection
   break will eventually be indicated to the NFS version 4.1 client.
   The client can then reconnect, and then retry the original request.
   If the NULL procedure call gets determine a response, the connection has not
   broken. file object's permitted
   security flavors.  The security flavor chosen by the client can decide does not
   have to wait longer for the original
   request's response, or it can break be included in the transport connection and
   reconnect before re-sending flavor list of the original request.

   For callbacks from export.  Of course the
   server has to the client, the same rules apply,
   but the server doing the callback becomes the client, and be configured for whatever flavor the client
   receiving the callback becomes selects,
   otherwise the server.

3.2.  Security Flavors

   Traditional request will fail at RPC implementations have included AUTH_NONE, AUTH_SYS,
   AUTH_DH, and AUTH_KRB4 as security flavors.  With RFC2203 [5] an
   additional authentication.

   In theory, there is no connection between the security flavor of RPCSEC_GSS has been introduced which
   uses used by
   SECINFO and those supported by the functionality of GSS-API RFC2743 [8].  This allows for export.  But in practice, the
   use of various security mechanisms
   client may start looking for strong flavors from those supported by
   the RPC layer without export, followed by those in the
   additional implementation overhead mandatory set.

5.4.  PUTFH + Anything Else

   PUTFH must return NFS4ERR_WRONGSEC in case of adding RPC security flavors.
   For NFS version 4, mismatch.
   This is the RPCSEC_GSS security flavor MUST be implemented most straightforward approach without having to enable the mandatory security mechanism.  Other flavors, such as,
   AUTH_NONE, AUTH_SYS, and AUTH_DH MAY be implemented as well.

3.2.1.  Security mechanisms add
   NFS4ERR_WRONGSEC to every other operations.

   PUTFH + SECINFO_NO_NAME (style "current_fh") is needed for NFS version 4

   The use of RPCSEC_GSS requires selection of: mechanism, quality of
   protection, and service (authentication, integrity, privacy).  The
   remainder of this document will refer the client
   to these three parameters of recover from NFS4ERR_WRONGSEC.

6.  NFSv4.1 Sessions

6.1.  Sessions Background

6.1.1.  Introduction to Sessions

   [[Comment.1: Noveck: Anyway, I think that trying to hack at the RPCSEC_GSS security as the security triple.

3.2.1.1.  Kerberos V5

   The Kerberos V5 GSS-API mechanism as described
   existing text is basically hopeless.  I think you have to figure out
   what a new chapter (on sessions or basic protocol structure) should
   say and then write it, pulling in RFC1964 [6] MUST be
   implemented.

    column descriptions:
    1 == number of pseudo flavor
    2 == name of pseudo flavor
    3 == mechanism's OID
    4 == RPCSEC_GSS service

    1      2     3                    4
    --------------------------------------------------------------------
    390003 krb5  1.2.840.113554.1.2.2 rpc_gss_svc_none
    390004 krb5i 1.2.840.113554.1.2.2 rpc_gss_svc_integrity
    390005 krb5p 1.2.840.113554.1.2.2 rpc_gss_svc_privacy

   Note text from the existing chapter when
   appropriate.  Apart from the issues you have found, that document was
   written with a whole different purpose in mind.  It discusses the pseudo flavor
   seesions "feature" and justifies it and talks about intergating it
   into v4.0, etc.  Instead, it is presented here as not a mapping aid to the
   implementor.  Because this NFS protocol includes feature but is a method basic
   underpinning of v4.1 and we just explain what client and server need
   to
   negotiate security do, and some why but it understands the GSS-API mechanism, the
   pseudo flavor is why this works not needed. why we have made
   these design choices vs. others we might have made.  It's a totally
   different story and I don't think you can get there incrementally.]]
   NFSv4.1 adds extensions which allow NFSv4 to support sessions and
   endpoint management, and to support operation atop RDMA-capable RPC
   over transports such as iWARP.  [RDMAP, DDP] These extensions enable
   support for exactly-once semantics by NFSv4 servers, multipathing and
   trunking of transport connections, and enhanced security.  The pseudo flavor
   ability to operate over RDMA enables greatly enhanced performance.
   Operation over existing TCP is needed for NFS
   version 3 since enhanced as well.

   While discussed here with respect to IETF-chartered transports, the security negotiation
   intent is done via NFSv4.1 will function over other standards, such as
   Infiniband.  [IB]
   The following are the MOUNT
   protocol.

   For a discussion of NFS' use major aspects of RPCSEC_GSS the session feature:

   o  An explicit session is introduced to NFSv4, and Kerberos V5, please
   see RFC2623 [23].

3.2.1.2.  LIPKEY as a security triple new operations are
      added to support it.  The LIPKEY GSS-API mechanism as described in RFC2847 [7] MUST be session allows for enhanced trunking,
      failover and recovery, and support for RDMA.  The session is
      implemented as operations within NFSv4 COMPOUND and provide the following security triples. does not
      impact layering or interoperability with existing NFSv4
      implementations.  The
   definition of NFSv4 callback channel is dynamically
      associated and is connected by the columns matches client and not the previous subsection "Kerberos
   V5 as security triple"

    1      2        3                   4
    --------------------------------------------------------------------
    390006 lipkey   1.3.6.1.5.5.9       rpc_gss_svc_none
    390007 lipkey-i 1.3.6.1.5.5.9       rpc_gss_svc_integrity
    390008 lipkey-p 1.3.6.1.5.5.9       rpc_gss_svc_privacy

3.2.1.3.  SPKM-3 as a server,
      enhancing security triple

   The SPKM-3 GSS-API mechanism as described in RFC2847 [7] MUST be
   implemented and provide operation through firewalls.  [[Comment.2:
      XXX is the following security triples.  The
   definition of true:]]In fact, the columns matches callback channel will be
      enabled to share the previous subsection "Kerberos
   V5 same connection as security triple".

    1      2        3                   5
    --------------------------------------------------------------------
    390009 spkm3    1.3.6.1.5.5.1.3     rpc_gss_svc_none
    390010 spkm3i   1.3.6.1.5.5.1.3     rpc_gss_svc_integrity
    390011 spkm3p   1.3.6.1.5.5.1.3     rpc_gss_svc_privacy

3.3.  Security Negotiation

   With the NFS version 4 operations channel.

   o  An enhanced RPC layer enables NFSv4 operation atop RDMA.  The
      session assists RDMA-mode connection, and additional facilities
      are provided for managing RDMA resources at both NFSv4 server potentially offering multiple security
   mechanisms, the client needs a method and
      client.  Existing NFSv4 operations continue to determine or negotiate which
   mechanism function as before,
      though certain size limits are negotiated.  A companion draft to
      this specification, "RDMA Transport for ONC RPC" [RPCRDMA] is to
      be used referenced for its communication with the server.  The
   NFS server may have multiple points within its file system name space
   that are available details of RPC RDMA support.

   o  Support for use exactly-once semantics ("EOS") is enabled by NFS clients.  In turn the NFS server
   may be configured such that each of these entry points may have
   different or multiple security mechanisms in use.

   The security negotiation between client and server must be done with
   a secure channel new
      session facilities, by providing to eliminate the possibility of a third party
   intercepting the negotiation sequence and forcing the client and server to choose a lower level way to bound the
      size of security than required or desired.
   See the section "Security Considerations" duplicate request cache for further discussion.

3.3.1.  SECINFO and SECINFO_NO_NAME

   The SECINFO a single client, and SECINFO_NO_NAME operations allow the client to
   determine, on
      manage its persistent storage.

                                   Block Diagram

             +-----------------+-------------------------------------+
             |     NFSv4       |     NFSv4 + session extensions      |
             +-----------------+------+----------------+-------------+
             |      Operations        |   Session      |             |
             +------------------------+----------------+             |
             |                RPC/XDR                  |             |
             +-------------------------------+---------+             |
             |       Stream Transport        |    RDMA Transport     |
             +-------------------------------+-----------------------+

6.1.2.  Session Model

   A session is a per filehandle basis, what security triple dynamically created, long-lived server object created
   by a client, used over time from one or more transport connections.
   Its function is to be
   used for server access.  In general, maintain the client will not have server's state relative to use
   either operation except during initial communication with the server
   or when the
   connection(s) belonging to a client crosses policy boundaries at the server.  It instance.  This state is
   possible that the server's policies change during entirely
   independent of the client's
   interaction therefore forcing connection itself.  The session in effect becomes
   the object representing an active client to negotiate a new security
   triple.

3.3.2.  Security Error

   Based on the assumption that each NFS version 4 client and server
   must support a minimum connection or set of security (i.e., LIPKEY, SPKM-3,
   connections.

   Clients may create multiple sessions for a single clientid, and
   Kerberos-V5 all under RPCSEC_GSS), the NFS client will start its
   communication with the server with one of the minimal security
   triples.  During communication with the server, the client may
   receive an NFS error
   wish to do so for optimization of NFS4ERR_WRONGSEC.  This error allows the transport resources, buffers, or
   server to notify behavior.  A session could be created by the client that the security triple currently being
   used is not appropriate for access to the server's file system
   resources.  The client is then responsible
   represent a single mount point, for determining what
   security triples are available at the server separate read and choose one which is
   appropriate for the client.  See the section for the "SECINFO"
   operation write
   "channels", or for further discussion any number of how the client will respond to
   the NFS4ERR_WRONGSEC error other client-selected parameters.

   The session enables several things immediately.  Clients may
   disconnect and use SECINFO.

3.3.3.  Callback RPC Authentication

   Callback authentication has changed in NFSv4.1 from NFSv4.0.

   NFSv4.0 required the NFS server to create a security reconnect (voluntarily or not) without loss of context for
   RPCSEC_GSS, AUTH_DH, and AUTH_KERB4, and any other security flavor
   that had a security context.  It also required that principal issuing
   the callback be the same as the principal that accepted
   at the callback
   parameters (via SETCLIENTID), server.  (Of course, locks, delegations and that the client principal accepting related
   associations require special handling, and generally expire in the callback be
   extended absence of an open connection.)  Clients may connect
   multiple transport endpoints to this common state.  The endpoints may
   have all the same as that which issued the SETCLIENTID.  This
   required attributes, for instance when trunked on multiple
   physical network links for bandwidth aggregation or path failover.
   Or, the NFS client to endpoints can have an assigned machine credential.
   NFSv4.1 specific, special purpose attributes such
   as callback channels.

   The NFSv4.0 specification does not require a machine credential.  Instead, NFSv4.1
   allows an RPCSEC_GSS security context initiated by the client and
   eswtablished provide for any form of flow
   control; instead it relies on both the client and server to be used on callback
   RPCs sent windowing provided by the server TCP to the client.  The BIND_BACKCHANNEL
   throttle requests.  This unfortunately does not work with RDMA, which
   in general provides no operation flow control and will terminate a
   connection in error when limits are exceeded.  Limits are therefore
   exchanged when a session is used establish RPCSEC_GSS contexts (if created; These limits then provide maxima
   within which each session's connections must operate, they are
   managed within these limits as described in [RPCRDMA].  The limits
   may also be modified dynamically at the client so
   desires) server's choosing by
   manipulating certain parameters present in each NFSv4.1 request.

   The presence of a maximum request limit on the server.  No support for AUTH_DH, or AUTH_KERB4 is
   specified.

3.3.4.  GSS Server Principal

   Regardless session bounds the
   requirements of what security mechanism under RPCSEC_GSS is being used, the NFS server, MUST identify itself in GSS-API via duplicate request cache.  This can be used a
   GSS_C_NT_HOSTBASED_SERVICE name type.  GSS_C_NT_HOSTBASED_SERVICE
   names are of
   server accurately determine any storage needs, enable it to maintain
   duplicate request cache persistence, and to provide reliable exactly-
   once semantics.

6.1.3.  Connection State

   In NFSv4.0, the form:

   service@hostname

   For NFS, combination of a connected transport endpoint and a
   clientid forms the "service" element is

   nfs

   Implementations basis of security mechanisms will convert nfs@hostname connection state.  While this has been
   made to
   various different forms.  For Kerberos V5, LIPKEY, be workable with certain limitations, there are difficulties
   in correct and SPKM-3, the
   following form is RECOMMENDED:

   nfs/hostname

4.  Filehandles robust implementation.  The filehandle in the NFS NFSv4.0 protocol is must
   provide a per server unique identifier server-initiated connection for a file system object.  The contents of the filehandle are opaque
   to callback channel, and
   must carefully specify the client.  Therefore, persistence of client state at the server is responsible for translating
   in the filehandle to an internal representation face of the file system
   object.

4.1.  Obtaining the First Filehandle transport interruptions.  The operations of server has only the NFS protocol are defined in terms of one or
   more filehandles.  Therefore,
   client's transport address binding (the IP 4-tuple) to identify the
   client needs a filehandle RPC transaction stream and to
   initiate communication with use as a lookup tag on the server.  With
   duplicate request cache.  (A useful overview of this is in [RW96].)
   If the NFS version 2
   protocol RFC1094 [17] server listens on multiple addresses, and the NFS version 3 protocol RFC1813 [18],
   there exists an ancillary protocol client connects
   to obtain this first filehandle.
   The MOUNT protocol, RPC program number 100005, provides the mechanism
   of translating a string based file system path name more than one, it must employ different clientid's on each,
   negating its ability to a filehandle
   which can then be aggregate bandwidth and redundancy.  In
   effect, each transport connection is used by the NFS protocols.

   The MOUNT protocol has deficiencies in as the area server's
   representation of security client state.  But, transport connections are
   potentially fragile and use
   via firewalls. transitory.

   In this specification, a session identifier is assigned by the server
   upon initial session negotiation on each connection.  This identifier
   is one reason that used to associate additional connections, to renegotiate after a
   reconnect, to provide an abstraction for the use of various session
   properties, and to address the public
   filehandle was introduced duplicate request cache.  No
   transport-specific information is used in RFC2054 [24] and RFC2055 [25].  With the
   use duplicate request cache
   implementation of the public filehandle an NFSv4.1 server, nor in combination with fact the LOOKUP operation
   in RPC XID itself.
   The session identifier is unique within the NFS version 2 server's scope and 3 protocols, it has been demonstrated that may be
   subject to certain server policies such as being bounded in time.

6.1.4.  NFSv4 Channels, Sessions and Connections

   There are two types of NFSv4 channels: the MOUNT protocol is unnecessary "operations" or "fore"
   channel used for viable interaction between NFS ordinary requests from client to server, and server.

   Therefore, the NFS version 4 protocol will not use an ancillary
   protocol
   "back" channel, used for translation callback requests from string based path names server to a
   filehandle.  Two special filehandles will be used client.

   Different NFSv4 operations on these channels can lead to different
   resource needs.  For example, server callback operations (CB_RECALL)
   are specific, small messages which flow from server to client at
   arbitrary times, while data transfers such as starting points read and write have
   very different sizes and asymmetric behaviors.  It is sometimes
   impractical for the NFS client.

4.1.1.  Root Filehandle

   The first RDMA peers (NFSv4 client and NFSv4 server) to
   post buffers for these various operations on a single connection.
   Commingling of requests with responses at the special filehandles client receive queue is
   particularly troublesome, due both to the ROOT filehandle.  The
   ROOT filehandle is need to manage both
   solicited and unsolicited completions, and to provision buffers for
   both purposes.  Due to the "conceptual" root lack of any ordering of callback requests
   versus response arrivals, without any other mechanisms, the file system name
   space at the NFS server.  The client uses or starts with the ROOT
   filehandle by employing the PUTROOTFH operation.  The PUTROOTFH
   operation instructs the server
   would be forced to set the "current" filehandle allocate all buffers sized to the
   ROOT of worst case.

   The callback requests are likely to be handled by a different task
   context from that handling the server's file tree.  Once this PUTROOTFH operation is
   used, responses.  Significant demultiplexing
   and thread management may be required if both are received on the
   same connection.  The client can then traverse the entirety of the server's file
   tree with the LOOKUP operation.  A complete discussion of the and server
   name space is in the section "NFS Server Name Space".

4.1.2.  Public Filehandle

   The second special filehandle have full control as to
   whether a connection will service one channel or both channels.

   [[Comment.3: I think trunking remains an open issue has there is no
   way yet for clients to determine whether two different server network
   addresses refer to the PUBLIC filehandle.  Unlike same server]].  Also, the
   ROOT filehandle, client may wish to
   perform trunking of operations channel requests for performance
   reasons, or multipathing for availability.  This specification
   permits both, as well as many other session and connection
   possibilities, by permitting each operation to carry session
   membership information and to share session (and clientid) state in
   order to draw upon the PUBLIC filehandle appropriate resources.  For example, reads and
   writes may be bound assigned to specific, optimized connections, or represent an
   arbitrary file system object at sorted
   and separated by any or all of size, idempotency, etc.

   To address the server.  The server is
   responsible problems described above, this specification allows
   multiple sessions to share a clientid, as well as for multiple
   connections to share a session.

   Single Connection model:

                            NFSv4.1 Session
                               /      \
                Operations_Channel   [Back_Channel]
                                \    /
                             Connection
                                  |

   Multi-connection trunked model (2 operations channels shown):

                            NFSv4.1 Session
                               /      \
                Operations_Channels  [Back_Channel]
                    |          |               |
                Connection Connection     [Connection]
                    |          |               |

   Multi-connection split-use model (2 mounts shown):

                                     NFSv4.1 Session
                                   /                 \
                            (/home)        (/usr/local - readonly)
                            /      \                    |
             Operations_Channel  [Back_Channel]         |
                     |                 |          Operations_Channel
                 Connection       [Connection]          |
                     |                 |            Connection
                                                        |

   In this binding.  It way, implementation as well as resource management may be that the PUBLIC filehandle
   optimized.  Each session will have its own response caching and
   buffering, and each connection or channel will have its own transport
   resources, as appropriate.  Clients which do not require certain
   behaviors may optimize such resources away completely, by using
   specific sessions and not even creating the ROOT filehandle refer to the same file system object.
   However, it is up to the administrative software at additional channels and
   connections.

6.1.5.  Reconnection, Trunking and Failover

   Reconnection after failure references stored state on the server
   associated with lease recovery during the grace period.  The session
   provides a convenient handle for storing and managing information
   regarding the policies client's previous state on a per- connection basis,
   e.g. to be used upon reconnection.  Reconnection to a previously
   existing session, and its stored resources, are covered in
   Section 6.3.

   One important aspect of the server administrator reconnection is that of RPC library support.
   Traditionally, an Upper Layer RPC-based Protocol such as NFS leaves
   all transport knowledge to define the binding RPC layer implementation below it.
   This allows NFS to operate over a wide variety of the
   PUBLIC filehandle transports and server file system object. has
   proven to be a highly successful approach.  The client may session, however,
   introduces an abstraction which is, in a way, "between" RPC and
   NFSv4.1.  It is important that the session abstraction not
   make any assumptions about this binding.  The client uses have
   ramifications within the PUBLIC
   filehandle via RPC layer.

   One such issue arises within the PUTPUBFH operation.

4.2.  Filehandle Types

   In reconnection logic of RPC.
   Previously, an explicit session binding operation, which established
   session context for each new connection, was explored.  This however
   required that the NFS version 2 session binding also be performed during reconnect,
   which in turn required an RPC request.  This additional request
   requires new RPC semantics, both in implementation and 3 protocols, there was one type of
   filehandle with the fact that
   a single set of semantics.  This type of filehandle new request is termed "persistent" in NFS Version 4.  The semantics inserted into the RPC stream.  Also, the binding of
   a
   persistent filehandle remain connection to a session required the same as before.  A new type upper layer to become "aware"
   of
   filehandle introduced in NFS Version 4 is connections, something the "volatile" filehandle,
   which attempts RPC layer abstraction architecturally
   abstracts away.  Therefore the session binding is not handled in
   connection scope but instead explicitly carried in each request.

   For Reliability Availability and Serviceability (RAS) issues such as
   bandwidth aggregation and multipathing, clients frequently seek to accommodate certain server environments.
   make multiple connections through multiple logical or physical
   channels.  The volatile filehandle type was introduced session is a convenient point to address aggregate and manage
   these resources.

6.1.6.  Server Duplicate Request Cache

   RPC-based server
   functionality or implementation issues which make correct
   implementation duplicate request caches, while not a part of an NFS
   protocol, have become a persistent filehandle infeasible.  Some de-facto requirement of any NFS
   implementation.  First described in [CJ89], the duplicate request
   cache was initially found to reduce work at the server
   environments by avoiding
   duplicate processing for retransmitted requests.  A second, and in
   the long run more important benefit, was improved correctness, as the
   cache avoided certain destructive non-idempotent requests from being
   reinvoked.

   However, RPC-based caches do not provide a file system level invariant that can correctness guarantees; they
   cannot be
   used to construct managed in a reliable, persistent filehandle. fashion.  The underlying server
   file system may not provide the invariant or the server's file system
   programming interfaces may not provide access reason is
   understandable - their storage requirement is unbounded due to the needed
   invariant.  Volatile filehandles may ease the implementation
   lack of
   server functionality any such as hierarchical storage management or file
   system reorganization or migration.  However, the volatile filehandle
   increases bound in the implementation burden NFS protocol, and they are dependent on
   transport addresses for request matching.

   The session model, the client.

   Since presence of maximum request count limits and
   negotiated maximum sizes allows the client will need size and duration of the cache to handle persistent
   be bounded, and volatile
   filehandles differently, coupled with a file attribute is defined long-lived session identifier, enables
   its persistent storage on a per-session basis.

   This provides a single unified mechanism which may be
   used by the client to determine provides the filehandle types being returned
   by following
   guarantees required in the server.

4.2.1.  General Properties of a Filehandle

   The filehandle contains NFSv4 specification, while extending them
   to all the information requests, rather than limiting them only to a subset of state-
   related requests:

   "It is critical the server needs to
   distinguish an individual file.  To maintain the client, last response sent to the filehandle is
   opaque.  The
   client stores filehandles for use in to provide a later request and
   can compare two filehandles from more reliable cache of duplicate non- idempotent
   requests than that of the same server for equality by
   doing a byte-by-byte comparison.  However, traditional cache described in [CJ89]..."
   RFC3530 [2]

   The maximum request count limit is the client MUST NOT
   otherwise interpret count of active operations,
   which bounds the contents number of filehandles.  If two filehandles
   from entries in the same server are equal, they MUST refer to cache.  Constraining the same file.
   Servers SHOULD try to maintain a one-to-one correspondence between
   filehandles and files but this is not required.  Clients MUST use
   filehandle comparisons only to improve performance, not for correct
   behavior.  All clients need
   size of operations additionally serves to be prepared for situations in which it
   cannot be determined whether two filehandles denote limit the same object
   and in such cases, avoid making invalid assumptions which might cause
   incorrect behavior.  Further discussion of filehandle and attribute
   comparison in required storage
   to the context product of data caching is presented in the section
   "Data Caching current maximum request count and File Identity".

   As an example, in the case that two different path names when
   traversed at maximum
   response size.  This storage requirement enables server- side
   efficiencies.

   Session negotiation allows the server terminate at to maintain other state.  An
   NFSv4.1 client invoking the same file system object, session destroy operation will cause the
   server SHOULD return to close the same filehandle session, allowing the server to deallocate cache
   entries.  Clients can potentially specify that such caches not be
   kept for each path. appropriate types of sessions (for example, read-only
   sessions).  This can
   occur if a hard link is used to create two file names which refer to
   the same underlying file object enable more efficient server operation resulting
   in improved response times, and associated data.  For example, if
   paths /a/b/c more efficient sizing of buffers and /a/d/c refer to the same file, the server SHOULD
   return the same filehandle for both path names traversals.

4.2.2.  Persistent Filehandle

   A persistent filehandle
   response caches.

   Similarly, it is defined as having a fixed value important for the
   lifetime of the file system object client to which it refers.  Once explicitly learn whether
   the server creates the filehandle is able to implement reliable semantics.  Knowledge of
   whether these semantics are in force is critical for a file system object, the server
   MUST accept highly
   reliable client, one which must provide transactional integrity
   guarantees.  When clients request that the same filehandle semantics be enabled for a
   given session, the object for session reply must inform the lifetime of client if the object.  If mode
   is in fact enabled.  In this way the client can confidently proceed
   with operations without having to implement consistency facilities of
   its own.

6.2.  Session Initialization and Transfer Models

   Session initialization issues, and data transfer models relevant to
   both TCP and RDMA are discussed in this section.

6.2.1.  Session Negotiation

   The following parameters are exchanged between client and server restarts or reboots at
   session creation time.  Their values allow the NFS server must
   honor the same filehandle value as it did to properly
   size resources allocated in order to service the server's previous
   instantiation.  Similarly, if the file system is migrated, client's requests,
   and to provide the new
   NFS server must honor with a way to communicate limits to the same filehandle
   client for proper and optimal operation.  They are exchanged prior to
   all session-related activity, over any transport type.  Discussion of
   their use is found in their descriptions as the old NFS server. well as throughout this
   section.

   Maximum Requests

      The persistent filehandle will be become stale or invalid when the
   file system object client's desired maximum number of concurrent requests is removed.  When
      passed, in order to allow the server is presented with a
   persistent filehandle that refers to a deleted object, it MUST return
   an error of NFS4ERR_STALE.  A filehandle size its reply cache
      storage.  The server may become stale when modify the
   file system containing client's requested limit
      downward (or upward) to match its local policy and/or resources.
      Over RDMA-capable RPC transports, the object per-request management of
      low-level transport message credits is no longer available. handled within the RPC
      layer.  [RPCRDMA]

   Maximum Request/Response Sizes

      The file
   system may become unavailable if it exists on removable media maximum request and response sizes are exchanged in order to
      permit allocation of appropriately sized buffers and request cache
      entries.  The size must allow for certain protocol minima,
      allowing the
   media is no longer available at the server or receipt of maximally sized operations (e.g.  RENAME
      requests which contains two name strings).  Note the file system in
   whole has been destroyed or maximum
      request/response sizes cover the file system has entire request/response message
      and not simply been removed
   from the server's name space (i.e. unmounted data payload as traditional NFS maximum read or
      write size.  Also note the server implementation may not, in a UNIX environment).

4.2.3.  Volatile Filehandle

   A volatile filehandle fact
      probably does not share not, require the same longevity
   characteristics of a persistent filehandle. reply cache entries to be sized as
      large as the maximum response.  The server may determine
   that a volatile filehandle is no longer valid at many different
   points in time.  If reduce the client's
      requested sizes.

   Inline Padding/Alignment

      The server can definitively determine that a
   volatile filehandle refers to an object that has been removed, the
   server should return NFS4ERR_STALE to inform the client (as is the case for
   persistent filehandles).  In all other cases where the server
   determines that a volatile filehandle of any padding which can no longer be used, it
   should return an error of NFS4ERR_FHEXPIRED.

   The mandatory attribute "fh_expire_type" is used by
      to deliver NFSv4 inline WRITE payloads into aligned buffers.  Such
      alignment can be used to avoid data copy operations at the server
      for both TCP and inline RDMA transfers.  For RDMA, the client to
   determine what type of filehandle
      informs the server is providing in each operation when padding has been
      applied.  [RPCRDMA]
   Transport Attributes

      A placeholder for a
   particular file system.  This attribute transport-specific attributes is a bitmask provided, with
      a format to be determined.  Possible examples of information to be
      passed in this parameter include transport security attributes to
      be used on the
   following values:

   FH4_PERSISTENT  The value connection, RDMA- specific attributes, legacy
      "private data" as used on existing RDMA fabrics, transport Quality
      of FH4_PERSISTENT Service attributes, etc.  This information is used to indicate a
      persistent filehandle, which is valid until be passed to
      the object peer's transport layer by local means which is removed
      from currently
      outside the file system.  The server will not return
      NFS4ERR_FHEXPIRED for scope of this filehandle.  FH4_PERSISTENT draft, however one attribute is defined
      as a value provided
      in which none of the bits specified below are set.

   FH4_VOLATILE_ANY  The filehandle may expire at any time, except as
      specifically excluded (i.e.  FH4_NO_EXPIRE_WITH_OPEN).

   FH4_NOEXPIRE_WITH_OPEN  May only RDMA case:

   RDMA Read Resources

      RDMA implementations must explicitly provision resources to
      support RDMA Read requests from connected peers.  These values
      must be set when FH4_VOLATILE_ANY is set.
      If this bit is set, then the meaning of FH4_VOLATILE_ANY is
      qualified explicitly specified, to exclude any expiration of the filehandle when it is
      open.

   FH4_VOL_MIGRATION  The filehandle will expire as a result of a file
      system transition (migration or replication), in those case in
      which provide adequate resources for
      matching the continuity of filehandle use is not specified by
      _handle_ class information within peer's expected needs and the fs_locations_info attribute.
      When this bit is set, clients without access to fs_locations_info
      information should assume filehandles will expire on file system
      transitions.

   FH4_VOL_RENAME connection's delay-
      bandwidth parameters.  The filehandle will expire during rename.  This
      includes a rename by the requesting client or a rename by any
      other client.  If FH4_VOL_ANY is set, FH4_VOL_RENAME is redundant.

   Servers which provide volatile filehandles that may expire while open
   (i.e. if FH4_VOL_MIGRATION or FH4_VOL_RENAME is set or if
   FH4_VOLATILE_ANY is set and FH4_NOEXPIRE_WITH_OPEN not set), should
   deny a RENAME or REMOVE that would affect an OPEN file of any of the
   components leading provides its chosen value to the OPEN file.  In addition, the server should
   deny all RENAME or REMOVE requests during the grace period upon
      server restart.

   Servers which provide volatile filehandles that may expire while open
   require special care as regards handling of RENAMESs and REMOVEs.
   This situation can arise if FH4_VOL_MIGRATION or FH4_VOL_RENAME is
   set, if FH4_VOLATILE_ANY is set and FH4_NOEXPIRE_WITH_OPEN not set,
   or if a non-readonly file system has a transition target in a
   different _handle _ class.  In these cases, the server should deny a
   RENAME or REMOVE that would affect an OPEN file of any of initial session creation, the
   components leading value must be provided
      in each client RDMA endpoint.  The values are asymmetric and
      should be set to the OPEN file.  In addition, zero at the server should
   deny all RENAME or REMOVE requests during the grace period, in order to make sure that reclaims conserve RDMA
      resources, since clients do not issue RDMA Read operations in this
      specification.  The result is communicated in the session
      response, to permit matching of files where filehandles values across the connection.  The
      value may have
   expired do not do a reclaim for be changed in the wrong file.

4.3.  One Method duration of Constructing the session, although
      a Volatile Filehandle new value may be requested as part of a new session.

6.2.2.  RDMA Requirements

   A volatile filehandle, while opaque to complete discussion of the client could contain:

   [volatile bit = 1 | server boot time | slot | generation number]

   o  slot operation of RPC-based protocols atop
   RDMA transports is an index in the server volatile filehandle table

   o  generation number [RPCRDMA].  Where RDMA is considered, this
   specification assumes the generation number for the table entry/
      slot

   When the client presents use of such a volatile filehandle, the server makes the
   following checks, which assume that layering; it addresses only
   the check upper layer issues relevant to making best use of RPC/RDMA.

   A connection oriented (reliable sequenced) RDMA transport will be
   required.  There are several reasons for this.  First, this model
   most closely reflects the volatile bit
   has passed.  If the server boot time is less than the current server
   boot time, return NFS4ERR_FHEXPIRED.  If slot is out general NFSv4 requirement of range, return
   NFS4ERR_BADHANDLE.  If long-lived and
   congestion-controlled transports.  Second, to operate correctly over
   either an unreliable or unsequenced RDMA transport, or both, would
   require significant complexity in the generation number does implementation and protocol not match, return
   NFS4ERR_FHEXPIRED.

   When the server reboots, the table is gone (it is volatile).

   If volatile bit
   appropriate for a strict minor version.  For example, retransmission
   on connected endpoints is 0, then explicitly disallowed in the current NFSv4
   draft; it is a persistent filehandle would again be required with these alternate transport
   characteristics.  Third, this specification assumes a
   different structure following it.

4.4.  Client Recovery from Filehandle Expiration

   If possible, the client SHOULD recover from specific RDMA
   ordering semantic, which presents the receipt same set of an
   NFS4ERR_FHEXPIRED error. ordering and
   reliability issues to the RDMA layer over such transports.

   The client must take on additional
   responsibility so that it may prepare itself RDMA implementation provides for making connections to recover from other
   RDMA-capable peers.  In the
   expiration case of a volatile filehandle.  If the server returns
   persistent filehandles, current proposals before the client does not need
   RDDP working group, these additional
   steps.

   For volatile filehandles, most commonly the client will need to store
   the component names leading up to RDMA connections are preceded by a
   "streaming" phase, where ordinary TCP (or NFS) traffic might flow.
   However, this is not assumed here and including the file system
   object sizes and other parameters are
   explicitly exchanged upon a session entering RDMA mode.

6.2.3.  RDMA Connection Resources

   On transport endpoints which support automatic RDMA mode, that is,
   endpoints which are created in question.  With these names, the client should RDMA-enabled state, a single,
   preposted buffer must initially be able to
   recover provided by finding a filehandle in both peers, and the name space that
   client session negotiation must be the first exchange.

   On transport endpoints supporting dynamic negotiation, a more
   sophisticated negotiation is still
   available or by starting at possible, but is not discussed in the root
   current draft.

   RDMA imposes several requirements on upper layer consumers.
   Registration of memory and the server's file system name
   space.

   If the expired filehandle refers need to an object that has been removed
   from the file system, obviously the client will not post buffers of a specific
   size and number for receive operations are a primary consideration.

   Registration of memory can be able a relatively high-overhead operation,
   since it requires pinning of buffers, assignment of attributes (e.g.
   readable/writable), and initialization of hardware translation.
   Preregistration is desirable to
   recover from the expired filehandle.

   It reduce overhead.  These registrations
   are specific to hardware interfaces and even to RDMA connection
   endpoints, therefore negotiation of their limits is also possible that the expired filehandle refers desirable to a file that
   has been renamed.  If
   manage resources effectively.

   Following the file was renamed basic registration, these buffers must be posted by another client, again
   it is possible that the original client will not be able
   RPC layer to recover.

   However, handle receives.  These buffers remain in use by the case that the client itself is renaming
   RPC/NFSv4 implementation; the file size and
   the file is open, it is possible that the client may number of them must be able known
   to
   recover.  The client can determine the new path name based remote peer in order to avoid RDMA errors which would cause a
   fatal error on the
   processing of the rename request. RDMA connection.

   The client can then regenerate the
   new filehandle based on session provides a natural way for the new path name.  The server to manage resource
   allocation to each client could also use
   the compound operation mechanism rather than to construct a set of operations
   like:

             RENAME A B
             LOOKUP B
             GETFH

   Note that the COMPOUND procedure does not provide atomicity. each transport connection
   itself.  This
   example only reduces the overhead of recovering from an expired
   filehandle.

5.  File Attributes

   To meet enables considerable flexibility in the requirements administration
   of extensibility transport endpoints.

6.2.4.  TCP and increased
   interoperability with non-UNIX platforms, attributes must be handled
   in a flexible manner. RDMA Inline Transfer Model

   The NFS version 3 fattr3 structure contains a
   fixed list of attributes that not all clients basic transfer model for both TCP and servers are able RDMA is referred to
   support or care about.  The fattr3 structure can not be extended as
   new needs arise and it provides no way to indicate non-support.  With
   the NFS version 4 protocol, the client
   "inline".  For TCP, this is able query what attributes the server supports and construct requests with only those supported
   attributes (or a subset thereof).

   To this end, attributes are divided into three groups: mandatory,
   recommended, and named.  Both mandatory transfer model supported, since
   TCP carries both the RPC header and recommended attributes
   are supported data together in the data stream.

   For RDMA, the RDMA Send transfer model is used for all NFS version 4 protocol by a specific and well-
   defined encoding requests
   and are identified by number.  They are requested replies, but data is optionally carried by
   setting a bit in the bit vector sent in the GETATTR request; the
   server response includes a bit vector to list what attributes were
   returned in the response.  New mandatory RDMA Writes or recommended attributes
   may be added RDMA
   Reads.  Use of Sends is required to the NFS protocol between major revisions by
   publishing a standards-track RFC which allocates a new attribute
   number value ensure consistency of data and defines to
   deliver completion notifications.  The pure-Send method is typically
   used where the encoding data payload is small, or where for the attribute.  See the
   section "Minor Versioning" whatever reason
   target memory for further discussion.

   Named attributes are accessed by the new OPENATTR operation, which
   accesses a hidden directory of attributes associated RDMA is not available.

        Inline message exchange

               Client                                Server
                  :                Request              :
             Send :   ------------------------------>   : untagged
                  :                                     :  buffer
                  :               Response              :
         untagged :   <------------------------------   : Send
          buffer  :                                     :

               Client                                Server
                  :            Read request             :
             Send :   ------------------------------>   : untagged
                  :                                     :  buffer
                  :       Read response with a file
   system object.  OPENATTR takes a filehandle for the object and
   returns data       :
         untagged :   <------------------------------   : Send
          buffer  :                                     :

               Client                                Server
                  :       Write request with data       :
             Send :   ------------------------------>   : untagged
                  :                                     :  buffer
                  :            Write response           :
         untagged :   <------------------------------   : Send
          buffer  :                                     :

   Responses must be sent to the filehandle for client on the attribute hierarchy.  The filehandle
   for same connection that the named attributes
   request was sent.  It is important that the server does not assume
   any specific client implementation, in particular whether connections
   within a directory object accessible by LOOKUP
   or READDIR and contains files whose names represent session share any state at the named
   attributes client.  This is also
   important to preserve ordering of RDMA operations, and whose data bytes are especially
   RMDA consistency.  Additionally, it ensures that the value RPC RDMA layer
   makes no requirement of the attribute.  For
   example:

        +----------+-----------+---------------------------------+
        | LOOKUP   | "foo"     | ; look up file                  |
        | GETATTR  | attrbits  |                                 |
        | OPENATTR |           | ; access foo's named attributes |
        | LOOKUP   | "x11icon" | ; look up specific attribute    |
        | READ     | 0,4096    | ; read stream RDMA provider to open its memory
   registration handles (Steering Tags) beyond the scope of bytes          |
        +----------+-----------+---------------------------------+

   Named attributes are intended for data needed by applications rather
   than by a single
   RDMA connection.  This is an NFS client implementation.  NFS implementors are strongly
   encouraged important security consideration.

   Two values must be known to define their new attributes as recommended attributes
   by bringing them each peer prior to issuing Sends: the IETF standards-track process.

   The set
   maximum number of attributes sends which may be posted, and their maximum size.
   These values are classified as mandatory is
   deliberately small since servers must do whatever it takes to support
   them.  A server should support referred to, respectively, as many the message credits
   and the maximum message size.  While the message credits might vary
   dynamically over the duration of the recommended attributes
   as possible but by their definition, session, the maximum message
   size does not.  The server is not required must commit to
   support all preserving this number of them.  Attributes are deemed mandatory if the data is
   both needed by
   duplicate request cache entires, and preparing a large number of clients and is not otherwise
   reasonably computable by receive
   buffers equal to or greater than its currently advertised credit
   value, each of the client when support is not provided on advertised size.  These ensure that transport
   resources are allocated sufficient to receive the server. full advertised
   limits.

   Note that the hidden directory returned by OPENATTR is a convenience
   for protocol processing. server must post the maximum number of session requests
   to each client operations channel.  The client should is not make required to
   spread its requests in any assumptions
   about the server's implementation of named attributes and whether the
   underlying file system at particular fashion across connections
   within a session.  If the server has client wishes, it may create multiple
   sessions, each with a named attribute directory single or not.  Therefore, small number of operations such as SETATTR and GETATTR on channels
   to provide the
   named attribute directory are undefined.

5.1.  Mandatory Attributes

   These MUST be supported by every NFS version 4 client and server in
   order to ensure with this resource advantage.  Or, over RDMA
   the server may employ a minimum level of interoperability. "shared receive queue".  The server must
   store and return these attributes and can in
   any case protect its resources by restricting the client must be able client's request
   credits.

   While tempting to
   function with an attribute set limited consider, it is not possible to these attributes.  With
   just use the mandatory attributes some client functionality may be
   impaired or limited in some ways.  A client may ask for any of these
   attributes TCP window
   as an RDMA operation flow control mechanism.  First, to do so would
   violate layering, requiring both senders to be returned by setting a bit in the GETATTR request and aware of the server must return their value.

5.2.  Recommended Attributes

   These attributes existing
   TCP outbound window at all times.  Second, since requests are understood well enough to warrant support in of
   variable size, the
   NFS version 4 protocol.  However, they may not TCP window can hold a widely variable number of
   them, and since it cannot be supported reduced without actually receiving data,
   the receiver cannot limit the sender.  Third, any middlebox
   interposing on all
   clients and servers.  A client may ask for the connection would wreck any of these attributes possible scheme.
   [MIDTAX] In this specification, maximum request count limits are
   exchanged at the session level to
   be returned allow correct provisioning of
   receive buffers by setting a bit in the GETATTR transports.

   When operating over TCP or other similar transport, request limits
   and sizes are still employed in NFSv4.1, but must handle
   the case where the server does not return them.  A client may ask instead of being
   required for correctness, they provide the set basis for efficient server
   implementation of attributes the server supports and should not duplicate request
   attributes cache.  The limits are chosen
   based upon the server does not support.  A server should be tolerant
   of requests for unsupported attributes expected needs and simply not return them
   rather than considering capabilities of the request an error.  It is expected that
   servers will support all attributes they comfortably can client and
   server, and only
   fail to support attributes which are difficult to support in their
   operating environments.  A server should provide attributes whenever
   they don't have to "tell lies" fact arbitrary.  Sizes may be specified by the
   client as zero (requesting the server's preferred or optimal value),
   and request limits may be chosen in proportion to the client. client's
   capabilities.  For example, a file
   modification time should be either an accurate time or should not limit of 1000 allows 1000 requests to
   be
   supported by the server.  This will not always in progress, which may generally be comfortable far more than adequate to
   clients but the keep
   local networks and servers fully utilized.

   Both client is better positioned decide whether and how to
   fabricate or construct an attribute or whether to do without the
   attribute.

5.3.  Named Attributes

   These attributes are not supported by direct encoding in the NFS
   Version 4 protocol server have independent sizes and buffering, but over
   RDMA fabrics client credits are accessed easily managed by string names rather than
   numbers and correspond posting a receive
   buffer prior to an uninterpreted stream of bytes which are
   stored sending each request.  Each such buffer may not be
   completed with the file system object.  The name space corresponding reply, since responses from NFSv4
   servers arrive in arbitrary order.  When an operations channel is
   also used for these
   attributes may be accessed by using callbacks, the OPENATTR operation.  The
   OPENATTR operation returns a filehandle client must account for callback
   requests by posting additional buffers.  Note that implementation-
   specific facilities such as a virtual "attribute
   directory" and further perusal shared receive queue may also allow
   optimization of these allocations.

   When a session is created, the name space may be done using
   READDIR client requests a preferred buffer
   size, and LOOKUP operations on the server provides its answer.  The server posts all
   buffers of at least this filehandle.  Named attributes
   may then be examined or changed size.  The client must comply by normal READ and WRITE and CREATE
   operations on the filehandles returned from READDIR and LOOKUP.
   Named attributes may have attributes. not sending
   requests greater than this size.  It is recommended that servers support arbitrary named attributes.  A
   client should not depend on the ability server
   implementations do all they can to store any named attributes accommodate a useful range of
   possible client requests.  There is a provision in [RPCRDMA] to allow
   the sending of client requests which exceed the server's file system.  If a receive
   buffer size, but it requires the server does support named
   attributes, to "pull" the client's
   request as a "read chunk" via RDMA Read.  This introduces at least
   one additional network roundtrip, plus other overhead such as
   registering memory for RDMA Read at the client which and additional RDMA
   operations at the server, and is also able to handle them should be able
   to copy a file's data and meta-data with complete transparency from
   one location to another; this would imply that names allowed for
   regular directory entries are valid for named attribute names as
   well.

   Names avoided.

   An issue therefore arises when considering the NFSv4 COMPOUND
   procedures.  Since an arbitrary number (total size) of attributes will not operations can
   be controlled specified in a single COMPOUND procedure, its size is effectively
   unbounded.  This cannot be supported by RDMA Sends, and therefore
   this document or other
   IETF standards track documents.  See size negotiation places a restriction on the section "IANA
   Considerations" for further discussion.

5.4.  Classification of Attributes

   Each construction and
   maximum size of the Mandatory both COMPOUND requests and Recommended attributes responses.  If a COMPOUND
   results in a reply at the server that is larger than can be classified sent in
   one
   an RDMA Send to the client, then the COMPOUND must terminate and the
   operation which causes the overflow will provide a TOOSMALL error
   status result.

6.2.5.  RDMA Direct Transfer Model

   Placement of three categories: per server, per file system, or per file
   system object.  Note that it data by explicitly tagged RDMA operations is possible that some per file system
   attributes may vary within referred to
   as "direct" transfer.  This method is typically used where the file system.  See data
   payload is relatively large, that is, when RDMA setup has been
   performed prior to the "homogeneous"
   attribute operation, or when any overhead for its definition.  Note that setting up
   and performing the attributes
   time_access_set transfer is regained by avoiding the overhead of
   processing an ordinary receive.

   The client advertises RDMA buffers and time_modify_set are not listed the server.  This means
   the "XDR Decoding with Read Chunks" described in this section
   because they are write-only attributes corresponding to time_access
   and time_modify, [RPCRDMA] is not
   employed by NFSv4.1 replies, and instead all results transferred via
   RDMA to the client employ "XDR Decoding with Write Chunks".  There
   are used in several reasons for this.

   First, it allows for a special instance of SETATTR.

   o  The per server attribute is:

         lease_time

   o correct and secure mode of transfer.  The per file system attributes are:

         supp_attr, fh_expire_type, link_support, symlink_support,
         unique_handles, aclsupport, cansettime, case_insensitive,
         case_preserving, chown_restricted, files_avail, files_free,
         files_total, fs_locations, homogeneous, maxfilesize, maxname,
         maxread, maxwrite, no_trunc, space_avail, space_free,
         space_total, time_delta, fs_layout_type, send_impl_id,
         recv_impl_id

   o
   client may advertise specific memory buffers only during specific
   times, and may revoke access when it pleases.  The per server is not
   required to expose copies of local file system object attributes are:

         type, change, size, named_attr, fsid, rdattr_error, filehandle,
         ACL, archive, fileid, hidden, maxlink, mimetype, mode,
         numlinks, owner, owner_group, rawdev, space_used, system,
         time_access, time_backup, time_create, time_metadata,
         time_modify, mounted_on_fileid, layout_type, layout_hint,
         layout_blksize, layout_alignment

   For quota_avail_hard, quota_avail_soft, and quota_used see their
   definitions below buffers for individual
   clients, or to lock or copy them for each client access.

   Second, client credits based on fixed-size request buffers are easily
   managed on the server, but for the appropriate classification.

5.5.  Mandatory Attributes - Definitions

   +-----------------+----+------------+--------+----------------------+
   | name            | #  | Data Type  | Access | Description          |
   +-----------------+----+------------+--------+----------------------+
   | supp_attr       | 0  | bitmap     | READ   | The bit vector which |
   |                 |    |            |        | server additional management of
   buffers for client RDMA Reads is not well-bounded.  For example, the
   client may not perform these RDMA Read operations in a timely
   fashion, therefore the server would retrieve all   |
   |                 |    |            |        | mandatory have to protect itself against
   denial-of-service on these resources.

   Third, it reduces network traffic, since buffer exposure outside the
   scope and        |
   |                 |    |            |        | recommended          |
   |                 |    |            |        | attributes that duration of a single request/response exchange necessitates
   additional memory management exchanges.

   There are  |
   |                 |    |            |        | supported for costs associated with this   |
   |                 |    |            |        | object. decision.  Primary among them is
   the need for the server to employ RDMA Read for operations such as
   large WRITE.  The scope of |
   |                 |    |            |        | RDMA Read operation is a two-way exchange at the
   RDMA layer, which incurs additional overhead relative to RDMA Write.
   Additionally, RDMA Read requires resources at the data source (the
   client in this attribute       |
   |                 |    |            |        | applies specification) to all       |
   |                 |    |            |        | objects maintain state and to generate
   replies.  These costs are overcome through use of pipelining with a       |
   |                 |    |            |        | matching fsid.       |
   | type            | 1  | nfs4_ftype | READ   | The type
   credits, with sufficient RDMA Read resources negotiated at session
   initiation, and appropriate use of RDMA for writes by the      |
   |                 |    |            |        | object (file,        |
   |                 |    |            |        | directory, symlink,  |
   |                 |    |            |        | etc.)                |
   | fh_expire_type  | 2  | uint32     | client -
   for example only for transfers above a certain size.

   A description of which NFSv4 operation results are eligible for data
   transfer via RDMA Write is in [NFSDDP].  There are only two such
   operations: READ   | Server uses this and READLINK.  When XDR encoding these requests on
   an RDMA transport, the NFSv4.1 client must insert the appropriate
   xdr_write_list entries to  |
   |                 |    |            |        | specify filehandle   |
   |                 |    |            |        | expiration behavior  |
   |                 |    |            |        | indicate to the client. See   |
   |                 |    |            |        | server whether the section          |
   |                 |    |            |        | "Filehandles" results
   should be transferred via RDMA or inline with a Send.  As described
   in [NFSDDP], a zero-length write chunk is used to indicate an inline
   result.  In this way, it is unnecessary to create new operations for    |
   |                 |    |            |        | additional           |
   |                 |    |            |        | description.         |
   | change          | 3  | uint64     |
   RDMA-mode versions of READ   | A value created by   |
   |                 |    |            |        | and READLINK.

   Another tool to avoid creation of new, RDMA-mode operations is the server that
   Reply Chunk [RPCRDMA], which is used by RPC in RDMA mode to return
   large replies via RDMA as if they were inline.  Reply chunks are used
   for operations such as READDIR, which returns large amounts of
   information, but in many small XDR segments.  Reply chunks are
   offered by the  |
   |                 |    |            |        | client and the server can use them in preference to    |
   |                 |    |            |        | determine if file    |
   |                 |    |            |        | data, directory      |
   |                 |    |            |        | contents or          |
   |                 |    |            |        | attributes of
   inline.  Reply chunks are transparent to upper layers such as NFSv4.

   In any very rare cases where another NFSv4.1 operation requires
   larger buffers than were negotiated when the    |
   |                 |    |            |        | object have been     |
   |                 |    |            |        | modified. The server |
   |                 |    |            |        | session was created (for
   example extraordinarily large RENAMEs), the underlying RPC layer may return
   support the       |
   |                 |    |            |        | object's             |
   |                 |    |            |        | time_metadata        |
   |                 |    |            |        | attribute use of "Message as an RDMA Read Chunk" and "RDMA Write of
   Long Replies" as described in [RPCRDMA].  No additional support is
   required in the NFSv4.1 client for this.  The client should be
   certain that its requested buffer sizes are not so small as to make
   this   |
   |                 |    |            |        | attribute's value    |
   |                 |    |            |        | a frequent occurrence, however.

   All operations are initiated by a Send, and are completed with a
   Send.  This is exactly as in conventional NFSv4, but only if the file |
   |                 |    |            |        | system object can    |
   |                 |    |            |        | under RDMA has a
   significant purpose: RDMA operations are not be updated more  |
   |                 |    |            |        | frequently than complete, that is,
   guaranteed consistent, at the  |
   |                 |    |            |        | resolution of        |
   |                 |    |            |        | time_metadata.       |
   | size            | 4  | uint64     | R/W    | The size of data sink until followed by a
   successful Send completion (i.e. a receive).  These events provide a
   natural opportunity for the      |
   |                 |    |            |        | object in bytes.     |
   | link_support    | 5  | bool       | READ   | True, if initiator (client) to enable and later
   disable RDMA access to the         |
   |                 |    |            |        | object's file system |
   |                 |    |            |        | supports hard links. |
   | symlink_support | 6  | bool       | READ   | True, if memory which is the         |
   |                 |    |            |        | object's file system |
   |                 |    |            |        | supports symbolic    |
   |                 |    |            |        | links.               |
   | named_attr      | 7  | bool       | READ   | True, if this object |
   |                 |    |            |        | has named            |
   |                 |    |            |        | attributes. In other |
   |                 |    |            |        | words, object has a  |
   |                 |    |            |        | non-empty named      |
   |                 |    |            |        | attribute directory. |
   | fsid            | 8  | fsid4      | READ   | Unique file system   |
   |                 |    |            |        | identifier target of each
   operation, in order to provide for the   |
   |                 |    |            |        | file system holding  |
   |                 |    |            |        | this object. fsid    |
   |                 |    |            |        | contains major consistent and   |
   |                 |    |            |        | minor components     |
   |                 |    |            |        | each secure operation.
   The RDMAP Send with Invalidate operation may be worth employing in
   this respect, as it relieves the client of which are    |
   |                 |    |            |        | uint64.              |
   | unique_handles  | 9  | bool       | READ   | True, if two         |
   |                 |    |            |        | distinct filehandles |
   |                 |    |            |        | guaranteed certain overhead in this
   case.

   A "onetime" boolean advisory to refer  |
   |                 |    |            |        | each RDMA region might become a hint
   to two different     |
   |                 |    |            |        | file system objects. |
   | lease_time      | 10 | nfs_lease4 | READ   | Duration of leases   |
   |                 |    |            |        | at the server that the client will use the three-tuple for only one
   NFSv4 operation.  For a transport such as iWARP, the server can
   assist the client in         |
   |                 |    |            |        | seconds.             |
   | rdattr_error    | 11 | enum       | READ   | Error returned from  |
   |                 |    |            |        | getattr during       |
   |                 |    |            |        | readdir.             |
   | filehandle      | 19 | nfs_fh4    | READ   | invalidating the three-tuple by performing a
   Send with Solicited Event and Invalidate.  The filehandle server may ignore this
   hint, in which case the client must perform a local invalidate after
   receiving the indication from the server that the NFSv4 operation is
   complete.  This may be considered in a future version of    |
   |                 |    |            |        | this object          |
   |                 |    |            |        | (primarily draft
   and [NFSDDP].

   In a trusted environment, it may be desirable for       |
   |                 |    |            |        | readdir requests).   |
   +-----------------+----+------------+--------+----------------------+

5.6.  Recommended Attributes - Definitions
   +--------------------+----+---------------+--------+----------------+
   | name               | #  | Data Type     | Access | Description    |
   +--------------------+----+---------------+--------+----------------+
   | ACL                | 12 | nfsace4<>     | R/W    | The the client to
   persistently enable RDMA access     |
   |                    |    |               |        | control list   |
   |                    |    |               |        | by the server.  Such a model is
   desirable for the        |
   |                    |    |               |        | object.        |
   | aclsupport         | 13 | uint32        | READ   | Indicates what |
   |                    |    |               |        | types highest level of ACLs  |
   |                    |    |               |        | efficiency and lowest overhead.

        RDMA message exchanges

               Client                                Server
                  :         Direct Read Request         :
             Send :   ------------------------------>   : untagged
                  :                                     :  buffer
                  :               Segment               :
          tagged  :   <------------------------------   :  RDMA Write
          buffer  :                  :                  :
                  :              [Segment]              :
          tagged  :   <------------------------------   : [RDMA Write]
          buffer  :                                     :
                  :         Direct Read Response        :
         untagged :   <------------------------------   :  Send (w/Inv.)
          buffer  :                                     :

               Client                                Server
                  :        Direct Write Request         :
             Send :   ------------------------------>   : untagged
                  :                                     :  buffer
                  :               Segment               :
          tagged  :   v------------------------------   :  RDMA Read
          buffer  :   +----------------------------->   :
                  :                  :                  :
                  :              [Segment]              :
          tagged  :   v------------------------------   : [RDMA Read]
          buffer  :   +----------------------------->   :
                  :                                     :
                  :        Direct Write Response        :
         untagged :   <------------------------------   :  Send (w/Inv.)
          buffer  :                                     :

6.3.  Connection Models

   There are supported  |
   |                    |    |               |        | on three scenarios in which to discuss the current |
   |                    |    |               |        | file system.   |
   | archive            | 14 | bool          | R/W    | True, if this  |
   |                    |    |               |        | file has been  |
   |                    |    |               |        | archived since |
   |                    |    |               |        | connection model.
   Each will be discussed individually, after describing the time of    |
   |                    |    |               |        | last           |
   |                    |    |               |        | modification   |
   |                    |    |               |        | (deprecated common case
   encountered at initial connection establishment.

   After a successful connection, the first request proceeds, in |
   |                    |    |               |        | favor the
   case of       |
   |                    |    |               |        | time_backup).  |
   | cansettime         | 15 | bool          | READ   | True, if a new client association, to initial session creation, and
   then optionally to session callback channel binding, prior to regular
   operation.

   Commonly, each new client "mount" will be the action which drives
   creation of a new session.  However there are any number of other
   approaches.  Clients may choose to share a single connection and
   session among all their mount points.  Or, clients may support
   trunking, where additional connections are created but all within a
   single session.  Alternatively, the client may choose to create
   multiple sessions, each tuned to the buffering and reliability needs
   of the mount point.  For example, a readonly mount can sharply reduce
   its write buffering and also makes no requirement for the   |
   |                    |    |               |        | server able to |
   |                    |    |               |        | change
   support reliable duplicate request caching.

   Similarly, the     |
   |                    |    |               |        | times client can choose among several strategies for
   clientid usage.  Sessions can share a    |
   |                    |    |               |        | file system    |
   |                    |    |               |        | object single clientid, or create new
   clientids as      |
   |                    |    |               |        | specified in the client deems appropriate.  For kernel-based clients
   which service multiple authenticated users, a |
   |                    |    |               |        | SETATTR        |
   |                    |    |               |        | operation.     |
   | case_insensitive   | 16 | bool          | READ   | True, if       |
   |                    |    |               |        | filename       |
   |                    |    |               |        | comparisons on |
   |                    |    |               |        | this file      |
   |                    |    |               |        | system are     |
   |                    |    |               |        | case           |
   |                    |    |               |        | insensitive.   |
   | case_preserving    | 17 | bool          | READ   | True, if       |
   |                    |    |               |        | filename case  |
   |                    |    |               |        | on this single clientid shared
   across all mount points is generally the most appropriate and
   flexible approach.  For example, all the client's file   |
   |                    |    |               |        | system operations may
   wish to share locking state and the local client kernel takes the
   responsibility for arbitrating access locally.  For clients choosing
   to support other authentication models, perhaps example userspace
   implementations, a new clientid is indicated.  Through use of session
   create options, both models are     |
   |                    |    |               |        | preserved.     |
   | chown_restricted   | 18 | bool          | READ   | If TRUE, supported at the   |
   |                    |    |               |        | client's choice.

   Since the session is explicitly created and destroyed by the client,
   and each client is uniquely identified, the server may be
   specifically instructed to discard unneeded persistent state.  For
   this reason, it is possible that a server will    |
   |                    |    |               |        | reject retain any     |
   |                    |    |               |        | request previous
   state indefinitely, and place its destruction under administrative
   control.  Or, a server may choose to     |
   |                    |    |               |        | change either  |
   |                    |    |               |        | retain state for some
   configurable period, provided that the owner or   |
   |                    |    |               |        | period meets other NFSv4
   requirements such as lease reclamation time, etc.  However, since
   discarding this state at the group      |
   |                    |    |               |        | associated     |
   |                    |    |               |        | with server may affect the correctness of the
   server as seen by the client across network partitioning, such
   discarding of state should be done only in a file if |
   |                    |    |               |        | conservative manner.

   Each client request to the caller server carries a new SEQUENCE operation
   within each COMPOUND, which provides the session context.  This
   session context then governs the request control, duplicate request
   caching, and other persistent parameters managed by the server for a
   session.

6.3.1.  TCP Connection Model

   The following is  |
   |                    |    |               |        | not a          |
   |                    |    |               |        | privileged     |
   |                    |    |               |        | user (for      |
   |                    |    |               |        | example,       |
   |                    |    |               |        | "root" in UNIX |
   |                    |    |               |        | operating      |
   |                    |    |               |        | environments   |
   |                    |    |               |        | or in Windows  |
   |                    |    |               |        | 2000 schematic diagram of the "Take |
   |                    |    |               |        | Ownership"     |
   |                    |    |               |        | privilege).    |
   | dir_notif_delay    | 56 | nfstime4      | READ   | notification   |
   |                    |    |               |        | delays on      |
   |                    |    |               |        | directory      |
   |                    |    |               |        | attributes     |
   | dirent_notif_delay | 57 | nfstime4      | READ   | notification   |
   |                    |    |               |        | delays NFSv4.1 protocol
   exchanges leading up to normal operation on      |
   |                    |    |               |        | child          |
   |                    |    |               |        | attributes     |
   | fileid             | 20 | uint64        | READ   | A number       |
   |                    |    |               |        | uniquely       |
   |                    |    |               |        | identifying    |
   |                    |    |               |        | a TCP stream.

               Client                                Server
          TCPmode :   Create Clientid(nfs_client_id4)   : TCPmode
                  :   ------------------------------>   :
                  :                                     :
                  :     Clientid reply(clientid, ...)   :
                  :   <------------------------------   :
                  :                                     :
                  :   Create Session(clientid, size S,  :
                  :      maxreq N, STREAM, ...)         :
                  :   ------------------------------>   :
                  :                                     :
                  :   Session reply(sessionid, size S', :
                  :      maxreq N')                     :
                  :   <------------------------------   :
                  :                                     :
                  :          <normal operation>         :
                  :   ------------------------------>   :
                  :   <------------------------------   :
                  :                  :                  :

   No net additional exchange is added to the file       |
   |                    |    |               |        | within initial negotiation.  In
   the     |
   |                    |    |               |        | file system.   |
   | files_avail        | 21 | uint64        | READ   | File slots     |
   |                    |    |               |        | available to   |
   |                    |    |               |        | NFSv4.1 exchange, the CREATE_CLIENTID replaces SETCLIENTID
   (eliding the callback "clientaddr4" addressing) and CREATE_SESSION
   subsumes the function of SETCLIENTID_CONFIRM, as described elsewhere
   in this user on   |
   |                    |    |               |        | specification.  Callback channel binding is optional, as in
   NFSv4.0.  Note that the file       |
   |                    |    |               |        | system         |
   |                    |    |               |        | containing     |
   |                    |    |               |        | STREAM transport type is shown above, but
   since the transport mode remains unchanged and transport attributes
   are not necessarily exchanged, DEFAULT could also be passed.

6.3.2.  Negotiated RDMA Connection Model

   One possible design which has been considered is to have a
   "negotiated" RDMA connection model, supported via use of a session
   bind operation as a required first step.  However due to issues
   mentioned earlier, this object -  |
   |                    |    |               |        | proved problematic.  This section remains as
   a reminder of that fact, and it is possible such a mode can be
   supported.

   It is not considered critical that this should be |
   |                    |    |               |        | supported for two reasons.
   One, the smallest   |
   |                    |    |               |        | relevant       |
   |                    |    |               |        | limit.         |
   | files_free         | 22 | uint64        | READ   | Free file      |
   |                    |    |               |        | slots on session persistence provides a way for the   |
   |                    |    |               |        | file system    |
   |                    |    |               |        | containing     |
   |                    |    |               |        | this object -  |
   |                    |    |               |        | this should server to
   remember important session parameters, such as sizes and maximum
   request counts.  These values can be |
   |                    |    |               |        | used to restore the smallest   |
   |                    |    |               |        | relevant       |
   |                    |    |               |        | limit.         |
   | files_total        | 23 | uint64        | READ   | Total file     |
   |                    |    |               |        | slots on endpoint
   prior to making the   |
   |                    |    |               |        | file system    |
   |                    |    |               |        | containing     |
   |                    |    |               |        | this object.   |
   | fs_absent          | 60 | bool          | READ   | Is current     |
   |                    |    |               |        | file system    |
   |                    |    |               |        | present or     |
   |                    |    |               |        | absent.        |
   | fs_layout_type     | 62 | layouttype4   | READ   | Layout types   |
   |                    |    |               |        | available for  |
   |                    |    |               |        | the file       |
   |                    |    |               |        | system.        |
   | fs_locations       | 24 | fs_locations  | READ   | Locations      |
   |                    |    |               |        | where this     |
   |                    |    |               |        | file system    |
   |                    |    |               |        | may be found.  |
   |                    |    |               |        | If the server  |
   |                    |    |               |        | returns        |
   |                    |    |               |        | NFS4ERR_MOVED  |
   |                    |    |               |        | as an error,   |
   |                    |    |               |        | this attribute |
   |                    |    |               |        | MUST be        |
   |                    |    |               |        | supported.     |
   | fs_locations_info  | 67 |               | READ   | Full function  |
   |                    |    |               |        | file system    |
   |                    |    |               |        | location.      |
   | fs_status          | 61 | fs4_status    | READ   | Generic file   |
   |                    |    |               |        | system type    |
   |                    |    |               |        | information.   |
   | hidden             | 25 | bool          | R/W    | True, if the   |
   |                    |    |               |        | file is        |
   |                    |    |               |        | considered     |
   |                    |    |               |        | hidden with    |
   |                    |    |               |        | respect to the |
   |                    |    |               |        | Windows API?   |
   | homogeneous        | 26 | bool          | READ   | True, if this  |
   |                    |    |               |        | object's file  |
   |                    |    |               |        | system is      |
   |                    |    |               |        | homogeneous,   |
   |                    |    |               |        | i.e. are per   |
   |                    |    |               |        | file system    |
   |                    |    |               |        | attributes the |
   |                    |    |               |        | same for all   |
   |                    |    |               |        | file system's  |
   |                    |    |               |        | objects.       |
   | layout_alignment   | 66 | uint32_t      | READ   | Preferred      |
   |                    |    |               |        | alignment for  |
   |                    |    |               |        | layout related |
   |                    |    |               |        | I/O.           |
   | layout_blksize     | 65 | uint32_t      | READ   | Preferred      |
   |                    |    |               |        | block size for |
   |                    |    |               |        | layout related |
   |                    |    |               |        | I/O.           |
   | layout_hint        | 63 | layouthint4   | WRITE  | Client         |
   |                    |    |               |        | specified hint |
   |                    |    |               |        | for file       |
   |                    |    |               |        | layout.        |
   | layout_type        | 64 | layouttype4   | READ   | Layout types   |
   |                    |    |               |        | available for  |
   |                    |    |               |        | the file.      |
   | maxfilesize        | 27 | uint64        | READ   | Maximum        |
   |                    |    |               |        | supported file |
   |                    |    |               |        | size for the   |
   |                    |    |               |        | file system of |
   |                    |    |               |        | this object.   |
   | maxlink            | 28 | uint32        | READ   | Maximum number |
   |                    |    |               |        | of links for   |
   |                    |    |               |        | this object.   |
   | maxname            | 29 | uint32        | READ   | Maximum        |
   |                    |    |               |        | filename size  |
   |                    |    |               |        | supported for  |
   |                    |    |               |        | this object.   |
   | maxread            | 30 | uint64        | READ   | Maximum read   |
   |                    |    |               |        | size supported |
   |                    |    |               |        | for this       |
   |                    |    |               |        | object.        |
   | maxwrite           | 31 | uint64        | READ   | Maximum write  |
   |                    |    |               |        | size supported |
   |                    |    |               |        | for this       |
   |                    |    |               |        | object. This   |
   |                    |    |               |        | attribute      |
   |                    |    |               |        | SHOULD be      |
   |                    |    |               |        | supported if   |
   |                    |    |               |        | the file is    |
   |                    |    |               |        | writable. Lack |
   |                    |    |               |        | of this        |
   |                    |    |               |        | attribute can  |
   |                    |    |               |        | lead to the    |
   |                    |    |               |        | client either  |
   |                    |    |               |        | wasting        |
   |                    |    |               |        | bandwidth or   |
   |                    |    |               |        | not receiving  |
   |                    |    |               |        | the best       |
   |                    |    |               |        | performance.   |
   | mdsthreshold       | 68 | mdsthreshold4 | READ   | Hint to client |
   |                    |    |               |        | as to when to  |
   |                    |    |               |        | write through  |
   |                    |    |               |        | the pnfs       |
   |                    |    |               |        | metadata       |
   |                    |    |               |        | server.        |
   | mimetype           | 32 | utf8<>        | R/W    | MIME body      |
   |                    |    |               |        | type/subtype   |
   |                    |    |               |        | of this        |
   |                    |    |               |        | object.        |
   | mode               | 33 | mode4         | R/W    | UNIX-style     |
   |                    |    |               |        | mode and       |
   |                    |    |               |        | permission     |
   |                    |    |               |        | bits for this  |
   |                    |    |               |        | object.        |
   | mounted_on_fileid  | 55 | uint64        | READ   | Like fileid,   |
   |                    |    |               |        | but if the     |
   |                    |    |               |        | target         |
   |                    |    |               |        | filehandle is  |
   |                    |    |               |        | the root of a  |
   |                    |    |               |        | file system    |
   |                    |    |               |        | return the     |
   |                    |    |               |        | fileid of the  |
   |                    |    |               |        | underlying     |
   |                    |    |               |        | directory.     |
   | no_trunc           | 34 | bool          | READ   | True, if a     |
   |                    |    |               |        | name longer    |
   |                    |    |               |        | than name_max  |
   |                    |    |               |        | is used, an    |
   |                    |    |               |        | error be       |
   |                    |    |               |        | returned and   |
   |                    |    |               |        | name is not    |
   |                    |    |               |        | truncated.     |
   | numlinks           | 35 | uint32        | READ   | Number of hard |
   |                    |    |               |        | links to this  |
   |                    |    |               |        | object.        |
   | owner              | 36 | utf8<>        | R/W    | The string     |
   |                    |    |               |        | name of the    |
   |                    |    |               |        | owner of this  |
   |                    |    |               |        | object.        |
   | owner_group        | 37 | utf8<>        | R/W    | The string     |
   |                    |    |               |        | name of the    |
   |                    |    |               |        | group          |
   |                    |    |               |        | ownership of   |
   |                    |    |               |        | this object.   |
   | quota_avail_hard   | 38 | uint64        | READ   | For definition |
   |                    |    |               |        | see "Quota     |
   |                    |    |               |        | Attributes"    |
   |                    |    |               |        | section below. |
   | quota_avail_soft   | 39 | uint64        | READ   | For definition |
   |                    |    |               |        | see "Quota     |
   |                    |    |               |        | Attributes"    |
   |                    |    |               |        | section below. |
   | quota_used         | 40 | uint64        | READ   | For definition |
   |                    |    |               |        | see "Quota     |
   |                    |    |               |        | Attributes"    |
   |                    |    |               |        | section below. |
   | rawdev             | 41 | specdata4     | READ   | Raw device     |
   |                    |    |               |        | identifier.    |
   |                    |    |               |        | UNIX device    |
   |                    |    |               |        | major/minor    |
   |                    |    |               |        | node           |
   |                    |    |               |        | information.   |
   |                    |    |               |        | If the value   |
   |                    |    |               |        | of type is not |
   |                    |    |               |        | NF4BLK or      |
   |                    |    |               |        | NF4CHR, the    |
   |                    |    |               |        | value return   |
   |                    |    |               |        | SHOULD NOT be  |
   |                    |    |               |        | considered     |
   |                    |    |               |        | useful.        |
   | recv_impl_id       | 59 | nfs_impl_id4  | READ   | Client obtains |
   |                    |    |               |        | server         |
   |                    |    |               |        | implementation |
   |                    |    |               |        | via GETATTR.   |
   | send_impl_id       | 58 | impl_ident4   | WRITE  | Client         |
   |                    |    |               |        | provides       |
   |                    |    |               |        | server with    |
   |                    |    |               |        | implementation |
   |                    |    |               |        | identity via   |
   |                    |    |               |        | SETATTR.       |
   | space_avail        | 42 | uint64        | READ   | Disk space in  |
   |                    |    |               |        | bytes          |
   |                    |    |               |        | available to   |
   |                    |    |               |        | this user on   |
   |                    |    |               |        | the file       |
   |                    |    |               |        | system         |
   |                    |    |               |        | containing     |
   |                    |    |               |        | this object -  |
   |                    |    |               |        | this should be |
   |                    |    |               |        | the smallest   |
   |                    |    |               |        | relevant       |
   |                    |    |               |        | limit.         |
   | space_free         | 43 | uint64        | READ   | Free disk      |
   |                    |    |               |        | space in bytes |
   |                    |    |               |        | on the file    |
   |                    |    |               |        | system         |
   |                    |    |               |        | containing     |
   |                    |    |               |        | this object -  |
   |                    |    |               |        | this should be |
   |                    |    |               |        | the smallest   |
   |                    |    |               |        | relevant       |
   |                    |    |               |        | limit.         |
   | space_total        | 44 | uint64        | READ   | Total disk     |
   |                    |    |               |        | space in bytes |
   |                    |    |               |        | on the file    |
   |                    |    |               |        | system         |
   |                    |    |               |        | containing     |
   |                    |    |               |        | this object.   |
   | space_used         | 45 | uint64        | READ   | Number of file |
   |                    |    |               |        | system bytes   |
   |                    |    |               |        | allocated to   |
   |                    |    |               |        | this object.   |
   | system             | 46 | bool          | R/W    | True, if this  |
   |                    |    |               |        | file is a      |
   |                    |    |               |        | "system" file  |
   |                    |    |               |        | with respect   |
   |                    |    |               |        | to the Windows |
   |                    |    |               |        | API?           |
   | time_access        | 47 | nfstime4      | READ   | The time of    |
   |                    |    |               |        | last access to |
   |                    |    |               |        | the object by  |
   |                    |    |               |        | a read that    |
   |                    |    |               |        | was satisfied  |
   |                    |    |               |        | by the server. |
   | time_access_set    | 48 | settime4      | WRITE  | Set the time   |
   |                    |    |               |        | of last access |
   |                    |    |               |        | to the object. |
   |                    |    |               |        | SETATTR use    |
   |                    |    |               |        | only.          |
   | time_backup        | 49 | nfstime4      | R/W    | The time of    |
   |                    |    |               |        | last backup of |
   |                    |    |               |        | the object.    |
   | time_create        | 50 | nfstime4      | R/W    | The time of    |
   |                    |    |               |        | creation of    |
   |                    |    |               |        | the object.    |
   |                    |    |               |        | This attribute |
   |                    |    |               |        | does not have  |
   |                    |    |               |        | any relation   |
   |                    |    |               |        | to the         |
   |                    |    |               |        | traditional    |
   |                    |    |               |        | UNIX file      |
   |                    |    |               |        | attribute      |
   |                    |    |               |        | "ctime" or     |
   |                    |    |               |        | "change time". |
   | time_delta         | 51 | nfstime4      | READ   | Smallest       |
   |                    |    |               |        | useful server  |
   |                    |    |               |        | time           |
   |                    |    |               |        | granularity.   |
   | time_metadata      | 52 | nfstime4      | READ   | The time of    |
   |                    |    |               |        | last meta-data |
   |                    |    |               |        | modification   |
   |                    |    |               |        | of the object. |
   | time_modify        | 53 | nfstime4      | READ   | The time of    |
   |                    |    |               |        | last           |
   |                    |    |               |        | modification   |
   |                    |    |               |        | to the object. |
   | time_modify_set    | 54 | settime4      | WRITE  | Set the time   |
   |                    |    |               |        | of last        |
   |                    |    |               |        | modification   |
   |                    |    |               |        | to the object. |
   |                    |    |               |        | SETATTR use    |
   |                    |    |               |        | only.          |
   +--------------------+----+---------------+--------+----------------+

5.7.  Time Access

   As defined above, the time_access attribute represents the time of
   last access to the object by a read that was satisfied by the server.
   The notion of what is an "access" depends on server's operating
   environment and/or the server's file system semantics.  For example,
   for servers obeying POSIX semantics, time_access would be updated
   only by the READLINK, READ, and READDIR operations and not any of the
   operations that modify the content of the object.  Of course, setting
   the corresponding time_access_set attribute is another way to modify
   the time_access attribute.

   Whenever the file object resides on a writable file system, the
   server should make best efforts to record time_access into stable
   storage.  However, to mitigate the performance effects of doing so,
   and most especially whenever the server is satisfying the read of the
   object's content from its cache, the server MAY cache access time
   updates and lazily write them to stable storage.  It is also
   acceptable to give administrators of the server the option to disable
   time_access updates.

5.8.  Interpreting owner and owner_group

   The recommended attributes "owner" and "owner_group" (and also users
   and groups within the "acl" attribute) are represented in terms of a
   UTF-8 string.  To avoid a representation that is tied to a particular
   underlying implementation at the client or server, the use of the
   UTF-8 string has been chosen.  Note that section 6.1 of RFC2624 [26]
   provides additional rationale.  It is expected that the client and
   server will have their own local representation of owner and
   owner_group that is used for local storage or presentation to the end
   user.  Therefore, it is expected that when these attributes are
   transferred between the client and server that the local
   representation is translated to a syntax of the form "user@
   dns_domain".  This will allow for a client and server that do not use
   the same local representation the ability to translate to a common
   syntax that can be interpreted by both.

   Similarly, security principals may be represented in different ways
   by different security mechanisms.  Servers normally translate these
   representations into a common format, generally that used by local
   storage, to serve as a means of identifying the users corresponding
   to these security principals.  When these local identifiers are
   translated to the form of the owner attribute, associated with files
   created by such principals they identify, in a common format, the
   users associated with each corresponding set of security principals.

   The translation used to interpret owner and group strings is not
   specified as part of the protocol.  This allows various solutions to
   be employed.  For example, a local translation table may be consulted
   that maps between a numeric id first reply.  Two, there are currently no
   critical RDMA parameters to set in the user@dns_domain syntax.  A name
   service may also be used to accomplish endpoint at the translation.  A server may
   provide a more general service, not limited by any particular
   translation (which would only translate a limited set side of possible
   strings) by storing the owner and owner_group attributes in local
   storage without any translation or it may augment a translation
   method by storing
   the entire string for attributes for connection.  RDMA Read resources, which no
   translation is available while using the local representation for
   those cases are in which a translation is available.

   Servers that do general not provide support for all possible values of
   settable after entering RDMA mode, are set only at the
   owner and owner_group attributes, should return an error
   (NFS4ERR_BADOWNER) when a string is presented that has no
   translation, as client - the value to be set for a SETATTR
   originator of the owner,
   owner_group, or acl attributes.  When a server does accept an owner
   or owner_group value connection.  Therefore as long as valid on a SETATTR (and similarly for the
   owner and group strings in RDMA provider
   supports an acl), it is promising to return that
   same string when a corresponding GETATTR is done.  Configuration
   changes and ill-constructed name translations (those that contain
   aliasing) may make that promise impossible to honor.  Servers should
   make appropriate efforts to avoid a situation in which these
   attributes have their values changed when automatic RDMA connection mode, no real change to ownership
   has occurred.

   The "dns_domain" portion of the owner string further support is meant to be a DNS
   domain name.  For example, user@ietf.org.  Servers should accept as
   valid a set of users
   required from the NFSv4.1 protocol for reconnection.

   Note, the client must provide at least one domain.  A as many RDMA Read resources to
   its local queue for the benefit of the server may treat
   other domains when reconnecting, as having no valid translations.  A more general
   service is provided
   it used when a server is capable of accepting users for
   multiple domains, or for all domains, subject to security
   constraints.

   In negotiating the case where there session.  If this value is no translation available to longer
   appropriate, the client or
   server, the attribute value must be constructed without should resynchronize its session state,
   destroy the "@".
   Therefore, existing session, and start over with the absence more
   appropriate values.

6.3.3.  Automatic RDMA Connection Model

   The following is a schematic diagram of the @ from the owner or owner_group
   attribute signifies that no translation was available at the sender
   and that the receiver NFSv4.1 protocol
   exchanges performed on an RDMA connection.

             Client                                Server
       RDMAmode :                  :                  : RDMAmode
                :                  :                  :
       Prepost  :                  :                  : Prepost
       receive  :                  :                  : receive
                :                                     :
                :   Create Clientid(nfs_client_id4)   :
                :   ------------------------------>   :
                :                                     : Prepost
                :     Clientid reply(clientid, ...)   : receive
                :   <------------------------------   :
       Prepost  :                                     :
       receive  :   Create Session(clientid, size S,  :
                :      maxreq N, RDMA ...)            :
                :   ------------------------------>   :
                :                                     : Prepost <=N'
                :   Session reply(sessionid, size S', :     receives of
                :      maxreq N')                     :     size S'
                :   <------------------------------   :
                :                                     :
                :          <normal operation>         :
                :   ------------------------------>   :
                :   <------------------------------   :
                :                  :                  :

6.4.  Buffer Management, Transfer, Flow Control

   Inline operations in NFSv4.1 behave effectively the attribute should not use that string same as TCP
   sends.  Procedure results are passed in a basis for translation into single message, and its own internal format.  Even though
   the attribute value can not be translated, it may still be useful.
   In
   completion at the case of a client, client signal the attribute string may be used for local
   display of ownership.

   To provide a greater degree of compatibility with previous versions
   of NFS (i.e. v2 and v3), which identified users and groups by 32-bit
   unsigned uid's and gid's, owner and group strings that consist of
   decimal numeric values with no leading zeros can be given a special
   interpretation by clients and servers which choose receiving process to provide such
   support.  The receiver may treat such a user or group string as
   representing inspect the same user as would be represented
   message.

   RDMA operations are performed solely by a v2/v3 uid or
   gid having the corresponding numeric value.  A server is in NFSv4.1, as
   described in Section 6.2.5 RDMA Direct Transfer Model.  Since server
   RDMA operations do not
   obligated to accept such a string, but may return an NFS4ERR_BADOWNER
   instead.  To avoid this mechanism being used to subvert user and
   group translation, so that result in a client might pass all of completion at the owners client, and
   groups due
   to ordering rules in numeric form, a server SHOULD return an NFS4ERR_BADOWNER
   error when there is RDMA transports, after all required RDMA
   operations are complete, a valid translation Send (Send with Solicited Event for iWARP)
   containing the user or owner
   designated procedure results is performed from server to client.
   This Send operation will result in this way.  In that case, a completion which will signal the
   client must use to inspect the
   appropriate name@domain string and not message.

   In the special form for
   compatibility.

   The owner string "nobody" may be used to designate an anonymous user,
   which case of client read-type NFSv4 operations, the server will be associated with a file created by a security principal
   that cannot be mapped through normal means
   have issued RDMA Writes to transfer the owner attribute.

5.9.  Character Case Attributes

   With respect to resulting data into client-
   advertised buffers.  The subsequent Send operation performs two
   necessary functions: finalizing any active or pending DMA at the case_insensitive
   client, and case_preserving attributes,
   each UCS-4 character (which UTF-8 encodes) has a "long descriptive
   name" RFC1345 [27] which may or may not included signaling the word "CAPITAL"
   or "SMALL".  The presence client to inspect the message.

   In the case of SMALL or CAPITAL allows an NFS client write-type NFSv4 operations, the server will
   have issued RDMA Reads to
   implement unambiguous and efficient table driven mappings for case
   insensitive comparisons, and non-case-preserving storage.  For
   general character handling and internationalization issues, see fetch the
   section "Internationalization".

5.10.  Quota Attributes

   For data from the attributes related client-advertised
   buffers.  No data consistency issues arise at the client, but the
   completion of the transfer must be acknowledged, again by a Send from
   server to file system quotas, client.

   In either case, the following
   definitions apply:

   quota_avail_soft client advertises buffers for direct (RDMA style)
   operations.  The value in bytes which represents client may desire certain advertisement limits, and
   may wish the amount of
      additional disk space that can be allocated server to this file or
      directory before perform remote invalidation on its behalf when
   the user server has completed its RDMA.  This may reasonably be warned.  It is
      understood that considered in a
   future version of this space draft.

   In the absence of remote invalidation, the client may be consumed by allocations perform its
   own, local invalidation after the operation completes.  This
   invalidation should occur prior to other
      files or directories though there is any RPCSEC GSS integrity checking,
   since a rule validly remotely accessible buffer can possibly be modified
   by the peer.  However, after invalidation and the contents integrity
   checked, the contents are locally secure.

   Credit updates over RDMA transports are supported at the RPC layer as to which other
      files or directories.

   quota_avail_hard  The value
   described in bytes which represent [RPCRDMA].  In each request, the amount client requests a
   desired number of
      additional disk space beyond the current allocation that can be
      allocated credits to this file or directory before further allocations
      will be refused.  It is understood that this space may be consumed
      by allocations made available to other files or directories.

   quota_used the connection on
   which it sends the request.  The value in bytes client must not send more requests
   than the number which represent the amount server has previously advertised, or in the
   case of disc
      space used the first request, only one.  If the client exceeds its
   credit limit, the connection may close with a fatal RDMA error.

   The server then executes the request, and replies with an updated
   credit count accompanying its results.  Since replies are sequenced
   by their RDMA Send order, the most recent results always reflect the
   server's limit.  In this file or directory and possibly a way the client will always know the maximum
   number of
      other similar files or directories, where requests it may safely post.

   Because the set of "similar"
      meets at least client requests an arbitrary credit count in each
   request, it is relatively easy for the criterion that allocating space client to any file request more, or
      directory
   fewer, credits to match its expected need.  A client that discovered
   itself frequently queuing outgoing requests due to lack of server
   credits might increase its requested credits proportionately in
   response.  Or, a client might have a simple, configurable number.
   The protocol also provides a per-operation "maxslot" exchange to
   assist in dynamic adjustment at the set will session level, described in a
   later section.

   Occasionally, a server may wish to reduce the "quota_avail_hard" total number of every
      other file or directory in the set.

      Note that there may credits
   it offers a certain client on a connection.  This could be
   encountered if a number of distinct but overlapping sets
      of files client were found to be consuming its credits
   slowly, or directories not at all.  A client might notice this itself, and reduce
   its requested credits in advance, for which a quota_used value is
      maintained.  E.g. "all files with instance requesting only the
   count of operations it currently has queued, plus a given owner", "all files with few as a given group owner". etc. base for
   starting up again.  Such mechanisms can, however, be potentially
   complicated and are implementation-defined.  The server is at liberty to choose any protocol does not
   require them.

   Because of those sets but should do
      so the way in a repeatable way.  The rule may which RDMA fabrics function, it is not possible
   for the server (or client back channel) to cancel outstanding receive
   operations.  Therefore, effectively only one credit can be configured withdrawn
   per file
      system or may be "choose the set with the smallest quota".

5.11.  mounted_on_fileid

   UNIX-based operating environments connect receive completion.  The server (or client back channel) would
   simply not replenish a file system into the
   namespace by connecting (mounting) receive operation when replying.  The server
   can still reduce the file system onto available credit advertisement in its replies to
   the existing
   file object (the mount point, usually target value it desires, as a directory) of an existing
   file system.  When hint to the mount point's parent directory client that its credit
   target is read via an
   API like readdir(), the return results are directory entries, each
   with a component name lower and a fileid.  The fileid of the mount point's
   directory entry will it should expect it to be different from reduced accordingly.
   Of course, even if the fileid that server could cancel outstanding receives, it
   cannot do so, since the stat()
   system call returns.  The stat() system call is returning client may have already sent requests in
   expectation of the fileid previous limit.

   This brings out an interesting scenario similar to that of client
   reconnect discussed in Section 6.3.  How does the root server reduce the
   credits of an inactive client?

   One approach is for the mounted file system, whereas readdir() server to simply close such a connection and
   require the client to reconnect at a new credit limit.  This is
   returning
   acceptable, if inefficient, when the fileid stat() would have returned before any file
   systems were mounted on connection setup time is short
   and where the mount point.

   Unlike NFS version 3, NFS version 4 allows server supports persistent session semantics.

   A better approach is to provide a client's LOOKUP back channel request to cross other file systems.  The client detects return the file system
   crossing whenever
   operations channel credits.  The server may request the filehandle argument of LOOKUP has an fsid
   attribute different from that client to
   return some number of credits, the filehandle returned by LOOKUP.
   A UNIX-based client will consider this a "mount point crossing".
   UNIX has a legacy scheme for allowing a process to determine its
   current working directory.  This relies must comply by performing
   operations on readdir() of a mount
   point's parent and stat() the operations channel, provided of course that the mount point returning fileids as
   previously described.  The mounted_on_fileid attribute corresponds
   request does not drop the client's credit count to zero (in which
   case the fileid that readdir() connection would have returned as described
   previously.

   While deadlock).  If the NFS version 4 client could simply fabricate a fileid
   corresponding finds that it has
   no requests with which to what mounted_on_fileid provides (and if consume the server
   does not support mounted_on_fileid, credits it was previously
   granted, it must send zero-length Send RDMA operations, or NULL NFSv4
   operations in order to return the resources to the client has no choice), there
   is a risk that server.  If the
   client will generate a fileid that conflicts with
   one that is already assigned fails to another object comply in the file system.
   Instead, if a timely fashion, the server can provide recover
   the mounted_on_fileid, resources by breaking the
   potential for client operational problems connection.

   While in principle, the back channel credits could be subject to a
   similar resource adjustment, in practice this area is eliminated.

   If not an issue, since
   the server detects that there back channel is no mounted point at the target
   file object, then the value used purely for mounted_on_fileid that it returns control and is
   the same as expected to be
   statically provisioned.

   It is important to note that in addition to maximum request counts,
   the sizes of buffers are negotiated per-session.  This permits the fileid attribute.

   The mounted_on_fileid attribute
   most efficient allocation of resources on both peers.  There is RECOMMENDED, so an
   important requirement on reconnection: the sizes posted by the server SHOULD
   provide it if possible, and for a UNIX-based server, this is
   straightforward.  Usually, mounted_on_fileid will
   at reconnect must be requested during
   a READDIR operation, in which case it is trivial (at at least for UNIX-
   based servers) as large as previously used, to return mounted_on_fileid since it is equal allow
   recovery.  Any replies that are replayed from the server's duplicate
   request cache must be able to be received into client buffers.  In
   the
   fileid of case where a directory entry returned by readdir().  If
   mounted_on_fileid is requested client has received replies to all its retried
   requests (and therefore received all its expected responses), then
   the client may disconnect and reconnect with different buffers at
   will, since no cache replay will be required.

6.5.  Retry and Replay

   NFSv4.0 forbids retransmission on active connections over reliable
   transports; this includes connected-mode RDMA.  This restriction must
   be maintained in NFSv4.1.

   If one peer were to retransmit a GETATTR operation, request (or reply), it would consume
   an additional credit on the other.  If the server
   should obey an invariant that has it returning retransmitted a value that is equal
   to the file object's entry
   reply, it would certainly result in an RDMA connection loss, since
   the object's parent directory, i.e.
   what readdir() client would have returned.  Some operating environments
   allow a series of two or more file systems to be mounted onto typically only post a single mount point.  In this case, receive buffer for each
   request.  If the client retransmitted a request, the additional
   credit consumed on the server might lead to obey RDMA connection failure
   unless the
   aforementioned invariant, client accounted for it will need and decreased its available
   credit, leading to wasted resources.

   RDMA credits present a new issue to find the base mount point, duplicate request cache in
   NFSv4.1.  The request cache may be used when a connection within a
   session is lost, such as after the client reconnects.  Credit
   information is a dynamic property of the connection, and stale values
   must not be replayed from the intermediate mount points.

5.12.  send_impl_id and recv_impl_id

   These recommended attributes are used to identify cache.  This implies that the client request
   cache contents must not be blindly used when replies are issued from
   it, and
   server.  In credit information appropriate to the case of channel must be
   refreshed by the send_impl_id attribute, RPC layer.

   Finally, RDMA fabrics do not guarantee that the client sends
   its clientid4 value along with memory handles
   (Steering Tags) within each rdma three-tuple are valid on a scope
   outside that of a single connection.  Therefore, handles used by the nfs_impl_id4.
   direct operations become invalid after connection loss.  The server
   must ensure that any RDMA operations which must be replayed from the
   request cache use of the
   clientid4 value allows newly provided handle(s) from the server to identify and match specific
   client interaction.  In most recent
   request.

6.6.  The Back Channel

   The NFSv4 callback operations present a significant resource problem
   for the case of RDMA enabled client.  Clearly, callbacks must be negotiated
   in the recv_impl_id attribute, way credits are for the ordinary operations channel for
   requests flowing from client receives the nfs_impl_id4 value.

   Access to this identification information can be most useful at both
   client and server.  Being able to identify specific implementations
   can help in planning by administrators or implementers.  For example,
   diagnostic software may extract this information in an attempt  But, for callbacks to
   identify implementation problems, performance workload behaviors or
   general usage statistics.  Since arrive
   on the intent of having access same RDMA endpoint as operation replies would require
   dedicating additional resources, and specialized demultiplexing and
   event handling.  Or, callbacks may not require RDMA sevice at all
   (they do not normally carry substantial data payloads).  It is highly
   desirable to streamline this
   information critical path via a second
   communications channel.

   The session callback channel binding facility is designed for planning or general diagnosis only, the client and
   server MUST NOT interpret this implementation identity information in exactly
   such a way that affects interoperational behavior of situation, by dynamically associating a new connected endpoint
   with the implementation. session, and separately negotiating sizes and counts for
   active callback channel operations.  The reason binding operation is
   firewall-friendly since it does not require the if server to initiate
   the connection.

   This same method serves as well for ordinary TCP connection mode.  It
   is expected that all NFSv4.1 clients and servers did such a thing, they might may make use fewer capabilities of the protocol than session
   facility to streamline their design.

   The back channel functions exactly the peer can support, or same as the client and server might refuse operations channel
   except that no RDMA operations are required to interoperate.

   Because it is likely some implementations will violate perform transfers,
   instead the protocol
   specification sizes are required to be sufficiently large to carry all
   data inline, and interpret the identity information, implementations
   MUST allow the users of course the NFSv4 client and server reverse their roles
   with respect to set the
   contents which is in control of credit management.  The same
   rules apply for all transfers, with the sent nfs_impl_id structure server being required to any value.

   Even though these attributes are recommended, if flow
   control its callback requests.

   The back channel is optional.  If not bound on a given session, the
   server supports
   one of them it MUST support must not issue callback operations to the other.

5.13.  fs_layout_type client.  This attribute applies to in
   turn implies that such a file system and indicates what layout
   types are supported by client must never put itself in the file system.  We expect this attribute
   situation where the server will need to
   be queried when a do so, lest the client encounters lose
   its connection by force, or its operation be incorrect.  For the same
   reason, if a new fsid.  This attribute back channel is
   used by bound, the client is subject to determine
   revocation of its delegations if it has applicable layout drivers.

5.14.  layout_type

   This attribute indicates the particular layout type(s) used for a
   file.  This back channel is lost.  Any
   connection loss should be corrected by the client as soon as
   possible.

   This can be convenient for informational purposes only.  The the NFSv4.1 client; if the client needs expects
   to make no use the LAYOUTGET operation in order to get enough information (e.g.,
   specific device information) in order of back channel facilities such as delegations, then
   there is no need to perform I/O.

5.15.  layout_hint create it.  This attribute may save significant resources
   and complexity at the client.

   For these reasons, if the client wishes to use the back channel, that
   channel must be set bound first, before using the operations channel.  In
   this way, the server will not find itself in a position where it will
   send callbacks on newly created files to influence the
   metadata server's choice for operations channel when the file's layout.  It client is suggested that
   this attribute not
   prepared for them.

   [[Comment.4: [XXX - do we want to support this?]]]  There is set as one of the initial attributes within
   special case, that where the
   OPEN call.  The metadata server may ignore this attribute.  This
   attribute back channel is a sub-set of bound in fact to the layout structure returned by LAYOUTGET.
   For example, instead of specifying particular devices, this
   operations channel's connection.  This configuration would be used
   normally over a TCP stream connection to suggest exactly implement the stripe width
   NFSv4.0 behavior, but over RDMA would require complex resource and
   event management at both sides of a file.  It is up to the connection.  The server
   implementation is not
   required to determine which fields within the layout it uses.

5.16.  mdsthreshold

   This attribute acts as accept such a hint to the client to help it determine when bind request on an RDMA connection for this
   reason, though it is more efficient to issue read and write requests recommended.

6.7.  COMPOUND Sizing Issues

   Very large responses may pose duplicate request cache issues.  Since
   servers will want to bound the metadata
   server vs. storage required for such a cache, the dataserver.  Two types of thresholds are described:
   file size thresholds and I/O
   unlimited size thresholds. of response data in COMPOUND may be troublesome.  If a file's size
   COMPOUND is
   smaller than used in all its generality, then the inclusion of certain
   non-idempotent operations within a single COMPOUND request may render
   the entire request non-idempotent.  (For example, a single COMPOUND
   request which read a file size threshold, data accesses should or symbolic link, then removed it, would be issued
   obliged to cache the metadata server.  If an I/O data in order to allow identical replay).
   Therefore, many requests might include operations that return any
   amount of data.

   It is below not satisfactory for the I/O size threshold, server to reject COMPOUNDs at will
   with NFS4ERR_RESOURCE when they pose such difficulties for the I/O should
   server, as this results in serious interoperability problems.
   Instead, any such limits must be issued to explicitly exposed as attributes of
   the metadata server.  Each threshold session, ensuring that the server can
   be specified independently explicitly support any
   duplicate request cache needs at all times.

6.8.  Data Alignment

   A negotiated data alignment enables certain scatter/gather
   optimizations.  A facility for read this is supported by [RPCRDMA].  Where
   NFS file data is the payload, specific optimizations become highly
   attractive.

   Header padding is requested by each peer at session initiation, and write requests.  For either
   threshold type, a value of 0 indicates no read or write should
   may be
   issued to zero (no padding).  Padding leverages the metadata server, while a value useful property that
   RDMA receives preserve alignment of all 1s indicates all
   reads or data, even when they are placed
   into anonymous (untagged) buffers.  If requested, client inline
   writes should be issued will insert appropriate pad bytes within the request header to
   align the metadata server. data payload on the specified boundary.  The attribute client is available on a per filehandle basis.  If
   encouraged to be optimistic and simply pad all WRITEs within the current
   filehandle refers RPC
   layer to a non-pNFS file or directory, the metadata
   server should return an attribute negotiated size, in the expectation that is representative of the
   filehandle's file system. server can
   use them efficiently.

   It is suggested highly recommended that this attribute is
   queried as part clients offer to pad headers to an
   appropriate size.  Most servers can make good use of the OPEN operation.  Due such padding,
   which allows them to dynamic system
   changes, the client should not assume chain receive buffers in such a way that the attribute will remain
   constant for any specific time period, thus it should
   data carried by client requests will be periodically
   refreshed.

6.  Access Control Lists placed into appropriate
   buffers at the server, ready for file system processing.  The NFS version 4 ACL attribute is an array of access control entries
   (ACEs).  Although,
   receiver's RPC layer encounters no overhead from skipping over pad
   bytes, and the client can read RDMA layer's high performance makes the insertion and write
   transmission of padding on the ACL attribute, sender a significant optimization.  In
   this way, the server is responsible need for using the ACL servers to perform access
   control.  The RDMA Read to satisfy all
   but the largest client can use writes is obviated.  An added benefit is the OPEN or ACCESS operations to check
   access without modifying or reading data or metadata.
   reduction of message roundtrips on the network - a potentially good
   trade, where latency is present.

   The NFS ACE attribute value to choose for padding is defined as follows:

            typedef uint32_t        acetype4;
            typedef uint32_t        aceflag4;
            typedef uint32_t        acemask4;

            struct nfsace4 {
                    acetype4        type;
                    aceflag4        flag;
                    acemask4        access_mask;
                    utf8str_mixed   who;
            };

   To determine if subject to a request succeeds, the server processes each nfsace4
   entry number of criteria.
   A primary source of variable-length data in order.  Only ACEs which have a "who" that matches the
   requester are considered.  Each ACE RPC header is processed until all of the
   bits of
   authentication information, the requester's access have been ALLOWED.  Once a bit (see
   below) has been ALLOWED by an ACCESS_ALLOWED_ACE, it form of which is no longer
   considered client-determined,
   possibly in response to server specification.  The contents of
   COMPOUNDs, sizes of strings such as those passed to RENAME, etc. all
   go into the processing determination of a maximal NFSv4 request size and
   therefore minimal buffer size.  The client must select its offered
   value carefully, so as not to overburden the server, and vice- versa.
   The payoff of later ACEs.  If an ACCESS_DENIED_ACE appropriate padding value is encountered where the requester's access still has unALLOWED bits
   in common with higher performance.

                    Sender gather:
        |RPC Request|Pad bytes|Length| -> |User data...|
        \------+---------------------/       \
                \                             \
                 \    Receiver scatter:        \-----------+- ...
            /-----+----------------\            \           \
            |RPC Request|Pad|Length|   ->  |FS buffer|->|FS buffer|->...

   In the "access_mask" of above case, the ACE, server may recycle unused buffers to the request is denied.
   However, unlike next
   posted receive if unused by the ALLOWED and DENIED ACE types, actual received request, or may pass
   the ALARM and AUDIT
   ACE types do not affect a requester's access, and instead are now-complete buffers by reference for
   triggering events as normal write processing.
   For a result server which can make use of a requester's access attempt.

   Therefore, all AUDIT and ALARM ACEs are processed until end it, this removes any need for data
   copies of incoming data, without resorting to complicated end-to-end
   buffer advertisement and management.  This includes most kernel-based
   and integrated server designs, among many others.  The client may
   perform similar optimizations, if desired.

   Padding is negotiated by the
   ACL.  When session creation operation, and
   subsequently used by the ACL is fully processed, if there are bits RPC RDMA layer, as described in [RPCRDMA].

6.9.  NFSv4 Integration

   The following section discusses the
   requester's mask that have not been ALLOWED or DENIED, access is
   denied.

   This integration of the session
   infrastructure into NFSv4.1

6.9.1.  Minor Versioning

   Minor versioning of NFSv4 is relatively restrictive, and allows for
   tightly limited changes only.  In particular, it does not intended permit
   adding new "procedures" (it permits adding only new "operations").
   Interoperability concerns make it impossible to limit the ability of server implementations consider additional
   layering to implement alternative access policies.  For example:

   o  A server implementation might always grant ACE4_WRITE_ACL and
      ACE4_READ_ACL permissions. be a minor revision.  This would prevent the user from
      getting into somewhat limits the situation where they can't ever modify changes
   that can be introduced when considering extensions.

   To support the ACL.

   o  If duplicate request cache integrated with sessions and
   request control, it is desirable to tag each request with an
   identifier to be called a file system Slotid.  This identifier must be passed by
   NFSv4.1 when running atop any transport, including traditional TCP.
   Therefore it is mounted read only, then not desirable to add the server may deny
      ACE4_WRITE_DATA Slotid to a new RPC
   transport, even though the ACL grants it.

   As mentioned before, this such a transport is one indicated for support of the reasons that client
   implementations are not recommended to do their own access checks
   based on their interpretation the ACL, but rather use the OPEN
   RDMA.  This specification and
   ACCESS to [RPCRDMA] do access checks.  This allows the client not specify such an
   approach.

   Instead, this specification conforms to act on the
   results of having the server determine whether or not access should
   be granted based on its interpretation requirements of NFSv4
   minor versioning, through the ACL.

   Clients must be aware use of situations a new operation within NFSv4
   COMPOUND procedures as detailed below.

   If sessions are in which an object's ACL will
   define use for a certain access even though given clientid, this same clientid
   cannot be used for non-session NFSv4 operation, including NFSv4.0.
   Because the server will not enforce it.
   In general, but especially in these situations, the client needs have allocated session-specific state to
   do its part in the enforcement of access as defined by the ACL.  To
   do this, the client may issue
   active clientid, it would be an unnecessary burden on the appropriate ACCESS operation prior server
   implementor to servicing the request of the user or application support and account for additional, non- session
   traffic, in order addition to being of no benefit.  Therefore this
   specification prohibits a single clientid from doing this.
   Nevertheless, employing a new clientid for such traffic is supported.

6.9.2.  Slot Identifiers and Server Duplicate Request Cache

   The presence of deterministic maximum request limits on a session
   enables in-progress requests to
   determine whether the user or application should be granted assigned unique values with useful
   properties.

   The RPC layer provides a transaction ID (xid), which, while required
   to be unique, is not especially convenient for tracking requests.
   The transaction ID is only meaningful to the
   access requested.

   Some situations in which issuer (client), it
   cannot be interpreted at the ACL server except to test for equality with
   previously issued requests.  Because RPC operations may define accesses that be completed
   by the server
   doesn't enforce:

   o All servers will allow in any order, many transaction IDs may be outstanding
   at any time.  The client may therefore perform a user the ability to read computationally
   expensive lookup operation in the data process of demultiplexing each
   reply.

   In the
   file when only specification, there is a limit to the execute permission number of active
   requests.  This immediately enables a convenient, computationally
   efficient index for each request which is granted (i.e.  If designated as a Slot
   Identifier, or slotid.

   When the ACL
   denies client issues a new request, it selects a slotid in the user
   range 0..N-1, where N is the ACE4_READ_DATA access and allows server's current "totalrequests" limit
   granted the user
   ACE4_EXECUTE, client on the server will allow session over which the user request is to read the data be
   issued.  The slotid must be unused by any of the
   file).

   o Many servers have the notion of owner-override in requests which the owner
   of
   client has already active on the object is allowed to override accesses that are denied by session.  "Unused" here means the
   ACL.

   The NFS version 4 ACL model is quite rich.  Some server platforms may
   provide access control functionality
   client has no outstanding request for that goes beyond slotid.  Because the UNIX-style
   mode attribute, but which slot
   id is not as rich as always an integer in the NFS ACL model.  So
   that users range 0..N-1, client implementations
   can take advantage of this more limited functionality, use the slotid from a server may indicate that it supports ACLs as long as it follows the
   guidelines response to efficiently match
   responses with outstanding requests, such as, for mapping between its ACL model example, by using
   the slotid to index into a outstanding request array.  This can be
   used to avoid expensive hashing and lookup functions in the NFS version 4
   ACL model.
   performance-critical receive path.

   The situation sequenceid, which accompanies the slotid in each request, is complicated by
   important for a second, important check at the server: it must be
   able to be determined efficiently whether a request using a certain
   slotid is a retransmit or a new, never-before-seen request.  It is
   not feasible for the fact that a server may have
   multiple modules client to assert that enforce ACLs.  For example, the enforcement it is retransmitting to
   implement this, because for
   NFS version 4 access may be different from any given request the enforcement for local
   access, and both may be different from client cannot know
   the enforcement for access
   through other protocols such as SMB.  So server has seen it may be useful for a unless the server to accept an ACL even actually replies.  Of
   course, if the client has seen the server's reply, the client would
   not all retransmit!

   The sequenceid must increase monotonically for each new transmit of its modules are able to
   support it. a
   given slotid, and must remain unchanged for any retransmission.  The guiding principle in all cases is that the
   server must not accept
   ACLs in turn compare each newly received request's sequenceid
   with the last one previously received for that appear slotid, to make the file more secure than it really is.

6.1.  ACE type

      Type         Description
      _____________________________________________________
      ALLOW        Explicitly grants see if the access defined
   new request is:

   o  A new request, in
                   acemask4 to the file or directory.

      DENY         Explicitly denies which the access defined sequenceid is one greater than that
      previously seen in
                   acemask4 to the file or directory.

      AUDIT        LOG (system dependent) any access
                   attempt slot (accounting for sequence wraparound).
      The server proceeds to a file or directory which
                   uses any of execute the access methods specified new request.

   o  A retransmitted request, in acemask4.

      ALARM        Generate a system ALARM (system
                   dependent) when any access attempt which the sequenceid is
                   made equal to a file or directory for the
                   access methods specified that
      last seen in acemask4.

   A server need not support all of the above ACE types.  The bitmask
   constants used to represent the above definitions within the
   aclsupport attribute are as follows:

         const ACL4_SUPPORT_ALLOW_ACL    = 0x00000001;
         const ACL4_SUPPORT_DENY_ACL     = 0x00000002;
         const ACL4_SUPPORT_AUDIT_ACL    = 0x00000004;
         const ACL4_SUPPORT_ALARM_ACL    = 0x00000008; slot.  Note that this request may be either
      complete, or in progress.  The semantics of the "type" field follow server performs replay processing
      in these cases.

   o  A misordered duplicate, in which the descriptions provided
   above.

   The constants used sequenceid is less than
      (acounting for sequence wraparound) than that previously seen in
      the type field (acetype4) are as follows:

         const ACE4_ACCESS_ALLOWED_ACE_TYPE      = 0x00000000;
         const ACE4_ACCESS_DENIED_ACE_TYPE       = 0x00000001;
         const ACE4_SYSTEM_AUDIT_ACE_TYPE        = 0x00000002;
         const ACE4_SYSTEM_ALARM_ACE_TYPE        = 0x00000003;

   Clients should not attempt to set an ACE unless the slot.  The server claims
   support MUST return NFS4ERR_SEQ_MISORDERED.

   o  A misordered new request, in which the sequenceid is two or more
      than (acounting for sequence wraparound) than that ACE type.  If previously seen
      in the server receives a request to set
   an ACE slot.  Note that it cannot store, it MUST reject the request with
   NFS4ERR_ATTRNOTSUPP.  If because the server receives sequenceid must wraparound one
      it reaches 0xFFFFFFFF, a misordered new request to set an ACE
   that it can store but and a misordered
      duplicate cannot enforce, be distinguished.  Thus, the server SHOULD reject MUST return
      NFS4ERR_SEQ_MISORDERED.

   Unlike the
   request with NFS4ERR_ATTRNOTSUPP.

   Example: suppose XID, the slotid is always within a server can enforce NFS ACLs for NFS access but
   cannot enforce ACLs specific range; this
   has two implications.  The first implication is that for local access.  If arbitrary processes can run
   on the server, then a given
   session, the server SHOULD NOT indicate ACL support.  On
   the other hand, if need only trusted administrative programs run locally,
   then cache the server may indicate ACL support.

6.2.  ACE Access Mask results of a limited number
   of COMPOUND requests.  The access_mask field contains values based on second implication derives from the following:

      ACE4_READ_DATA
         Operation(s) affected:
              READ
              OPEN
         Discussion:
              Permission to read first,
   which is unlike XID-indexed DRCs, the data slotid DRC by its nature cannot
   be overflowed.  Through use of the file.

              Servers SHOULD allow a user sequenceid to identify
   retransmitted requests, it is notable that the ability server does not need
   to read actually cache the data
              of request itself, reducing the file when only storage
   requirements of the ACE4_EXECUTE access mask bit is
              allowed.

      ACE4_LIST_DIRECTORY
          Operation(s) affected:
              READDIR
          Discussion:
              Permission DRC further.  These new facilities makes it
   practical to list maintain all the contents required entries for an effective DRC.

   The slotid and sequenceid therefore take over the traditional role of a directory.

      ACE4_WRITE_DATA
          Operation(s) affected:
              WRITE
              OPEN
          Discussion:
              Permission to modify a file's data anywhere in
   the file's
              offset range.  This includes XID and port number in the ability to write to any
              arbitrary offset server DRC implementation, and as a result to grow the file.

      ACE4_ADD_FILE
          Operation(s) affected:
              CREATE
              OPEN
          Discussion:
              Permission to add a new file in a directory.  The CREATE
              operation is affected when nfs_ftype4 is NF4LNK, NF4BLK,
              NF4CHR, NF4SOCK, or NF4FIFO. (NF4DIR
   session replaces the IP address.  This approach is not listed because considerably more
   portable and completely robust - it is covered by ACE4_ADD_SUBDIRECTORY.) OPEN is affected
              when used to create a regular file.

      ACE4_APPEND_DATA
          Operation(s) affected:
              WRITE
              OPEN
          Discussion:
               The ability not subject to modify a file's data, but only starting at
               EOF.  This allows for the notion frequent
   reassignment of append-only files, by
               allowing ACE4_APPEND_DATA and denying ACE4_WRITE_DATA to
               the same user or group.  If a file has an ACL such ports as clients reconnect over IP networks.  In
   addition, the
               one described above and a WRITE request RPC XID is made for
               somewhere other than EOF, not used in the server SHOULD return
               NFS4ERR_ACCESS.

      ACE4_ADD_SUBDIRECTORY
          Operation(s) affected:
              CREATE
          Discussion:
              Permission to create a subdirectory reply cache, enhancing
   robustness of the cache in a directory.  The
              CREATE operation is affected when nfs_ftype4 is NF4DIR.

      ACE4_READ_NAMED_ATTRS
          Operation(s) affected:
              OPENATTR
          Discussion:
              Permission to read the named attributes face of a file or any rapid reuse of XIDs by the
   client.  [[Comment.5: We need to
              lookup discuss the named attributes directory.  OPENATTR is
              affected when it requirements of the
   client for changing the XID.]].

   It is not used required to create a named attribute
              directory.  This is when 1.) createdir is TRUE, but encode the slotid information into each request in
   a
              named attribute directory already exists, or 2.) createdir
              is FALSE.

      ACE4_WRITE_NAMED_ATTRS
          Operation(s) affected:
              OPENATTR
          Discussion:
              Permission to write way that does not violate the named attributes minor versioning rules of a file or
              to create a named attribute directory.  OPENATTR the NFSv4.0
   specification.  This is
              affected when accomplished here by encoding it is used to create in a named attribute
              directory.  This is when createdir is TRUE control
   operation (SEQUENCE) within each NFSv4.1 COMPOUND and no named
              attribute directory exists. CB_COMPOUND
   procedure.  The ability to check whether
              or not operation easily piggybacks within existing messages.

   In general, the receipt of a named attribute directory exists depends new sequenced request arriving on any
   valid slot is an indication that the
              ability previous DRC contents of that
   slot may be discarded.  In order to look it up, therefore, users also need further assist the
              ACE4_READ_NAMED_ATTRS permission server in order to create a
              named attribute directory.

      ACE4_EXECUTE
          Operation(s) affected:
              LOOKUP
              READ
              OPEN
          Discussion:
              Permission slot
   management, the client is required to execute a file or traverse/search a
              directory.

              Servers SHOULD allow use the lowest available slot
   when issuing a user new request.  In this way, the ability server may be able to read the data
              of
   retire additional entries.

   However, in the file when only case where the ACE4_EXECUTE access mask bit is
              allowed.  This is because there server is no way actively adjusting its
   granted maximum request count to execute a
              file without reading the contents.  Though a server client, it may
              treat ACE4_EXECUTE and ACE4_READ_DATA bits identically
              when deciding not be able to permit a READ operation, it SHOULD still
              allow
   use receipt of the two bits slotid to be set independently retire cache entries.  The slotid used
   in ACLs, and
              MUST distinguish between them when replying to ACCESS
              operations.  In particular, servers SHOULD NOT silently
              turn on one an incoming request may not reflect the server's current idea of
   the two bits when client's session limit, because the other is set, request may have been sent
   from the client before the update was received.  Therefore, in the
   downward adjustment case, the server may have to retain a number of
   duplicate request cache entries at least as
              that would make large as the old value,
   until operation sequencing rules allow it impossible for to infer that the client to correctly
              enforce
   has seen its reply.

   The SEQUENCE (and CB_SEQUENCE) operation also carries a "maxslot"
   value which carries additional client slot usage information.  The
   client must always provide its highest-numbered outstanding slot
   value in the distinction between read maxslot argument, and execute
              permissions.

                As an example, following the server may reply with a SETATTR of new
   recognized value.  The client should in all cases provide the following ACL:
                        nfsuser:ACE4_EXECUTE:ALLOW

                A subsequent GETATTR of ACL for that file SHOULD return:
                        nfsuser:ACE4_EXECUTE:ALLOW
                Rather than:
                        nfsuser:ACE4_EXECUTE/ACE4_READ_DATA:ALLOW

      ACE4_DELETE_CHILD
          Operation(s) affected:
              REMOVE
          Discussion:
              Permission most
   conservative value possible, although it can be increased somewhat
   above the actual instantaneous usage to delete a file maintain some minimum or directory within
   optimal level.  This provides a
              directory.  See section "ACE4_DELETE vs.
              ACE4_DELETE_CHILD" way for the client to yield unused
   request slots back to the server, which in turn can use the
   information on how these two access
              mask bits interact.

      ACE4_READ_ATTRIBUTES
          Operation(s) affected:
              GETATTR of file system object attributes
          Discussion:
              The ability to read basic attributes (non-ACLs) of a file.
              On a UNIX system, basic attributes reallocate resources.  Obviously, maxslot can never be thought of as
   zero, or the stat level attributes.  Allowing this access mask bit session would mean deadlock.

   The server also provides a target maxslot value to the entity can execute "ls -l" and stat.

      ACE4_WRITE_ATTRIBUTES
          Operation(s) affected:

              SETATTR client, which
   is an indication to the client of time_access_set, time_backup,
              time_create, time_modify_set
          Discussion:
              Permission the maxslot the server wishes the
   client to change be using.  This permits the times associated with server to withdraw (or add)
   resources from a file
              or directory client that has been found to an arbitrary value.  A user having
              ACE4_WRITE_DATA permission, but lacking
              ACE4_WRITE_ATTRIBUTES must not be allowed using them, in
   order to implicitly set
              the times associated with more fairly share resources among a file.

      ACE4_DELETE
          Operation(s) affected:
              REMOVE
          Discussion:
              Permission to delete varying level of demand
   from other clients.  The client must always comply with the file or directory.  See section
              "ACE4_DELETE vs. ACE4_DELETE_CHILD" for information server's
   value updates, since they indicate newly established hard limits on how
              these two
   the client's access mask bits interact.

      ACE4_READ_ACL
          Operation(s) affected:
              GETATTR of acl
          Discussion:
              Permission to read the ACL.

      ACE4_WRITE_ACL
          Operation(s) affected:
              SETATTR session resources.  However, because of acl and mode
          Discussion:
              Permission to write
   request pipelining, the acl and mode attributes.

      ACE4_WRITE_OWNER
          Operation(s) affected:
              SETATTR of owner and owner_group
          Discussions:
              Permission to write client may have active requests in flight
   reflecting prior values, therefore the owner and owner_group attributes.
              On UNIX systems, this is server must not immediately
   require the ability client to execute chown or
              chgrp.

      ACE4_SYNCHRONIZE
          Operation(s) affected:
              NONE
          Discussion:
              Permission comply.

   It is worthwhile to note that Sprite RPC [BW87] defined a "channel"
   which in some ways is similar to access file locally at the slotid defined here.  Sprite RPC
   used channels to implement parallel request processing and request/
   response cache retirement.

6.9.3.  Resolving server callback races with
              synchronized reads and writes.

   The bitmask constants used sessions

   It is possible for server callbacks to arrive at the access mask field are as follows:

      const ACE4_READ_DATA            = 0x00000001;
      const ACE4_LIST_DIRECTORY       = 0x00000001;
      const ACE4_WRITE_DATA           = 0x00000002;
      const ACE4_ADD_FILE             = 0x00000002;
      const ACE4_APPEND_DATA          = 0x00000004;
      const ACE4_ADD_SUBDIRECTORY     = 0x00000004;
      const ACE4_READ_NAMED_ATTRS     = 0x00000008;
      const ACE4_WRITE_NAMED_ATTRS    = 0x00000010;
      const ACE4_EXECUTE              = 0x00000020;
      const ACE4_DELETE_CHILD         = 0x00000040;
      const ACE4_READ_ATTRIBUTES      = 0x00000080;
      const ACE4_WRITE_ATTRIBUTES     = 0x00000100;
      const ACE4_DELETE               = 0x00010000;
      const ACE4_READ_ACL             = 0x00020000;
      const ACE4_WRITE_ACL            = 0x00040000;
      const ACE4_WRITE_OWNER          = 0x00080000;
      const ACE4_SYNCHRONIZE          = 0x00100000;

   Server implementations need not provide client before
   the granularity of control
   that is implied by this list of masks. reply from related forward channel operations.  For example, POSIX-based
   systems might not distinguish APPEND_DATA (the ability to append a
   client may have been granted a delegation to a
   file) from WRITE_DATA (the ability file it has opened,
   but the reply to modify existing contents); both
   masks would the OPEN (informing the client of the granting of
   the delegation) may be tied to a single "write" permission.  When such delayed in the network.  If a
   server returns attributes to conflicting
   operation arrives at the client, server, it would show both
   APPEND_DATA and WRITE_DATA if and only will recall the delegation using
   the callback channel, which may be on a different transport
   connection, perhaps even a different network.  In NFSv4.0, if the write permission is
   enabled.

   If
   callback request arrives before the related reply, the client may
   reply to the server with an error.

   The presence of a session between client and server receives alleviates this
   issue.  When a SETATTR request that it cannot accurately
   implement, it should error session is in place, each client request is uniquely
   identified by its { slotid, sequenceid } pair.  By the direction rules under
   which slot entries (duplicate request cache entries) are retired, the
   server has knowledge whether the client has "seen" each of more restricted
   access.  For example, suppose a the
   server's replies.  The server cannot distinguish overwriting
   data from appending new data, as described in can therefore provide sufficient
   information to the previous paragraph.
   If a client submits to allow it to disambiguate between an ACE where APPEND_DATA is set but WRITE_DATA is
   not (or vice versa),
   erroneous or conflicting callback and a race condition.

   For each client operation which might result in some sort of server
   callback, the server should reject "remember" the { slotid, sequenceid }
   pair of the client request with
   NFS4ERR_ATTRNOTSUPP.  Nonetheless, if until the ACE has type DENY, slotid retirement rules allow
   the server may silently turn on the other bit, so that both APPEND_DATA
   and WRITE_DATA are denied.

6.2.1.  ACE4_DELETE vs. ACE4_DELETE_CHILD

   Two access mask bits govern the ability to delete a file or directory
   object: ACE4_DELETE on determine that the object itself, and ACE4_DELETE_CHILD on client has, in fact, seen the object's parent directory.

   Many systems also consult
   server's reply.  Until the "sticky bit" (MODE4_SVTX) and write
   mode bit on time the parent directory when determining whether to allow a
   file to { slotid, sequencedid } request
   pair can be deleted.  The mode bit retired, any recalls of the associated object MUST carry
   an array of these referring identifiers (in the CB_SEQUENCE
   operation's arguments), for write corresponds to
   ACE4_WRITE_DATA, which is the same physical bit as ACE4_ADD_FILE.
   Therefore, ACE4_ADD_FILE can come into play when determining
   permission to delete.

   In benefit of the algorithm below, client.  After this
   time, it is not necessary for the strategy server to provide this information
   in related callbacks, since it is certain that ACE4_DELETE and
   ACE4_DELETE_CHILD take precedence over the sticky bit, and a race condition can
   no longer occur.

   The CB_SEQUENCE operation which begins each server callback carries a
   list of "referring" { slotid, sequenceid } tuples.  If the sticky
   bit takes precedence over client
   finds the "write" mode bits (reflected in
   ACE4_ADD_FILE).

   Server implementations SHOULD grant or deny permission request corresponding to delete
   based on the following algorithm.

          if ACE4_EXECUTE is denied by referring slotid and sequenced
   id be currently outstanding (i.e. the parent directory ACL:
              deny delete
          else if ACE4_DELETE is allowed server's reply has not been
   seen by the target object ACL:
              allow delete
          else if ACE4_DELETE_CHILD is allowed by client), it can determine that the parent
          directory ACL:
              allow delete
          else if ACE4_DELETE_CHILD is denied by callback has raced the
          parent directory ACL:
              deny delete
          else if ACE4_ADD_FILE is allowed by
   reply, and act accordingly.

   The client must not simply wait forever for the parent directory ACL:
              if MODE4_SVTX expected server reply
   to arrive on any of the session's operations channels, because it is set
   possible that they will be delayed indefinitely.  However, it should
   wait for the parent directory: a period of time, and if the principal owns the parent directory OR time expires it can provide a
   more meaningful error such as NFS4ERR_DELAY.

   [[Comment.6: XXX ...  We need to consider the principal owns clients' options here,
   and describe them...  NFS4ERR_DELAY has been discussed as a legal
   reply to CB_RECALL?]]

   There are other scenarios under which callbacks may race replies,
   among them pnfs layout recalls, described in Section 17.3.5.3
   [[Comment.7: XXX fill in the target object OR
                      ACE4_WRITE_DATA blanks w/others, etc...]]

6.9.4.  COMPOUND and CB_COMPOUND

   [[Comment.8: Noveck: This is allowed by the target
                      object ACL:
                          allow delete
                      else:
                          deny delete
              else:
                  allow delete
          else:
              deny delete

6.3.  ACE flag

   The "flag" field contains values based on about the following descriptions.

   ACE4_FILE_INHERIT_ACE
      Can be placed on a directory and indicates twelfth time we say that this ACE
   is minor version.  The diagram makes sense if you are explaining
   which should be done somewhere, but this is supposedly explaining
   sessions.]]

   Support for per-operation control is added to each NFSv4 COMPOUNDs by
   placing such facilities into their own, new non-directory file created.

   ACE4_DIRECTORY_INHERIT_ACE
      Can be placed on a directory operation, and indicates that placing
   this ACE should be
      added to operation first in each COMPOUND under the new directory created.

   ACE4_INHERIT_ONLY_ACE
      Can be placed on a directory but does not NFSv4 minor
   protocol revision.  The contents of the operation would then apply to
   the directory,
      only to newly created files/directories as specified by entire COMPOUND.

   Recall that the above
      two flags.

   ACE4_NO_PROPAGATE_INHERIT_ACE
      Can be placed on a directory.  Normally when a new directory NFSv4 minor version number is
      created and an ACE exists on contained within the parent directory which is marked
      ACE4_DIRECTORY_INHERIT_ACE, two ACEs are placed on
   COMPOUND header, encoded prior to the COMPOUNDed operations.  By
   simply requiring that the new
      directory.  One for operation always be contained in NFSv4
   minor COMPOUNDs, the directory itself control protocol can piggyback perfectly with
   each request and one which is an
      inheritable ACE for newly created directories.  This flag tells response.

   In this way, the server to not place an ACE on NFSv4 Session Extensions may stay in compliance with
   the newly created directory
      which is inheritable by subdirectories minor versioning requirements specified in section 10 of the created directory.

   ACE4_SUCCESSFUL_ACCESS_ACE_FLAG

   ACE4_FAILED_ACCESS_ACE_FLAG
      The ACE4_SUCCESSFUL_ACCESS_ACE_FLAG (SUCCESS) and
      ACE4_FAILED_ACCESS_ACE_FLAG (FAILED) flag bits relate only RFC3530
   [2].

   Referring to
      ACE4_SYSTEM_AUDIT_ACE_TYPE (AUDIT) section 13.1 of RFC3530 [2], the specified session-
   enabled COMPOUND and ACE4_SYSTEM_ALARM_ACE_TYPE
      (ALARM) ACE types.  If during CB_COMPOUND have the processing of form:

      +-----+--------------+-----------+------------+-----------+----
      | tag | minorversion | numops    | control op | op + args | ...
      |     |   (== 1)     | (limited) |  + args    |           |
      +-----+--------------+-----------+------------+-----------+----

      and the file's ACL, reply's structure is:

      +------------+-----+--------+-------------------------------+--//
      |last status | tag | numres | status + control op + results |  //
      +------------+-----+--------+-------------------------------+--//
              //-----------------------+----
              // status + op + results | ...
              //-----------------------+----

   [[Comment.9: The artwork above doesn't mention callback_ident that is
   used for CB_COMPOUND.  We need to mention that for NFSv4.1,
   callback_ident is superfluous]] The single control operation,
   SEQUENCE, within each NFSv4.1 COMPOUND defines the server encounters an AUDIT or ALARM ACE context and
   operational session parameters which govern that matches COMPOUND request and
   reply.  Placing it first in the
      principal attempting COMPOUND encoding is required in
   order to allow its processing before other operations in the OPEN,
   COMPOUND.

6.10.  Sessions Security Considerations

   The NFSv4 minor version 1 retains all of existing NFSv4 security; all
   security considerations present in NFSv4.0 apply to it equally.

   Security considerations of any underlying RDMA transport are
   additionally important, all the server notes that fact, and more so due to the
      presence, if any, emerging nature of
   such transports.  Examining these issues is outside the SUCCESS and FAILED flags encountered scope of this
   specification.

   When protecting a connection with RPCSEC_GSS, all data in each
   request and response (whether transferred inline or via RDMA)
   continues to receive this protection over RDMA fabrics [RPCRDMA].
   However when performing data transfers via RDMA, RPCSEC_GSS
   protection of the AUDIT or ALARM ACE.  Once data transfer portion works against the server completes efficiency
   which RDMA is typically employed to achieve.  This is because such
   data is normally managed solely by the ACL
      processing, RDMA fabric, and intentionally
   is not touched by software.  The means by which the share reservation processing, and local RPCSEC_GSS
   implementation is integrated with the OPEN
      call, it then notes if RDMA data protection facilities
   are outside the OPEN succeeded or failed. scope of this specification.

   If the OPEN
      succeeded, and if NFS client wishes to maintain full control over RPCSEC_GSS
   protection, it may still perform its transfer operations using either
   the SUCCESS flag was set for a matching AUDIT inline or
      ALARM, then the appropriate AUDIT RDMA transfer model, or ALARM event occurs.  If the
      OPEN failed, and if the FAILED flag was set for of course employ traditional
   TCP stream operation.  In the matching AUDIT
      or ALARM, then RDMA inline case, header padding is
   recommended to optimize behavior at the appropriate AUDIT or ALARM event occurs.
      Clearly either or both of server.  At the SUCCESS or FAILED can client, close
   attention should be set, but if
      neither is set, paid to the AUDIT or ALARM ACE is not useful.

      The previously described implementation of RPCSEC_GSS
   processing applies to minimize memory referencing and especially copying.

   The session callback channel binding improves security over that of
   provided by NFSv4 for the ACCESS
      operation as well. callback channel.  The difference being that "success" or
      "failure" does not mean whether ACCESS returns NFS4_OK or not.
      Success means whether ACCESS returns all requested connection is
   client-initiated, and supported
      bits.  Failure means whether ACCESS failed subject to return a bit that
      was requested the same firewall and supported.

   ACE4_IDENTIFIER_GROUP
      Indicates that routing checks
   as the "who" refers operations channel.  The connection cannot be hijacked by an
   attacker who connects to a GROUP the client port prior to the intended
   server.  The connection is set up by the client with its desired
   attributes, such as defined under UNIX optionally securing with IPsec or similar.  The
   binding is fully authenticated before being activated.

6.10.1.  Denial of Service via Unauthorized State Changes

   Under some conditions, NFSv4.0 is vulnerable to a GROUP ACCOUNT as defined under Windows.  Clients and servers
      must ignore the ACE4_IDENTIFIER_GROUP flag on ACEs denial of service
   issue with respect to its state management.

   The attack works via an unauthorized client faking an open_owner4, an
   open_owner/lock_owner pair, or stateid, combined with a who
      value equal seqid.  The
   operation is sent to one of the special identifiers outlined in section
      "ACE who". NFSv4 server.  The bitmask constants used for NFSv4 server accepts the flag field are
   state information, and as long as follows:

      const ACE4_FILE_INHERIT_ACE             = 0x00000001;
      const ACE4_DIRECTORY_INHERIT_ACE        = 0x00000002;
      const ACE4_NO_PROPAGATE_INHERIT_ACE     = 0x00000004;
      const ACE4_INHERIT_ONLY_ACE             = 0x00000008;
      const ACE4_SUCCESSFUL_ACCESS_ACE_FLAG   = 0x00000010;
      const ACE4_FAILED_ACCESS_ACE_FLAG       = 0x00000020;
      const ACE4_IDENTIFIER_GROUP             = 0x00000040;

   A server need not support any of these flags.  If status code from the server supports
   flags that are similar to, but result of
   this operation is not exactly the same as, these flags,
   the implementation may define a mapping between the protocol-defined
   flags and the implementation-defined flags.  Again, NFS4ERR_STALE_CLIENTID, NFS4ERR_STALE_STATEID,
   NFS4ERR_BAD_STATEID, NFS4ERR_BAD_SEQID, NFS4ERR_BADXDR,
   NFS4ERR_RESOURCE, or NFS4ERR_NOFILEHANDLE, the guiding
   principle sequence number is that
   incremented.  When the file not appear to be more secure than it
   really is.

   For example, suppose a authorized client tries to set issues an ACE with
   ACE4_FILE_INHERIT_ACE set but not ACE4_DIRECTORY_INHERIT_ACE.  If the
   server does not support any form operation, it gets
   back NFS4ERR_BAD_SEQID, because its idea of ACL inheritance, the server
   should reject the request current sequence
   number is off by one.  The authorized client's recovery options are
   pretty limited, with NFS4ERR_ATTRNOTSUPP. SETCLIENTID, followed by complete reclaim of
   state, which may or may not succeed completely.  That qualifies as a
   denial of service attack.

   If the server
   supports a single "inherit ACE" flag that applies to both files client uses RPCSEC_GSS authentication and integrity, and every
   client maps each open_owner and lock_owner one and only one
   principal, and
   directories, the server may reject the request (i.e., requiring enforces this binding, then the
   client conditions
   leading to vulnerability to set both the file denial of service do not exist.  One
   should keep in mind that if AUTH_SYS is being used, far simpler
   easier denial of service and directory inheritance flags).  The
   server may also accept other attacks are possible.

   With NFSv4.1 sessions, the request and silently turn on per-operation sequence number is ignored
   (see Section 13.13) therefore the
   ACE4_DIRECTORY_INHERIT_ACE flag.

6.4.  ACE who

   There are several special identifiers ("who") which need NFSv4.0 denial of service
   vulnerability described above does not apply.  However as described
   to be
   understood universally, rather than this point in the context of a particular
   DNS domain.  Some of these identifiers cannot be understood when specification, an
   NFS client accesses attacker could forge the server, but have meaning when
   sessionid and issue a local process
   accesses SEQUENCE with a slot id that he expects the file.  The ability
   legitimate client to display and modify these
   permissions is permitted over NFS, even if none of the access methods
   on the server understands the identifiers.

      Who                    Description
      _______________________________________________________________
      "OWNER" use next.  The owner of legitimate client could then use
   the file.
      "GROUP"                The group associated slotid with the file.
      "EVERYONE"             The world, including the owner same sequence number, and
                             owning group.
      "INTERACTIVE"          Accessed from an interactive terminal.
      "NETWORK"              Accessed via the network.
      "DIALUP"               Accessed as a dialup user to server returns the server.
      "BATCH"                Accessed
   attacker's result from a batch job.
      "ANONYMOUS"            Accessed without any authentication.
      "AUTHENTICATED"        Any authenticated the replay cache, thereby disrupting the
   legitimate client.

   If we give each NFSv4.1 user (opposite of
                             ANONYMOUS)
      "SERVICE"              Access from a system service.

   To avoid conflict, these special identifiers are distinguish by an
   appended "@" their own session, and should appear in each user uses
   RPCSEC_GSS authentication and integrity, then the form "xxxx@" (note: no domain
   name after denial of service
   issue is solved, at the "@").  For example: ANONYMOUS@.

6.4.1.  Discussion cost of EVERYONE@

   It additional per session state.  The
   alternative NFSv4.1 specifies is important described as follows.

   Transport connections MUST be bound to note that "EVERYONE@" is not equivalent to the
   UNIX "other" entity.  This is because, a session by definition, UNIX "other"
   does not include the owner or owning group of client.
   The server MUST return an error to an operation (other than the
   operation that binds the connection to the session) that uses an
   unbound connection.  As a file.  "EVERYONE@"
   means literally everyone, including simplification, the owner or owning group.

6.4.2.  Discussion of OWNER@ and GROUP@

   The ACL itself cannot be transport connection
   used by CREATE_SESSION is automatically bound to determine the owner and owning group session.
   Additional connections are bound to a session via a new operation,
   BIND_CONN_TO_SESSION.

   To prevent attackers from issuing BIND_CONN_TO_SESSION operations,
   the arguments to BIND_CONN_TO_SESSION include a digest of a file.  This information should be indicated by shared
   secret called the values of secret session verifier (SSV) that only the
   owner client
   and owner_group file attributes returned by the server.

6.5.  Mode Attribute server know.  The NFS version 4 mode attribute digest is based on created via a one way, collision
   resistance hash function, making it intractable for the UNIX mode bits. attacker to
   forge.

   The
   following bits are defined:

         const MODE4_SUID = 0x800;  /* set user id on execution */
         const MODE4_SGID = 0x400;  /* set group id on execution */
         const MODE4_SVTX = 0x200;  /* save text even after use */
         const MODE4_RUSR = 0x100;  /* read permission: owner */
         const MODE4_WUSR = 0x080;  /* write permission: owner */
         const MODE4_XUSR = 0x040;  /* execute permission: owner */
         const MODE4_RGRP = 0x020;  /* read permission: group */
         const MODE4_WGRP = 0x010;  /* write permission: group */
         const MODE4_XGRP = 0x008;  /* execute permission: group */
         const MODE4_ROTH = 0x004;  /* read permission: other */
         const MODE4_WOTH = 0x002;  /* write permission: other */
         const MODE4_XOTH = 0x001;  /* execute permission: other */

   Bits MODE4_RUSR, MODE4_WUSR, and MODE4_XUSR apply SSV is sent to the principal
   identified in server via SET_SSV.  To prevent eavesdropping,
   a SET_SSV for the owner attribute.  Bits MODE4_RGRP, MODE4_WGRP, and
   MODE4_XGRP apply to SSV can be protected via RPCSEC_GSS with the principals identified in
   privacy service.  The SSV can be changed by the owner_group
   attribute.  Bits MODE4_ROTH, MODE4_WOTH, MODE4_XOTH apply to client at any
   principal that does not match that time,
   by any principal.  However several aspects of SSV changing prevent an
   attacker from engaging in the owner group, and does not
   have a group matching that successful denial of service attack:

   1.  A SET_SSV on the owner_group attribute.

   The remaining bits are SSV does not defined by this protocol and replace the SSV with the argument
       to SET_SVV.  Instead, the current SSV on the server is logically
       exclusive ORed (XORed) with the argument to SET_SSV.  SET_SSV
       MUST NOT be
   used. called with an SSV value that is zero.

   2.  The minor version mechanism must be used arguments to define further bit
   usage.

6.6.  Interaction Between Mode and ACL Attributes

   As defined, there is a certain amount results of overlap between ACL SET_SSV include digests of the
       old and mode
   file attributes.  Even though there new SSV, respectively.

   3.  Because the initial value of the SSV is overlap, ACLs don't contain
   all zero, therefore known,
       the information specified by client MUST issue at least one SET_SSV operation before the
       first BIND_CONN_TO_SESSION operation.  A client SHOULD issue
       SET_SSV as soon as a mode and modes can't possibly
   contain all session is created.

   If a connection is disconnected, BIND_CONN_TO_SESSION is required to
   bind a connection to the information specified by an ACL.

   For servers session, even if the connection that support both mode and ACL, was
   disconnected was the mode's MODE4_R*,
   MODE4_W* and MODE4_X* values should be computed from one CREATE_SESSION was created with.

   If a client is assigned a machine principal then the ACL and
   should be recomputed upon each SETATTR of ACL.  Similarly, upon
   SETATTR of mode, client SHOULD
   use the ACL should be modified in order machine principal's RPCSEC_GSS context to allow privacy protect the
   mode computed
   SSV from eavesdropping during the ACL to be SET_SSV operation.  If a machine
   principal is not being used, then the same as client MAY use the mode given non-machine
   principal's RPCSEC_GSS context to
   SETATTR. privacy protect the SSV.  The mode computed from any given ACL should be
   deterministic.  This means that given an ACL,
   server MUST accept either type of principal.  A client SHOULD change
   the same mode will
   always be computed.

   For servers SSV each time a new principal uses the session.

   Here are the types of attacks that support ACL can be attempted an attacker named
   Eve, and how the connection to session binding approach addresses
   each attack:

   o  If the Eve creates a connection after the legitimate client
      establishes an SSV via privacy protection from a machine
      principal's RPCSEC_GSS session, she does not mode, clients may handle
   applications which set know the SSV and get so
      cannot compute a digest that BIND_CONN_TO_SESSION will accept.
      Users on the mode legitimate client cannot be disrupted by creating Eve.

   o  If Eve first logs into the correct ACL
   to send legitimate client, and the client does
      not use machine principals, then Eve can cause an SSV to be
      created via the server and legitimate client's NFSv4.1 implementation,
      protected by computing the mode from RPCSEC_GSS context created by the ACL,
   respectively.  In this case, legitimate
      client (which uses Eve's GSS principal and credentials).  Eve can
      eavesdrop on the methods used by network, and because she knows her credentials,
      she can decrypt the server SSV.  Eve can compute a digest
      BIND_CONN_TO_SESSION will accept, and so bind a new connection to keep
   the mode in sync with
      the ACL session.  Eve can also be used by change the client.  These
   methods are explained slotid, sequence state, and/or
      the SSV state in Section 6.6.3, Section 6.6.1, and
   Section 6.6.2.

   Since such a way that when Bob accesses the mode can't possibly represent all of server via
      the information that legitimate client, the legitimate client will be unable to use
      the session.  The client's only recourse is defined by an ACL, there are some discrepencies to create a new
      session, which will cause any state Eve created on the legitimate
      client over the old (but hijacked) session to be aware of.
   As explained in lost.  This
      disrupts Eve, but because she is the section "Deficiencies in a Mode Representation of attacker, this is acceptable.
      Once the legitimate client establishes an ACL", SSV over the mode bits computed from new session
      using Bob's RPCSEC_GSS context, Eve can use the ACL could potentially convey
   more restrictive permissions than what would be granted new session via
      the ACL.
   Because of this clients are not recommended legitimate client, but she cannot disrupt Bob. Moreover,
      because the client SHOULD have modified the SSV due to do their own access
   checks based on Eve using
      the mode of new session, Bob cannot get revenge on Eve by binding a file.

   Because rogue
      connection to the mode attribute includes bits (i.e.  MODE4_SUID,
   MODE4_SGID, MODE4_SVTX) session.  The question is how does the
      legitimate client detect that have nothing Eve has hijacked the old session?
      When the client detects that a new principal, Bob, wants to do use
      the session, it SHOULD have issued a SET_SSV.

      *  Let us suppose that from the rogue connection, Eve issued a
         SET_SSV with ACL semantics,
   it is permitted for clients to specify both the ACL attribute same slotid and
   mode in sequence that the same SETATTR operation.  However, because there legitimate
         client later uses.  The server will assume this is no
   prescribed order for processing the attributes in a SETATTR, clients
   may see differing results.  For recommendations on how replay,
         and return to achieve
   consistent behavior, see Section 6.6.4 for recommendations.

6.6.1.  Recomputing mode upon SETATTR of ACL

   Keeping the mode and ACL attributes synchronized is important, but as
   mentioned previously, legitimate client the mode cannot possibly represent all of reply it sent Eve.
         However, unless Eve can correctly guess the
   information SSV the legitimate
         client will use, the digest verification checks in the ACL.  Still, SET_SSV
         response will fail.  That is the mode should be modified clue to
   represent the access as accurately as possible.

   The general algorithm to assign client that the
         session has been hijacked.

      *  Alternatively, Eve issued a new mode attribute to an object
   based on SET_SSV with a new ACL being set is:

   1.  Walk through different slotid
         than the ACEs in order, looking legitimate client uses for ACEs with a "who"
       value of OWNER@, GROUP@, or EVERYONE@.

   2.  It its SET_SSV.  Then the
         digest verification on the server fails, and the client is understood
         again clued that ACEs with a "who" value of OWNER@ affect the *USR bits of session has been hijacked.

      *  Alternatively, Eve issued an operation other than SET_SSV, but
         with the mode, GROUP@ affect *GRP bits, and EVERYONE@
       affect *USR, *GRP, same slotid and *OTH bits.

   3.  If such an ACE specifies ALLOW or DENY sequence that the legitimate client
         uses for ACE4_READ_DATA,
       ACE4_WRITE_DATA, or ACE4_EXECUTE, and its SET_SSV.  The server returns to the mode bits affected have legitimate
         client the response it sent Eve. The client sees that the
         response is not been determined yet, set them to one (if ALLOW) at all what it expects.  The client assumes
         either session hijacking or zero (if
       DENY).

   4.  Upon completion, any mode bits as yet undetermined have a value
       of zero.

   This pseudocode more precisely describes server bug, and either way destroys
         the algorithm:

          /* octal constants for old session.

   o  Eve binds a rogue connection to the mode bits */

          RUSR = 0400
          WUSR = 0200
          XUSR = 0100
          RGRP = 0040
          WGRP = 0020
          XGRP = 0010
          ROTH = 0004
          WOTH = 0002
          XOTH = 0001

         /*
          * old_mode represents session as above, and then
      destroys the previous value
          * of session.  Again, Bob goes to use the mode of server from the object.
          */
          mode_t mode = 0, seen = 0;
          for each ACE
      legitimate client.  The client has a {
              if a.type is ALLOW or DENY very clear indication that
      its session was hijacked, and
              ACE4_INHERIT_ONLY_ACE is does not set in a.flags {
                  if a.who is OWNER@ {
                      if ((a.mask & ACE4_READ_DATA) &&
                          (! (seen & RUSR))) {
                              seen |= RUSR;
                              if a.type is ALLOW {
                                  mode |= RUSR;
                              }
                      }
                      if ((a.mask & ACE4_WRITE_DATA) &&
                          (! (seen & WUSR))) {
                              seen |= WUSR;
                              if a.type is ALLOW {
                                  mode |= WUSR;
                              }
                      }
                      if ((a.mask & ACE4_EXECUTE) &&
                          (! (seen & XUSR))) {
                              seen |= XUSR;
                              if a.type is ALLOW {
                                  mode |= XUSR;
                              }
                      }
                  } else if a.who is GROUP@ {
                      if ((a.mask & ACE4_READ_DATA) &&
                          (! (seen & RGRP))) {
                              seen |= RGRP;
                              if a.type is ALLOW {
                                  mode |= RGRP;
                              }
                      }
                      if ((a.mask & ACE4_WRITE_DATA) &&
                          (! (seen & WGRP))) {
                              seen |= WGRP;
                              if a.type is ALLOW {
                                  mode |= WGRP;
                              }
                      }
                      if ((a.mask & ACE4_EXECUTE) &&
                          (! (seen & XGRP))) {
                              seen |= XGRP;
                              if a.type is ALLOW {
                                  mode |= XGRP;
                              }
                      }
                  } else if a.who is EVERYONE@ {
                      if (a.mask & ACE4_READ_DATA) {
                          if ! (seen & RUSR) {
                              seen |= RUSR;
                              if a.type is ALLOW {
                                  mode |= RUSR;
                              }
                          }
                          if ! (seen & RGRP) {
                              seen |= RGRP;
                              if a.type is ALLOW {
                                  mode |= RGRP;
                              }
                          }
                          if ! (seen & ROTH) {
                              seen |= ROTH;
                              if a.type is ALLOW {
                                  mode |= ROTH;
                              }
                          }
                      }
                      if (a.mask & ACE4_WRITE_DATA) {
                          if ! (seen & WUSR) {
                              seen |= WUSR;
                              if a.type is ALLOW {
                                  mode |= WUSR;
                              }
                          }
                          if ! (seen & WGRP) {
                              seen |= WGRP;
                              if a.type is ALLOW {
                                  mode |= WGRP;
                              }
                          }
                          if ! (seen & WOTH) {
                              seen |= WOTH;
                              if a.type is ALLOW {
                                  mode |= WOTH;
                              }
                          }
                      }
                      if (a.mask & ACE4_EXECUTE) {
                          if ! (seen & XUSR) {
                              seen |= XUSR;
                              if a.type is ALLOW {
                                  mode |= XUSR;
                              }
                          }
                          if ! (seen & XGRP) {
                              seen |= XGRP;
                              if a.type is ALLOW {
                                  mode |= XGRP;
                              }
                          }
                          if ! (seen & XOTH) {
                              seen |= XOTH;
                              if a.type is ALLOW {
                                  mode |= XOTH;
                              }
                          }
                      }
                  }
              }
          }
          return mode | (old_mode & (SUID | SGID | SVTX))

6.6.2.  Applying the mode given even have to CREATE or OPEN destroy the
      old session before creating a new session, which Eve will be
      unable to hijack because it will be protected with an inherited ACL

   The goal of implementing ACL inheritance is for newly SSV created objects
   to inherit
      via Bob's RPCSEC_GSS protection.

   o  If Eve creates a connection before the ACLs they were intended to inherit, but without
   disregarding legitimate client
      establishes an SSV, because the mode that initial value of the SSV is given with zero
      and therefore known, Eve can issue a SET_SSV that will pass the arguments
      digest verification check.  However because the new connection has
      not been bound to the CREATE
   or OPEN operations.  The general algorithm is as follows:

   1.  Form an ACL on session, the newly created object that SET_SSV is the concatenation
       of all inheritable ACEs from its parent directory.  Note rejected for that
       there may be zero inheritable ACEs; thus, an object may start
       with an empty ACL.

   2.  For each ACE in the new ACL, adjust its flags
      reason.

   o  The connection to session binding model does not prevent
      connection hijacking.  However, if necessary, and
       possibly create two ACEs in place an attacker can perform
      connection hijacking, it can issue denial of service attacks that
      are less difficult than attacks based on forging sessions.

6.11.  Session Mechanics - Steady State

6.11.1.  Obligations of one.  This is necessary to
       honor the intent Server

   [[Comment.10: XXX - TBD]]

6.11.2.  Obligations of the inheritance- related flags and Client

   The client has the following obligations in order to
       preserve information about utilize the original inheritable ACEs in
   session:

   o  Keep a necessary session from going idle on the
       case server.  A client
      that they will be modified by other steps.  The algorithm requires a session, but nonetheless is
       as follows:

       A.  If not sending operations
      risks having the ACE4_NO_PROPAGATE_INHERIT_ACE is set, or if session be destroyed by the object
           being created server.  This is not a directory, then clear
      because sessions consume resources, and resource limitations may
      force the following
           flags:

              ACE4_NO_PROPAGATE_INHERIT_ACE

              ACE4_FILE_INHERIT_ACE

              ACE4_DIRECTORY_INHERIT_ACE
              ACE4_INHERIT_ONLY_ACE

           Continue on server to cull the next ACE.

       B.  If least recently used session.

   o  Destroy the object being created is session when idle.  When a directory session has no state other
      than the session, and
           ACE4_FILE_INHERIT_ACE is set, but ACE4_DIRECTORY_INHERIT_ACE
           is NOT set, then we ensure that ACE4_INHERIT_ONLY_ACE is set.
           Continue on to no outstanding requests, the next ACE.  Otherwise:

       C. client should
      consider destroying the session.

   o  Maintain GSS contexts for callback.  If the type of client requires the ACE is neither ALLOW nor DENY, then
           continue on
      server to to use the next ACE.

       D.  Copy RPCSEC_GSS security flavor for callbacks,
      then it needs to be sure the original ACE into a second, adjacent ACE.

       E.  On contexts handed to the first ACE, ensure that ACE4_INHERIT_ONLY_ACE server via
      BACKCHANNEL_CTL are unexpired.  A good practice is set.

       F.  On the second ACE, clear to keep at
      least two contexts outstanding, where the following flags:

              ACE4_NO_PROPAGATE_INHERIT_ACE

              ACE4_FILE_INHERIT_ACE

              ACE4_DIRECTORY_INHERIT_ACE

              ACE4_INHERIT_ONLY_ACE

       G.  On expiration time of the second ACE, if
      newest context at the type field time it was created, is ALLOW, an
           implementation MAY clear the following mask bits:

              ACE4_WRITE_ACL

              ACE4_WRITE_OWNER

   3.  To ensure N times that of the mode
      oldest context, where N is honored, apply the algorithm number of contexts available for
       applying
      callbacks.

   o  Maintain an active connection.  The server requires a mode callback
      path in order to a file/directory with an existing ACL on gracefully recall recallable state, or notify the
       new object as described in Section 6.6.3, using
      client of certain events.

6.11.3.  Steps the mode that is Client Takes To Establish a Session

   The client issues CREATE_CLIENTID to be used for file creation.

6.6.3.  Applying establish a Mode clientid.

   The client uses the clientid to an Existing ACL

   An existing ACL can mean two things in this context.  One, that issue a
   file/directory already exists and it has an ACL.  Two, that CREATE_SESSION on a
   directory has inheritable ACEs that will make up
   connection to the ACL for any new
   files or directories created therein. server.  The high-level goal results of CREATE_SESSION indicate
   whether the behavior when server will persist the session replay cache through a mode
   server reboot or not, and the client notes this for future reference.

   The client SHOULD issue SET_SSV in first COMPOUND after the session
   is set on a file with
   an existing ACL created.  If it is to take the not using machine credentials, then each time a
   new mode into account, without needing principal goes to delete a pre-existing ACL.

   When use the session, it SHOULD issue a mode is applied SET_SSV
   again.

   If the client wants to an object, e.g. via SETATTR use delegations, layouts, directory
   notifications, or CREATE/OPEN,
   the ACL any other state that requires a call back channel,
   then it must be modified add connection to accommodate the mode.

   1. backchannel if CREATE_SESSION did
   not already do so.  The ACL is traversed, one ACE at client creates a time.  For each ACE:

       1.  If the type of the ACE is neither ALLOW nor DENY, the ACE is
           left unchanged.  Continue connection, and calls
   BIND_CONN_TO_SESSION to bind the next ACE.

       2.  If the ACE4_INHERIT_ONLY_ACE flag is set on the ACE, it is
           left unchanged.  Continue connection to the next ACE.

       3.  If either or both of ACE4_FILE_INHERIT_ACE or
           ACE4_DIRECTORY_INHERIT_ACE are set:

           1.  A copy of the ACE is made, session and placed in the ACL
               immediately following the current ACE.

           2.  In
   session's backchannel.  If CREATE_SESSION did not already do so, the first ACE,
   client MUST tell the flag ACE4_INHERIT_ONLY_ACE server what security is set.

           3.  In the second ACE, required in order for
   the following flags are cleared:

                  ACE4_FILE_INHERIT_ACE

                  ACE4_DIRECTORY_INHERIT_ACE

                  ACE4_NO_PROPAGATE_INHERIT_ACE client to accept callbacks.  The algorithm continues on with the second ACE.

       4. client does this via
   BACKCHANNEL_CTL.

   If the "who" field is one of client wants to use additional connections for the following:

              OWNER@

              GROUP@

              EVERYONE@ operations
   and back channels, then it MUST call BIND_CONN_TO_SESSION on each
   connection it wants to use with the following mask bits are cleared:

              ACE4_READ_DATA / ACE4_LIST_DIRECTORY

              ACE4_WRITE_DATA / ACE4_ADD_FILE

              ACE4_APPEND_DATA / ACE4_ADD_SUBDIRECTORY

              ACE4_EXECUTE session.

   At this point, we proceed point the client has reached a steady state as far as session
   use.

6.12.  Session Mechanics - Recovery

   This section discussions session related events that require
   recovery.

6.12.1.  Events Requiring Client Action

   The following events require client action to recover.

6.12.1.1.  RPCSEC_GSS Context Loss by Callback Path

   If all RPCSEC_GSS contexts granted to by the next ACE.

       5.  Otherwise, if client to the "who" server for
   callback use have expired, the client MUST establish a new context
   via BIND_CONN_TO_SESSION.  The sr_status field did not match one of OWNER@,
           GROUP@, SEQUENCE results
   indicates when callback contexts are nearly expired, or EVERYONE@, the following steps SHOULD be
           performed.

           1. fully expired
   (see Section 21.46.4).

6.12.1.2.  Connection Disconnect

   If the type of client loses the ACE is ALLOW, we check last connection of the preceding
               ACE (if any).  If session, then it does not meet all of MUST
   create a new connection, and bind it to the following
               criteria:

               1.  The type field is DENY.

               2. session via
   BIND_CONN_TO_SESSION.

6.12.1.3.  Loss of Session

   The who field is server may lose a record of the same as session.  Causes include:

   o  Server crash and reboot

   o  A catastrophe that causes the current ACE.

               3.  The flag bit ACE4_IDENTIFIER_GROUP is cache to be corrupted or lost on the same as
      media it
                   is was stored on.  This applies even if the server indicated
      in the current ACE, and no other flag bits are
                   set.

               4. CREATE_SESSION results that it would persist the cache.

   o  The mask bits are a subset of server purges the mask bits session of the
                   current ACE, and are also a subset of the following:

                      ACE4_READ_DATA / ACE4_LIST_DIRECTORY

                      ACE4_WRITE_DATA / ACE4_ADD_FILE

                      ACE4_APPEND_DATA / ACE4_ADD_SUBDIRECTORY

                      ACE4_EXECUTE

               then an ACE client that has been inactive
      for a very extended period of type DENY, with time.  [[Comment.11: XXX - Should we
      add a who equal to the current
               ACE, flag bits equal to (<current-ACE-flags> &
               ACE4_IDENTIFIER_GROUP), and no mask bits, is prepended.

           2.  The following modifications are made value to the prepended
               ACE.  The intent CREATE_SESSION results that tells a client how
      long he can let a session stay idle before losing it?]].

   Loss of replay cache is equivalent to mask the following ACE to disallow
               ACE4_READ_DATA, ACE4_WRITE_DATA, ACE4_APPEND_DATA, or
               ACE4_EXECUTE, based upon the group permissions loss of session.  The server
   indicates loss of session to the new
               mode.  As a special case, if client by returning
   NFS4ERR_BADSESSION on the ACE matches next operation that uses the current
               owner of sessionid
   associated with the file, lost session.

   After an event like a server reboot, the owner bits are used, rather than client may have lost its
   connections.  The client assumes for the group bits.  This is reflected in moment that the algorithm
               below.

          Let there be three bits defined:

          #define READ    04
          #define WRITE   02
          #define EXEC    01

          Let "amode" be session has
   not been lost.  It reconnects, and invokes BIND_CONN_TO_SESSION using
   the new mode, right-shifted three
          bits, in order to have sessionid.  If BIND_CONN_TO_SESSION returns NFS4ERR_BADSESSION,
   the group permission bits
          placed in client knows the three low order bits of amode,
          i.e. amode = mode >> 3 session was lost.  If ACE4_IDENTIFIER_GROUP is not set in the flags,
          and connection survives
   session loss, then the "who" field of next SEQUENCE operation the ACE matches client issues over
   the owner
          of connection will get back NFS4ERR_BADSESSION.  The client again
   knows the file, we shift amode three more bits, in
          order session was lost.

   When the client detects session loss, it must call CREATE_SESSION to
   recover.  Any non-idempotent operations that were in progress may
   have been performed on the owner permission bits placed in server at the three low order bits time of amode:

          amode = amode >> 3

          amode session loss.  The
   client has no general way to recover from this.

   Note that loss of session does not imply loss of lock, open,
   delegation, or layout state.  Nor does loss of lock, open,
   delegation, or layout state imply loss of session state.[[Comment.12:
   Add reference to lock recovery section]].  A session can survive a
   server reboot, but lock recovery may still be needed.  The converse
   is now used as follows:

          If ACE4_READ_DATA also true.

   It is set on possible CREATE_SESSION will fail with NFS4ERR_STALE_CLIENTID
   (for example the current ACE: server reboots and does not preserve clientid
   state).  If READ is set on amode:
                         ACE4_READ_DATA is cleared on the prepended ACE
                     else:
                         ACE4_READ_DATA is set on so, the prepended ACE

              If ACE4_WRITE_DATA is set on client needs to call CREATE_CLIENTID, followed by
   CREATE_SESSION.

6.12.2.  Events Requiring Server Action

   The following events require server action to recover.

6.12.2.1.  Client Crash and Reboot

   As described in Section 21.35, a rebooted client causes the current ACE: server to
   delete any sessions it had.

6.12.2.2.  Client Crash with No Reboot

   If WRITE is set on amode:
                         ACE4_WRITE_DATA is cleared on a client crashes and never comes back, it will never issue
   CREATE_CLIENTID with its old clientid.  Thus the prepended ACE
                     else:
                         ACE4_WRITE_DATA is set on server has session
   state that will never be used again.  After an extended period of
   time and if the prepended ACE
              If ACE4_APPEND_DATA is set on server has resource constraints, it MAY destroy the current ACE:
                     If WRITE is set on amode:
                         ACE4_APPEND_DATA is cleared on
   old session.

6.12.2.2.1.  Extended Network Parition

   To the
                         prepended ACE
                     else:
                         ACE4_APPEND_DATA is set on server, the prepended ACE

              If ACE4_EXECUTE is set on extended network partition may be no different
   than a client crash with no reboot (see Section 6.12.2.2 Client Crash
   with No Reboot).  Unless the current ACE:
                     If EXEC server can discern that there is set on amode:
                         ACE4_EXECUTE a
   network partition, it is cleared on free to treat the prepended ACE
                     else:
                         ACE4_EXECUTE is set on situation as if the prepended ACE

           3. client
   has crashed for good.

7.  Minor Versioning

   To conform with POSIX, and prevent cases where address the owner requirement of the file is given permissions via an explicit group,
               we implement the following step.

                  If ACE4_IDENTIFIER_GROUP is set in the flags field of NFS protocol that can evolve as the ALLOW ACE:
                      Let "mode" be
   need arises, the mode that we are chmoding to:
                          extramode = (mode >> 3) & 07
                          ownermode = mode >> 6
                          extramode &= ~ownermode
                      If extramode is not zero:
                          If extramode & READ:
                              Clear ACE4_READ_DATA in both NFS version 4 protocol contains the
                              prepended DENY ACE rules and
   framework to allow for future minor changes or versioning.

   The base assumption with respect to minor versioning is that any
   future accepted minor version must follow the ALLOW ACE
                          If extramode & WRITE:
                              Clear ACE4_WRITE_DATA IETF process and ACE_APPEND_DATA be
   documented in both a standards track RFC.  Therefore, each minor version
   number will correspond to an RFC.  Minor version zero of the prepended DENY ACE and NFS
   version 4 protocol is represented by this RFC.  The COMPOUND
   procedure will support the
                              ALLOW ACE
                      If extramode & EXEC:
                              Clear ACE4_EXECUTE in both encoding of the prepended
                              DENY ACE and minor version being
   requested by the ALLOW ACE

   2.  If there are at least six ACEs, client.

   The following items represent the final six ACEs are examined.
       If they are not equal basic rules for the development of
   minor versions.  Note that a future minor version may decide to
   modify or add to the following ACEs:

          A1) OWNER@:::DENY
          A2) OWNER@:ACE4_WRITE_ACL/ACE4_WRITE_OWNER/
              ACE4_WRITE_ATTRIBUTES/ACE4_WRITE_NAMED_ATTRIBUTES::ALLOW
          A3) GROUP@::ACE4_IDENTIFIER_GROUP:DENY
          A4) GROUP@::ACE4_IDENTIFIER_GROUP:ALLOW
          A5) EVERYONE@:ACE4_WRITE_ACL/ACE4_WRITE_OWNER/
              ACE4_WRITE_ATTRIBUTES/ACE4_WRITE_NAMED_ATTRIBUTES::DENY
          A6) EVERYONE@:ACE4_READ_ACL/ACE4_READ_ATTRIBUTES/
              ACE4_READ_NAMED_ATTRIBUTES/ACE4_SYNCHRONIZE::ALLOW

       Then six ACEs matching rules as part of the above are appended.

   3.  The final six ACEs minor version
   definition.

   1.   Procedures are adjusted according not added or deleted

        To maintain the general RPC model, NFS version 4 minor versions
        will not add to or delete procedures from the incoming mode.

          /* octal constants for NFS program.

   2.   Minor versions may add operations to the mode bits */

          RUSR = 0400
          WUSR = 0200
          XUSR = 0100
          RGRP = 0040
          WGRP = 0020
          XGRP = 0010
          ROTH = 0004
          WOTH = 0002
          XOTH = 0001

          If RUSR is set: set ACE4_READ_DATA in A2
              else: set ACE4_READ_DATA in A1
          If WUSR is set: set ACE4_WRITE_DATA and ACE4_APPEND_DATA in A2
              else: set ACE4_WRITE_DATA and ACE4_APPEND_DATA in A1
          If XUSR is set: set ACE4_EXECUTE in A2
              else: set ACE4_EXECUTE in A1
          If RGRP is set: set ACE4_READ_DATA in A4
              else: set ACE4_READ_DATA in A3
          If WGRP is set: set ACE4_WRITE_DATA and ACE4_APPEND_DATA in A4
              else: set ACE4_WRITE_DATA and ACE4_APPEND_DATA in A3
          If XGRP is set: set ACE4_EXECUTE in A4
              else: set ACE4_EXECUTE in A3
          If ROTH is set: set ACE4_READ_DATA in A6
              else: set ACE4_READ_DATA in A5
          If WOTH is set: set ACE4_WRITE_DATA COMPOUND and ACE4_APPEND_DATA in A6
              else: set ACE4_WRITE_DATA
        CB_COMPOUND procedures.

        The addition of operations to the COMPOUND and ACE4_APPEND_DATA in A5
          If XOTH is set: set ACE4_EXECUTE in A6
              else: set ACE4_EXECUTE in A5

6.6.4.  ACL CB_COMPOUND
        procedures does not affect the RPC model.

        *  Minor versions may append attributes to GETATTR4args,
           bitmap4, and mode GETATTR4res.

           This allows for the expansion of the attribute model to allow
           for future growth or adaptation.

        *  Minor version X must append any new attributes after the last
           documented attribute.

           Since attribute results are specified as an opaque array of
           per-attribute XDR encoded results, the complexity of adding
           new attributes in the same SETATTR

   The only reason that a mode and ACL should midst of the current definitions will
           be set in too burdensome.

   3.   Minor versions must not modify the same SETATTR structure of an existing
        operation's arguments or results.

        Again the complexity of handling multiple structure definitions
        for a single operation is if too burdensome.  New operations should
        be added instead of modifying existing structures for a minor
        version.

        This rule does not preclude the user wants following adaptations in a minor
        version.

        *  adding bits to set the SUID, SGID and SVTX flag fields such as new attributes to
           GETATTR's bitmap4 data type

        *  adding bits along to existing attributes like ACLs that have flag
           words

        *  extending enumerated types (including NFS4ERR_*) with
   setting new
           values

   4.   Minor versions may not modify the permissions by means structure of existing
        attributes.

   5.   Minor versions may not delete operations.

        This prevents the potential reuse of a particular operation
        "slot" in a future minor version.

   6.   Minor versions may not delete attributes.

   7.   Minor versions may not delete flag bits or enumeration values.

   8.   Minor versions may declare an ACL.  There operation as mandatory to NOT
        implement.

        Specifying an operation as "mandatory to not implement" is still no way
        equivalent to
   enforce which order obsoleting an operation.  For the attributes will be set in, and client, it is likely means
        that different orders of operations will produce different results.

6.6.4.1.  Client Side Recommendations

   If the operation should not be sent to the server.  For the
        server, an application needs NFS error can be returned as opposed to enforce a certain behavior, it is
   recommended "dropping"
        the request as an XDR decode error.  This approach allows for
        the obsolescence of an operation while maintaining its structure
        so that a future minor version can reintroduce the operation.

        1.  Minor versions may declare attributes mandatory to NOT
            implement.

        2.  Minor versions may declare flag bits or enumeration values
            as mandatory to NOT implement.

   9.   Minor versions may downgrade features from mandatory to
        recommended, or recommended to optional.

   10.  Minor versions may upgrade features from optional to recommended
        or recommended to mandatory.

   11.  A client implementations set mode and ACL server that support minor version X must support
        minor versions 0 (zero) through X-1 as well.

   12.  No new features may be introduced as mandatory in
   separate SETATTR requests. a minor
        version.

        This will produce consistent rule allows for the introduction of new functionality and expected
   results.

   If an application wants
        forces the use of implementation experience before designating a
        feature as mandatory.

   13.  A client MUST NOT attempt to set SUID, SGID use a stateid, filehandle, or
        similar returned object from the COMPOUND procedure with minor
        version X for another COMPOUND procedure with minor version Y,
        where X != Y.

8.  Protocol Data Types

   The syntax and SVTX bits semantics to describe the data types of the NFS
   version 4 protocol are defined in the XDR RFC4506 [3] and an ACL:

      In RPC RFC1831
   [4] documents.  The next sections build upon the first SETATTR, set XDR data types to
   define types and structures specific to this protocol.

8.1.  Basic Data Types

                   These are the mode base NFSv4 data types.

   +---------------+---------------------------------------------------+
   | Data Type     | Definition                                        |
   +---------------+---------------------------------------------------+
   | int32_t       | typedef int int32_t;                              |
   | uint32_t      | typedef unsigned int uint32_t;                    |
   | int64_t       | typedef hyper int64_t;                            |
   | uint64_t      | typedef unsigned hyper uint64_t;                  |
   | attrlist4     | typedef opaque attrlist4<>;                       |
   |               | Used for file/directory attributes                |
   | bitmap4       | typedef uint32_t bitmap4<>;                       |
   |               | Used in attribute array encoding.                 |
   | changeid4     | typedef uint64_t changeid4;                       |
   |               | Used in definition of change_info                 |
   | clientid4     | typedef uint64_t clientid4;                       |
   |               | Shorthand reference to client identification      |
   | component4    | typedef utf8str_cs component4;                    |
   |               | Represents path name components                   |
   | count4        | typedef uint32_t count4;                          |
   |               | Various count parameters (READ, WRITE, COMMIT)    |
   | length4       | typedef uint64_t length4;                         |
   |               | Describes LOCK lengths                            |
   | linktext4     | typedef utf8str_cs linktext4;                     |
   |               | Symbolic link contents                            |
   | mode4         | typedef uint32_t mode4;                           |
   |               | Mode attribute data type                          |
   | nfs_cookie4   | typedef uint64_t nfs_cookie4;                     |
   |               | Opaque cookie value for READDIR                   |
   | nfs_fh4       | typedef opaque nfs_fh4<NFS4_FHSIZE>               |
   |               | Filehandle definition; NFS4_FHSIZE is defined as  |
   |               | 128                                               |
   | nfs_ftype4    | enum nfs_ftype4;                                  |
   |               | Various defined file types                        |
   | nfsstat4      | enum nfsstat4;                                    |
   |               | Return value for operations                       |
   | offset4       | typedef uint64_t offset4;                         |
   |               | Various offset designations (READ, WRITE, LOCK,   |
   |               | COMMIT)                                           |
   | pathname4     | typedef component4 pathname4<>;                   |
   |               | Represents path name for fs_locations             |
   | qop4          | typedef uint32_t qop4;                            |
   |               | Quality of protection designation in SECINFO      |
   | sec_oid4      | typedef opaque sec_oid4<>;                        |
   |               | Security Object Identifier The sec_oid4 data type |
   |               | is not really opaque. Instead contains an ASN.1   |
   |               | OBJECT IDENTIFIER as used by GSS-API in the       |
   |               | mech_type argument to GSS_Init_sec_context. See   |
   |               | RFC2743 [8] for details.                          |
   | seqid4        | typedef uint32_t seqid4;                          |
   |               | Sequence identifier used for file locking         |
   | utf8string    | typedef opaque utf8string<>;                      |
   |               | UTF-8 encoding for strings                        |
   | utf8str_cis   | typedef opaque utf8str_cis;                       |
   |               | Case-insensitive UTF-8 string                     |
   | utf8str_cs    | typedef opaque utf8str_cs;                        |
   |               | Case-sensitive UTF-8 string                       |
   | utf8str_mixed | typedef opaque utf8str_mixed;                     |
   |               | UTF-8 strings with SUID, SGID a case sensitive prefix and SVTX bits a  |
   |               | case insensitive suffix.                          |
   | verifier4     | typedef opaque verifier4[NFS4_VERIFIER_SIZE];     |
   |               | Verifier used for various operations (COMMIT,     |
   |               | CREATE, OPEN, READDIR, SETCLIENTID,               |
   |               | SETCLIENTID_CONFIRM, WRITE) NFS4_VERIFIER_SIZE is |
   |               | defined as desired and all other bits with a value 8.                                     |
   +---------------+---------------------------------------------------+

                          End of 0.

      In a following SETATTR (preferably in the same COMPOUND) set the
      ACL.

6.6.4.2.  Server Side Recommendations

   If both mode and ACL are given to SETATTR, server implementations
   should verify that Base Data Types

                                  Table 1

8.2.  Structured Data Types

8.2.1.  nfstime4

   struct nfstime4 {
       int64_t seconds;
       uint32_t nseconds;
   }

   The nfstime4 structure gives the mode number of seconds and ACL don't conflict, i.e. the mode
   computed from nanoseconds
   since midnight or 0 hour January 1, 1970 Coordinated Universal Time
   (UTC).  Values greater than zero for the given ACL must be seconds field denote dates
   after the same as 0 hour January 1, 1970.  Values less than zero for the given mode,
   excluding
   seconds field denote dates before the SUID, SGID and SVTX bits.  The algorithm for assigning
   a new mode based on 0 hour January 1, 1970.  In
   both cases, the ACL can be used.  (This nseconds field is described in
   Section 6.6.1.)  If a server receives a request to set both mode and
   ACL, but be added to the two conflict, seconds field
   for the server should return NFS4ERR_INVAL.

6.6.5.  Inheritance and turning it off

   The inheritance of access permissions may be problematic if a user
   cannot prevent their file from inheriting unwanted permissions. final time representation.  For example, a user, "bob", sets up a shared project directory if the time to be used
   by everyone working on Project Foo. "alice" is a part of Project Foo,
   but
   represented is working on something that should not be seen by anyone else.
   How can "alice" make sure that any new files that she creates in this
   shared project directory do not inherit anything that could
   compromise the security of her work?

   More relevant to one-half second before 0 hour January 1, 1970, the implementors
   seconds field would have a value of NFS version 4 clients negative one (-1) and
   servers is the question of how to communicate the fact that user
   "alice" doesn't want any permissions to be inherited to her newly
   created file or directory.

   To do this, implementors should standardize on what the behavior
   nseconds fields would have a value of
   CREATE and OPEN must be if:

   1.  just mode one-half second (500000000).
   Values greater than 999,999,999 for nseconds are considered invalid.

   This data type is given

       In this case, inheritance will take place, but the mode will be
       applied used to the inherited ACL as described in Section 6.6.1,
       thereby modifying the ACL.

   2.  just ACL is given

       In this case, inheritance will not take place, and the ACL as
       defined in the CREATE or OPEN will be set without modification.

   3.  both mode pass time and ACL are given
       In this case, implementors should verify that the mode date information.  A server
   converts to and ACL
       don't conflict, i.e. the mode computed from the given ACL must be
       the same its local representation of time when processing
   time values, preserving as much accuracy as possible.  If the given mode.  The algorithm
   precision of timestamps stored for assigning a new
       mode based on the ACL file system object is less than
   defined, loss of precision can be used.  This occur.  An adjunct time maintenance
   protocol is described in
       Section 6.6.1) If a server receives a request recommended to set both mode reduce client and ACL, but the two conflict, the server should return
       NFS4ERR_INVAL.  If the mode and ACL don't conflict, inheritance
       will not take place and both, the mode and ACL, will be set
       without modification.

   4.  neither mode nor ACL time skew.

8.2.2.  time_how4

   enum time_how4 {
       SET_TO_SERVER_TIME4 = 0,
       SET_TO_CLIENT_TIME4 = 1
   };

8.2.3.  settime4

   union settime4 switch (time_how4 set_it) {
       case SET_TO_CLIENT_TIME4:
           nfstime4       time;
       default:
           void;
   };

   The above definitions are given

       In this case, inheritance will take place and no modifications to used as the ACL will happen.  It attribute definitions to set
   time values.  If set_it is worth noting that if no inheritable
       ACEs exist on the parent directory, SET_TO_SERVER_TIME4, then the file will be created with
       an empty ACL, thus granting no accesses.

6.6.6.  Deficiencies in a Mode Representation server uses
   its local representation of an ACL

   In time for the presence of an ACL, there are certain cases when time value.

8.2.4.  specdata4

   struct specdata4 {
       uint32_t specdata1; /* major device number */
       uint32_t specdata2; /* minor device number */
   };

   This data type represents additional information for the
   representation of device file
   types NF4CHR and NF4BLK.

8.2.5.  fsid4

   struct fsid4 {
       uint64_t        major;
       uint64_t        minor;
   };

8.2.6.  fs_location4

   struct fs_location4 {
       utf8str_cis    server<>;
       pathname4     rootpath;
   };

8.2.7.  fs_locations4

   struct fs_locations4 {
       pathname4     fs_root;
       fs_location4  locations<>;
   };

   The fs_location4 and fs_locations4 data types are used for the mode
   fs_locations recommended attribute which is not guaranteed used for migration and
   replication support.

8.2.8.  fattr4

   struct fattr4 {
       bitmap4       attrmask;
       attrlist4     attr_vals;
   };

   The fattr4 structure is used to be accurate.  An
   example of a situation represent file and directory
   attributes.

   The bitmap is detailed below.

   As mentioned in Section 6.6, the representation a counted array of the mode is
   deterministic, but not guaranteed 32 bit integers used to be accurate. contain bit
   values.  The mode bits
   potentially convey a more restrictive permission than what will
   actually be granted via position of the ACL.

   Given integer in the following ACL of two ACEs:

          GROUP@:ACE4_READ_DATA/ACE4_WRITE_DATA/ACE4_EXECUTE:
              ACE4_IDENTIFIER_GROUP:ALLOW
          EVERYONE@:ACE4_READ_DATA/ACE4_WRITE_DATA/ACE4_EXECUTE::DENY

   we would compute a mode of 0070.  However, it is possible, even
   likely, array that the owner might contains bit n
   can be a member of the object's owning
   group, and thus, computed from the owner would be granted read, write, expression (n / 32) and execute
   access to the object. its bit within that
   integer is (n mod 32).

   0            1
   +-----------+-----------+-----------+--
   |  count    | 31  ..  0 | 63  .. 32 |
   +-----------+-----------+-----------+--

8.2.9.  change_info4

   struct change_info4 {
       bool          atomic;
       changeid4     before;
       changeid4     after;
   };

   This would conflict structure is used with the mode of 0070,
   where an owner would be denied this access.

   The only way to overcome this deficiency would be CREATE, LINK, REMOVE, RENAME
   operations to determine
   whether let the object's owner is a member client know the value of the object's owning group.
   This is difficult, but worse, on a POSIX or any UNIX-like system, it
   is a process' membership change attribute
   for the directory in a group that which the target file system object resides.

8.2.10.  netaddr4

   struct netaddr4 {
       /* see struct rpcb in RFC1833 */
       string r_netid<>;    /* network id */
       string r_addr<>;     /* universal address */
   };

   The netaddr4 structure is important, not a user's.
   Thus, any fixed mode intended used to represent the above ACL can be
   incorrect.

   Example: administrative databases (possibly /etc/passwd identify TCP/IP based endpoints.
   The r_netid and r_addr fields are specified in RFC1833 [20], but they
   are underspecified in RFC1833 [20] as far as what they should look
   like for specific protocols.

   For TCP over IPv4 and /etc/
   group) indicate that for UDP over IPv4, the user "bob" is a member format of r_addr is the group "staff".
   An object has
   US-ASCII string:

   h1.h2.h3.h4.p1.p2

   The prefix, "h1.h2.h3.h4", is the ACL given above, standard textual form for
   representing an IPv4 address, which is owned by "bob", always four octets long.
   Assuming big-endian ordering, h1, h2, h3, and has an
   owning group of "staff".  User "bob" has logged into h4, are respectively,
   the system, first through fourth octets each converted to ASCII-decimal.
   Assuming big-endian ordering, p1 and
   thus processes have been created owned by "bob" p2 are, respectively, the first
   and having membership second octets each converted to ASCII-decimal.  For example, if a
   host, in group "staff".

   A mode representation big-endian order, has an address of the above ACL could thus be 0770, due to
   user "bob" having membership 0x0A010307 and there is
   a service listening on, in group "staff".  Now, big endian order, port 0x020F (decimal
   527), then complete universal address is "10.1.3.7.2.15".

   For TCP over IPv4 the
   administrative databases are changed, such that user "bob" value of r_netid is no
   longer in group "staff".  User "bob" logs in to the system again, and
   thus more processes are created, this time owned by "bob" but NOT in
   group "staff".

   A mode string "tcp".  For UDP
   over IPv4 the value of 0770 r_netid is inaccurate the string "udp".

   For TCP over IPv6 and for processes not belonging to group
   "staff".  But even if UDP over IPv6, the mode format of r_addr is the file were proactively changed
   to 0070 at
   US-ASCII string:

   x1:x2:x3:x4:x5:x6:x7:x8.p1.p2

   The suffix "p1.p2" is the time service port, and is computed the group database was edited, mode 0070 would be
   inaccurate same way
   as with universal addresses for the pre-existing processes owned by user "bob" TCP and
   having membership in group "staff".

7.  Single-server Name Space

   This chapter describes UDP over IPv4.  The prefix,
   "x1:x2:x3:x4:x5:x6:x7:x8", is the NFSv4 single-server name space.  Single-
   server namespaces may be presented directly to clients, or they may
   be used as a basis to standard textual form larger multi-server namespaces (e.g. site-
   wide or organization-wide) to be presented to clients, for
   representing an IPv6 address as described defined in Section 13.

7.1.  Server Exports

   On a UNIX server, 2.2 of RFC1884
   [9].  Additionally, the name space describes all two alternative forms specified in Section
   2.2 of RFC1884 [9] are also acceptable.

   For TCP over IPv6 the files reachable by
   pathnames under value of r_netid is the root directory or "/".  On a Windows NT server string "tcp6".  For UDP
   over IPv6 the name space constitutes all value of r_netid is the files on disks named by mapped
   disk letters.  NFS server administrators rarely make string "udp6".

8.2.11.  clientaddr4

   typedef netaddr4 clientaddr4;

   The clientaddr4 structure is used as part of the entire
   server's file system name space available SETCLIENTID
   operation to NFS clients.  More often
   portions of either specify the name space are made available via an "export"
   feature.  In previous versions address of the NFS protocol, the root
   filehandle for each export client that is obtained through using a
   clientid or as part of the MOUNT protocol; callback registration.

8.2.12.  cb_client4

   struct cb_client4 {
       unsigned int  cb_program;
       netaddr4      cb_location;
   };

   This structure is used by the client sends a string that identifies to inform the export server of name space
   and its call
   back address; includes the server returns program number and client address.

8.2.13.  nfs_client_id4

   struct nfs_client_id4 {
       verifier4     verifier;
       opaque        id<NFS4_OPAQUE_LIMIT>
   };

   This structure is part of the root filehandle for it.  The MOUNT
   protocol supports an EXPORTS procedure that will enumerate arguments to the
   server's exports.

7.2.  Browsing Exports

   The NFS version 4 protocol provides a root filehandle that clients
   can use SETCLIENTID operation.
   NFS4_OPAQUE_LIMIT is defined as 1024.

8.2.14.  open_owner4

   struct open_owner4 {
       clientid4     clientid;
       opaque        owner<NFS4_OPAQUE_LIMIT>
   };

   This structure is used to obtain filehandles for identify the exports of a particular server,
   via a series owner of LOOKUP operations within a COMPOUND, to traverse a
   path.  A common user experience open state.
   NFS4_OPAQUE_LIMIT is defined as 1024.

8.2.15.  lock_owner4

   struct lock_owner4 {
       clientid4     clientid;
       opaque        owner<NFS4_OPAQUE_LIMIT>
   };

   This structure is used to use a graphical user interface
   (perhaps a file "Open" dialog window) to find a identify the owner of file via progressive
   browsing through a directory tree.  The client must be able to move
   from one export to another export via single-component, progressive
   LOOKUP operations. locking state.
   NFS4_OPAQUE_LIMIT is defined as 1024.

8.2.16.  open_to_lock_owner4

   struct open_to_lock_owner4 {
       seqid4          open_seqid;
       stateid4        open_stateid;
       seqid4          lock_seqid;
       lock_owner4     lock_owner;
   };

   This style of browsing structure is not well supported by used for the NFS version 2 and
   3 protocols.  The client expects all LOOKUP operations to remain
   within a single server file system.  For example, first LOCK operation done for an
   open_owner4.  It provides both the device
   attribute will not change.  This prevents a client from taking name
   space paths open_stateid and lock_owner such
   that span exports.

   An automounter on the client can obtain transition is made from a snapshot valid open_stateid sequence to
   that of the server's
   name space using new lock_stateid sequence.  Using this mechanism avoids
   the EXPORTS procedure confirmation of the MOUNT protocol.  If lock_owner/lock_seqid pair since it
   understands is tied
   to established state in the server's pathname syntax, it can create an image form of the server's name space on open_stateid/open_seqid.

8.2.17.  stateid4

   struct stateid4 {
       uint32_t        seqid;
       opaque          other[12];
   };

   This structure is used for the client. various state sharing mechanisms
   between the client and server.  For the client, this data structure
   is read-only.  The parts starting value of the name space
   that are not exported by the seqid field is undefined.
   The server are filled in with a "pseudo file
   system" that allows the user to browse from one mounted file system
   to another.  There is a drawback required to this representation of increment the
   server's name space on seqid field monotonically at
   each transition of the client: it stateid.  This is static.  If the server
   administrator adds a new export important since the client
   will be unaware of it.

7.3.  Server Pseudo File System

   NFS version 4 servers avoid this name space inconsistency by
   presenting all the exports for a given server within inspect the framework of
   a single namespace, for that server.  An NFS version 4 client uses
   LOOKUP and READDIR operations to browse seamlessly from one export seqid in OPEN stateids to
   another.  Portions of determine the server name space that are not exported are
   bridged via a "pseudo file system" that provides a view order of exported
   directories only.  A pseudo file system has a unique fsid and behaves
   like a normal, read only file system.

   Based on
   OPEN processing done by the construction of server.

8.2.18.  layouttype4

   enum layouttype4 {
       LAYOUT_NFSV4_FILES  = 1,
       LAYOUT_OSD2_OBJECTS = 2,
       LAYOUT_BLOCK_VOLUME = 3
   };

   A layout type specifies the server's name space, it layout being used.  The implication is possible
   that multiple pseudo file systems may exist.  For example,

           /a              pseudo file system
           /a/b            real file system
           /a/b/c          pseudo clients have "layout drivers" that support one or more layout
   types.  The file system
           /a/b/c/d        real server advertises the layout types it supports
   through the LAYOUT_TYPES file system

   Each attribute.  A client asks for
   layouts of the pseudo file systems are considered separate entities a particular type in LAYOUTGET, and
   therefore will have passes those layouts
   to its own unique fsid.

7.4.  Multiple Roots layout driver.

   The DOS and Windows operating environments are sometimes described as
   having "multiple roots".  File Systems are commonly layouttype4 structure is 32 bits in length.  The range
   represented as
   disk letters.  MacOS represents file systems as top level names.  NFS
   version 4 servers for these platforms can construct a pseudo file
   system above these root names so that disk letters or volume names by the layout type is split into two parts.  Types within
   the range 0x00000000-0x7FFFFFFF are simply directory names globally unique and are assigned
   according to the description in Section 25.1; they are maintained by
   IANA.  Types within the pseudo root.

7.5.  Filehandle Volatility range 0x80000000-0xFFFFFFFF are site specific
   and for "private use" only.

   The nature of LAYOUT_NFSV4_FILES enumeration specifies that the server's pseudo NFSv4 file system
   layout type is to be used.  The LAYOUT_OSD2_OBJECTS enumeration
   specifies that it is a logical
   representation of file system(s) available from the server.
   Therefore, the pseudo file system object layout, as defined in [22], is most likely constructed
   dynamically when to be used.
   Similarly, the server is first instantiated.  It is expected LAYOUT_BLOCK_VOLUME enumeration that the pseudo file system may not have an on disk counterpart from
   which persistent filehandles could be constructed.  Even though it block/volume
   layout, as defined in [23], is
   preferable to be used.

8.2.19.  deviceid4

   typedef uint32_t deviceid4;  /* 32-bit device ID */

   Layout information includes device IDs that specify a storage device
   through a compact handle.  Addressing and type information is
   obtained with the server provide persistent filehandles for the
   pseudo file system, the NFS GETDEVICEINFO operation.  A client should expect must not assume
   that pseudo device IDs are valid across metadata server reboots.  The device
   ID is qualified by the layout type and are unique per file system filehandles are volatile.
   (FSID).  This can be confirmed allows different layout drivers to generate device IDs
   without the need for co-ordination.  See Section 17.3.1.4 for more
   details.

8.2.20.  devlist_item4

   struct devlist_item4 {
           deviceid4          dli_id;
           opaque             dli_device_addr<>;
   };

   An array of these values is returned by checking the GETDEVICELIST operation.
   They define the set of devices associated "fh_expire_type" attribute with a file system for those filehandles in
   question.  If the filehandles are volatile,
   layout type specified in the NFS client must be
   prepared GETDEVICELIST4args.

   The device address is used to recover set up a filehandle value (e.g. communication channel with a series of LOOKUP
   operations) when receiving an error the
   storage device.  Different layout types will require different types
   of NFS4ERR_FHEXPIRED.

7.6.  Exported Root

   If structures to define how they communicate with storage devices.
   The opaque device_addr field must be interpreted based on the server's root file system is exported, one might conclude that
   a pseudo-file system is unneeded.
   specified layout type.

   This not necessarily so.  Assume document defines the following device address for the NFSv4 file systems on layout
   (struct netaddr4 (Section 8.2.10)), which identifies a server:

           /       disk1  (exported)
           /a      disk2  (not exported)
           /a/b    disk3  (exported)

   Because disk2 storage device
   by network IP address and port number.  This is not exported, disk3 cannot be reached with simple
   LOOKUPs.  The server must bridge sufficient for the gap
   clients to communicate with a pseudo-file system.

7.7.  Mount Point Crossing

   The server file system environment the NFSv4 storage devices, and may be constructed in such a way
   that one file system contains a directory which is 'covered' or
   mounted upon
   sufficient for other layout types as well.  Device types for object
   storage devices and block storage devices (e.g., SCSI volume labels)
   will be defined by a second file system.  For example:

           /a/b            (file system 1)
           /a/b/c/d        (file system 2) their respective layout specifications.

8.2.21.  layout4

   struct layout4 {
       offset4                 lo_offset;
       length4                 lo_length;
       layoutiomode4           lo_iomode;
       layouttype4             lo_type;
       opaque                  lo_layout<>;
   };

   The pseudo file system layout4 structure defines a layout for a file.  The layout type
   specific data is opaque within this server may structure and must be constructed to look
   like:

           /               (place holder/not exported)
           /a/b            (file system 1)
           /a/b/c/d        (file system 2)

   It is
   interepreted based on the server's responsibility to present layout type.  Currently, only the pseudo NFSv4
   file system
   that layout type is complete to defined; see Section 17.4.1 for its definition.
   Since layouts are sub-dividable, the client.  If offset and length together with
   the file's filehandle, the clientid, iomode, and layout type,
   identifies the layout.

8.2.22.  layoutupdate4

   struct layoutupdate4 {
       layouttype4             lou_type;
       opaque                  lou_data<>;
   };

   The layoutupdate4 structure is used by the client sends to return 'updated'
   layout information to the metadata server at LAYOUTCOMMIT time.  This
   structure provides a lookup request channel to pass layout type specific information
   back to the metadata server.  E.g., for block/volume layout types
   this could include the path "/a/b/c/d", list of reserved blocks that were written.
   The contents of the server's response opaque lou_data argument are determined by the
   layout type and are defined in their context.  The NFSv4 file-based
   layout does not use this structure, thus the update_data field should
   have a zero length.

8.2.23.  layouthint4

   struct layouthint4 {
       layouttype4           loh_type;
       opaque                loh_data<>;
   };

   The layouthint4 structure is used by the filehandle of client to pass in a hint
   about the file system "/a/b/c/d".  In previous versions type of layout it would like created for a particular file.
   It is the NFS
   protocol, structure specified by the FILE_LAYOUT_HINT attribute
   described below.  The metadata server would respond with may ignore the filehandle hint, or may
   selectively ignore fields within the hint.  This hint should be
   provided at create time as part of directory
   "/a/b/c/d" the initial attributes within
   OPEN.  The NFSv4 file-based layout uses the file system "/a/b". "nfsv4_file_layouthint"
   structure as defined in Section 17.4.1.

8.2.24.  layoutiomode4

   enum layoutiomode4 {
       LAYOUTIOMODE_READ          = 1,
       LAYOUTIOMODE_RW            = 2,
       LAYOUTIOMODE_ANY           = 3
   };

   The NFS iomode specifies whether the client will be able intends to determine if it crosses a server mount
   point by a change in read or write
   (with the value possibility of reading) the "fsid" attribute.

7.8.  Security Policy data represented by the layout.
   The ANY iomode MUST NOT be used for LAYOUTGET, however, it can be
   used for LAYOUTRETURN and LAYOUTRECALL.  The ANY iomode specifies
   that layouts pertaining to both READ and Name Space Presentation RW iomodes are being
   returned or recalled, respectively.  The application metadata server's use of the server's security policy needs to be carefully
   considered by
   iomode may depend on the implementor.  One layout type being used.  The storage devices
   may choose validate I/O accesses against the iomode and reject invalid
   accesses.

8.2.25.  nfs_impl_id4

   struct nfs_impl_id4 {
       utf8str_cis   nii_domain;
       utf8str_cs    nii_name;
       nfstime4      nii_date;
   };

   This structure is used to limit identify client and server implementation
   detail.  The nii_domain field is the
   viewability of portions of DNS domain name that the pseudo file system based on
   implementer is associated with.  The nii_name field is the
   server's perception product
   name of the client's ability to authenticate itself
   properly.  However, with the support of multiple security mechanisms implementation and is completely free form.  It is
   encouraged that the ability nii_name be used to negotiate distinguish machine
   architecture, machine platforms, revisions, versions, and patch
   levels.  The nii_date field is the appropriate use timestamp of these mechanisms, when the server software
   instance was published or built.

8.2.26.  impl_ident4

   struct impl_ident4 {
       clientid4           ii_clientid;
       struct nfs_impl_id4 ii_impl_id;
   };

   This is unable used for exchanging implementation identification between
   client and server.

8.2.27.  threshold_item4

   struct threshold_item4 {
           layouttype4     thi_layout_type;
           bitmap4         thi_hintset;
           opaque          thi_hintlist<>;
   };

   This structure contains a list of hints specific to properly determine if a layout type for
   helping the client will be able
   to authenticate itself.  If, based on its policies, determine when it should issue I/O directly
   through the metadata server
   chooses to limit vs. the contents data servers.  The hint structure
   consists of the pseudo file system, the server
   may effectively hide file systems from layout type, a client that may otherwise
   have legitimate access.

   As suggested practice, the server should apply bitmap describing the security policy set of
   a shared resource in hints
   supported by the server's namespace to server, they may differ based on the components layout type,
   and a list of hints, whose structure is determined by the
   resource's ancestors.  For example:

           /
           /a/b
           /a/b/c hintset
   bitmap.  See the mdsthreshold attribute for more details.

   The /a/b/c directory hintset is a real bitmap of the following values:

   +-------------------------+---+---------+---------------------------+
   | name                    | # | Data    | Description               |
   |                         |   | Type    |                           |
   +-------------------------+---+---------+---------------------------+
   | threshold4_read_size    | 0 | length4 | The file system and size below which |
   |                         |   |         | it is recommended to read |
   |                         |   |         | data through the shared
   resource. MDS.     |
   | threshold4_write_size   | 1 | length4 | The security policy for /a/b/c file size below which |
   |                         |   |         | it is Kerberos with integrity.
   The server should apply the same security policy recommended to /, /a, and /a/b.
   This allows for the extension of the protection of      |
   |                         |   |         | write data through the server's
   namespace    |
   |                         |   |         | MDS.                      |
   | threshold4_read_iosize  | 2 | length4 | For read I/O sizes below  |
   |                         |   |         | this threshold it is      |
   |                         |   |         | recommended to read data  |
   |                         |   |         | through the ancestors of the real shared resource. MDS           |
   | threshold4_write_iosize | 3 | length4 | For write I/O sizes below |
   |                         |   |         | this threshold it is      |
   |                         |   |         | recommended to write data |
   |                         |   |         | through the case MDS           |
   +-------------------------+---+---------+---------------------------+

8.2.28.  mdsthreshold4

   struct mdsthreshold4 {
           threshold_item4 mth_hints<>;
   };

   This structure holds an array of the use threshold_item4 structures each of multiple, disjoint security mechanisms in
   the server's resources, the security
   which is valid for a particular object layout type.  An array is necessary
   since a server can support multiple layout types for a single file.

9.  Filehandles

   The filehandle in the
   server's namespace should be the union of all security mechanisms of
   all direct descendants.

8.  File Locking and Share Reservations

   Integrating locking into the NFS protocol necessarily causes it to be
   stateful.  With the inclusion of such features as share reservations,
   file and directory delegations, recallable layouts, and support is a per server unique identifier
   for
   mandatory byte-range locking the protocol becomes substantially more
   dependent on state than the traditional combination a file system object.  The contents of NFS and NLM
   [XNFS].  There the filehandle are three components to making this state manageable:

   o  Clear division between client and server

   o  Ability opaque
   to reliably detect inconsistency in state between client
      and server

   o  Simple and robust recovery mechanisms

   In this model, the server owns the state information.  The client
   requests changes in locks and client.  Therefore, the server responds with the changes
   made.  Non-client-initiated changes in locking state are infrequent
   and is responsible for translating
   the client receives prompt notification of them and can adjust
   his view filehandle to an internal representation of the locking state to reflect file system
   object.

9.1.  Obtaining the server's changes.

   To support Win32 share reservations it is necessary to provide First Filehandle

   The operations which atomically OPEN or CREATE files.  Having a separate
   share/unshare operation would not allow correct implementation of the
   Win32 OpenFile API.  In order to correctly implement share semantics,
   the previous NFS protocol mechanisms used when a file is opened are defined in terms of one or
   created (LOOKUP, CREATE, ACCESS) need
   more filehandles.  Therefore, the client needs a filehandle to be replaced.  The
   initiate communication with the server.  With the NFS version 4.1 2
   protocol defines OPEN operation which looks up or creates
   a file RFC1094 [17] and establishes locking state on the server.

8.1.  Locking

   It is assumed that manipulating a lock is rare when compared NFS version 3 protocol RFC1813 [18],
   there exists an ancillary protocol to READ
   and WRITE operations.  It is also assumed that crashes and network
   partitions are relatively rare.  Therefore it is important that obtain this first filehandle.
   The MOUNT protocol, RPC program number 100005, provides the
   READ and WRITE operations have a lightweight mechanism to indicate if
   they possess
   of translating a held lock.  A lock request contains the heavyweight
   information required string based file system path name to establish a lock and uniquely define the lock
   owner.

   The following sections describe the transition from the heavyweight
   information to the eventual lightwieght stateid used for most client
   and server locking interactions.

8.1.1.  Client ID

   For each operation that obtains or depends on locking state, the
   specific client must filehandle
   which can then be determinable used by the server.  In NFSv4, each
   distinct client instance is represented by a clientid, which is a 64-
   bit identifier that identifies a specific client at a given time NFS protocols.

   The MOUNT protocol has deficiencies in the area of security and
   which use
   via firewalls.  This is changed whenever one reason that the client or use of the server re-initializes.
   Clientid's are used to support lock identification public
   filehandle was introduced in RFC2054 [24] and crash
   recovery.

   In NFSv4.1, RFC2055 [25].  With the clientid associated
   use of the public filehandle in combination with each the LOOKUP operation
   in the NFS version 2 and 3 protocols, it has been demonstrated that
   the MOUNT protocol is derived unnecessary for viable interaction between NFS
   client and server.

   Therefore, the NFS version 4 protocol will not use an ancillary
   protocol for translation from string based path names to a
   filehandle.  Two special filehandles will be used as starting points
   for the session on which NFS client.

9.1.1.  Root Filehandle

   The first of the operation special filehandles is issued.  Each session the ROOT filehandle.  The
   ROOT filehandle is
   associated with a specific clientid the "conceptual" root of the file system name
   space at session creation and that
   clientid then becomes the clientid associated NFS server.  The client uses or starts with all requests
   issued using it.

   A sequence of a CREATE_CLIENTID operation followed the ROOT
   filehandle by a
   CREATE_SESSION employing the PUTROOTFH operation.  The PUTROOTFH
   operation using that clientid is required instructs the server to establish set the identification on "current" filehandle to the server.  Establishment of identification by
   a new incarnation
   ROOT of the client also has server's file tree.  Once this PUTROOTFH operation is
   used, the effect of immediately
   releasing any locking state that a previous incarnation of that same client might have had on the server.  Such released state would
   include all lock, share reservation, and, where can then traverse the server is not
   supporting entirety of the CLAIM_DELEGATE_PREV claim type, all delegation state
   associated with same client server's file
   tree with the same identity.  For LOOKUP operation.  A complete discussion of delegation state recovery, see the server
   name space is in the section "Delegation Recovery".

   Releasing such state requires that "NFS Server Name Space".

9.1.2.  Public Filehandle

   The second special filehandle is the server PUBLIC filehandle.  Unlike the
   ROOT filehandle, the PUBLIC filehandle may be able to determine
   that one client instance is bound or represent an
   arbitrary file system object at the successor of another.  Where server.  The server is
   responsible for this
   cannot binding.  It may be done, for any of a number of reasons, that the locking state
   will remain for a time subject to lease expiration (see Section 8.5) PUBLIC filehandle
   and the new client will need to wait for such state ROOT filehandle refer to be removed, if
   it makes conflicting lock requests.

   Client identification is encapsulated in the following structure:

               struct nfs_client_id4 {
               verifier4     verifier;
               opaque        id<NFS4_OPAQUE_LIMIT>;
               };

   The first field, verifier, is a client incarnation verifier that same file system object.
   However, it is
   used up to detect client reboots.  Only if the verifier is different
   from that administrative software at the server had previously recorded for the client (as
   identified by and
   the second field policies of the structure, id) does the server
   start administrator to define the process binding of canceling the client's leased state.
   PUBLIC filehandle and server file system object.  The second field, id is a variable length string that uniquely
   defines the client so that subsequent instances of the same may not
   make any assumptions about this binding.  The client
   bear uses the same id with a different verifier.

   There are several considerations for how PUBLIC
   filehandle via the client generates PUTPUBFH operation.

9.2.  Filehandle Types

   In the id
   string:

   o NFS version 2 and 3 protocols, there was one type of
   filehandle with a single set of semantics.  This type of filehandle
   is termed "persistent" in NFS Version 4.  The string should be unique so that multiple clients do not
      present semantics of a
   persistent filehandle remain the same string.  The consequences as before.  A new type of two clients
      presenting
   filehandle introduced in NFS Version 4 is the same string range from one client getting an error "volatile" filehandle,
   which attempts to one client having its leased state abruptly and unexpectedly
      canceled.

   o accommodate certain server environments.

   The string should be selected so the subsequent incarnations (e.g.
      reboots) volatile filehandle type was introduced to address server
   functionality or implementation issues which make correct
   implementation of a persistent filehandle infeasible.  Some server
   environments do not provide a file system level invariant that can be
   used to construct a persistent filehandle.  The underlying server
   file system may not provide the same client cause invariant or the client server's file system
   programming interfaces may not provide access to present the same
      string.  The implementor is cautioned from an approach that
      requires needed
   invariant.  Volatile filehandles may ease the string to be recorded in a local implementation of
   server functionality such as hierarchical storage management or file because this
      precludes
   system reorganization or migration.  However, the use of volatile filehandle
   increases the implementation in an environment where
      there is no local disk and all file access is from an NFS version
      4 server.

   o  The string should be different burden for each server network address
      that the client.

   Since the client accesses, rather than common will need to all server network
      addresses.  The reason handle persistent and volatile
   filehandles differently, a file attribute is that it defined which may not be possible for
   used by the client to tell if same server is listening on multiple network
      addresses.  If determine the client issues CREATE_CLIENTID with filehandle types being returned
   by the same id
      string to each network address server.

9.2.1.  General Properties of such a server, Filehandle

   The filehandle contains all the information the server will
      think it is needs to
   distinguish an individual file.  To the same client, the filehandle is
   opaque.  The client stores filehandles for use in a later request and each successive CREATE_CLIENTID
      will cause
   can compare two filehandles from the same server remove the client's previous leased state.

   o  The algorithm for generating equality by
   doing a byte-by-byte comparison.  However, the string should not assume that client MUST NOT
   otherwise interpret the
      client's network address won't change.  This includes changes contents of filehandles.  If two filehandles
   from the same server are equal, they MUST refer to the same file.
   Servers SHOULD try to maintain a one-to-one correspondence between client incarnations
   filehandles and even changes while the client files but this is
      still running not required.  Clients MUST use
   filehandle comparisons only to improve performance, not for correct
   behavior.  All clients need to be prepared for situations in its current incarnation.  This means that if the
      client includes just which it
   cannot be determined whether two filehandles denote the client's same object
   and server's network address in such cases, avoid making invalid assumptions which might cause
   incorrect behavior.  Further discussion of filehandle and attribute
   comparison in the id string, there context of data caching is a real risk, after presented in the client gives up section
   "Data Caching and File Identity".

   As an example, in the
      network address, case that another client, using a similar algorithm
      for generating two different path names when
   traversed at the id string, would generate a conflicting id
      string.

   Given server terminate at the above considerations, an example of same file system object, the
   server SHOULD return the same filehandle for each path.  This can
   occur if a well generated id
   string hard link is one that includes:

   o  The server's network address.

   o  The client's network address.

   o used to create two file names which refer to
   the same underlying file object and associated data.  For a user level NFS version 4 client, it should contain
      additional information example, if
   paths /a/b/c and /a/d/c refer to distinguish the client from other user
      level clients running on same file, the server SHOULD
   return the same host, such filehandle for both path names traversals.

9.2.2.  Persistent Filehandle

   A persistent filehandle is defined as having a process id or
      other unique sequence.

   o  Additional information that tends to be unique, such as one or
      more of:

      *  The client machine's serial number (for privacy reasons, it is
         best to perform some one way function on fixed value for the serial number).

      *  A MAC address.

      *  The timestamp
   lifetime of when the NFS version 4 software was first
         installed on the client (though this is subject file system object to which it refers.  Once the
         previously mentioned caution about using information that is
         stored in a file, because
   server creates the filehandle for a file might only be accessible
         over NFS version 4).

      *  A true random number.  However since this number ought to be system object, the same between client incarnations, this shares server
   MUST accept the same
         problem as that of filehandle for the using object for the timestamp lifetime of
   the software
         installation.

   As a security measure, object.  If the server MUST NOT cancel a client's leased
   state if restarts or reboots the principal established NFS server must
   honor the state for a given id string same filehandle value as it did in the server's previous
   instantiation.  Similarly, if the file system is
   not migrated, the new
   NFS server must honor the same filehandle as the principal issuing old NFS server.

   The persistent filehandle will be become stale or invalid when the
   file system object is removed.  When the CREATE_CLIENTID.

   A server may compare an nfs_client_id4 in a CREATE_CLIENTID is presented with an
   nfs_client_id4 established using SETCLIENTID using NFSv4 minor
   version 0, so a
   persistent filehandle that refers to a deleted object, it MUST return
   an NFSv4.1 client error of NFS4ERR_STALE.  A filehandle may become stale when the
   file system containing the object is not forced to delay until
   lease expiration for locking state established by no longer available.  The file
   system may become unavailable if it exists on removable media and the earlier client
   using minor version 0.

   Once a CREATE_CLIENTID
   media is no longer available at the server or the file system in
   whole has been done, and destroyed or the resulting clientid
   established as associated with a session, all requests made on that
   session implicitly identify that clientid, which in turn designates file system has simply been removed
   from the client specified using server's name space (i.e. unmounted in a UNIX environment).

9.2.3.  Volatile Filehandle

   A volatile filehandle does not share the long-form nfs_client_id4 structure. same longevity
   characteristics of a persistent filehandle.  The shorthand client identifier (a clientid) server may determine
   that a volatile filehandle is assigned by no longer valid at many different
   points in time.  If the server and should be chosen so can definitively determine that it will not conflict with a
   clientid previously assigned by
   volatile filehandle refers to an object that has been removed, the server.  This applies across
   server restarts or reboots. should return NFS4ERR_STALE to the client (as is the case for
   persistent filehandles).  In all other cases where the event of a server restart, a client will find out
   determines that its
   current clientid is a volatile filehandle can no longer valid when receives a
   NFS4ERR_STALE_CLIENTID error.  The precise circumstances depend be used, it
   should return an error of NFS4ERR_FHEXPIRED.

   The mandatory attribute "fh_expire_type" is used by the characteristics client to
   determine what type of filehandle the sessions involved, specifically whether
   the session server is persistent.

   When providing for a session
   particular file system.  This attribute is not persistent, the client will need to create a
   new session.  When bitmask with the existing clientid
   following values:

   FH4_PERSISTENT  The value of FH4_PERSISTENT is presented used to indicate a server as
   part of creating a session and that clientid
      persistent filehandle, which is not recognized, as
   would happen after a server reboot, valid until the object is removed
      from the file system.  The server will reject the
   request with the error NFS4ERR_STALE_CLIENTID.  When not return
      NFS4ERR_FHEXPIRED for this happens,
   the client must obtain filehandle.  FH4_PERSISTENT is defined
      as a new clientid by use value in which none of the CREATE_CLIENTID
   operation and then use that clientid bits specified below are set.

   FH4_VOLATILE_ANY  The filehandle may expire at any time, except as
      specifically excluded (i.e.  FH4_NO_EXPIRE_WITH_OPEN).

   FH4_NOEXPIRE_WITH_OPEN  May only be set when FH4_VOLATILE_ANY is set.
      If this bit is set, then the basis of the basis meaning of a
   new session and then proceed FH4_VOLATILE_ANY is
      qualified to exclude any other necessary recovery for the
   server reboot case (See Section 8.6.2).

   In the case expiration of the session being persistent, the client will re-
   establish communication using the existing session after the reboot.
   This session filehandle when it is
      open.

   FH4_VOL_MIGRATION  The filehandle will be associated with expire as a stale clientid and the client
   will receive an indication result of that fact a file
      system transition (migration or replication), in those case in
      which the status field returned
   by the SEQUENCE operation.  The client, can then continuity of filehandle use is not specified by
      _handle_ class information within the existing
   session to do whatever operations are necessary fs_locations_info attribute.
      When this bit is set, clients without access to determine the
   status of requests outstanding at fs_locations_info
      information should assume filehandles will expire on file system
      transitions.

   FH4_VOL_RENAME  The filehandle will expire during rename.  This
      includes a rename by the time of reboot, while avoiding
   issuing new requests, particularly requesting client or a rename by any involving locking on
      other client.  If FH4_VOL_ANY is set, FH4_VOL_RENAME is redundant.

   Servers which provide volatile filehandles that may expire while open
   (i.e. if FH4_VOL_MIGRATION or FH4_VOL_RENAME is set or if
   FH4_VOLATILE_ANY is set and FH4_NOEXPIRE_WITH_OPEN not set), should
   deny a RENAME or REMOVE that
   session.  Such requests would fail with NFS4ERR_STALE_CLIENTID error
   or affect an NFS4ERR_STALE_STATEID error, if attempted.  In OPEN file of any case, of the
   client would create a new clientid using CREATE_CLIENTID, create a
   new session based on that clientid, and proceed
   components leading to other necessary
   recovery for the OPEN file.  In addition, the server reboot case.

   See should
   deny all RENAME or REMOVE requests during the detailed descriptions grace period upon
   server restart.

   Servers which provide volatile filehandles that may expire while open
   require special care as regards handling of CREATE_CLIENTID RENAMESs and CREATE_SESSION
   for REMOVEs.
   This situation can arise if FH4_VOL_MIGRATION or FH4_VOL_RENAME is
   set, if FH4_VOLATILE_ANY is set and FH4_NOEXPIRE_WITH_OPEN not set,
   or if a complete specification of non-readonly file system has a transition target in a
   different _handle _ class.  In these operations.

8.1.2.  Server Release of Clientid

   If cases, the server determines should deny a
   RENAME or REMOVE that would affect an OPEN file of any of the client holds no associated state
   for its clientid, the server may choose
   components leading to release the clientid.  The OPEN file.  In addition, the server may should
   deny all RENAME or REMOVE requests during the grace period, in order
   to make this choice for an inactive client so sure that resources
   are reclaims of files where filehandles may have
   expired do not consumed by those intermittently active clients.  If do a reclaim for the
   client contacts wrong file.

9.3.  One Method of Constructing a Volatile Filehandle

   A volatile filehandle, while opaque to the client could contain:

   [volatile bit = 1 | server after this release, boot time | slot | generation number]

   o  slot is an index in the server must ensure volatile filehandle table

   o  generation number is the client receives generation number for the appropriate error so that it will use table entry/
      slot

   When the
   CREATE_CLIENTID/CREATE_SESSION sequence to establish client presents a new identity.
   It should be clear that volatile filehandle, the server must be very hesitant to release a
   clientid since makes the resulting work on
   following checks, which assume that the client to recover from such
   an event will be check for the same burden as if volatile bit
   has passed.  If the server had failed and
   restarted.  Typically a server would not release a clientid unless
   there had been no activity from that client for many minutes.

   Note that if the id string in a CREATE_CLIENTID request boot time is properly
   constructed, and if the client takes care to use less than the same principal
   for each successive use of CREATE_CLIENTID, then, barring an active
   denial of service attack, NFS4ERR_CLID_INUSE should never be
   returned.

   However, client bugs, current server bugs, or perhaps a deliberate change of
   the principal owner
   boot time, return NFS4ERR_FHEXPIRED.  If slot is out of range, return
   NFS4ERR_BADHANDLE.  If the id string (such as generation number does not match, return
   NFS4ERR_FHEXPIRED.

   When the case of a client
   that changes security flavors, and under server reboots, the new flavor, there table is gone (it is volatile).

   If volatile bit is 0, then it is no
   mapping to the previous owner) will in rare cases result in
   NFS4ERR_CLID_INUSE.

   In that event, when the server gets a CREATE_CLIENTID for persistent filehandle with a
   different structure following it.

9.4.  Client Recovery from Filehandle Expiration

   If possible, the client id SHOULD recover from the receipt of an
   NFS4ERR_FHEXPIRED error.  The client must take on additional
   responsibility so that currently has no state, or it has state, but may prepare itself to recover from the lease has
   expired, rather than returning NFS4ERR_CLID_INUSE,
   expiration of a volatile filehandle.  If the server MUST
   allow the CREATE_CLIENTID, and confirm the new clientid if followed
   by returns
   persistent filehandles, the appropriate CRREATESESSION.

8.1.3.  State-owner and Stateid Definition

   When opening a file or requesting a byte-range lock, client does not need these additional
   steps.

   For volatile filehandles, most commonly the client must
   specify an identifier which represents will need to store
   the owner of component names leading up to and including the requested
   lock.  This identifier is file system
   object in question.  With these names, the form of client should be able to
   recover by finding a state-owner, represented filehandle in the protocol name space that is still
   available or by a state_owner4, a variable-length opaque array
   which, when concatenated with the current clientid uniquely defines starting at the owner root of lock managed by the client.  This may be a thread id,
   process id, or other unique value.

   Owners of opens and owners of byte-range locks are separate entities
   and remain separate even if server's file system name
   space.

   If the same opaque arrays are used expired filehandle refers to
   designate owners of each.  The protocol distinguishes between open-
   owners (represented by open_owner4 structures) and lock-owners
   (represented by lock_owner4 structures).

   Each open is associated with a specific open-owner while each byte-
   range lock is associated with a lock-owner and an open-owner, object that has been removed
   from the
   latter being file system, obviously the open-owner associated with client will not be able to
   recover from the open expired filehandle.

   It is also possible that the expired filehandle refers to a file under which that
   has been renamed.  If the LOCK operation file was done.  Delegations and layouts, on renamed by another client, again
   it is possible that the other
   hand, are original client will not associated with a specific owner but are associated be able to recover.

   However, in the case that the client as a whole.

   When itself is renaming the server grants a lock of any type (including opens, byte-
   range locks, delegations, file and layouts)
   the file is open, it responds with a unique
   stateid, is possible that represents a set of locks (often a single lock) for the
   same file, of client may be able to
   recover.  The client can determine the same type, and sharing new path name based on the same ownership
   characteristics.  Thus opens
   processing of the same file by different open-
   owners each have an identifying stateid.  Similarly, each set of
   byte-range locks rename request.  The client can then regenerate the
   new filehandle based on a file owned by a specific lock-owner and gotten
   via an open for a specific open-owner, has its own identifying
   stateid.  Delegations and layouts also have associated stateid's by
   which they may be referenced. the new path name.  The stateid is used as a shorthand
   reference client could also use
   the compound operation mechanism to construct a lock or set of locks and given a stateid the client
   can determine operations
   like:

             RENAME A B
             LOOKUP B
             GETFH

   Note that the associated state-owner or state-owners (in COMPOUND procedure does not provide atomicity.  This
   example only reduces the case overhead of recovering from an open-owner/lock-owner pair) and expired
   filehandle.

10.  File Attributes

   To meet the associated.  Clients,
   however, must not assume any such mapping requirements of extensibility and increased
   interoperability with non-UNIX platforms, attributes must not use be handled
   in a stateid
   returned for flexible manner.  The NFS version 3 fattr3 structure contains a given filehandle and state-owner in the context
   fixed list of a
   different filehandle attributes that not all clients and servers are able to
   support or a different state-owner. care about.  The server is free to form the stateid in any manner that it chooses
   as long fattr3 structure can not be extended as
   new needs arise and it is able provides no way to recognize invalid and out-of-date stateids.
   Although indicate non-support.  With
   the protocol XDR definition divides NFS version 4 protocol, the stateid into client is able query what attributes
   the server supports and construct requests with only those supported
   attributes (or a subset thereof).

   To this end, attributes are divided into
   'seqid' three groups: mandatory,
   recommended, and 'other' fields, for named.  Both mandatory and recommended attributes
   are supported in the purposes of minor NFS version one,
   this distinction is not important 4 protocol by a specific and well-
   defined encoding and are identified by number.  They are requested by
   setting a bit in the bit vector sent in the GETATTR request; the
   server may use response includes a bit vector to list what attributes were
   returned in the
   available space as it chooses, with one exception.

   The exception is that stateids whose 'other' field is either all
   zeros response.  New mandatory or all ones are reserved and recommended attributes
   may not be generated added to the NFS protocol between major revisions by
   publishing a standards-track RFC which allocates a new attribute
   number value and defines the
   server.  Clients may use encoding for the protocol-defined special stateid values attribute.  See the
   section "Minor Versioning" for their defined purposes, but any use of stateid's in this reserved
   class that further discussion.

   Named attributes are not specially defined accessed by the protocol MUST result in
   an NFS4ERR_BAD_STATED being returned.

   Clients may not compare stateids associated with different
   filehandles, so that a server might use stateids with the same bit
   pattern for all opens with new OPENATTR operation, which
   accesses a given open-owner or for all sets hidden directory of
   byte-range locks attributes associated with a given lock-owner/open-owner pair.
   However, if it does so, it must recognize and reject any use of
   stateid when the current file
   system object.  OPENATTR takes a filehandle is such that no lock for that
   filehandle by that open owner (or lock-owner/open-owner pair) exists.

   Stateid's must remain valid until either a client reboot or a sever
   reobot or until the client object and
   returns all of the locks associated with filehandle for the stateid attribute hierarchy.  The filehandle
   for the named attributes is a directory object accessible by means of an operation such as CLOSE LOOKUP
   or DELEGRETURN.
   If READDIR and contains files whose names represent the locks named
   attributes and whose data bytes are lost due to revocation the sateid remains usable
   until value of the client frees it by using FREE_STATEID.  Stateid's
   associated with byte-range locks attribute.  For
   example:

        +----------+-----------+---------------------------------+
        | LOOKUP   | "foo"     | ; look up file                  |
        | GETATTR  | attrbits  |                                 |
        | OPENATTR |           | ; access foo's named attributes |
        | LOOKUP   | "x11icon" | ; look up specific attribute    |
        | READ     | 0,4096    | ; read stream of bytes          |
        +----------+-----------+---------------------------------+

   Named attributes are intended for data needed by applications rather
   than by an exception.  They remain valid
   even if a LOCKU free all remaining locks, so long as the opefile with
   which they are associated remains open, unless the NFS client does a
   FREE_STATEID to caused the stateid implementation.  NFS implementors are strongly
   encouraged to be freed.

   Because each operation using a stateid occurs define their new attributes as part of a session,
   each stateid is implicitly associated with the clientid assigned recommended attributes
   by bringing them to
   that session.  Use of a stateid in the context IETF standards-track process.

   The set of a session where the
   clientid attributes which are classified as mandatory is invalid should result in the error NFS4ERR_STALE_STATEID.
   Servers MUST NOT
   deliberately small since servers must do any validation or return other errors in this
   case, even if they have sufficient information available to validate
   stateids associated with an out-of-date client.

   One mechanism that may be used whatever it takes to satisfy support
   them.  A server should support as many of the requirement that recommended attributes
   as possible but by their definition, the server recognize invalid and out-of-date stateids is for the server not required to divide the stateid into two fields.  This division may coincide
   with the documented division into 'seqid' and 'other' fields or it
   may divide
   support all of them.  Attributes are deemed mandatory if the stateid field up in any other ay it chooses.

   o  An index into data is
   both needed by a table of locking-state structures.

   o  A generation large number which of clients and is incremented not otherwise
   reasonably computable by the client when support is not provided on each allocation of a
      table entry a particular allocation of a stateid.

   And then store in each table entry,

   o  The current generation number.

   o  The clientid with which
   the stateid server.

   Note that the hidden directory returned by OPENATTR is associated.

   o a convenience
   for protocol processing.  The filehandle of the file on which client should not make any assumptions
   about the locks are taken.

   o  An indication server's implementation of named attributes and whether the type of stateid (open, byte-range lock,
   underlying file
      delegation, system at the server has a named attribute directory delegation, layout).

   With this information,
   or not.  Therefore, operations such as SETATTR and GETATTR on the following procedure would
   named attribute directory are undefined.

10.1.  Mandatory Attributes

   These MUST be used supported by every NFS version 4 client and server in
   order to
   validate an incoming stateid ensure a minimum level of interoperability.  The server must
   store and return an appropriate error, when
   necessary:

   o  If these attributes and the current session is associated client must be able to
   function with an invalid clientid,
      return NFS4ERR_STALE_STATEID.

   o  If the table index field is outside the range of the associated
      table, return NFS4ERR_BAD_STATEID.

   o  If attribute set limited to these attributes.  With
   just the selected table entry is mandatory attributes some client functionality may be
   impaired or limited in some ways.  A client may ask for any of these
   attributes to be returned by setting a different generation than that
      specified bit in the incoming stateid, return NFS4ERR_BAD_STATEID.

   o  If the selected table entry does not match GETATTR request and
   the current file
      handle, server must return NFS4ERR_BAD_STATEID.

   o  If the clientid their value.

10.2.  Recommended Attributes

   These attributes are understood well enough to warrant support in the table entry does
   NFS version 4 protocol.  However, they may not match be supported on all
   clients and servers.  A client may ask for any of these attributes to
   be returned by setting a bit in the clientid
      associated with GETATTR request but must handle
   the current session, return NFS4ERR_BAD_STATEID.

   o  If case where the stateid type is server does not valid return them.  A client may ask for
   the context in which the
      stateid appears, return NFS4ERR_BAD_STATEID.

   o  Otherwise, set of attributes the stateid is valid server supports and the table entry should contain
      any additional information about not request
   attributes the associated set server does not support.  A server should be tolerant
   of locks, such
      as open-owner and lock-owner information, as well as information
      on the specific locks, such as open modes requests for unsupported attributes and byte ranges.

8.1.4.  Use of simply not return them
   rather than considering the Stateid and Locking

   All READ, WRITE request an error.  It is expected that
   servers will support all attributes they comfortably can and SETATTR operations contain a stateid.  For the
   purposes of this section, SETATTR operations only
   fail to support attributes which change the size
   attribute of a file are treated as if difficult to support in their
   operating environments.  A server should provide attributes whenever
   they are writing the area
   between the old and new size (i.e. the range truncated or added don't have to "tell lies" to the client.  For example, a file
   modification time should be either an accurate time or should not be
   supported by means of the SETATTR), even where SETATTR is server.  This will not
   explicitly mentioned in the text.

   If always be comfortable to
   clients but the state-owner performs a READ client is better positioned decide whether and how to
   fabricate or WRITE in a situation in which
   it has established a lock construct an attribute or share reservation on the server (any
   OPEN constitutes a share reservation) whether to do without the stateid (previously
   returned
   attribute.

10.3.  Named Attributes

   These attributes are not supported by direct encoding in the server) must be used to indicate what locks,
   including both record locks and share reservations, NFS
   Version 4 protocol but are held accessed by string names rather than
   numbers and correspond to an uninterpreted stream of bytes which are
   stored with the
   state-owner.  If no state is established file system object.  The name space for these
   attributes may be accessed by using the client, either record
   lock or share reservation, OPENATTR operation.  The
   OPENATTR operation returns a special stateid of all bits 0 (including
   all fields of the stateid) is used.  Regardless whether filehandle for a stateid virtual "attribute
   directory" and further perusal of
   all bits 0, or a stateid returned by the server is used, if there is
   a conflicting share reservation or mandatory record lock held name space may be done using
   READDIR and LOOKUP operations on the
   file, the server MUST refuse to service the READ this filehandle.  Named attributes
   may then be examined or WRITE operation.

   Share reservations are established changed by OPEN normal READ and WRITE and CREATE
   operations on the filehandles returned from READDIR and by their
   nature are mandatory in LOOKUP.
   Named attributes may have attributes.

   It is recommended that when servers support arbitrary named attributes.  A
   client should not depend on the OPEN denies READ or WRITE
   operations, that denial results ability to store any named attributes
   in such operations being rejected
   with error NFS4ERR_LOCKED.  Record locks may be implemented by the server's file system.  If a server does support named
   attributes, a client which is also able to handle them should be able
   to copy a file's data and meta-data with complete transparency from
   one location to another; this would imply that names allowed for
   regular directory entries are valid for named attribute names as either mandatory or advisory, or the choice
   well.

   Names of mandatory or
   advisory behavior may attributes will not be determined controlled by this document or other
   IETF standards track documents.  See the server on the basis section "IANA
   Considerations" for further discussion.

10.4.  Classification of Attributes

   Each of the Mandatory and Recommended attributes can be classified in
   one of three categories: per server, per file being accessed (for example, system, or per file
   system object.  Note that it is possible that some UNIX-based servers support a
   "mandatory lock bit" on per file system
   attributes may vary within the mode file system.  See the "homogeneous"
   attribute such for its definition.  Note that if set, record
   locks are required on the file before I/O is possible).  When record
   locks are advisory, they only prevent the granting of conflicting
   lock requests attributes
   time_access_set and have no effect on READs or WRITEs.  Mandatory
   record locks, however, prevent conflicting I/O operations.  When they time_modify_set are attempted, not listed in this section
   because they are rejected with NFS4ERR_LOCKED.  When the
   client gets NFS4ERR_LOCKED on a file it knows it has the proper share
   reservation for, it will need write-only attributes corresponding to issue time_access
   and time_modify, and are used in a LOCK request on the region special instance of the SETATTR.

   o  The per server attribute is:

         lease_time

   o  The per file that includes the region system attributes are:

         supp_attr, fh_expire_type, link_support, symlink_support,
         unique_handles, aclsupport, cansettime, case_insensitive,
         case_preserving, chown_restricted, files_avail, files_free,
         files_total, fs_locations, homogeneous, maxfilesize, maxname,
         maxread, maxwrite, no_trunc, space_avail, space_free,
         space_total, time_delta, fs_layout_type, send_impl_id,
         recv_impl_id

   o  The per file system object attributes are:

         type, change, size, named_attr, fsid, rdattr_error, filehandle,
         ACL, archive, fileid, hidden, maxlink, mimetype, mode,
         numlinks, owner, owner_group, rawdev, space_used, system,
         time_access, time_backup, time_create, time_metadata,
         time_modify, mounted_on_fileid, layout_type, layout_hint,
         layout_blksize, layout_alignment

   For quota_avail_hard, quota_avail_soft, and quota_used see their
   definitions below for the I/O was appropriate classification.

10.5.  Mandatory Attributes - Definitions

   +-----------------+----+------------+--------+----------------------+
   | name            | #  | Data Type  | Access | Description          |
   +-----------------+----+------------+--------+----------------------+
   | supp_attr       | 0  | bitmap     | READ   | The bit vector which |
   |                 |    |            |        | would retrieve all   |
   |                 |    |            |        | mandatory and        |
   |                 |    |            |        | recommended          |
   |                 |    |            |        | attributes that are  |
   |                 |    |            |        | supported for this   |
   |                 |    |            |        | object. The scope of |
   |                 |    |            |        | this attribute       |
   |                 |    |            |        | applies to be performed on, all       |
   |                 |    |            |        | objects with an appropriate locktype (i.e.  READ*_LT for a       |
   |                 |    |            |        | matching fsid.       |
   | type            | 1  | nfs4_ftype | READ operation,
   WRITE*_LT for a WRITE operation).

   Note that for UNIX environments that support mandatory file locking,
   the distinction between advisory and mandatory locking is subtle.  In
   fact, advisory and mandatory record locks are exactly the same in so
   far as the APIs and requirements on implementation.  If the mandatory
   lock attribute is set on the file,   | The type of the server checks      |
   |                 |    |            |        | object (file,        |
   |                 |    |            |        | directory, symlink,  |
   |                 |    |            |        | etc.)                |
   | fh_expire_type  | 2  | uint32     | READ   | Server uses this to  |
   |                 |    |            |        | specify filehandle   |
   |                 |    |            |        | expiration behavior  |
   |                 |    |            |        | to see if the
   lock-owner has an appropriate shared (read) or exclusive (write)
   record lock on client. See   |
   |                 |    |            |        | the region it wishes to read or write to.  If there is
   no appropriate lock, section          |
   |                 |    |            |        | "Filehandles" for    |
   |                 |    |            |        | additional           |
   |                 |    |            |        | description.         |
   | change          | 3  | uint64     | READ   | A value created by   |
   |                 |    |            |        | the server checks if there is a conflicting lock
   (which that the  |
   |                 |    |            |        | client can be done by attempting use to acquire the conflicting lock on
   the behalf of the lock-owner, and    |
   |                 |    |            |        | determine if successful, release the lock
   after the READ file    |
   |                 |    |            |        | data, directory      |
   |                 |    |            |        | contents or WRITE is done), and if there is,          |
   |                 |    |            |        | attributes of the    |
   |                 |    |            |        | object have been     |
   |                 |    |            |        | modified. The server returns
   NFS4ERR_LOCKED.

   For Windows environments, there are no advisory record locks, so |
   |                 |    |            |        | may return the
   server always checks       |
   |                 |    |            |        | object's             |
   |                 |    |            |        | time_metadata        |
   |                 |    |            |        | attribute for record locks during I/O requests.

   Thus, this   |
   |                 |    |            |        | attribute's value    |
   |                 |    |            |        | but only if the NFS version 4 LOCK operation does file |
   |                 |    |            |        | system object can    |
   |                 |    |            |        | not need to distinguish
   between advisory and mandatory record locks.  It is be updated more  |
   |                 |    |            |        | frequently than the NFS version  |
   |                 |    |            |        | resolution of        |
   |                 |    |            |        | time_metadata.       |
   | size            | 4
   server's processing  | uint64     | R/W    | The size of the      |
   |                 |    |            |        | object in bytes.     |
   | link_support    | 5  | bool       | READ and WRITE operations that introduces   | True, if the distinction.

   Every stateid other than         |
   |                 |    |            |        | object's file system |
   |                 |    |            |        | supports hard links. |
   | symlink_support | 6  | bool       | READ   | True, if the special stateid values noted in         |
   |                 |    |            |        | object's file system |
   |                 |    |            |        | supports symbolic    |
   |                 |    |            |        | links.               |
   | named_attr      | 7  | bool       | READ   | True, if this
   section, whether returned by an OPEN-type operation (i.e.  OPEN,
   OPEN_DOWNGRADE), or by object |
   |                 |    |            |        | has named            |
   |                 |    |            |        | attributes. In other |
   |                 |    |            |        | words, object has a LOCK-type operation (i.e.  LOCK or LOCKU),
   defines an access mode  |
   |                 |    |            |        | non-empty named      |
   |                 |    |            |        | attribute directory. |
   | fsid            | 8  | fsid4      | READ   | Unique file system   |
   |                 |    |            |        | identifier for the   |
   |                 |    |            |        | file (i.e.  READ, WRITE, or READ-
   WRITE) as established by the original OPEN which caused the
   allocation of the open stateid and as modified by subsequent OPENs system holding  |
   |                 |    |            |        | this object. fsid    |
   |                 |    |            |        | contains major and OPEN_DOWNGRADEs for the same open-owner/file pair.  Stateids
   returned by byte-range lock operations imply the access mode for the
   open stateid associated with the lock set represented by the stateid.
   Delegation stateids have an access mode based on the type   |
   |                 |    |            |        | minor components     |
   |                 |    |            |        | each of
   delegation.  When a READ, WRITE, or SETATTR which specifies the size
   attribute, is done, the operation is subject are    |
   |                 |    |            |        | uint64.              |
   | unique_handles  | 9  | bool       | READ   | True, if two         |
   |                 |    |            |        | distinct filehandles |
   |                 |    |            |        | guaranteed to checking against the
   access mode refer  |
   |                 |    |            |        | to verify that the operation is appropriate given the
   OPEN with which the operation is associated.

   In the case of WRITE-type operations (i.e.  WRITEs and SETATTRs which
   set size), the server must verify that the access mode allows writing
   and return an NFS4ERR_OPENMODE error if it does not.  In the case, two different     |
   |                 |    |            |        | file system objects. |
   | lease_time      | 10 | nfs_lease4 | READ   | Duration of
   READ, the leases   |
   |                 |    |            |        | at server may perform the corresponding check on the access
   mode, or it may choose to allow in         |
   |                 |    |            |        | seconds.             |
   | rdattr_error    | 11 | enum       | READ on opens   | Error returned from  |
   |                 |    |            |        | getattr during       |
   |                 |    |            |        | readdir.             |
   | filehandle      | 19 | nfs_fh4    | READ   | The filehandle of    |
   |                 |    |            |        | this object          |
   |                 |    |            |        | (primarily for WRITE only, to
   accommodate clients whose write implementation may unavoidably do
   reads (e.g. due to buffer cache constraints).  However, even if READs
   are allowed in these circumstances, the server MUST still check       |
   |                 |    |            |        | readdir requests).   |
   +-----------------+----+------------+--------+----------------------+

10.6.  Recommended Attributes - Definitions
   +--------------------+----+---------------+--------+----------------+
   | name               | #  | Data Type     | Access | Description    |
   +--------------------+----+---------------+--------+----------------+
   | ACL                | 12 | nfsace4<>     | R/W    | The access     |
   |                    |    |               |        | control list   |
   |                    |    |               |        | for
   locks that conflict with the        |
   |                    |    |               |        | object.        |
   | aclsupport         | 13 | uint32        | READ (e.g. another open specify denial   | Indicates what |
   |                    |    |               |        | types of READs).  Note that a server which does enforce the access mode
   check ACLs  |
   |                    |    |               |        | are supported  |
   |                    |    |               |        | on READs need not explicitly check for conflicting share
   reservations the current |
   |                    |    |               |        | file system.   |
   | archive            | 14 | bool          | R/W    | True, if this  |
   |                    |    |               |        | file has been  |
   |                    |    |               |        | archived since |
   |                    |    |               |        | the existence of OPEN for read access guarantees
   that no conflicting share reservation can exist.

   A special stateid time of all bits 1 (one), including all fields    |
   |                    |    |               |        | last           |
   |                    |    |               |        | modification   |
   |                    |    |               |        | (deprecated in |
   |                    |    |               |        | favor of       |
   |                    |    |               |        | time_backup).  |
   | cansettime         | 15 | bool          | READ   | True, if the
   stateid indicates a desire to bypass locking checks.  The   |
   |                    |    |               |        | server MAY
   allow READ operations able to bypass locking checks at the server, when
   this special stateid is used.  However, WRITE operations with with
   this special stateid value MUST NOT bypass locking checks and are
   treated exactly |
   |                    |    |               |        | change the same as if     |
   |                    |    |               |        | times for a stateid of all bits 0 were used.

   A lock may not be granted while    |
   |                    |    |               |        | file system    |
   |                    |    |               |        | object as      |
   |                    |    |               |        | specified in a |
   |                    |    |               |        | SETATTR        |
   |                    |    |               |        | operation.     |
   | case_insensitive   | 16 | bool          | READ or WRITE operation using one
   of the special stateids is being performed and the range of the lock
   request conflicts with the range of the   | True, if       |
   |                    |    |               |        | filename       |
   |                    |    |               |        | comparisons on |
   |                    |    |               |        | this file      |
   |                    |    |               |        | system are     |
   |                    |    |               |        | case           |
   |                    |    |               |        | insensitive.   |
   | case_preserving    | 17 | bool          | READ or WRITE operation.  For
   the purposes of   | True, if       |
   |                    |    |               |        | filename case  |
   |                    |    |               |        | on this paragraph, a conflict occurs when a shared lock
   is requested and a WRITE operation is being performed, or an
   exclusive lock is requested and either a file   |
   |                    |    |               |        | system are     |
   |                    |    |               |        | preserved.     |
   | chown_restricted   | 18 | bool          | READ or a WRITE operation is
   being performed.  A SETATTR that sets size is treated similarly to a
   WRITE as discussed above.

8.2.  Lock Ranges

   The protocol allows a lock owner to request a lock with a byte range
   and then either upgrade, downgrade, or unlock a sub-range of   | If TRUE, the
   initial lock.  It is expected that this   |
   |                    |    |               |        | server will be an uncommon type of
   request.  In    |
   |                    |    |               |        | reject any case, servers or server filesystems may not be able
   to support sub-range lock semantics.  In the event that a server
   receives a locking     |
   |                    |    |               |        | request that represents a sub-range of current
   locking state for the lock owner, the server is allowed to return     |
   |                    |    |               |        | change either  |
   |                    |    |               |        | the
   error NFS4ERR_LOCK_RANGE to signify that it does not support sub-
   range lock operations.  Therefore, owner or   |
   |                    |    |               |        | the client should be prepared to
   receive this error and, group      |
   |                    |    |               |        | associated     |
   |                    |    |               |        | with a file if appropriate, report the error to |
   |                    |    |               |        | the
   requesting application.

   The client caller is discouraged from combining multiple independent locking
   ranges that happen to be adjacent into a single request since the
   server may  |
   |                    |    |               |        | not support sub-range requests and for reasons related to
   the recovery of file locking state a          |
   |                    |    |               |        | privileged     |
   |                    |    |               |        | user (for      |
   |                    |    |               |        | example,       |
   |                    |    |               |        | "root" in the event of server failure.
   As discussed UNIX |
   |                    |    |               |        | operating      |
   |                    |    |               |        | environments   |
   |                    |    |               |        | or in Windows  |
   |                    |    |               |        | 2000 the section "Server Failure and Recovery" below, the
   server may employ certain optimizations during recovery that work
   effectively only when the client's behavior during lock recovery is
   similar to the client's locking behavior prior to server failure.

8.3.  Upgrading and Downgrading Locks

   If a client has a write lock "Take |
   |                    |    |               |        | Ownership"     |
   |                    |    |               |        | privilege).    |
   | dir_notif_delay    | 56 | nfstime4      | READ   | notification   |
   |                    |    |               |        | delays on a record, it can request an atomic
   downgrade of the lock to a read lock via      |
   |                    |    |               |        | directory      |
   |                    |    |               |        | attributes     |
   | dirent_notif_delay | 57 | nfstime4      | READ   | notification   |
   |                    |    |               |        | delays on      |
   |                    |    |               |        | child          |
   |                    |    |               |        | attributes     |
   | fileid             | 20 | uint64        | READ   | A number       |
   |                    |    |               |        | uniquely       |
   |                    |    |               |        | identifying    |
   |                    |    |               |        | the LOCK request, by setting file       |
   |                    |    |               |        | within the type     |
   |                    |    |               |        | file system.   |
   | files_avail        | 21 | uint64        | READ   | File slots     |
   |                    |    |               |        | available to READ_LT.  If the server supports atomic downgrade,   |
   |                    |    |               |        | this user on   |
   |                    |    |               |        | the
   request will succeed.  If not, it will return NFS4ERR_LOCK_NOTSUPP.
   The client file       |
   |                    |    |               |        | system         |
   |                    |    |               |        | containing     |
   |                    |    |               |        | this object -  |
   |                    |    |               |        | this should be prepared to receive this error, and if
   appropriate, report the error to |
   |                    |    |               |        | the requesting application.

   If a client has a read lock smallest   |
   |                    |    |               |        | relevant       |
   |                    |    |               |        | limit.         |
   | files_free         | 22 | uint64        | READ   | Free file      |
   |                    |    |               |        | slots on a record, it can request an atomic
   upgrade of the lock to a write lock via   |
   |                    |    |               |        | file system    |
   |                    |    |               |        | containing     |
   |                    |    |               |        | this object -  |
   |                    |    |               |        | this should be |
   |                    |    |               |        | the LOCK request by setting smallest   |
   |                    |    |               |        | relevant       |
   |                    |    |               |        | limit.         |
   | files_total        | 23 | uint64        | READ   | Total file     |
   |                    |    |               |        | slots on the type to WRITE_LT   |
   |                    |    |               |        | file system    |
   |                    |    |               |        | containing     |
   |                    |    |               |        | this object.   |
   | fs_absent          | 60 | bool          | READ   | Is current     |
   |                    |    |               |        | file system    |
   |                    |    |               |        | present or WRITEW_LT.  If the server does not support
   atomic upgrade, it will return NFS4ERR_LOCK_NOTSUPP.  If     |
   |                    |    |               |        | absent.        |
   | fs_layout_type     | 62 | layouttype4   | READ   | Layout types   |
   |                    |    |               |        | available for  |
   |                    |    |               |        | the upgrade
   can file       |
   |                    |    |               |        | system.        |
   | fs_locations       | 24 | fs_locations  | READ   | Locations      |
   |                    |    |               |        | where this     |
   |                    |    |               |        | file system    |
   |                    |    |               |        | may be achieved without an existing conflict, the request will
   succeed.  Otherwise, found.  |
   |                    |    |               |        | If the server will return either NFS4ERR_DENIED or
   NFS4ERR_DEADLOCK.  The error NFS4ERR_DEADLOCK is returned  |
   |                    |    |               |        | returns        |
   |                    |    |               |        | NFS4ERR_MOVED  |
   |                    |    |               |        | as an error,   |
   |                    |    |               |        | this attribute |
   |                    |    |               |        | MUST be        |
   |                    |    |               |        | supported.     |
   | fs_locations_info  | 67 |               | READ   | Full function  |
   |                    |    |               |        | file system    |
   |                    |    |               |        | location.      |
   | fs_status          | 61 | fs4_status    | READ   | Generic file   |
   |                    |    |               |        | system type    |
   |                    |    |               |        | information.   |
   | hidden             | 25 | bool          | R/W    | True, if the
   client issued the LOCK request   |
   |                    |    |               |        | file is        |
   |                    |    |               |        | considered     |
   |                    |    |               |        | hidden with the type set    |
   |                    |    |               |        | respect to WRITEW_LT and the
   server has detected a deadlock.  The client should be prepared to
   receive such errors and |
   |                    |    |               |        | Windows API?   |
   | homogeneous        | 26 | bool          | READ   | True, if appropriate, report this  |
   |                    |    |               |        | object's file  |
   |                    |    |               |        | system is      |
   |                    |    |               |        | homogeneous,   |
   |                    |    |               |        | i.e. are per   |
   |                    |    |               |        | file system    |
   |                    |    |               |        | attributes the error to |
   |                    |    |               |        | same for all   |
   |                    |    |               |        | file system's  |
   |                    |    |               |        | objects.       |
   | layout_alignment   | 66 | uint32_t      | READ   | Preferred      |
   |                    |    |               |        | alignment for  |
   |                    |    |               |        | layout related |
   |                    |    |               |        | I/O.           |
   | layout_blksize     | 65 | uint32_t      | READ   | Preferred      |
   |                    |    |               |        | block size for |
   |                    |    |               |        | layout related |
   |                    |    |               |        | I/O.           |
   | layout_hint        | 63 | layouthint4   | WRITE  | Client         |
   |                    |    |               |        | specified hint |
   |                    |    |               |        | for file       |
   |                    |    |               |        | layout.        |
   | layout_type        | 64 | layouttype4   | READ   | Layout types   |
   |                    |    |               |        | available for  |
   |                    |    |               |        | the
   requesting application.

8.4.  Blocking Locks

   Some clients require file.      |
   | maxfilesize        | 27 | uint64        | READ   | Maximum        |
   |                    |    |               |        | supported file |
   |                    |    |               |        | size for the support   |
   |                    |    |               |        | file system of blocking locks.  NFSv4.1 does not
   provide a callback when a previously unavailable lock becomes
   available.  Clients thus have no choice but to continually poll |
   |                    |    |               |        | this object.   |
   | maxlink            | 28 | uint32        | READ   | Maximum number |
   |                    |    |               |        | of links for   |
   |                    |    |               |        | this object.   |
   | maxname            | 29 | uint32        | READ   | Maximum        |
   |                    |    |               |        | filename size  |
   |                    |    |               |        | supported for  |
   |                    |    |               |        | this object.   |
   | maxread            | 30 | uint64        | READ   | Maximum read   |
   |                    |    |               |        | size supported |
   |                    |    |               |        | for this       |
   |                    |    |               |        | object.        |
   | maxwrite           | 31 | uint64        | READ   | Maximum write  |
   |                    |    |               |        | size supported |
   |                    |    |               |        | for this       |
   |                    |    |               |        | object. This   |
   |                    |    |               |        | attribute      |
   |                    |    |               |        | SHOULD be      |
   |                    |    |               |        | supported if   |
   |                    |    |               |        | the lock.  This presents a fairness problem.  Two new lock types are
   added, READW and WRITEW, and are used to indicate to the server that
   the client file is requesting a blocking lock.  The server should maintain
   an ordered list    |
   |                    |    |               |        | writable. Lack |
   |                    |    |               |        | of pending blocking locks.  When the conflicting lock
   is released, the server may wait the lease period for the first
   waiting client this        |
   |                    |    |               |        | attribute can  |
   |                    |    |               |        | lead to re-request the lock.  After the lease period
   expires the next waiting    |
   |                    |    |               |        | client request is allowed the lock.  Clients
   are required to poll at an interval sufficiently small that it is
   likely to acquire the lock in a timely manner.  The server is either  |
   |                    |    |               |        | wasting        |
   |                    |    |               |        | bandwidth or   |
   |                    |    |               |        | not
   required receiving  |
   |                    |    |               |        | the best       |
   |                    |    |               |        | performance.   |
   | mdsthreshold       | 68 | mdsthreshold4 | READ   | Hint to maintain a list of pending blocked locks client |
   |                    |    |               |        | as it is used to
   increase fairness and not correct operation.  Because of the
   unordered nature of crash recovery, storing of lock state to stable
   storage would be required when to guarantee ordered granting of blocking
   locks.

   Servers may also note  |
   |                    |    |               |        | write through  |
   |                    |    |               |        | the lock types and delay returning denial pnfs       |
   |                    |    |               |        | metadata       |
   |                    |    |               |        | server.        |
   | mimetype           | 32 | utf8<>        | R/W    | MIME body      |
   |                    |    |               |        | type/subtype   |
   |                    |    |               |        | of
   the request to allow extra time this        |
   |                    |    |               |        | object.        |
   | mode               | 33 | mode4         | R/W    | UNIX-style     |
   |                    |    |               |        | mode and       |
   |                    |    |               |        | permission     |
   |                    |    |               |        | bits for a conflicting lock to be
   released, allowing a successful return.  In this way, clients can
   avoid  |
   |                    |    |               |        | object.        |
   | mounted_on_fileid  | 55 | uint64        | READ   | Like fileid,   |
   |                    |    |               |        | but if the burden of needlessly frequent polling for blocking locks.
   The server should take care in     |
   |                    |    |               |        | target         |
   |                    |    |               |        | filehandle is  |
   |                    |    |               |        | the length root of delay in the event the
   client retransmits a  |
   |                    |    |               |        | file system    |
   |                    |    |               |        | return the request.

8.5.  Lease Renewal

   The purpose     |
   |                    |    |               |        | fileid of a lease is to allow a server to remove stale locks
   that are held by a client that has crashed or is otherwise
   unreachable.  It is not a mechanism for cache consistency and lease
   renewals may not be denied if the lease interval has not expired.

   Since each session is associated with  |
   |                    |    |               |        | underlying     |
   |                    |    |               |        | directory.     |
   | no_trunc           | 34 | bool          | READ   | True, if a specific client, any
   operation issued on that session     |
   |                    |    |               |        | name longer    |
   |                    |    |               |        | than name_max  |
   |                    |    |               |        | is used, an indication that the associated
   client is reachable.  When a request is issued for a given session,
   execution of a SEQUENCE operation will result in all leases for the
   associated client to    |
   |                    |    |               |        | error be implicitly renewed.  This approach allows for
   low overhead lease renewal which scales well.  In the typical case no
   extra RPC calls are required for lease renewal       |
   |                    |    |               |        | returned and in the worst case
   one RPC   |
   |                    |    |               |        | name is required every lease period, via a COMPOUND that consists
   solely not    |
   |                    |    |               |        | truncated.     |
   | numlinks           | 35 | uint32        | READ   | Number of a single SEQUENCE operation. hard |
   |                    |    |               |        | links to this  |
   |                    |    |               |        | object.        |
   | owner              | 36 | utf8<>        | R/W    | The number string     |
   |                    |    |               |        | name of locks held by
   the client is not a factor since all state for the client is involved
   with the lease renewal action.

   Since all operations that create a new lease also renew existing
   leases, the server must maintain a common lease expiration time for
   all valid leases for a given client.  This lease time can then be
   easily updated upon implicit lease renewal actions.

8.6.  Crash Recovery    |
   |                    |    |               |        | owner of this  |
   |                    |    |               |        | object.        |
   | owner_group        | 37 | utf8<>        | R/W    | The important requirement in crash recovery is that both the client
   and string     |
   |                    |    |               |        | name of the server know when    |
   |                    |    |               |        | group          |
   |                    |    |               |        | ownership of   |
   |                    |    |               |        | this object.   |
   | quota_avail_hard   | 38 | uint64        | READ   | For definition |
   |                    |    |               |        | see "Quota     |
   |                    |    |               |        | Attributes"    |
   |                    |    |               |        | section below. |
   | quota_avail_soft   | 39 | uint64        | READ   | For definition |
   |                    |    |               |        | see "Quota     |
   |                    |    |               |        | Attributes"    |
   |                    |    |               |        | section below. |
   | quota_used         | 40 | uint64        | READ   | For definition |
   |                    |    |               |        | see "Quota     |
   |                    |    |               |        | Attributes"    |
   |                    |    |               |        | section below. |
   | rawdev             | 41 | specdata4     | READ   | Raw device     |
   |                    |    |               |        | identifier.    |
   |                    |    |               |        | UNIX device    |
   |                    |    |               |        | major/minor    |
   |                    |    |               |        | node           |
   |                    |    |               |        | information.   |
   |                    |    |               |        | If the other has failed.  Additionally, it is
   required that a client sees a consistent view value   |
   |                    |    |               |        | of data across server
   restarts or reboots.  All READ and WRITE operations that may have
   been queued within the client type is not |
   |                    |    |               |        | NF4BLK or network buffers must wait until the
   client has successfully recovered the locks protecting      |
   |                    |    |               |        | NF4CHR, the    |
   |                    |    |               |        | value return   |
   |                    |    |               |        | SHOULD NOT be  |
   |                    |    |               |        | considered     |
   |                    |    |               |        | useful.        |
   | recv_impl_id       | 59 | nfs_impl_id4  | READ and   | Client obtains |
   |                    |    |               |        | server         |
   |                    |    |               |        | implementation |
   |                    |    |               |        | via GETATTR.   |
   | send_impl_id       | 58 | impl_ident4   | WRITE operations.

8.6.1.  | Client Failure and Recovery

   In the event that a client fails, the         |
   |                    |    |               |        | provides       |
   |                    |    |               |        | server may release the client's
   locks when with    |
   |                    |    |               |        | implementation |
   |                    |    |               |        | identity via   |
   |                    |    |               |        | SETATTR.       |
   | space_avail        | 42 | uint64        | READ   | Disk space in  |
   |                    |    |               |        | bytes          |
   |                    |    |               |        | available to   |
   |                    |    |               |        | this user on   |
   |                    |    |               |        | the associated leases have expired.  Conflicting locks
   from another client may only be granted after file       |
   |                    |    |               |        | system         |
   |                    |    |               |        | containing     |
   |                    |    |               |        | this lease expiration.
   When a client has not not failed and re-establishes his lease before
   expiration occurs, requests for conflicting locks will not object -  |
   |                    |    |               |        | this should be
   granted.

   To minimize client delay upon restart, lock requests are associated
   with an instance of |
   |                    |    |               |        | the client by a client supplied verifier.  This
   verifier is part of smallest   |
   |                    |    |               |        | relevant       |
   |                    |    |               |        | limit.         |
   | space_free         | 43 | uint64        | READ   | Free disk      |
   |                    |    |               |        | space in bytes |
   |                    |    |               |        | on the initial CREATE_CLIENTID call made by file    |
   |                    |    |               |        | system         |
   |                    |    |               |        | containing     |
   |                    |    |               |        | this object -  |
   |                    |    |               |        | this should be |
   |                    |    |               |        | the
   client.  The server returns a clientid as a result smallest   |
   |                    |    |               |        | relevant       |
   |                    |    |               |        | limit.         |
   | space_total        | 44 | uint64        | READ   | Total disk     |
   |                    |    |               |        | space in bytes |
   |                    |    |               |        | on the file    |
   |                    |    |               |        | system         |
   |                    |    |               |        | containing     |
   |                    |    |               |        | this object.   |
   | space_used         | 45 | uint64        | READ   | Number of file |
   |                    |    |               |        | system bytes   |
   |                    |    |               |        | allocated to   |
   |                    |    |               |        | this object.   |
   | system             | 46 | bool          | R/W    | True, if this  |
   |                    |    |               |        | file is a      |
   |                    |    |               |        | "system" file  |
   |                    |    |               |        | with respect   |
   |                    |    |               |        | to the
   CREATE_CLIENTID operation. Windows |
   |                    |    |               |        | API?           |
   | time_access        | 47 | nfstime4      | READ   | The client then confirms the use time of    |
   |                    |    |               |        | last access to |
   |                    |    |               |        | the
   clientid object by establishing  |
   |                    |    |               |        | a session associated with that clientid.
   All locks, including opens, byte-range locks, delegations, and layout
   obtained by sessions using that clientid are associated with read that
   clientid.

   Since the verifier will be changed    |
   |                    |    |               |        | was satisfied  |
   |                    |    |               |        | by the client upon each
   initialization, the server can compare a new verifier to the verifier
   associated with currently held locks and determine that they do not
   match.  This signifies server. |
   | time_access_set    | 48 | settime4      | WRITE  | Set the client's new instantiation and subsequent
   loss time   |
   |                    |    |               |        | of locking state.  As a result, the server is free last access |
   |                    |    |               |        | to release
   all locks held which are associated with the old clientid which was
   derived from object. |
   |                    |    |               |        | SETATTR use    |
   |                    |    |               |        | only.          |
   | time_backup        | 49 | nfstime4      | R/W    | The time of    |
   |                    |    |               |        | last backup of |
   |                    |    |               |        | the old verifier.  At this point conflicting locks from
   other clients, kept waiting while object.    |
   | time_create        | 50 | nfstime4      | R/W    | The time of    |
   |                    |    |               |        | creation of    |
   |                    |    |               |        | the leaser had object.    |
   |                    |    |               |        | This attribute |
   |                    |    |               |        | does not yet expired, can
   be granted.

   Note that the verifier must have  |
   |                    |    |               |        | any relation   |
   |                    |    |               |        | to the same uniqueness properties of
   the verifier for the COMMIT operation.

8.6.2.  Server Failure and Recovery

   If the         |
   |                    |    |               |        | traditional    |
   |                    |    |               |        | UNIX file      |
   |                    |    |               |        | attribute      |
   |                    |    |               |        | "ctime" or     |
   |                    |    |               |        | "change time". |
   | time_delta         | 51 | nfstime4      | READ   | Smallest       |
   |                    |    |               |        | useful server loses locking state (usually as a result  |
   |                    |    |               |        | time           |
   |                    |    |               |        | granularity.   |
   | time_metadata      | 52 | nfstime4      | READ   | The time of a restart
   or reboot), it must allow clients    |
   |                    |    |               |        | last meta-data |
   |                    |    |               |        | modification   |
   |                    |    |               |        | of the object. |
   | time_modify        | 53 | nfstime4      | READ   | The time of    |
   |                    |    |               |        | last           |
   |                    |    |               |        | modification   |
   |                    |    |               |        | to discover this fact and re-
   establish the lost locking state.  The client must be able object. |
   | time_modify_set    | 54 | settime4      | WRITE  | Set the time   |
   |                    |    |               |        | of last        |
   |                    |    |               |        | modification   |
   |                    |    |               |        | to re-
   establish the locking state without having object. |
   |                    |    |               |        | SETATTR use    |
   |                    |    |               |        | only.          |
   +--------------------+----+---------------+--------+----------------+

10.7.  Time Access

   As defined above, the server deny valid
   requests because time_access attribute represents the server has granted conflicting time of
   last access to another
   client.  Likewise, if there is a possibility that clients have not
   yet re-established their locking state for a file, the server must
   disallow READ and WRITE operations for that file.

   A client can determine that server failure (and thus loss of locking
   state) has occurred, when it receives one of two errors.  The
   NFS4ERR_STALE_STATEID error indicates a stateid invalidated object by a
   reboot or restart.  The NFS4ERR_STALE_CLIENTID error indicates a
   clientid invalidated read that was satisfied by reboot or restart.  When either of these are
   received, the client must establish a new clientid (See
   Section 8.1.1) and re-establish its locking state.

   Once a session server.
   The notion of what is established using an "access" depends on server's operating
   environment and/or the new clientid, server's file system semantics.  For example,
   for servers obeying POSIX semantics, time_access would be updated
   only by the client will
   use reclaim-type locking requests (i.e.  LOCK requests with reclaim
   set to true READLINK, READ, and OPEN READDIR operations with a claim type and not any of CLAIM_PREVIOUS)
   to re-establish its locking state.  Once this is done, or if there is
   no such locking state to reclaim, the client does a RECLAIM_COMPLETE
   operation to indicate
   operations that it has reclaimed all modify the content of the locking state
   that it will reclaim.  Once a client does a RECLAIM_COMPLETE
   operation, it may attempt non-reclaim locking operations, although it
   may get NFS4ERR_GRACE errors on these until object.  Of course, setting
   the period of special
   handling is over.

   The period of special handling of locking and READs and WRITEs, corresponding time_access_set attribute is
   referred another way to as the "grace period".  During modify
   the grace period, clients
   recover locks and time_access attribute.

   Whenever the associated state using reclaim-type locking
   requests.  During this period, file object resides on a writable file system, the
   server must reject READ and WRITE
   operations and non-reclaim locking requests (i.e. other LOCK and OPEN
   operations) with an error of NFS4ERR_GRACE, unless it is able should make best efforts to
   guarantee that these may be done safely, as described below.

   The grace period may last until all clients who are known record time_access into stable
   storage.  However, to possibly
   have had locks have done a RECLAIM_COMPLETE operation, indicating
   that they have finished reclaiming mitigate the locks they held before performance effects of doing so,
   and most especially whenever the server reboot.  The server is assumed to maintain in stable storage a
   list satisfying the read of clients who may have such locks.  The server may also
   terminate the grace period before all clients have done
   RECLAIM_COMPLETE.  The server SHOULD NOT terminate
   object's content from its cache, the grace period
   before a server MAY cache access time equal
   updates and lazily write them to the lease period in order stable storage.  It is also
   acceptable to give clients an
   opportunity to find out about administrators of the server reboot.  Some additional
   time in order to allow time the option to establish a new clientid disable
   time_access updates.

10.8.  Interpreting owner and session owner_group

   The recommended attributes "owner" and to effect lock reclaims may be added.

   If the server can reliably determine that granting a non-reclaim
   request will not conflict with reclamation of locks by other clients,
   the NFS4ERR_GRACE error does not have to be returned even "owner_group" (and also users
   and groups within the
   grace period, although NFS4ERR_GRACE must always be returned to
   clients attempting "acl" attribute) are represented in terms of a non-reclaim lock request before doing their own
   RECLAIM_COMPLETE.  For the server to be able to service READ and
   WRITE operations during the grace period, it must again be able to
   guarantee that no possible conflict could arise between
   UTF-8 string.  To avoid a potential
   reclaim locking request and the READ or WRITE operation.  If the
   server is unable to offer representation that guarantee, the NFS4ERR_GRACE error
   must be returned to the client.

   For a server to provide simple, valid handling during the grace
   period, the easiest method is tied to simply reject all non-reclaim
   locking requests and READ and WRITE operations by returning the
   NFS4ERR_GRACE error.  However, a server may keep information about
   granted locks in stable storage.  With this information, particular
   underlying implementation at the server
   could determine if a regular lock or READ client or WRITE operation can be
   safely processed.

   For example, if server, the server maintained on stable storage summary
   information on whether mandatory locks exist, either mandatory byte-
   range locks, or share reservations specifying deny modes, many
   requests could be allowed during use of the grace period.  If it is known
   that no such share reservations exist, OPEN request
   UTF-8 string has been chosen.  Note that do not
   specify deny modes may be safely granted.  If, in addition, it section 6.1 of RFC2624 [26]
   provides additional rationale.  It is
   known expected that no mandatory byte-range locks exist, either through
   information stored on stable storage or simply because the server
   does not support such locks, READ client and WRITE requests may be safely
   processed during the grace period.

   To reiterate, for a
   server that allows non-reclaim lock and I/O
   requests to be processed during the grace period, it MUST determine
   that no lock subsequently reclaimed will be rejected and that no lock
   subsequently reclaimed would have prevented any I/O operation
   processed during the grace period.

   Clients should be prepared for the return their own local representation of NFS4ERR_GRACE errors for
   non-reclaim lock owner and I/O requests.  In this case the client should
   employ a retry mechanism for the request.  A delay (on the order of
   several seconds) between retries should be
   owner_group that is used for local storage or presentation to avoid overwhelming the server.  Further discussion of end
   user.  Therefore, it is expected that when these attributes are
   transferred between the general issue is included in
   [Floyd].  The client must account for the and server that the local
   representation is able translated to
   perform I/O and non-reclaim locking requests within a syntax of the grace period
   as well as those form "user@
   dns_domain".  This will allow for a client and server that can not do so.

   A reclaim-type locking request outside not use
   the server's grace period can
   only succeed if same local representation the server can guarantee ability to translate to a common
   syntax that no conflicting lock or
   I/O request has been granted since reboot or restart.

   A server may, upon restart, establish can be interpreted by both.

   Similarly, security principals may be represented in different ways
   by different security mechanisms.  Servers normally translate these
   representations into a new value for the lease
   period.  Therefore, clients should, once common format, generally that used by local
   storage, to serve as a new clientid is
   established, refetch means of identifying the lease_time attribute and use it as users corresponding
   to these security principals.  When these local identifiers are
   translated to the basis
   for lease renewal for form of the lease owner attribute, associated with that server.
   However, the server must establish, for this restart event, files
   created by such principals they identify, in a grace
   period at least as long as common format, the lease period for
   users associated with each corresponding set of security principals.

   The translation used to interpret owner and group strings is not
   specified as part of the previous server
   instantiation. protocol.  This allows the client state obtained during the
   previous server instance various solutions to
   be reliably re-established.

8.6.3.  Network Partitions and Recovery

   If the duration of employed.  For example, a network partition is greater than the lease
   period provided by the server, the server will have not received local translation table may be consulted
   that maps between a
   lease renewal from numeric id to the client.  If this occurs, user@dns_domain syntax.  A name
   service may also be used to accomplish the translation.  A server may free
   all locks held for
   provide a more general service, not limited by any particular
   translation (which would only translate a limited set of possible
   strings) by storing the client, owner and owner_group attributes in local
   storage without any translation or it may allow augment a translation
   method by storing the lock state to
   remain entire string for a considerable period, subject to attributes for which no
   translation is available while using the constraint that if a
   request local representation for
   those cases in which a conflicting lock translation is made, locks associated with expired
   leases available.

   Servers that do not prevent such provide support for all possible values of the
   owner and owner_group attributes, should return an error
   (NFS4ERR_BADOWNER) when a conflicting lock from being granted but
   are revoked as necessary so string is presented that has no
   translation, as not to interfere with such conflicting
   requests.

   If the server chooses value to delay freeing of lock state until there is be set for a
   conflict, it may either free all SETATTR of the clients locks once there is owner,
   owner_group, or acl attributes.  When a
   conflict, server does accept an owner
   or owner_group value as valid on a SETATTR (and similarly for the
   owner and group strings in an acl), it is promising to return that
   same string when a corresponding GETATTR is done.  Configuration
   changes and ill-constructed name translations (those that contain
   aliasing) may only revoke the minimum set of locks necessary make that promise impossible to
   allow conflicting requests.  When it adopts honor.  Servers should
   make appropriate efforts to avoid a situation in which these
   attributes have their values changed when no real change to ownership
   has occurred.

   The "dns_domain" portion of the finer-grained
   approach, it must revoke all locks associated with owner string is meant to be a given stateid, DNS
   domain name.  For example, user@ietf.org.  Servers should accept as long
   valid a set of users for at least one domain.  A server may treat
   other domains as it revokes having no valid translations.  A more general
   service is provided when a single such lock.

   When the server chooses to free all is capable of a client's lock state, either
   immediately upon lease expiration, accepting users for
   multiple domains, or a result of for all domains, subject to security
   constraints.

   In the first attempt case where there is no translation available to get a lock, all stateids held by the client will become invalid or
   stale.  Once the client is able to reach
   server, the server after such a
   network partition, attribute value must be constructed without the status returned by "@".
   Therefore, the SEQUENCE operation will
   indicate a loss absence of locking state.  In addition all I/O submitted by
   the client with the now invalid stateids will fail with the server
   returning @ from the error NFS4ERR_EXPIRED.  Once owner or owner_group
   attribute signifies that no translation was available at the client learns of sender
   and that the
   loss receiver of locking state, it will suitably notify the applications that
   held the invalidated locks.  The client attribute should then take action to
   free invalidated stateid's, either by establishing a new client id
   using a new verifier or by doing not use that string as
   a FREE_STATEID operation to release
   each of basis for translation into its own internal format.  Even though
   the invalidated stateid's.

   When attribute value can not be translated, it may still be useful.
   In the server adopts case of a finer-grained approach to revocation client, the attribute string may be used for local
   display of
   locks when lease have expired, only ownership.

   To provide a subset greater degree of stateids will
   normally become invalid during compatibility with previous versions
   of NFS (i.e. v2 and v3), which identified users and groups by 32-bit
   unsigned uid's and gid's, owner and group strings that consist of
   decimal numeric values with no leading zeros can be given a network partition.  When the client
   is able special
   interpretation by clients and servers which choose to communicate with the server after provide such
   support.  The receiver may treat such a network
   partition, user or group string as
   representing the status returned same user as would be represented by the SEQUENCE operation will
   indicate a partial loss of locking state.  In addition, operations,
   including I/O submitted by the client with the now invalid stateids
   will fail with v2/v3 uid or
   gid having the corresponding numeric value.  A server returning the error NFS4ERR_EXPIRED.  Once
   the client learns of the loss of locking state, it will use the
   TEST_STATEID operation on is not
   obligated to accept such a string, but may return an NFS4ERR_BADOWNER
   instead.  To avoid this mechanism being used to subvert user and
   group translation, so that a client might pass all of its stateid's to determine which
   locks have been lost the owners and them suitably notify
   groups in numeric form, a server SHOULD return an NFS4ERR_BADOWNER
   error when there is a valid translation for the applications user or owner
   designated in this way.  In that
   held case, the invalidated locks.  The client can then release must use the
   invalidated locking state
   appropriate name@domain string and acknowledge the revocation of not the special form for
   compatibility.

   The owner string "nobody" may be used to designate an anonymous user,
   which will be associated locks by doing a FREE_STATEID operation on each of the
   invalidated stateid's.

   When a network partition is combined with a server reboot, there are
   edge conditions file created by a security principal
   that place requirements on cannot be mapped through normal means to the server in order owner attribute.

10.9.  Character Case Attributes

   With respect to
   avoid silent data corruption following the server reboot.  Two of
   these edge conditions are known, case_insensitive and are discussed below.

   The first edge condition arises as case_preserving attributes,
   each UCS-4 character (which UTF-8 encodes) has a result of the scenarios such as "long descriptive
   name" RFC1345 [27] which may or may not included the follwing:

   1.  Client A acquires a lock.

   2.  Client A and word "CAPITAL"
   or "SMALL".  The presence of SMALL or CAPITAL allows an NFS server experience mutual network partition, such
       that client A is unable to renew its lease.

   3.  Client A's lease expires,
   implement unambiguous and the server releases lock.

   4.  Client B acquires a lock that would have conflicted with that of
       Client A.

   5.  Client B releases its lock.

   6.  Server reboots.

   7.  Network partition between client A efficient table driven mappings for case
   insensitive comparisons, and server heals.

   8.  Client A connects to new server instance non-case-preserving storage.  For
   general character handling and finds out about
       server reboot.

   9.  Client A reclaims its lock within the server's grace period.

   Thus, at the final step, internationalization issues, see the server has erroneously granted client
   A's lock reclaim.  If client B modified
   section "Internationalization".

10.10.  Quota Attributes

   For the object attributes related to file system quotas, the lock was
   protecting, client A will experience object corruption. following
   definitions apply:

   quota_avail_soft  The second known edge condition arises value in situations such as bytes which represents the
   following:

   1.   Client A acquires one amount of
      additional disk space that can be allocated to this file or more locks.

   2.   Server reboots.

   3.   Client A and server experience mutual network partition, such
      directory before the user may reasonably be warned.  It is
      understood that client A this space may be consumed by allocations to other
      files or directories though there is unable a rule as to reclaim all of its locks within the
        grace period.

   4.   Server's reclaim grace period ends.  Client A has either no
        locks which other
      files or an incomplete set directories.

   quota_avail_hard  The value in bytes which represent the amount of locks known to
      additional disk space beyond the server.

   5.   Client B acquires a lock current allocation that would have conflicted with a lock
        of client A can be
      allocated to this file or directory before further allocations
      will be refused.  It is understood that was not reclaimed.

   6.   Client B releases the lock.

   7.   Server reboots a second time.

   8.   Network partition between client A and server heals.

   9.   Client A connects this space may be consumed
      by allocations to new server instance and finds out about
        server reboot.

   10.  Client A reclaims its lock within the server's grace period.

   As with the first edge condition, other files or directories.

   quota_used  The value in bytes which represent the final step amount of the scenario disc
      space used by this file or directory and possibly a number of
      other similar files or directories, where the second edge condition has the server erroneously granting client
   A's lock reclaim.

   Solving the first and second edge conditions requires that set of "similar"
      meets at least the server
   either always assumes after it reboots criterion that some edge condition
   occurs, and thus return NFS4ERR_NO_GRACE for all reclaim attempts, allocating space to any file or
   that the server record some information
      directory in stable storage.  The
   amount of information the server records in stable storage is set will reduce the "quota_avail_hard" of every
      other file or directory in
   inverse proportion to how harsh the server intends to set.

      Note that there may be whenever
   edge conditions arise. a number of distinct but overlapping sets
      of files or directories for which a quota_used value is
      maintained.  E.g. "all files with a given owner", "all files with
      a given group owner". etc.

      The server that is completely tolerant at liberty to choose any of all
   edge conditions will record those sets but should do
      so in stable storage every lock that is
   acquired, removing a repeatable way.  The rule may be configured per file
      system or may be "choose the lock record from stable storage only when set with the
   lock is released.  For smallest quota".

10.11.  mounted_on_fileid

   UNIX-based operating environments connect a file system into the two edge conditions discussed above,
   namespace by connecting (mounting) the
   harshest a server can be, and still support a grace period for
   reclaims, requires that file system onto the server record in stable storage
   information some minimal information.  For example, a server
   implementation could, for each client, save in stable storage existing
   file object (the mount point, usually a
   record containing:

   o directory) of an existing
   file system.  When the client's id string

   o  a boolean that indicates if mount point's parent directory is read via an
   API like readdir(), the client's lease expired or if there
      was administrative intervention (see Section 8.7) to revoke return results are directory entries, each
   with a
      record lock, share reservation, or delegation component name and there has been
      no acknowledgement (via FREE_STATEID) of such revocation.

   o a boolean that indicates whether fileid.  The fileid of the client may have locks that it
      believes to mount point's
   directory entry will be reclaimable in situations which different from the grace period
      was terminated, making fileid that the server's view stat()
   system call returns.  The stat() system call is returning the fileid
   of lock reclaimability
      suspect. the root of the mounted file system, whereas readdir() is
   returning the fileid stat() would have returned before any file
   systems were mounted on the mount point.

   Unlike NFS version 3, NFS version 4 allows a client's LOOKUP request
   to cross other file systems.  The server will set this for any client record in stable
      storage where detects the file system
   crossing whenever the filehandle argument of LOOKUP has an fsid
   attribute different from that of the filehandle returned by LOOKUP.
   A UNIX-based client will consider this a "mount point crossing".
   UNIX has not done a RECLAIM_COMPLETE, before
      it grants any new (i.e. not reclaimed) lock to any client.

   Assuming the above record keeping, legacy scheme for allowing a process to determine its
   current working directory.  This relies on readdir() of a mount
   point's parent and stat() of the first edge condition,
   after the server reboots, mount point returning fileids as
   previously described.  The mounted_on_fileid attribute corresponds to
   the record that client A's lease expired
   means fileid that another readdir() would have returned as described
   previously.

   While the NFS version 4 client could have acquired simply fabricate a conflicting record
   lock, share reservation, or delegation.  Hence fileid
   corresponding to what mounted_on_fileid provides (and if the server must reject
   a reclaim from client A with the error NFS4ERR_NO_GRACE.

   For the second edge condition, after
   does not support mounted_on_fileid, the server reboots for client has no choice), there
   is a second
   time, the indication risk that the client had not completed its reclaims
   at the time at which the grace period ended means will generate a fileid that conflicts with
   one that is already assigned to another object in the file system.
   Instead, if the server
   must reject a reclaim from client A with can provide the error NFS4ERR_NO_GRACE.

   When either edge condition occurs, mounted_on_fileid, the client's attempt to reclaim
   locks will result
   potential for client operational problems in the error NFS4ERR_NO_GRACE.  When this area is
   received, or after eliminated.

   If the client reboots with server detects that there is no lock state, mounted point at the client
   will issue a RECLAIM_COMPLETE.  When target
   file object, then the RECLAIM_COMPLETE value for mounted_on_fileid that it returns is
   received,
   the server and client are again in agreement regarding
   reclaimable locks and both booleans in persistent storage can be
   reset, to be set again only when there is a subsequent event same as that
   causes lock reclaim operations to be questionable.

   Regardless of the level and approach to record keeping, fileid attribute.

   The mounted_on_fileid attribute is RECOMMENDED, so the server
   MUST implement one of the following strategies (which apply to
   reclaims of share reservations, record locks, SHOULD
   provide it if possible, and delegations):

   1.  Reject all reclaims with NFS4ERR_NO_GRACE.  This for a UNIX-based server, this is extremely
       unforgiving, but necessary if the server does not record lock
       state in stable storage.

   2.  Record sufficient state in stable storage such that all known
       edge conditions involving server reboot, including the two noted
   straightforward.  Usually, mounted_on_fileid will be requested during
   a READDIR operation, in this section, are detected.  False positives are acceptable.
       Note that at this time, which case it is not known if there are other edge
       conditions.

       In trivial (at least for UNIX-
   based servers) to return mounted_on_fileid since it is equal to the event that, after
   fileid of a server reboot, directory entry returned by readdir().  If
   mounted_on_fileid is requested in a GETATTR operation, the server determines
   should obey an invariant that has it returning a value that there is unrecoverable damage or corruption equal
   to the
       information file object's entry in stable storage, then for all clients and/or locks
       which may the object's parent directory, i.e.
   what readdir() would have returned.  Some operating environments
   allow a series of two or more file systems to be affected, mounted onto a
   single mount point.  In this case, for the server MUST return NFS4ERR_NO_GRACE.

   A mandate for to obey the client's handling of
   aforementioned invariant, it will need to find the NFS4ERR_NO_GRACE error is
   outside base mount point,
   and not the scope intermediate mount points.

10.12.  send_impl_id and recv_impl_id

   These recommended attributes are used to identify the client and
   server.  In the case of this specification, since the strategies for
   such handling are very dependent on send_impl_id attribute, the client's operating
   environment.  However, one potential approach is described below.

   When client sends
   its clientid4 value along with the nfs_impl_id4.  The use of the
   clientid4 value allows the server to identify and match specific
   client receives NFS4ERR_NO_GRACE, it could examine interaction.  In the
   change attribute case of the objects recv_impl_id attribute, the
   client is trying receives the nfs_impl_id4 value.

   Access to reclaim state
   for, this identification information can be most useful at both
   client and use that server.  Being able to determine whether identify specific implementations
   can help in planning by administrators or implementers.  For example,
   diagnostic software may extract this information in an attempt to re-establish the state via
   normal OPEN
   identify implementation problems, performance workload behaviors or LOCK requests.  This is acceptable provided the
   client's operating environment allows it.  In other words,
   general usage statistics.  Since the client
   implementor is advised intent of having access to document this
   information is for his users the behavior.  The
   client could also inform the application that its record lock or
   share reservations (whether they were delegated planning or not) have been
   lost, such as via a UNIX signal, a GUI pop-up window, etc.  See general diagnosis only, the
   section, "Data Caching client and Revocation" for
   server MUST NOT interpret this implementation identity information in
   a discussion way that affects interoperational behavior of what the
   client should do for dealing with unreclaimed delegations on client
   state.

   For further discussion of revocation of locks see Section 8.7.

8.7.  Server Revocation of Locks

   At any point, implementation.
   The reason is the server can revoke locks held by a client if clients and servers did such a thing, they might
   use fewer capabilities of the
   client must be prepared for this event.  When protocol than the client detects that
   its locks have been peer can support, or may have been revoked,
   the client and server might refuse to interoperate.

   Because it is
   responsible for validating likely some implementations will violate the state information between itself protocol
   specification and interpret the server.  Validating locking state for identity information, implementations
   MUST allow the client means that it
   must verify or reclaim state for each lock currently held.

   The first occasion users of lock revocation is upon server reboot or re-
   initialization.  In this instance the NFSv4 client will receive an error
   (NFS4ERR_STALE_STATEID or NFS4ERR_STALE_CLIENTID) and server to set the client will
   proceed with normal crash recovery as described in
   contents of the previous
   section.

   The second occasion sent nfs_impl_id structure to any value.

   Even though these attributes are recommended, if the server supports
   one of lock revocation is them it MUST support the inability other.

10.13.  fs_layout_type

   This attribute applies to renew a file system and indicates what layout
   types are supported by the
   lease before expiration, as discussed above.  While file system.  We expect this is
   considered attribute to
   be queried when a rare or unusual event, the client must be prepared to
   recover.  The server encounters a new fsid.  This attribute is responsible for determining lease expiration,
   and deciding exactly how to deal with it, informing
   used by the client of the
   scope of to determine if it has applicable layout drivers.

10.14.  layout_type

   This attribute indicates the lock revocation. particular layout type(s) used for a
   file.  This is for informational purposes only.  The client then uses the status
   information provided by the server needs to synchronize his locking state
   with that of
   use the server, LAYOUTGET operation in order to recover.

   The third occasion of lock revocation can occur get enough information (e.g.,
   specific device information) in order to perform I/O.

10.15.  layout_hint

   This attribute may be set on newly created files to influence the
   metadata server's choice for the file's layout.  It is suggested that
   this attribute is set as a result of
   revocation one of locks within the lease period, either because of
   administrative intervention, or because a recallable lock (a
   delegation or layout) was not returned initial attributes within the lease period ater
   having been recalled.  While these are considered rare events, they
   are possible and the client must be prepared to deal with them.  When
   either
   OPEN call.  The metadata server may ignore this attribute.  This
   attribute is a sub-set of these events occur, the client finds out about the
   situation through the status layout structure returned by the SEQUENCE operation.  Any
   use LAYOUTGET.
   For example, instead of stateids associated with revoked locks will receive specifying particular devices, this would be
   used to suggest the error
   NFS4ERR_ADMIN_REVOKED or NFS4ERR_DELEG_REVOKED, as appropriate.

   In all situations in which a subset stripe width of locking state may have been
   revoked, which include all cases in which locking state a file.  It is revoked up to the server
   implementation to determine which fields within the lease period, layout it is up uses.

10.16.  mdsthreshold

   This attribute acts as a hint to the client to help it determine which
   locks have been revoked when
   it is more efficient to issue read and which have not.  It does this by using write requests to the TEST_STATEID operation on metadata
   server vs. the appropriate set dataserver.  Two types of stateid's.  Once thresholds are described:
   file size thresholds and I/O size thresholds.  If a file's size is
   smaller than the set of revoked locks has been determined, file size threshold, data accesses should be issued
   to the applications can metadata server.  If an I/O is below the I/O size threshold,
   the I/O should be
   notified, and issued to the invalidated stateid's metadata server.  Each threshold can
   be freed specified independently for read and lock
   revocation acknowledged by using FREE_STATEID.

8.8.  Share Reservations

   A share reservation is write requests.  For either
   threshold type, a mechanism to control access value of 0 indicates no read or write should be
   issued to the metadata server, while a file.  It value of all 1s indicates all
   reads or writes should be issued to the metadata server.

   The attribute is available on a separate and independent mechanism from record locking.  When a
   client opens a file, it issues an OPEN operation per filehandle basis.  If the current
   filehandle refers to a non-pNFS file or directory, the metadata
   server
   specifying the type should return an attribute that is representative of access required (READ, WRITE, or BOTH) and the
   type
   filehandle's file system.  It is suggested that this attribute is
   queried as part of access to deny others (deny NONE, READ, WRITE, or BOTH).  If the OPEN fails operation.  Due to dynamic system
   changes, the client will fail the application's open request.

   Pseudo-code definition of the semantics:

           if (request.access == 0)
           return (NFS4ERR_INVAL)
           else
           if ((request.access & file_state.deny)) ||
           (request.deny & file_state.access))
           return (NFS4ERR_DENIED)

   This checking of share reservations on OPEN is done with no exception
   for an existing OPEN for should not assume that the same open-owner.

   The constants used attribute will remain
   constant for any specific time period, thus it should be periodically
   refreshed.

11.  Access Control Lists

   Access Control Lists (ACLs) are a file attribute that specify fine
   grained access control.  This chapter covers the OPEN "acl", "aclsupport",
   and OPEN_DOWNGRADE operations "mode" file attributes, and their interactions.

11.1.  Goals

   ACLs and modes represent two well established but different models
   for specifying permissions.  This chapter specifies requirements that
   attempt to meet the
   access and deny fields are as follows:

           const OPEN4_SHARE_ACCESS_READ   = 0x00000001;
           const OPEN4_SHARE_ACCESS_WRITE  = 0x00000002;
           const OPEN4_SHARE_ACCESS_BOTH   = 0x00000003;

           const OPEN4_SHARE_DENY_NONE     = 0x00000000;
           const OPEN4_SHARE_DENY_READ     = 0x00000001;
           const OPEN4_SHARE_DENY_WRITE    = 0x00000002;
           const OPEN4_SHARE_DENY_BOTH     = 0x00000003;

8.9.  OPEN/CLOSE Operations

   To provide correct share semantics, following goals:

   o  If a client MUST use server supports the OPEN
   operation mode attribute, it should provide
      reasonable semantics to obtain the initial filehandle clients that only set and indicate retrieve the desired
   access and what if any access
      mode attribute.

   o  If a server supports the ACL attribute, it should provide
      reasonable semantics to deny.  Even clients that only set and retrieve the ACL
      attribute.

   o  On servers that support the mode attribute, if the client intends to
   use a stateid of all 0's ACL attribute
      has never been set on an object, via inheritance or all 1's, it must still obtain explicitly,
      the
   filehandle for behavior should be traditional UNIX-like behavior.

   o  On servers that support the regular file with mode attribute, if the OPEN operation so ACL attribute
      has been previously set on an object, either explicitly or via
      inheritance:

      *  Setting only the
   appropriate share semantics can be applied. mode attribute should effectively control the
         traditional UNIX-like permissions of read, write, and execute
         on owner, owner_group, and other.

      *  Setting only the mode attribute should provide reasonable
         security.  For clients that do not
   have example, setting a deny mode built into their open programming interfaces, deny
   equal to NONE of 000 should be used.

   The OPEN operation with the CREATE flag, also subsumes the CREATE
   operation enough
         to ensure that future opens for regular files as used in previous versions read or write by any principal
         should fail, regardless of the NFS
   protocol.  This allows a create with previously existing or inherited
         ACL.

   o  It must be possible to implement a share server such that its clients
      can have POSIX compliant semantics.

   o  This minor version of NFSv4 should not introduce significantly
      different semantics relating to be done atomically.

   The CLOSE operation removes all share reservations held by the open-
   owner mode and ACL attributes, nor
      should it render invalid any existing conformant implementations.

      Rather, this chapter provides clarifications based on that file.  If record locks are held, the client SHOULD
   release all locks before issuing previous
      implementations and discussions around them.

   o  If a CLOSE.  The server MAY free supports the ACL attribute, then at any time, the
      server can provide an ACL attribute when requested.  The ACL
      attribute will describe all
   outstanding locks permissions on CLOSE but some servers may not support the CLOSE
   of a file that still has record locks held.  The server MUST return
   failure, NFS4ERR_LOCKS_HELD, if any locks would exist after object, except
      for the
   CLOSE. three high-order bits of the mode attribute (described in
      Section 11.2.2).  The LOOKUP operation ACL attribute will return a filehandle without establishing
   any lock state not conflict with the
      mode attribute, on servers that support the server.  Without mode attribute.

   o  If a valid stateid, the server
   will assume supports the client has mode attribute, then at any time, the least access.  For example, a file
   opened with deny READ/WRITE cannot be accessed using
      server can provide a filehandle
   obtained through LOOKUP because it would mode attribute when requested.  The mode
      attribute will not have a valid stateid
   (i.e. using a stateid of all bits 0 or all bits 1).

8.10.  Open Upgrade and Downgrade

   When an OPEN is done for a file and the open-owner for which the open
   is being done already has conflict with the file open, ACL attribute, on servers
      that support the result ACL attribute.

   o  When a mode attribute is to upgrade the
   open file status maintained set on an object, the server ACL attribute may
      need to include the access and
   deny bits specified by the new OPEN as well be modified so as those for to not conflict with the existing
   OPEN.  The result new mode.  In
      such cases, it is desirable that there is one open file, the ACL keep as far much information
      as the
   protocol is concerned, and it possible.  This includes the union of the access information about inheritance, AUDIT
      and
   deny bits for all of the OPEN requests completed.  Only a single
   CLOSE will be done to reset the effects of both OPENs.  Note ALARM ACEs, and permissions granted and denied that the
   client, when issuing the OPEN, may do not know that
      conflict with the same file is in
   fact being opened. new mode.

11.2.  File Attributes Discussion

11.2.1.  ACL Attribute

   The above only applies if both OPENs result in
   the OPENed object being designated by the same filehandle.

   When the server chooses to export multiple filehandles corresponding
   to NFS version 4 ACL attribute is an array of access control entries
   (ACEs).  Although the same file object client can read and returns different filehandles on two
   different OPENs of write the same file object, ACL attribute,
   the server MUST NOT "OR"
   together is responsible for using the ACL to perform access and deny bits and coalesce the two open files.
   Instead
   control.  The client can use the server must maintain separate OPENs with separate
   stateids and will require separate CLOSEs OPEN or ACCESS operations to free them.

   When multiple open files on the client are merged into check
   access without modifying or reading data or metadata.

   The NFS ACE attribute is defined as follows:

                       typedef uint32_t   acetype4;
                       typedef uint32_t   aceflag4;
                       typedef uint32_t   acemask4;

                       struct nfsace4 {
                           acetype4       type;
                           aceflag4       flag;
                           acemask4       access_mask;
                           utf8str_mixed  who;
                       };

   To determine if a single open
   file object on request succeeds, the server, server processes each nfsace4
   entry in order.  Only ACEs which have a "who" that matches the close of one
   requester are considered.  Each ACE is processed until all of the open files (on the
   client) may necessitate change
   bits of the requester's access and deny status of the
   open file on the server.  This have been ALLOWED.  Once a bit (see
   below) has been ALLOWED by an ACCESS_ALLOWED_ACE, it is because no longer
   considered in the union processing of later ACEs.  If an ACCESS_DENIED_ACE
   is encountered where the requester's access and
   deny still has unALLOWED bits for
   in common with the remaining opens may be smaller (i.e. a proper
   subset) than previously.  The OPEN_DOWNGRADE operation is used to
   make "access_mask" of the necessary change and ACE, the client should use it to update request is denied.
   When the
   server so that share reservation requests by other clients ACL is fully processed, if there are
   handled properly.

8.11.  Short and Long Leases

   When determining bits in the time period for requester's
   mask that have not been ALLOWED or DENIED, access is denied.

   Unlike the server lease, ALLOW and DENY ACE types, the usual
   lease tradeoffs apply.  Short leases ALARM and AUDIT ACE types do
   not affect a requester's access, and instead are good for fast server
   recovery at triggering
   events as a cost result of increased operations to effect lease renewal
   (when there are no other operations during the period to effect lease
   renewal as a side-effect).  Long leases are certainly kinder requester's access attempt.  Therefore, all
   AUDIT and
   gentler to servers trying to handle very large numbers of clients.
   The number ALARM ACEs are processed until end of extra requests to effect lock renewal drop in inverse
   proportion to the lease time. ACL.

   The disadvantages of long leases
   include the possibility of slower recovery after certain failures.
   After NFS version 4 ACL model is quite rich.  Some server failure, a longer grace period platforms may be required when some
   clients do
   provide access control functionality that goes beyond the UNIX-style
   mode attribute, but which is not promptly reclaim their locks and do a
   RECLAIM_COMPLETE.  In as rich as the event of client failure, it NFS ACL model.  So
   that users can longer
   period for leases to expire thus forcing conflicting requests to
   wait.

   Long leases are usable if the server is able to store lease state in
   non-volatile memory.  Upon recovery, take advantage of this more limited functionality, the
   server can reconstruct may indicate that it supports ACLs as long as it follows the
   lease state from its non-volatile memory and continue operation with
   guidelines for mapping between its clients and therefore long leases would not be an issue.

8.12.  Clocks, Propagation Delay, ACL model and Calculating Lease Expiration

   To avoid the need for synchronized clocks, lease times are granted NFS version 4
   ACL model.

   The situation is complicated by the server as a time delta.  However, there is a requirement fact that the
   client and a server clocks do not drift excessively over the duration
   of the lock.  There is also the issue of propagation delay across the
   network which could easily be several hundred milliseconds as well as
   the possibility may have
   multiple modules that requests will be lost and need to be
   retransmitted.

   To take propagation delay into account, enforce ACLs.  For example, the client should subtract it enforcement for
   NFS version 4 access may be different from lease times (e.g. if the client estimates enforcement for local
   access, and both may be different from the one-way
   propagation delay enforcement for access
   through other protocols such as 200 msec, then it can assume that the lease is
   already 200 msec old when it gets it).  In addition, SMB.  So it will take
   another 200 msec to get may be useful for a response back
   server to the server.  So the client
   must send a lock renewal or write data back accept an ACL even if not all of its modules are able to the server 400 msec
   before the lease would expire.
   support it.

   The server's lease period configuration should take into account the
   network distance of the clients that will be accessing the server's
   resources.  It guiding principle in all cases is expected that the lease period will take into
   account server must not accept
   ACLs that appear to make the network propagation delays and other network delay
   factors file more secure than it really is.

11.2.1.1.  ACE Type

   The constants used for the client population.  Since type field (acetype4) are as follows:

                     const ACE4_ACCESS_ALLOWED_ACE_TYPE = 0x00000000;
                     const ACE4_ACCESS_DENIED_ACE_TYPE  = 0x00000001;
                     const ACE4_SYSTEM_AUDIT_ACE_TYPE   = 0x00000002;
                     const ACE4_SYSTEM_ALARM_ACE_TYPE   = 0x00000003;
   +------------------------------+--------------+---------------------+
   | Value                        | Abbreviation | Description         |
   +------------------------------+--------------+---------------------+
   | ACE4_ACCESS_ALLOWED_ACE_TYPE | ALLOW        | Explicitly grants   |
   |                              |              | the protocol does not allow
   for an automatic method access defined  |
   |                              |              | in acemask4 to determine an appropriate lease period, the
   server's administrator may have  |
   |                              |              | file or directory.  |
   | ACE4_ACCESS_DENIED_ACE_TYPE  | DENY         | Explicitly denies   |
   |                              |              | the access defined  |
   |                              |              | in acemask4 to tune the lease period.

8.13.  Vestigial Locking Infrastructure From V4.0

   There are  |
   |                              |              | file or directory.  |
   | ACE4_SYSTEM_AUDIT_ACE_TYPE   | AUDIT        | LOG (system         |
   |                              |              | dependent) any      |
   |                              |              | access attempt to a number |
   |                              |              | file or directory   |
   |                              |              | which uses any of operations and fields within existing
   operations that no longer have a function   |
   |                              |              | the access methods  |
   |                              |              | specified in minor version one.  In
   one way        |
   |                              |              | acemask4.           |
   | ACE4_SYSTEM_ALARM_ACE_TYPE   | ALARM        | Generate a system   |
   |                              |              | ALARM (system       |
   |                              |              | dependent) when any |
   |                              |              | access attempt is   |
   |                              |              | made to a file or another, these changes are all due   |
   |                              |              | directory for the   |
   |                              |              | access methods      |
   |                              |              | specified in        |
   |                              |              | acemask4.           |
   +------------------------------+--------------+---------------------+

    The "Abbreviation" column denotes how the types will be referred to
                   throughout the implementation rest of sessions which provides client context and replay protection as a
   base feature this document.

11.2.1.2.  The aclsupport Attribute

   A server need not support all of the protocol, separate from locking itself.

   The following operations have become mandatory-to-not-implement. above ACE types.  The
   server should return NFS4ERR_NOTSUPP if these operations bitmask
   constants used to represent the above definitions within the
   aclsupport attribute are found in
   an NFSv4.1 COMPOUND.

   o  SETCLIENTID since its function has been replaced by
      CREATE_CLIENTID.

   o  SETCLIENTID_CONFIRM since clientid confirmation now happens by
      means of CREATE_SESSION.

   o  OPEN_CONFIRM because OPEN's no longer require confirmation as follows:

                     const ACL4_SUPPORT_ALLOW_ACL    = 0x00000001;
                     const ACL4_SUPPORT_DENY_ACL     = 0x00000002;
                     const ACL4_SUPPORT_AUDIT_ACL    = 0x00000004;
                     const ACL4_SUPPORT_ALARM_ACL    = 0x00000008;

   Clients should not attempt to
      establish set an owner-based sequence value.

   o  RELEASE_LOCKOWNER because lock-owners with no associated locks
      have any sequence-related state and so can be deleted by ACE unless the server at will.

   o  RENEW because every SEQUENCE operation claims
   support for a session causes lease
      renewal, making a separate operation useless.

   Also, there are a number of fields, present in existing operations
   related to locking that have no use in minor version one.  They were
   used in minor version zero to perform functions now provided in ACE type.  If the server receives a
   different fashion.

   o  Sequence id's used request to sequence requests for set
   an ACE that it cannot store, it MUST reject the request with
   NFS4ERR_ATTRNOTSUPP.  If the server receives a given state-owner
      and to provide replay protection, now provided via sessions.

   o  Clientid's used request to identify set an ACE
   that it can store but cannot enforce, the client associated server SHOULD reject the
   request with NFS4ERR_ATTRNOTSUPP.

   Example: suppose a given
      request.  Client identification is now available using server can enforce NFS ACLs for NFS access but
   cannot enforce ACLs for local access.  If arbitrary processes can run
   on the
      clientid associated with server, then the current session, without needing an
      explicit clientid field.

   Such vestigial fields in existing operations should be set by server SHOULD NOT indicate ACL support.  On
   the
   client to zero.  When they are not, other hand, if only trusted administrative programs run locally,
   then the server MUST return an
   NFS4ERR_INVAL error.

9.  Client-Side Caching

   Client-side caching may indicate ACL support.

11.2.1.3.  ACE Access Mask

   The bitmask constants used for the access mask field are as follows:

              const ACE4_READ_DATA            = 0x00000001;
              const ACE4_LIST_DIRECTORY       = 0x00000001;
              const ACE4_WRITE_DATA           = 0x00000002;
              const ACE4_ADD_FILE             = 0x00000002;
              const ACE4_APPEND_DATA          = 0x00000004;
              const ACE4_ADD_SUBDIRECTORY     = 0x00000004;
              const ACE4_READ_NAMED_ATTRS     = 0x00000008;
              const ACE4_WRITE_NAMED_ATTRS    = 0x00000010;
              const ACE4_EXECUTE              = 0x00000020;
              const ACE4_DELETE_CHILD         = 0x00000040;
              const ACE4_READ_ATTRIBUTES      = 0x00000080;
              const ACE4_WRITE_ATTRIBUTES     = 0x00000100;
              const ACE4_DELETE               = 0x00010000;
              const ACE4_READ_ACL             = 0x00020000;
              const ACE4_WRITE_ACL            = 0x00040000;
              const ACE4_WRITE_OWNER          = 0x00080000;
              const ACE4_SYNCHRONIZE          = 0x00100000;

11.2.1.3.1.  Discussion of data, Mask Attributes

    ACE4_READ_DATA
       Operation(s) affected:
            READ
            OPEN
       Discussion:
            Permission to read the data of file attributes, and the file.

            Servers SHOULD allow a user the ability to read the data
            of the file names when only the ACE4_EXECUTE access mask bit is
   essential
            allowed.

    ACE4_LIST_DIRECTORY
        Operation(s) affected:
            READDIR
        Discussion:

            Permission to providing good performance with list the NFS protocol.
   Providing distributed cache coherence is contents of a difficult problem and
   previous versions directory.

    ACE4_WRITE_DATA
        Operation(s) affected:
            WRITE
            OPEN
            SETATTR of the NFS protocol have not attempted it.
   Instead, several NFS client implementation techniques have been used size
        Discussion:
            Permission to reduce the problems that modify a lack of coherence poses for users.
   These techniques have not been clearly defined by earlier protocol
   specifications file's data anywhere in the file's
            offset range.  This includes the ability to write to any
            arbitrary offset and it is often unclear what is valid or invalid
   client behavior.

   The NFS version 4 protocol uses many techniques similar as a result to grow the file.

    ACE4_ADD_FILE
        Operation(s) affected:
            CREATE
            OPEN
        Discussion:
            Permission to those that
   have been used add a new file in previous protocol versions. a directory.  The NFS version 4
   protocol does CREATE
            operation is affected when nfs_ftype4 is NF4LNK, NF4BLK,
            NF4CHR, NF4SOCK, or NF4FIFO. (NF4DIR is not provide distributed cache coherence.  However, listed because
            it
   defines is covered by ACE4_ADD_SUBDIRECTORY.) OPEN is affected
            when used to create a more limited set regular file.

    ACE4_APPEND_DATA
        Operation(s) affected:
            WRITE
            OPEN
            SETATTR of caching guarantees to allow locks and
   share reservations size
        Discussion:
             The ability to be used without destructive interference from
   client side caching.

   In addition, the NFS version 4 protocol introduces modify a delegation
   mechanism which file's data, but only starting at
             EOF.  This allows many decisions normally made by for the server to
   be made locally by clients.  This mechanism provides efficient
   support notion of append-only files, by
             allowing ACE4_APPEND_DATA and denying ACE4_WRITE_DATA to
             the common cases where sharing is infrequent same user or where
   sharing group.  If a file has an ACL such as the
             one described above and a WRITE request is read-only.

9.1.  Performance Challenges made for Client-Side Caching

   Caching techniques used in previous versions of
             somewhere other than EOF, the NFS protocol have
   been successful server SHOULD return
             NFS4ERR_ACCESS.

    ACE4_ADD_SUBDIRECTORY
        Operation(s) affected:
            CREATE
        Discussion:
            Permission to create a subdirectory in providing good performance.  However, several
   scalability challenges can arise when those techniques are used with
   very large numbers of clients.  This a directory.  The
            CREATE operation is particularly true affected when
   clients are geographically distributed which classically increases nfs_ftype4 is NF4DIR.

    ACE4_READ_NAMED_ATTRS
        Operation(s) affected:
            OPENATTR
        Discussion:

            Permission to read the latency for cache revalidation requests.

   The previous versions named attributes of the NFS protocol repeat their a file data
   cache validation requests at the time or to
            lookup the file named attributes directory.  OPENATTR is opened.  This
   behavior can have serious performance drawbacks.  A common case
            affected when it is
   one in which not used to create a file named attribute
            directory.  This is only accessed by when 1.) createdir is TRUE, but a single client.  Therefore,
   sharing
            named attribute directory already exists, or 2.) createdir
            is infrequent.

   In this case, repeated reference FALSE.

    ACE4_WRITE_NAMED_ATTRS
        Operation(s) affected:
            OPENATTR
        Discussion:
            Permission to write the server named attributes of a file or
            to find that no
   conflicts exist create a named attribute directory.  OPENATTR is expensive.  A better option with regards to
   performance
            affected when it is used to allow a client that repeatedly opens create a file to do
   so without reference to the server. named attribute
            directory.  This is done until potentially
   conflicting operations from another client actually occur.

   A similar situation arises in connection with file locking.  Sending
   file lock when createdir is TRUE and unlock requests no named
            attribute directory exists.  The ability to check whether
            or not a named attribute directory exists depends on the server as well as
            ability to look it up, therefore, users also need the read and
   write requests necessary
            ACE4_READ_NAMED_ATTRS permission in order to make create a
            named attribute directory.

    ACE4_EXECUTE
        Operation(s) affected:
            LOOKUP
            READ
            OPEN
        Discussion:
            Permission to execute a file or traverse/search a
            directory.

            Servers SHOULD allow a user the ability to read the data caching consistent with
            of the
   locking semantics (see file when only the section "Data Caching and File Locking")
   can severely limit performance.  When locking ACE4_EXECUTE access mask bit is used
            allowed.  This is because there is no way to provide
   protection against infrequent conflicts, execute a large penalty is incurred.
   This penalty may discourage the use of
            file locking by applications.

   The NFS version 4 protocol provides more aggressive caching
   strategies with without reading the following design goals:

   .IP o Compatibility with contents.  Though a large range of server semantics. .IP o
   Provide may
            treat ACE4_EXECUTE and ACE4_READ_DATA bits identically
            when deciding to permit a READ operation, it SHOULD still
            allow the same caching benefits as previous versions two bits to be set independently in ACLs, and
            MUST distinguish between them when replying to ACCESS
            operations.  In particular, servers SHOULD NOT silently
            turn on one of the NFS
   protocol two bits when unable the other is set, as
            that would make it impossible for the client to provide correctly
            enforce the more aggressive model. .IP o
   Requirements distinction between read and execute
            permissions.

             As an example, following a SETATTR of the following ACL:
                     nfsuser:ACE4_EXECUTE:ALLOW

             A subsequent GETATTR of ACL for aggressive caching are organized so that file SHOULD return:

                     nfsuser:ACE4_EXECUTE:ALLOW
             Rather than:
                     nfsuser:ACE4_EXECUTE/ACE4_READ_DATA:ALLOW

    ACE4_DELETE_CHILD
        Operation(s) affected:
            REMOVE
        Discussion:
            Permission to delete a large
   portion file or directory within a
            directory.  See section "ACE4_DELETE vs. ACE4_DELETE_CHILD"
            for information on how these two access mask bits interact.

    ACE4_READ_ATTRIBUTES
        Operation(s) affected:
            GETATTR of the benefit file system object attributes
        Discussion:
            The ability to read basic attributes (non-ACLs) of a file.
            On a UNIX system, basic attributes can be obtained even when not all thought of as
            the
   requirements stat level attributes.  Allowing this access mask bit
            would mean the entity can execute "ls -l" and stat.

    ACE4_WRITE_ATTRIBUTES
        Operation(s) affected:
            SETATTR of time_access_set, time_backup,
            time_create, time_modify_set, mimetype, hidden, system
        Discussion:
            Permission to change the times associated with a file
            or directory to an arbitrary value.  Also permission
            to change the mimetype, hidden and system attributes.
            A user having ACE4_WRITE_DATA permission, but lacking
            ACE4_WRITE_ATTRIBUTES must be met. .LP The appropriate requirements allowed to implicitly set
            the times associated with a file.

    ACE4_DELETE
        Operation(s) affected:
            REMOVE
        Discussion:
            Permission to delete the file or directory.  See section
            "ACE4_DELETE vs. ACE4_DELETE_CHILD" for information on how
            these two access mask bits interact.

    ACE4_READ_ACL
        Operation(s) affected:
            GETATTR of acl
        Discussion:
            Permission to read the
   server are discussed in later sections in which specific forms ACL.

    ACE4_WRITE_ACL
        Operation(s) affected:
            SETATTR of
   caching are covered. (see acl and mode
        Discussion:
            Permission to write the section "Open Delegation").

9.2.  Delegation acl and Callbacks

   Recallable delegation mode attributes.

    ACE4_WRITE_OWNER
        Operation(s) affected:
            SETATTR of owner and owner_group
        Discussions:
            Permission to write the owner and owner_group attributes.
            On UNIX systems, this is the ability to execute chown().

    ACE4_SYNCHRONIZE
        Operation(s) affected:
            NONE
        Discussion:
            Permission to access file locally at the server responsibilities for with
            synchronized reads and writes.

   Server implementations need not provide the granularity of control
   that is implied by this list of masks.  For example, POSIX-based
   systems might not distinguish ACE4_APPEND_DATA (the ability to append
   to a file file) from ACE4_WRITE_DATA (the ability to modify existing
   contents); both masks would be tied to a
   client improves performance by avoiding repeated requests single "write" permission.
   When such a server returns attributes to the client, it would show
   both ACE4_APPEND_DATA and ACE4_WRITE_DATA if and only if the write
   permission is enabled.

   If a server receives a SETATTR request that it cannot accurately
   implement, it should error in the absence of inter-client conflict.  With the use direction of a
   "callback" RPC from server to client, more restricted
   access.  For example, suppose a server recalls delegated
   responsibilities when another client engages cannot distinguish overwriting
   data from appending new data, as described in sharing of the previous paragraph.
   If a
   delegated file.

   A delegation client submits an ACE where ACE4_APPEND_DATA is passed from set but
   ACE4_WRITE_DATA is not (or vice versa), the server to should reject the client, specifying
   request with NFS4ERR_ATTRNOTSUPP.  Nonetheless, if the ACE has type
   DENY, the server may silently turn on the other bit, so that both
   ACE4_APPEND_DATA and ACE4_WRITE_DATA are denied.

11.2.1.3.2.  ACE4_DELETE vs. ACE4_DELETE_CHILD

   Two access mask bits govern the ability to delete a file or directory
   object: ACE4_DELETE on the object of itself, and ACE4_DELETE_CHILD on
   the delegation object's parent directory.

   Many systems also consult the "sticky bit" (MODE4_SVTX) and write
   mode bit on the type of delegation.  There are
   different types of delegations but each type contains parent directory when determining whether to allow a stateid
   file to be
   used deleted.  The mode bit for write corresponds to represent
   ACE4_WRITE_DATA, which is the delegation same physical bit as ACE4_ADD_FILE.

   Therefore, ACE4_ADD_FILE can come into play when performing operations that
   depend on determining
   permission to delete.

   In the delegation.  This stateid algorithm below, the strategy is similar to those
   associated with locks and share reservations but differs in that ACE4_DELETE and
   ACE4_DELETE_CHILD take precedence over the
   stateid for a delegation is associated with a clientid sticky bit, and may be
   used the sticky
   bit takes precedence over the "write" mode bits (reflected in
   ACE4_ADD_FILE).

   Server implementations SHOULD grant or deny permission to delete
   based on behalf of all the open_owners for following algorithm.

       if ACE4_EXECUTE is denied by the given client.  A
   delegation parent directory ACL:
           deny delete
       else if ACE4_DELETE is made to allowed by the client as a whole and not to any specific
   process or thread of control within it.

   Because callback RPCs may not work in all environments (due to
   firewalls, target object ACL:
           allow delete
       else if ACE4_DELETE_CHILD is allowed by the parent
       directory ACL:
           allow delete
       else if ACE4_DELETE_CHILD is denied by the
       parent directory ACL:
           deny delete
       else if ACE4_ADD_FILE is allowed by the parent directory ACL:
           if MODE4_SVTX is set for example), correct protocol operation does not depend
   on them.  Preliminary testing of callback functionality the parent directory:
               if the principal owns the parent directory OR
                   the principal owns the target object OR
                   ACE4_WRITE_DATA is allowed by means of a
   CB_NULL procedure determines whether callbacks can be supported.  The
   CB_NULL procedure checks the continuity of target
                   object ACL:
                       allow delete
                   else:
                       deny delete
           else:
               allow delete
       else:
           deny delete

11.2.1.4.  ACE flag

   The bitmask constants used for the callback path. flag field are as follows:

              const ACE4_FILE_INHERIT_ACE             = 0x00000001;
              const ACE4_DIRECTORY_INHERIT_ACE        = 0x00000002;
              const ACE4_NO_PROPAGATE_INHERIT_ACE     = 0x00000004;
              const ACE4_INHERIT_ONLY_ACE             = 0x00000008;
              const ACE4_SUCCESSFUL_ACCESS_ACE_FLAG   = 0x00000010;
              const ACE4_FAILED_ACCESS_ACE_FLAG       = 0x00000020;
              const ACE4_IDENTIFIER_GROUP             = 0x00000040;

   A server makes a preliminary assessment need not support any of callback availability to a
   given client and avoids delegating responsibilities until it has
   determined these flags.  If the server supports
   flags that callbacks are supported.  Because similar to, but not exactly the granting of same as, these flags,
   the implementation may define a
   delegation mapping between the protocol-defined
   flags and the implementation-defined flags.  Again, the guiding
   principle is always conditional upon that the absence of conflicting
   access, clients must file not assume that a delegation will be granted and
   they must always be prepared for OPENs appear to be processed without any
   delegations being granted.

   Once granted, a delegation behaves in most ways like more secure than it
   really is.

   For example, suppose a lock.  There
   is an associated lease that is subject client tries to renewal together set an ACE with all
   of
   ACE4_FILE_INHERIT_ACE set but not ACE4_DIRECTORY_INHERIT_ACE.  If the other leases held by that client.

   Unlike locks, an operation by a second client to a delegated file
   will cause
   server does not support any form of ACL inheritance, the server to recall a delegation through a callback.

   On recall,
   should reject the client holding request with NFS4ERR_ATTRNOTSUPP.  If the delegation must flush modified
   state (such as modified data) server
   supports a single "inherit ACE" flag that applies to both files and
   directories, the server and return may reject the
   delegation.  The conflicting request will not receive a response
   until the recall is complete.  The recall is considered complete when (i.e., requiring the
   client returns the delegation or to set both the file and directory inheritance flags).  The
   server times out on may also accept the
   recall request and revokes the delegation as a result of the timeout.
   Following silently turn on the resolution
   ACE4_DIRECTORY_INHERIT_ACE flag.

11.2.1.4.1.  Discussion of the recall, the server has the
   information necessary Flag Bits

   ACE4_FILE_INHERIT_ACE
      Can be placed on a directory and indicates that this ACE should be
      added to grant or deny the second client's request.

   At the time the client receives each new non-directory file created.

   ACE4_DIRECTORY_INHERIT_ACE
      Can be placed on a delegation recall, it may have
   substantial state directory and indicates that needs this ACE should be
      added to each new directory created.

   ACE4_INHERIT_ONLY_ACE
      Can be flushed placed on a directory but does not apply to the server.  Therefore,
   the server should allow sufficient time for the delegation directory;
      ALLOW and DENY ACEs with this bit set do not affect access to be
   returned since it may involve numerous RPCs the
      directory, and AUDIT and ALARM ACEs with this bit set do not
      trigger log or alarm events.  Such ACEs only take effect once they
      are applied (with this bit cleared) to newly created files and
      directories as specified by the server.  If above two flags.

   ACE4_NO_PROPAGATE_INHERIT_ACE
      Can be placed on a directory.  This flag tells the server is able to determine that the client is diligently flushing
   state
      inheritance of this ACE should stop at newly created child
      directories.

   ACE4_SUCCESSFUL_ACCESS_ACE_FLAG

   ACE4_FAILED_ACCESS_ACE_FLAG
      The ACE4_SUCCESSFUL_ACCESS_ACE_FLAG (SUCCESS) and
      ACE4_FAILED_ACCESS_ACE_FLAG (FAILED) flag bits relate only to
      ACE4_SYSTEM_AUDIT_ACE_TYPE (AUDIT) and ACE4_SYSTEM_ALARM_ACE_TYPE
      (ALARM) ACE types.  If during the server as a result processing of the recall, file's ACL,
      the server may extend encounters an AUDIT or ALARM ACE that matches the usual time allowed for a recall.  However,
      principal attempting the time allowed for
   recall completion should not be unbounded.

   An example of this is when responsibility to mediate opens on a given
   file is delegated to a client (see OPEN, the section "Open Delegation").
   The server will not know what opens are notes that fact, and the
      presence, if any, of the SUCCESS and FAILED flags encountered in effect on
      the client.
   Without this knowledge AUDIT or ALARM ACE.  Once the server will be unable to determine completes the ACL
      processing, it then notes if the
   access operation succeeded or failed.
      If the operation succeeded, and deny state if the SUCCESS flag was set for a
      matching AUDIT or ALARM ACE, then the file allows any particular open until appropriate AUDIT or ALARM
      event occurs.  If the delegation operation failed, and if the FAILED flag was
      set for the file has been returned.

   A client failure matching AUDIT or a network partition ALARM ACE, then the appropriate
      AUDIT or ALARM event occurs.  Either or both of the SUCCESS or
      FAILED can result in failure to
   respond to a recall callback.  In this case, be set, but if neither is set, the server will revoke AUDIT or ALARM ACE
      is not useful.

      The previously described processing applies to that of the delegation which in turn will render useless any modified state
   still on ACCESS
      operation as well, the client.

9.2.1.  Delegation Recovery

   There are three situations difference being that delegation recovery must deal with:

   o  Client reboot or restart

   o  Server reboot "success" or restart

   o  Network partition (full
      "failure" does not mean whether ACCESS returns NFS4_OK or callback-only)

   In the event not.
      Success means whether ACCESS returns all requested and supported
      bits.  Failure means whether ACCESS failed to return a bit that
      was requested and supported.

   ACE4_IDENTIFIER_GROUP
      Indicates that the client reboots "who" refers to a GROUP as defined under UNIX
      or restarts, a GROUP ACCOUNT as defined under Windows.  Clients and servers
      must ignore the failure ACE4_IDENTIFIER_GROUP flag on ACEs with a who
      value equal to renew
   leases will result in one of the revocation special identifiers outlined in
      Section 11.2.1.5.

11.2.1.5.  ACE Who

   The "who" field of record locks and share
   reservations.  Delegations, however, an ACE is an identifier that specifies the
   principal or principals to whom the ACE applies.  It may be treated refer to a
   user or a group, with the flag bit
   differently. ACE4_IDENTIFIER_GROUP specifying
   which.

   There will be situations in are several special identifiers which delegations will need to be
   reestablished after understood
   universally, rather than in the context of a particular DNS domain.
   Some of these identifiers cannot be understood when an NFS client reboots or restarts.  The reason for
   this is
   accesses the client may server, but have file data stored locally and this data
   was associated with meaning when a local process accesses
   the previously held delegations. file.  The client will
   need ability to reestablish display and modify these permissions is
   permitted over NFS, even if none of the appropriate file state access methods on the server.

   To allow for this type of client recovery, the server MAY extend
   understands the
   period for delegation recovery beyond identifiers.

   +---------------+--------------------------------------------------+
   | Who           | Description                                      |
   +---------------+--------------------------------------------------+
   | OWNER         | The owner of the typical lease expiration
   period.  This implies that requests from other clients that conflict file                            |
   | GROUP         | The group associated with these delegations will need to wait.  Because the normal recall
   process may require significant time for file.              |
   | EVERYONE      | The world, including the client to flush changed
   state to owner and owning group. |
   | INTERACTIVE   | Accessed from an interactive terminal.           |
   | NETWORK       | Accessed via the server, other clients need be prepared for delays that
   occur because of network.                        |
   | DIALUP        | Accessed as a conflicting delegation.  This longer interval
   would increase the window for clients dialup user to reboot and consult stable
   storage so that the delegations can be reclaimed.  For open
   delegations, such delegations are reclaimed using OPEN with server.         |
   | BATCH         | Accessed from a claim
   type of CLAIM_DELEGATE_PREV.  (See the sections on "Data Caching and
   Revocation" and "Operation 18: OPEN" for discussion of open
   delegation and the details batch job.                       |
   | ANONYMOUS     | Accessed without any authentication.             |
   | AUTHENTICATED | Any authenticated user (opposite of OPEN respectively).

   A server MAY support ANONYMOUS)   |
   | SERVICE       | Access from a claim type of CLAIM_DELEGATE_PREV, but if it
   does, it MUST NOT remove delegations upon SETCLIENTID_CONFIRM, system service.                    |
   +---------------+--------------------------------------------------+

                                  Table 7

   To avoid conflict, these special identifiers are distinguish by an
   appended "@" and
   instead MUST, for a period of time should appear in the form "xxxx@" (note: no less than that of domain
   name after the value "@").  For example: ANONYMOUS@.

11.2.1.5.1.  Discussion of
   the lease_time attribute, maintain the client's delegations to allow
   time for the client EVERYONE@

   It is important to issue CLAIM_DELEGATE_PREV requests.  The
   server note that supports CLAIM_DELEGATE_PREV MUST support "EVERYONE@" is not equivalent to the DELEGPURGE
   operation.

   When
   UNIX "other" entity.  This is because, by definition, UNIX "other"
   does not include the server reboots owner or restarts, delegations owning group of a file.  "EVERYONE@"
   means literally everyone, including the owner or owning group.

11.2.2.  mode Attribute

   The NFS version 4 mode attribute is based on the UNIX mode bits.  The
   following bits are reclaimed (using defined:

           const MODE4_SUID = 0x800;  /* set user id on execution */
           const MODE4_SGID = 0x400;  /* set group id on execution */
           const MODE4_SVTX = 0x200;  /* save text even after use */
           const MODE4_RUSR = 0x100;  /* read permission: owner */
           const MODE4_WUSR = 0x080;  /* write permission: owner */
           const MODE4_XUSR = 0x040;  /* execute permission: owner */
           const MODE4_RGRP = 0x020;  /* read permission: group */
           const MODE4_WGRP = 0x010;  /* write permission: group */
           const MODE4_XGRP = 0x008;  /* execute permission: group */
           const MODE4_ROTH = 0x004;  /* read permission: other */
           const MODE4_WOTH = 0x002;  /* write permission: other */
           const MODE4_XOTH = 0x001;  /* execute permission: other */

   Bits MODE4_RUSR, MODE4_WUSR, and MODE4_XUSR apply to the OPEN operation with CLAIM_PREVIOUS) principal
   identified in a similar fashion to
   record locks and share reservations.  However, there is a slight
   semantic difference.  In the normal case if owner attribute.  Bits MODE4_RGRP, MODE4_WGRP, and
   MODE4_XGRP apply to principals identified in the server decides that a
   delegation should owner_group
   attribute but who are not be granted, it performs identified in the requested action
   (e.g.  OPEN) without granting owner attribute.  Bits
   MODE4_ROTH, MODE4_WOTH, MODE4_XOTH apply to any delegation.  For reclaim, the
   server grants principal that does
   not match that in the delegation but owner attribute, and does not have a special designation is applied so group
   matching that the client treats the delegation as having been granted but
   recalled by the server.  Because of this, the client has the duty to
   write all modified state to the owner_group attribute.

   The remaining bits are not defined by this protocol.  A server MUST
   NOT return bits other than those defined above in a GETATTR or
   READDIR operation, and then it MUST return the
   delegation.  This process of handling delegation reclaim reconciles
   three principles of the NFS version 4 protocol:

   o  Upon reclaim, NFS4ERR_INVAL if bits other
   than those defined above are set in a client reporting resources assigned SETATTR, CREATE, or OPEN
   operation.

11.3.  Common Methods

   The requirements in this section will be referred to it by in future
   sections, especially Section 11.4.

11.3.1.  Interpreting an
      earlier server instance must be granted those resources.

   o ACL

11.3.1.1.  Server Considerations

   The server has unquestionable authority uses the algorithm described in Section 11.2.1 to
   determine whether
      delegations are to be granted and, once granted, whether they are an ACL allows access to be continued.

   o  The use of callbacks is an object.  However, the
   ACL may not to be depended upon until the client
      has proven its ability to receive them.

   When a network partition occurs, delegations are subject to freeing
   by the server when the lease renewal period expires.  This is similar
   to the behavior for locks and share reservations. sole determiner of access.  For delegations,
   however, example:

   o  In the case of a file system exported as read-only, the server may extend the period
      deny write permissions even though an object's ACL grants it.

   o  Server implementations MAY grant ACE4_WRITE_ACL and ACE4_READ_ACL
      permissions in which conflicting
   requests are held off.  Eventually order to prevent the occurrence of a conflicting
   request owner from another client will cause revocation of the delegation.
   A loss of the callback path (e.g. by later network configuration
   change) will have getting into the same effect.  A recall request will fail and
   revocation of
      situation where they can't ever modify the delegation ACL.

   o  All servers will result.

   A client normally finds out about revocation of a delegation when it
   uses a stateid associated with allow a delegation and receives user the error
   NFS4ERR_EXPIRED.  It also may find out about delegation revocation
   after a client reboot when it attempts ability to reclaim a delegation and
   receives that same error.  Note that in read the case of a revoked write
   open delegation, there are issues because data may have been modified
   by of the client whose delegation
      file when only the execute permission is revoked granted (i.e.  If the ACL
      denies the user the ACE4_READ_DATA access and separately by other
   clients.  See allows the section "Revocation Recovery for Write Open
   Delegation" for a discussion of such issues.  Note also that when
   delegations are revoked, information about user
      ACE4_EXECUTE, the revoked delegation server will be written by allow the server user to stable storage (as described in read the
   section "Crash Recovery").  This is done to deal with data of
      the case file).

   o  Many servers have the notion of owner-override in which a server reboots after revoking a delegation but before the
   client holding the revoked delegation is notified about the
   revocation.

9.3.  Data Caching

   When applications share access to a set of files, they need to be
   implemented so as to take account owner
      of the possibility of conflicting
   access by another application.  This object is true whether the applications
   in question execute on different clients or reside on the same
   client.

   Share reservations and record locks allowed to override accesses that are denied by
      the facilities the NFS
   version 4 protocol provides ACL.  This may be helpful, for example, to allow applications users
      continued access to coordinate open files on which the permissions have
      changed.

11.3.1.2.  Client Considerations

   Clients SHOULD NOT do their own access by providing mutual exclusion facilities.  The NFS version 4
   protocol's data caching must be implemented such that it does not
   invalidate checks based on their
   interpretation the assumptions that those using these facilities depend
   upon.

9.3.1.  Data Caching ACL, but rather use the OPEN and OPENs

   In order ACCESS operations
   to avoid invalidating do access checks.  This allows the sharing assumptions that
   applications rely on, NFS version 4 clients should not provide cached
   data client to applications or modify it act on behalf the results of an application when it
   would
   having the server determine whether or not access should be valid to obtain or modify that same data via a READ or
   WRITE operation.

   Furthermore, in the absence granted
   based on its interpretation of open delegation (see the section "Open
   Delegation") two additional rules apply.  Note that these rules are
   obeyed in practice by many NFS version 2 and version 3 clients.

   o  First, cached data present on a client ACL.

   Clients must be revalidated after
      doing aware of situations in which an OPEN.  Revalidating means that object's ACL will
   define a certain access even though the client fetches server will not enforce it.
   In general, but especially in these situations, the
      change attribute from client needs to
   do its part in the server, compares it with enforcement of access as defined by the cached
      change attribute, and if different, declares ACL.  To
   do this, the cached data (as
      well as client MAY issue the cached attributes) as invalid.  This is appropriate ACCESS operation prior
   to ensure that servicing the data for request of the OPENed file is still correctly reflected user or application in order to
   determine whether the
      client's cache.  This validation must user or application should be done at least when granted the
      client's OPEN operation includes DENY=WRITE or BOTH thus
      terminating a period
   access requested.  For examples in which other clients may have had the
      opportunity to open the file with WRITE access.  Clients ACL may
      choose to do define accesses
   that the revalidation more often (i.e. at OPENs specifying
      DENY=NONE) server doesn't enforce see Section 11.3.1.1.

11.3.2.  Computing a Mode Attribute from an ACL

   The following method can be used to parallel the NFS version 3 protocol's practice for calculate the benefit of users assuming this degree MODE4_R*, MODE4_W*
   and MODE4_X* bits of cache revalidation.

      Since a mode attribute, based upon an ACL.

   1.  To determine MODE4_ROTH, MODE4_WOTH, and MODE4_XOTH:

       1.  If the change attribute special identifier EVERYONE@ is updated for data and metadata
      modifications, some client implementors may granted
           ACE4_READ_DATA, then the bit MODE4_ROTH SHOULD be tempted to use set.
           Otherwise, MODE4_ROTH SHOULD NOT be set.

       2.  If the
      time_modify attribute and not change to validate cached data, so
      that metadata changes do not spuriously invalidate clean data.
      The implementor is cautioned in this approach.  The change
      attribute special identifier EVERYONE@ is guaranteed to change for each update to granted
           ACE4_WRITE_DATA or ACE4_APPEND_DATA, then the file,
      whereas time_modify bit MODE4_WOTH
           SHOULD be set.  Otherwise, MODE4_WOTH SHOULD NOT be set.

       3.  If the special identifier EVERYONE@ is guaranteed to change only at granted ACE4_EXECUTE,
           then the bit MODE4_XOTH SHOULD be set.  Otherwise, MODE4_XOTH
           SHOULD NOT be set.

   2.  To determine MODE4_RGRP, MODE4_WGRP, and MODE4_XGRP, note that
       the EVERYONE@ special identifier SHOULD be taken into account.
       In other words, when determining if the
      granularity of GROUP@ special identifier
       is granted a permission, ACEs with the time_delta attribute.  Use by identifier EVERYONE@
       should take effect just as ACEs with the client's data
      cache validation logic of time_modify and not change runs special identifier
       GROUP@ would.

       1.  If the risk
      of special identifier GROUP@ is granted ACE4_READ_DATA,
           then the client incorrectly marking stale data as valid.

   o  Second, modified data must bit MODE4_RGRP SHOULD be flushed to set.  Otherwise, MODE4_RGRP
           SHOULD NOT be set.

       2.  If the server before closing
      a file OPENed for write.  This special identifier GROUP@ is complementary to granted ACE4_WRITE_DATA
           or ACE4_APPEND_DATA, then the first rule. bit MODE4_WGRP SHOULD be set.
           Otherwise, MODE4_WGRP SHOULD NOT be set.

       3.  If the data special identifier GROUP@ is not flushed at CLOSE, granted ACE4_EXECUTE,
           then the revalidation done after
      client OPENs as file is unable to achieve its purpose.  The bit MODE4_XGRP SHOULD be set.  Otherwise, MODE4_XGRP
           SHOULD NOT be set.

   3.  To determine MODE4_RUSR, MODE4_WUSR, and MODE4_XUSR, note that
       the EVERYONE@ special identifier SHOULD be taken into account.
       In other
      aspect to flushing words, when determining if the data before close OWNER@ special identifier
       is that granted a permission, ACEs with the data must be
      committed to stable storage, at identifier EVERYONE@
       should take effect just as ACEs with the server, before special identifer OWNER@
       would.

       1.  If the CLOSE
      operation special identifier OWNER@ is requested by granted ACE4_READ_DATA,
           then the client.  In bit MODE4_RUSR SHOULD be set.  Otherwise, MODE4_RUSR
           SHOULD NOT be set.

       2.  If the case of a server
      reboot special identifier OWNER@ is granted ACE4_WRITE_DATA
           or restart and a CLOSEd file, it may not ACE4_APPEND_DATA, then the bit MODE4_WUSR SHOULD be possible to
      retransmit set.
           Otherwise, MODE4_WUSR SHOULD NOT be set.

       3.  If the data to special identifier OWNER@ is granted ACE4_EXECUTE,
           then the bit MODE4_XUSR SHOULD be written set.  Otherwise, MODE4_XUSR
           SHOULD NOT be set.

11.3.2.1.  Discussion

   The nine low-order mode bits (MODE4_R*, MODE4_W*, MODE4_X*)
   correspond to the file.  Hence, this
      requirement.

9.3.2.  Data Caching ACE4_READ_DATA, ACE4_WRITE_DATA/ACE4_APPEND_DATA, and File Locking

   For those applications that choose to use file locking instead of
   share reservations to exclude inconsistent file access, there is an
   analogous set
   ACE4_EXECUTE for OWNER@, GROUP@, and EVERYONE@.  On some
   implementations, mode bits may represent a superset of constraints that apply to client side data caching.
   These rules are effective only these
   permissions, e.g. if the file locking is used in a way
   that matches in an equivalent way specific user is granted ACE4_WRITE_DATA, then
   MODE4_WGRP will be set, even though the actual READ and WRITE
   operations executed.  This file's owner_group is not
   granted ACE4_WRITE_DATA.

   Server implementations are discouraged from doing this, as opposed to file locking experience
   has shown that this is
   based on pure convention.  For example, it is possible to manipulate
   a two-megabyte file by dividing the file into two one-megabyte
   regions confusing and protecting access annoying to end users.  The
   specifications above also discourage this practice to enforce the two regions by file locks on
   bytes zero and one.  A lock for write on byte zero of the file would
   represent
   semantic that setting the right to do READ mode attribute effectively specifies read,
   write, and WRITE operations on the first
   region.  A lock execute for write on byte one of the file would represent the
   right to do READ owner, group, and WRITE operations on the second region.  As long
   as all applications manipulating the file obey this convention, they
   will work on a local file system.  However, they may not work with
   the NFS version 4 protocol unless clients refrain from data caching. other.

11.4.  Requirements

   The rules for data caching in the file locking environment are:

   o  First, when a client obtains a file lock for a particular region,
      the data cache corresponding to server that region (if any cache data
      exists) supports both mode and ACL must be revalidated.  If take care to
   synchronize the change attribute indicates
      that MODE4_*USR, MODE4_*GRP, and MODE4_*OTH bits with the file may
   ACEs which have been updated since the cached data was
      obtained, respective who fields of "OWNER@", "GROUP@", and
   "EVERYONE@" so that the client must flush or invalidate the cached data for can see semantically equivalent access
   permissions exist whether the newly locked region.  A client might choose to invalidate all
      of non-modified cached data that it has asks for the file but the only
      requirement owner, owner_group and
   mode attributes, or for correct operation just the ACL.

   In this section, much is to invalidate all made of the data methods in the newly locked region.

   o  Second, before releasing a write lock for a region, all modified
      data for that region must be flushed to the server.  The modified
      data must also be written Section 11.3.2.  Many
   requirements refer to stable storage.

   Note this section.  But note that flushing data to the server and the invalidation of cached
   data must reflect the actual byte ranges locked or unlocked.
   Rounding these up or down to reflect client cache block boundaries
   will cause problems if not carefully done.  For example, writing a
   modified block when only half of that block methods have
   behaviors specified with "SHOULD".  This is within an area being
   unlocked may cause invalid modification intentional, to avoid
   invalidating existing implementations that compute the region outside mode according
   to the
   unlocked area.  This, in turn, may be part of a region locked by
   another client.  Clients can avoid this situation withdrawn POSIX ACL draft (1003.1e draft 17), rather than by synchronously
   performing portions of write operations that overlap that portion
   (initial or final) that is
   actual permissions on owner, group, and other.

11.4.1.  Setting the mode and/or ACL Attributes

11.4.1.1.  Setting mode and not ACL

   When setting a full block.  Similarly, invalidating
   a locked area which is mode attribute and not an integral number of full buffer blocks
   would require the client to read one or two partial blocks from the
   server if the revalidation procedure shows that the data which ACL attribute, the
   client possesses may not mode
   attribute MUST be valid. set as given.  The data ACL attribute MUST be modified
   such that is written to the server as a prerequisite to mode computed via the
   unlocking of a region must be written, at method in Section 11.3.2 yields
   the server, to stable
   storage.  The client may accomplish this either with synchronous
   writes or by following asynchronous writes with a COMMIT operation.
   This is required because retransmission low-order nine bits (MODE4_R*, MODE4_W*, MODE4_X*) of the newly
   set mode attribute.  The ACL SHOULD also be modified data after a
   server reboot might conflict with a lock held by another client.

   A client implementation may choose to accommodate applications which
   use record locking in non-standard ways (e.g. using a record lock as
   a global semaphore) by flushing to such that:

   1.  If MODE4_RGRP is not set, entities explicitly listed in the server more data upon an LOCKU ACL
       other than OWNER@ and EVERYONE@ SHOULD NOT be granted
       ACE4_READ_DATA.

   2.  If MODE4_WGRP is covered by not set, entities explicitly listed in the locked range.  This may include modified data
   within files ACL
       other than OWNER@ and EVERYONE@ SHOULD NOT be granted
       ACE4_WRITE_DATA or ACE4_APPEND_DATA.

   3.  If MODE4_XGRP is not set, entities explicitly listed in the one for which the unlocks are being done.
   In such cases, ACL
       other than OWNER@ and EVERYONE@ SHOULD NOT be granted
       ACE4_EXECUTE.

   Access mask bits other those listed above, appearing in ALLOW ACEs,
   MAY also be disabled.

   Note that ACEs with the client must flag ACE4_INHERIT_ONLY_ACE set do not interfere with applications whose
   READs and WRITEs are being done only within affect
   the bounds permissions of record
   locks which the application holds.  For example, an application locks
   a single byte ACL itself, nor do ACEs of a file the type AUDIT and proceeds
   ALARM.  As such, it is desirable to write that single byte.  A
   client leave these ACEs unmodified when
   modifying the ACL attribute.

   Also note that chose to handle a LOCKU the requirement may be met by flushing all modified data to discarding the server could validly write that single byte ACL, in response to
   favor of an
   unrelated unlock.  However, ACL that represents the mode and only the mode.  This is
   permitted, but it would not be valid is preferable for a server to write preserve as much of
   the entire
   block in which that single written byte was located since ACL as possible without violating the above requirements.
   Discarding the ACL makes it includes
   an area that is not locked and might be locked by another client.
   Client implementations can avoid this problem by dividing files with
   modified data into those effectively impossible for which all modifications are done a file created
   with a mode attribute to
   areas covered by inherit an appropriate record lock ACL (see Section 11.4.3).

11.4.1.2.  Setting ACL and not mode

   When setting an ACL attribute and those for which there
   are modifications not covered by a record lock.  Any writes done for mode attribute, the former class ACL
   attribute SHOULD be set as given.  The nine low-order bits of files must not include areas not locked and thus
   not modified on the client.

9.3.3.  Data Caching and Mandatory File Locking

   Client side data caching needs
   mode attribute (MODE4_R*, MODE4_W*, MODE4_X*) MUST be modified to respect mandatory file locking when
   it is in effect.
   match the result of the method Section 11.3.2.  The presence three high-order
   bits of mandatory file locking for a given
   file is indicated when the client gets back NFS4ERR_LOCKED from a
   READ or WRITE on a file it has an appropriate share reservation for. mode (MODE4_SUID, MODE4_SGID, MODE4_SVTX) SHOULD remain
   unchanged.

11.4.1.3.  Setting both ACL and mode

   When mandatory locking is in effect for a file, the client must check
   for an appropriate file lock for data being read or written.  If a
   lock exists for setting both the range being read or written, mode and the client may
   satisfy ACL attribute in the request using same
   operation, the client's validated cache.  If an
   appropriate file lock attributes MUST be applied in this order: mode, then
   ACL.  The mode attribute is not held for set as given, then the range of ACL attribute is
   set as given, possibly changing the read or write, final mode, as described above in
   Section 11.4.1.2.

11.4.2.  Retrieving the read or write request must not be satisfied by mode and/or ACL Attributes

   This section applies only to servers that support both the client's cache mode and
   the request must be sent ACL attribute.

   Some server implementations may have a concept of "objects without
   ACLs", meaning that all permissions are granted and denied according
   to the server mode attribute, and that no ACL attribute is stored for processing.  When a
   read or write request partially overlaps that
   object.  If an ACL attribute is requested of such a locked region, server, the request
   should be subdivided into multiple pieces
   server SHOULD return an ACL that does not conflict with each region (locked or
   not) treated appropriately.

9.3.4.  Data Caching and File Identity

   When clients cache data, the file data needs to be organized
   according mode;
   that is to say, the file system object to which ACL returned SHOULD represent the data belongs. nine low-order
   bits of the mode attribute (MODE4_R*, MODE4_W*, MODE4_X*) as
   described in Section 11.3.2.

   For
   NFS version 3 clients, other server implementations, the typical practice has been to assume ACL attribute is always present
   for every object.  Such servers SHOULD store at least the purpose three high-
   order bits of caching that distinct filehandles represent distinct
   file system objects.  The client then has the choice to organize mode attribute (MODE4_SUID, MODE4_SGID,
   MODE4_SVTX).  The server SHOULD return a mode attribute if one is
   requested, and
   maintain the data cache on this basis.

   In the NFS version 4 protocol, there is now low-order nine bits of the possibility to have
   significant deviations from a "one filehandle per object" model
   because a filehandle may be constructed on mode (MODE4_R*,
   MODE4_W*, MODE4_X*) MUST match the basis result of applying the object's
   pathname.  Therefore, clients need a reliable method in
   Section 11.3.2 to determine if
   two filehandles designate the same file system object. ACL attribute.

11.4.3.  Creating New Objects

   If clients
   were simply to assume that all distinct filehandles denote distinct
   objects and proceed to do data caching a server supports the ACL attribute, it may use the ACL attribute
   on this basis, caching
   inconsistencies would arise between the distinct client side objects
   which mapped parent directory to the same server side object.

   By providing compute an initial ACL attribute for a method
   newly created object.  This will be referred to differentiate filehandles, the NFS version 4
   protocol alleviates a potential functional regression in comparison
   with as the NFS version 3 protocol.  Without this method, caching
   inconsistencies inherited ACL
   within the same client could occur and this has not
   been present in previous versions section.  The act of the NFS protocol.  Note that it
   is possible adding one or more ACEs to have such inconsistencies with applications executing
   on multiple clients but that is not the issue being addressed here.

   For the purposes of data caching, the following steps allow an NFS
   version 4 client to determine whether two distinct filehandles denote
   inherited ACL that are based upon ACEs in the same server side object:

   o  If GETATTR directed parent directory's ACL
   will be referred to two filehandles returns different values of
      the fsid attribute, then the filehandles represent distinct
      objects.

   o  If GETATTR for any file with as inheriting an fsid that matches ACE within this section.

   Implementors should standardize on what the fsid behavior of the
      two filehandles in question returns a unique_handles attribute
      with a value CREATE and
   OPEN must be depending on the presence or absence of TRUE, then the two objects are distinct.

   o mode and ACL
   attributes.

   1.  If GETATTR directed just mode is given:

       In this case, inheritance SHOULD take place, but the mode MUST be
       applied to the two filehandles does not return inherited ACL as described in Section 11.4.1.1,
       thereby modifying the
      fileid attribute for both of ACL.

   2.  If just ACL is given:

       In this case, inheritance SHOULD NOT take place, and the handles, then it cannot ACL as
       defined in the CREATE or OPEN will be
      determined whether set without modification,
       and the two objects mode modified as in Section 11.4.1.2

   3.  If both mode and ACL are the same.  Therefore,
      operations which depend on that knowledge (e.g. client side data
      caching) cannot given:

       In this case, inheritance SHOULD NOT take place, and both
       attributes will be done reliably.

   o set as described in Section 11.4.1.3.

   4.  If GETATTR directed to the two filehandles returns different
      values for the fileid attribute, then they are distinct objects.

   o  Otherwise they neither mode nor ACL are given:

       In the same object.

9.4.  Open Delegation

   When a file case where an object is being OPENed, the server may delegate further handling created without any initial
       attributes at all, e.g. an OPEN operation with an opentype4 of opens
       OPEN4_CREATE and closes for that file a createmode4 of EXCLUSIVE4, inheritance SHOULD
       NOT take place.  Instead, the server SHOULD set permissions to
       deny all access to the opening client.  Any such
   delegation newly created object.  It is recallable, since the circumstances expected that allowed for
       the delegation are subject to change.  In particular, appropriate client will set the server may
   receive desired attributes in a conflicting OPEN from another client,
       subsequent SETATTR operation, and the server must
   recall the delegation before deciding whether the OPEN from SHOULD allow that
       operation to succeed, regardless of what permissions the other
   client may be granted.  Making a delegation object
       is up to created with.  For example, an empty ACL denies all
       permissions, but the server and
   clients should not assume that any particular OPEN either will or
   will not result in an open delegation.  The following is a typical
   set of conditions that servers might use in deciding whether OPEN should be delegated:

   o  The client must be able allow the owner's SETATTR to respond
       succeed even though WRITE_ACL is implicitly denied.

       In other cases, inheritance SHOULD take place, and no
       modifications to the server's callback
      requests.  The server ACL will use the CB_NULL procedure for a test of
      callback ability.

   o happen.  The client must have responded properly to previous recalls.

   o  There must mode attribute, if
       supported, MUST be no current open conflicting as computed in Section 11.3.2, with the requested
      delegation.

   o  There should be no current delegation
       MODE4_SUID, MODE4_SGID and MODE4_SVTX bits clear.  It is worth
       noting that conflicts if no inheritable ACEs exist on the parent directory,
       the file will be created with an empty ACL, thus granting no
       access.

11.4.3.1.  The Inherited ACL

   If the
      delegation object being requested.

   o  The probability of future conflicting open requests should be low
      based on created is not a directory, the recent history of inherited ACL
   SHOULD NOT inherit ACEs from the file.

   o  The existence of any server-specific semantics of OPEN/CLOSE that
      would make parent directory ACL unless the required handling incompatible with
   ACE4_FILE_INHERIT_FLAG is set.

   If the prescribed
      handling object being created is a directory, the inherited ACL should
   inherit all inheritable ACEs from the parent directory, those that
   have ACE4_FILE_INHERIT_ACE or ACE4_DIRECTORY_INHERIT_ACE flag set.
   If the delegated client would apply (see below).

   There are two types of open delegations, read and write.  A read open
   delegation allows a client to handle, inheritable ACE has ACE4_FILE_INHERIT_ACE set, but
   ACE4_DIRECTORY_INHERIT_ACE is clear, the inherited ACE on its own, requests the newly
   created directory MUST have the ACE4_INHERIT_ONLY_ACE flag set to open a
   file
   prevent the directory from being affected by ACEs meant for reading that do not deny read access to others.  Multiple
   read open delegations may be outstanding simultaneously non-
   directories.

   If when a new directory is created and do not
   conflict.  A write open delegation allows the client to handle, on it inherits ACEs from its own, all opens.  Only one write open delegation may exist
   parent, for each inheritable ACE which affects the directory's
   permissions, a
   given file at a given time server MAY create two ACEs on the directory being
   created; one effective and it one which is inconsistent with any read open
   delegations.

   When a client only inheritable (i.e. has a read open delegation, it may not make any changes
   to
   ACE4_INHERIT_ONLY_ACE flag set).  This gives the contents or attributes of user and the file but it is assured that no
   other client may do so.  When a client has a write open delegation, server,
   in the cases which it may must mask certain permissions upon creation,
   the ability to modify the file data since no other client will be accessing effective permissions without modifying the file's data.  The client holding a write delegation may only
   affect file attributes
   ACE which are intimately connected with is to be inherited to the file
   data: size, time_modify, change. new directory's children.

   When a client has newly created object is created with attributes, and those
   attributes contain an open delegation, it does not send OPENs or
   CLOSEs to ACL attribute and/or a mode attribute, the
   server but updates MUST apply those attributes to the appropriate status internally.
   For a read open delegation, opens that cannot newly created object, as
   described in Section 11.4.1.

12.  Single-server Name Space

   This chapter describes the NFSv4 single-server name space.  Single-
   server namespaces may be handled locally
   (opens for write presented directly to clients, or that deny read access) must they may
   be sent used as a basis to the
   server.

   When an open delegation is made, the response form larger multi-server namespaces (e.g. site-
   wide or organization-wide) to be presented to clients, as described
   in Section 15.

12.1.  Server Exports

   On a UNIX server, the OPEN contains an
   open delegation structure which specifies name space describes all the following:

   o files reachable by
   pathnames under the type of delegation (read root directory or write)

   o "/".  On a Windows NT server
   the name space limitation information constitutes all the files on disks named by mapped
   disk letters.  NFS server administrators rarely make the entire
   server's file system name space available to control flushing NFS clients.  More often
   portions of data on close
      (write open delegation only, see the section "Open Delegation and
      Data Caching")

   o name space are made available via an nfsace4 specifying read and write permissions
   o "export"
   feature.  In previous versions of the NFS protocol, the root
   filehandle for each export is obtained through the MOUNT protocol;
   the client sends a stateid to represent string that identifies the delegation for READ and WRITE

   The delegation stateid is separate export of name space
   and distinct from the stateid server returns the root filehandle for it.  The MOUNT
   protocol supports an EXPORTS procedure that will enumerate the OPEN proper.
   server's exports.

12.2.  Browsing Exports

   The standard stateid, unlike NFS version 4 protocol provides a root filehandle that clients
   can use to obtain filehandles for the delegation
   stateid, is associated with exports of a particular lock_owner and will continue server,
   via a series of LOOKUP operations within a COMPOUND, to be valid after the delegation traverse a
   path.  A common user experience is recalled and the to use a graphical user interface
   (perhaps a file remains
   open.

   When "Open" dialog window) to find a request internal file via progressive
   browsing through a directory tree.  The client must be able to move
   from one export to another export via single-component, progressive
   LOOKUP operations.

   This style of browsing is not well supported by the NFS version 2 and
   3 protocols.  The client is made expects all LOOKUP operations to open remain
   within a single server file and open
   delegation is in effect, it system.  For example, the device
   attribute will be accepted or rejected solely not change.  This prevents a client from taking name
   space paths that span exports.

   An automounter on the basis client can obtain a snapshot of the following conditions.  Any requirement for other
   checks to be made by server's
   name space using the delegate should result in open delegation
   being denied so that EXPORTS procedure of the checks MOUNT protocol.  If it
   understands the server's pathname syntax, it can be made by create an image of
   the server itself.

   o server's name space on the client.  The access and deny bits for parts of the request and name space
   that are not exported by the file as described server are filled in the section "Share Reservations".

   o  The read and write permissions as determined below.

   The nfsace4 passed with delegation can be used to avoid frequent
   ACCESS calls.  The permission check should be as follows:

   o  If the nfsace4 indicates a "pseudo file
   system" that allows the open may be done, then it should
      be granted without reference user to browse from one mounted file system
   to another.  There is a drawback to this representation of the server.

   o
   server's name space on the client: it is static.  If the nfsace4 indicates that server
   administrator adds a new export the open may not be done, then an
      ACCESS request must client will be sent to unaware of it.

12.3.  Server Pseudo File System

   NFS version 4 servers avoid this name space inconsistency by
   presenting all the exports for a given server to obtain within the definitive
      answer.

   The server may return an nfsace4 framework of
   a single namespace, for that is more restrictive than the
   actual ACL server.  An NFS version 4 client uses
   LOOKUP and READDIR operations to browse seamlessly from one export to
   another.  Portions of the file.  This includes an nfsace4 server name space that specifies
   denial of all access.  Note are not exported are
   bridged via a "pseudo file system" that some common practices such as
   mapping provides a view of exported
   directories only.  A pseudo file system has a unique fsid and behaves
   like a normal, read only file system.

   Based on the traditional user "root" to construction of the user "nobody" may make server's name space, it
   incorrect to return the actual ACL is possible
   that multiple pseudo file systems may exist.  For example,

           /a              pseudo file system
           /a/b            real file system
           /a/b/c          pseudo file system
           /a/b/c/d        real file system

   Each of the pseudo file systems are considered separate entities and
   therefore will have its own unique fsid.

12.4.  Multiple Roots

   The DOS and Windows operating environments are sometimes described as
   having "multiple roots".  File Systems are commonly represented as
   disk letters.  MacOS represents file systems as top level names.  NFS
   version 4 servers for these platforms can construct a pseudo file
   system above these root names so that disk letters or volume names
   are simply directory names in the delegation
   response. pseudo root.

12.5.  Filehandle Volatility

   The use of delegation together with various other forms nature of caching
   creates the possibility server's pseudo file system is that no server authentication will ever be
   performed for it is a given user since all logical
   representation of file system(s) available from the user's requests might be
   satisfied locally.  Where server.
   Therefore, the client pseudo file system is depending most likely constructed
   dynamically when the server is first instantiated.  It is expected
   that the pseudo file system may not have an on disk counterpart from
   which persistent filehandles could be constructed.  Even though it is
   preferable that the server provide persistent filehandles for
   authentication, the
   pseudo file system, the NFS client should expect that pseudo file
   system filehandles are volatile.  This can be sure authentication occurs for
   each user confirmed by use checking
   the associated "fh_expire_type" attribute for those filehandles in
   question.  If the filehandles are volatile, the NFS client must be
   prepared to recover a filehandle value (e.g. with a series of LOOKUP
   operations) when receiving an error of NFS4ERR_FHEXPIRED.

12.6.  Exported Root

   If the ACCESS operation. server's root file system is exported, one might conclude that
   a pseudo-file system is unneeded.  This should be not necessarily so.  Assume
   the case
   even if an ACCESS operation would following file systems on a server:

           /       disk1  (exported)
           /a      disk2  (not exported)
           /a/b    disk3  (exported)

   Because disk2 is not exported, disk3 cannot be required otherwise.  As
   mentioned before, the server may enforce frequent authentication by
   returning an nfsace4 denying all access reached with every open delegation.

9.4.1.  Open Delegation and Data Caching

   OPEN delegation allows much of simple
   LOOKUPs.  The server must bridge the message overhead associated gap with
   the opening and closing files to a pseudo-file system.

12.7.  Mount Point Crossing

   The server file system environment may be eliminated.  An open when an open
   delegation is constructed in effect does not require such a way
   that one file system contains a validation message directory which is 'covered' or
   mounted upon by a second file system.  For example:

           /a/b            (file system 1)
           /a/b/c/d        (file system 2)

   The pseudo file system for this server may be
   sent constructed to look
   like:

           /               (place holder/not exported)
           /a/b            (file system 1)
           /a/b/c/d        (file system 2)

   It is the server.  The continued endurance of server's responsibility to present the "read open
   delegation" provides a guarantee pseudo file system
   that no OPEN for write and thus no
   write has occurred.  Similarly, when closing is complete to the client.  If the client sends a file opened lookup request
   for write
   and if write open delegation the path "/a/b/c/d", the server's response is in effect, the data written does not
   have to be flushed to filehandle of
   the file system "/a/b/c/d".  In previous versions of the NFS
   protocol, the server until would respond with the open delegation is
   recalled.  The continued endurance filehandle of directory
   "/a/b/c/d" within the open delegation provides file system "/a/b".

   The NFS client will be able to determine if it crosses a
   guarantee that no open and thus no read or write has been done server mount
   point by
   another client.

   For a change in the purposes value of open delegation, READs the "fsid" attribute.

12.8.  Security Policy and WRITEs done without an
   OPEN are treated as Name Space Presentation

   The application of the functional equivalents server's security policy needs to be carefully
   considered by the implementor.  One may choose to limit the
   viewability of a corresponding
   type portions of OPEN.  This refers the pseudo file system based on the
   server's perception of the client's ability to authenticate itself
   properly.  However, with the READs support of multiple security mechanisms
   and WRITEs that use the
   special stateids consisting ability to negotiate the appropriate use of all zero bits or all one bits.
   Therefore, READs or WRITEs with a special stateid done by another
   client will force these mechanisms,
   the server is unable to recall a write open delegation.  A
   WRITE with properly determine if a special stateid done by another client will force a
   recall of read open delegations.

   With delegations, a client is be able
   to avoid writing data to authenticate itself.  If, based on its policies, the server when
   chooses to limit the CLOSE contents of a the pseudo file is serviced.  The system, the server
   may effectively hide file close system
   call is systems from a client that may otherwise
   have legitimate access.

   As suggested practice, the usual point at which server should apply the client is notified security policy of
   a lack of
   stable storage for shared resource in the modified file data generated by server's namespace to the
   application.  At components of the close,
   resource's ancestors.  For example:

           /
           /a/b
           /a/b/c

   The /a/b/c directory is a real file data system and is written to the shared
   resource.  The security policy for /a/b/c is Kerberos with integrity.
   The server and
   through normal accounting should apply the server is able same security policy to determine if the
   available file system space /, /a, and /a/b.
   This allows for the data has been exceeded (i.e.
   server returns NFS4ERR_NOSPC or NFS4ERR_DQUOT).  This accounting
   includes quotas.  The introduction extension of delegations requires that a
   alternative method be in place for the same type protection of communication the server's
   namespace to
   occur between client and server.

   In the delegation response, ancestors of the server provides either real shared resource.

   For the limit case of the size use of multiple, disjoint security mechanisms in
   the file or server's resources, the number of modified blocks and associated
   block size.  The server must ensure that security for a particular object in the client will
   server's namespace should be able to
   flush data to the server union of a size equal all security mechanisms of
   all direct descendants.

13.  File Locking and Share Reservations

   Integrating locking into the NFS protocol necessarily causes it to that provided in be
   stateful.  With the
   original delegation.  The server must make this assurance inclusion of such features as share reservations,
   file and directory delegations, recallable layouts, and support for all
   outstanding delegations.  Therefore,
   mandatory byte-range locking the protocol becomes substantially more
   dependent on state than the traditional combination of NFS and NLM
   [XNFS].  There are three components to making this state manageable:

   o  Clear division between client and server must be careful

   o  Ability to reliably detect inconsistency in
   its management of available space for new or modified data taking
   into account available file system space state between client
      and any applicable quotas.
   The server can recall delegations as a result of managing

   o  Simple and robust recovery mechanisms

   In this model, the
   available file system space. server owns the state information.  The client should abide by
   requests changes in locks and the server's server responds with the changes
   made.  Non-client-initiated changes in locking state space limits for delegations.  If are infrequent
   and the client exceeds the stated
   limits for receives prompt notification of them and can adjust
   his view of the delegation, locking state to reflect the server's behavior changes.

   To support Win32 share reservations it is undefined.

   Based on server conditions, quotas necessary to provide
   operations which atomically OPEN or available file system space, CREATE files.  Having a separate
   share/unshare operation would not allow correct implementation of the server may grant write open delegations with very restrictive
   space limitations.  The limitations may be defined in
   Win32 OpenFile API.  In order to correctly implement share semantics,
   the previous NFS protocol mechanisms used when a way that will
   always force modified data file is opened or
   created (LOOKUP, CREATE, ACCESS) need to be flushed to the server replaced.  The NFS
   version 4.1 protocol defines OPEN operation which looks up or creates
   a file and establishes locking state on close.

   With respect the server.

13.1.  Locking

   It is assumed that manipulating a lock is rare when compared to authentication, flushing modified data READ
   and WRITE operations.  It is also assumed that crashes and network
   partitions are relatively rare.  Therefore it is important that the
   READ and WRITE operations have a lightweight mechanism to indicate if
   they possess a held lock.  A lock request contains the server
   after heavyweight
   information required to establish a CLOSE has occurred may be problematic.  For example, lock and uniquely define the user
   of lock
   owner.

   The following sections describe the application may have logged off transition from the heavyweight
   information to the eventual lightwieght stateid used for most client
   and unexpired
   authentication credentials may not be present.  In this case, server locking interactions.

13.1.1.  Client ID

   For each operation that obtains or depends on locking state, the
   specific client may need to take special care to ensure that local unexpired
   credentials will in fact be available.  This may must be accomplished determinable by
   tracking the expiration time of credentials and flushing data well in
   advance of their expiration or server.  In NFSv4, each
   distinct client instance is represented by making private copies of
   credentials to assure their availability when needed.

9.4.2.  Open Delegation and File Locks

   When a clientid, which is a 64-
   bit identifier that identifies a specific client holds at a write open delegation, lock operations given time and
   which is changed whenever the client or the server re-initializes.
   Clientid's are
   performed locally.  This includes those required for mandatory file
   locking.  This can be done since used to support lock identification and crash
   recovery.

   In NFSv4.1, the delegation implies that there
   can be no conflicting locks.  Similarly, all of clientid associated with each operation is derived
   from the revalidations
   that would normally be session on which the operation is issued.  Each session is
   associated with obtaining locks a specific clientid at session creation and that
   clientid then becomes the
   flushing of data clientid associated with the releasing all requests
   issued using it.

   A sequence of locks need not be
   done.

   When a client holds CREATE_CLIENTID operation followed by a read open delegation, lock operations are not
   performed locally.  All lock operations, including those requesting
   non-exclusive locks, are sent
   CREATE_SESSION operation using that clientid is required to establish
   the server for resolution.

9.4.3.  Handling identification on the server.  Establishment of CB_GETATTR

   The server needs to employ special handling for identification by
   a GETATTR where new incarnation of the
   target is a file that client also has the effect of immediately
   releasing any locking state that a write open delegation in effect.  The
   reason for this is previous incarnation of that the same
   client holding the write delegation may might have modified had on the data and server.  Such released state would
   include all lock, share reservation, and, where the server needs to reflect this change to
   the second client that submitted the GETATTR.  Therefore, is not
   supporting the CLAIM_DELEGATE_PREV claim type, all delegation state
   associated with same client
   holding with the write same identity.  For discussion
   of delegation needs to be interrogated.  The server
   will use state recovery, see the CB_GETATTR operation.  The only attributes section "Delegation Recovery".

   Releasing such state requires that the server can reliably query via CB_GETATTR are size and change.

   Since CB_GETATTR is being used to satisfy another client's GETATTR
   request, the server only needs be able to know if the determine
   that one client holding instance is the
   delegation has successor of another.  Where this
   cannot be done, for any of a modified version number of reasons, the file.  If the client's copy
   of locking state
   will remain for a time subject to lease expiration (see Section 13.5)
   and the delegated file new client will need to wait for such state to be removed, if
   it makes conflicting lock requests.

   Client identification is not modified (data or size), the server can
   satisfy the second client's GETATTR request from the attributes
   stored locally at the server.  If encapsulated in the file following structure:

               struct nfs_client_id4 {
               verifier4     verifier;
               opaque        id<NFS4_OPAQUE_LIMIT>;
               };

   The first field, verifier, is modified, the server
   only needs to know about this modified state.  If the server
   determines a client incarnation verifier that the file is currently modified, it will respond
   used to
   the second client's GETATTR as detect client reboots.  Only if the file verifier is different
   from that the server had been modified locally
   at previously recorded for the server.

   Since client (as
   identified by the form second field of the change attribute is determined by structure, id) does the server
   and
   start the process of canceling the client's leased state.

   The second field, id is opaque to a variable length string that uniquely
   defines the client, client so that subsequent instances of the same client and server need to agree on
   bear the same id with a
   method of communicating different verifier.

   There are several considerations for how the modified state of client generates the file.  For id
   string:

   o  The string should be unique so that multiple clients do not
      present the size
   attribute, same string.  The consequences of two clients
      presenting the same string range from one client will report getting an error
      to one client having its current view leased state abruptly and unexpectedly
      canceled.

   o  The string should be selected so the subsequent incarnations (e.g.
      reboots) of the file size.
   For same client cause the change attribute, client to present the handling same
      string.  The implementor is more involved.

   For the client, cautioned from an approach that
      requires the following steps will string to be taken when receiving recorded in a
   write delegation:

   o  The value of local file because this
      precludes the change attribute will be obtained from use of the server implementation in an environment where
      there is no local disk and cached.  Let this value be represented by c. all file access is from an NFS version
      4 server.

   o  The client will create a value greater than c that will string should be used different for communicating modified data is held at the client.  Let this
      value be represented by d.

   o  When each server network address
      that the client accesses, rather than common to all server network
      addresses.  The reason is queried via CB_GETATTR that it may not be possible for the change
      attribute, it checks
      client to see tell if it holds modified data.  If the
      file is modified, the value d same server is returned for the change attribute
      value. listening on multiple network
      addresses.  If this file is not currently modified, the client returns
      the value c for issues CREATE_CLIENTID with the change attribute.

   For simplicity same id
      string to each network address of implementation, such a server, the client MAY for each CB_GETATTR
   return server will
      think it is the same value d.  This is true even if, between client, and each successive
   CB_GETATTR operations, CREATE_CLIENTID
      will cause the client again modifies in server remove the file's data
   or metadata in its cache. client's previous leased state.

   o  The client can return the same value
   because algorithm for generating the only requirement is string should not assume that the
      client's network address won't change.  This includes changes
      between client be able to indicate
   to incarnations and even changes while the server client is
      still running in its current incarnation.  This means that if the
      client holds modified data.  Therefore, includes just the
   value of d may always be c + 1.

   While client's and server's network address in
      the change attribute id string, there is opaque to a real risk, after the client in gives up the sense
      network address, that
   it has no idea what units of time, if any, another client, using a similar algorithm
      for generating the server is counting
   change with, it id string, would generate a conflicting id
      string.

   Given the above considerations, an example of a well generated id
   string is not opaque in one that includes:

   o  The server's network address.

   o  The client's network address.

   o  For a user level NFS version 4 client, it should contain
      additional information to distinguish the client has to treat it as
   an unsigned integer, and from other user
      level clients running on the server has same host, such as a process id or
      other unique sequence.

   o  Additional information that tends to be able unique, such as one or
      more of:

      *  The client machine's serial number (for privacy reasons, it is
         best to see perform some one way function on the results serial number).

      *  A MAC address.

      *  The timestamp of when the client's changes to that integer.  Therefore, NFS version 4 software was first
         installed on the server MUST
   encode client (though this is subject to the change attribute
         previously mentioned caution about using information that is
         stored in network order when sending it to a file, because the
   client.  The client MUST decode it from network order file might only be accessible
         over NFS version 4).

      *  A true random number.  However since this number ought to its native
   order when receiving it and be
         the same between client MUST encode it network order
   when sending it to the server.  For incarnations, this reason, change is defined shares the same
         problem as
   an unsigned integer rather than an opaque array that of octets.

   For the server, using the following steps will be taken when providing a
   write delegation:

   o  Upon providing timestamp of the software
         installation.

   As a write delegation, security measure, the server will cache MUST NOT cancel a copy of
      the change attribute in client's leased
   state if the data structure it uses to record principal established the
      delegation.  Let this value be represented by sc.

   o  When a second client sends state for a GETATTR operation on given id string is
   not the same file to as the server, principal issuing the CREATE_CLIENTID.

   A server obtains the change attribute from the first
      client.  Let this value be cc.

   o  If the value cc is equal to sc, the file may compare an nfs_client_id4 in a CREATE_CLIENTID with an
   nfs_client_id4 established using SETCLIENTID using NFSv4 minor
   version 0, so that an NFSv4.1 client is not modified and the
      server returns the current values forced to delay until
   lease expiration for change, time_metadata, locking state established by the earlier client
   using minor version 0.

   Once a CREATE_CLIENTID has been done, and
      time_modify (for example) to the second client.

   o  If resulting clientid
   established as associated with a session, all requests made on that
   session implicitly identify that clientid, which in turn designates
   the value cc is NOT equal to sc, client specified using the file long-form nfs_client_id4 structure.
   The shorthand client identifier (a clientid) is currently modified
      at assigned by the first client
   server and most likely will should be modified at chosen so that it will not conflict with a
   clientid previously assigned by the server.  This applies across
   server
      at restarts or reboots.

   In the event of a future time.  The server then uses restart, a client will find out that its
   current time to
      construct attribute values for time_metadata and time_modify.  A
      new value of sc, which we will call nsc, clientid is computed by the
      server, such that nsc >= sc + 1. no longer valid when receives a
   NFS4ERR_STALE_CLIENTID error.  The server then returns the
      constructed time_metadata, time_modify, and nsc values to precise circumstances depend of
   the
      requester.  The server replaces sc in characteristics of the delegation record with
      nsc.  To prevent sessions involved, specifically whether
   the possibility of time_modify, time_metadata,
      and change from appearing to go backward (which would happen if session is persistent.

   When a session is not persistent, the client holding the delegation fails to write its modified data will need to create a
   new session.  When the existing clientid is presented to a server before the delegation as
   part of creating a session and that clientid is revoked or returned), not recognized, as
   would happen after a server reboot, the server SHOULD update will reject the file's metadata record
   request with the
      constructed attribute values.  For reasons of reasonable
      performance, committing the constructed attribute values to stable
      storage is OPTIONAL.

   As discussed earlier in error NFS4ERR_STALE_CLIENTID.  When this section, happens,
   the client MAY return the same
   cc value on subsequent CB_GETATTR calls, even if the file was
   modified in the client's cache yet again between successive
   CB_GETATTR calls.  Therefore, the server must assume that the file
   has been modified yet again, and MUST take care to ensure that the obtain a new nsc it constructs and returns is greater than the previous nsc it
   returned.  An example implementation's delegation record would
   satisfy this mandate clientid by including a boolean field (let us call it
   "modified") that is set to false when use of the delegation is granted, CREATE_CLIENTID
   operation and
   an sc value set at then use that clientid as the time basis of grant to the change attribute value.
   The modified field would be set to true the first time cc != sc, and
   would stay true until the delegation is returned or revoked.  The
   processing for constructing nsc, time_modify, and time_metadata would
   use this pseudo code:

   if (!modified) {
       do CB_GETATTR for change basis of a
   new session and size;

       if (cc != sc)
           modified = TRUE;
   } else {
       do CB_GETATTR for size;
   }

   if (modified) {
       sc = sc + 1;
       time_modify = time_metadata = current_time;
       update sc, time_modify, time_metadata into file's metadata;
   }

   return then proceed to client (that sent GETATTR) the attributes
   it requested, but make sure size comes from what
   CB_GETATTR returned. Do not update the file's metadata
   with any other necessary recovery for the client's modified size.
   server reboot case (See Section 13.6.2).

   In the case that the file attribute size is different than the
   server's current value, the server treats this as a modification
   regardless of the value of the change attribute retrieved via
   CB_GETATTR and responds to session being persistent, the second client as in will re-
   establish communication using the last step. existing session after the reboot.
   This methodology resolves issues of clock differences between client
   and server session will be associated with a stale clientid and other scenarios where the use client
   will receive an indication of CB_GETATTR break
   down.

   It should be noted that fact in the server is under no obligation to use
   CB_GETATTR and therefore status field returned
   by the server MAY simply recall SEQUENCE operation.  The client, can then use the delegation existing
   session to avoid its use.

9.4.4.  Recall do whatever operations are necessary to determine the
   status of Open Delegation

   The following events necessitate recall requests outstanding at the time of an open delegation:

   o  Potentially conflicting OPEN request (or READ/WRITE done reboot, while avoiding
   issuing new requests, particularly any involving locking on that
   session.  Such requests would fail with
      "special" stateid)

   o  SETATTR issued by another client

   o  REMOVE request for NFS4ERR_STALE_CLIENTID error
   or an NFS4ERR_STALE_STATEID error, if attempted.  In any case, the file

   o  RENAME request
   client would create a new clientid using CREATE_CLIENTID, create a
   new session based on that clientid, and proceed to other necessary
   recovery for the file as either source or target of server reboot case.

   See the
      RENAME

   Whether a RENAME detailed descriptions of CREATE_CLIENTID and CREATE_SESSION
   for a directory in the path leading to the file
   results in recall complete specification of an open delegation depends on the semantics these operations.

13.1.2.  Server Release of Clientid

   If the server file system.  If determines that file system denies such RENAMEs when
   a file is open, the recall must be performed to determine whether the
   file in question is, in fact, open.

   In addition to the situations above, client holds no associated state
   for its clientid, the server may choose to recall
   open delegations at any time if resource constraints release the clientid.  The
   server may make it
   advisable to do so.  Clients should always be prepared this choice for an inactive client so that resources
   are not consumed by those intermittently active clients.  If the
   possibility of recall.

   When a
   client receives a recall for an open delegation, it needs to
   update state on contacts the server before returning after this release, the delegation.  These
   same updates server must be done whenever a ensure
   the client chooses to return a
   delegation voluntarily.  The following items of state need to be
   dealt with:

   o  If receives the file associated with appropriate error so that it will use the delegation is no longer open and
      no previous CLOSE operation has been sent
   CREATE_CLIENTID/CREATE_SESSION sequence to the server, establish a CLOSE
      operation must new identity.
   It should be sent to the server.

   o  If a file has other open references at clear that the client, then OPEN
      operations server must be sent very hesitant to release a
   clientid since the server.  The appropriate stateids
      will be provided by the server for subsequent use by resulting work on the client
      since the delegation stateid to recover from such
   an event will not longer be valid.  These OPEN
      requests are done with the claim type of CLAIM_DELEGATE_CUR.  This
      will allow the presentation of same burden as if the delegation stateid so server had failed and
   restarted.  Typically a server would not release a clientid unless
   there had been no activity from that the client can establish the appropriate rights to perform the OPEN.
      (see the section "Operation 18: OPEN" for details.)

   o  If there are granted file locks, the corresponding LOCK operations
      need to be performed.  This applies to many minutes.

   Note that if the write open delegation
      case only.

   o  For id string in a write open delegation, CREATE_CLIENTID request is properly
   constructed, and if at the time of recall client takes care to use the file is
      not open for write, all modified data same principal
   for the file must each successive use of CREATE_CLIENTID, then, barring an active
   denial of service attack, NFS4ERR_CLID_INUSE should never be flushed
      to
   returned.

   However, client bugs, server bugs, or perhaps a deliberate change of
   the server.  If principal owner of the delegation had not existed, id string (such as the case of a client
      would have done this data flush before
   that changes security flavors, and under the CLOSE operation.

   o  For a write open delegation when a file new flavor, there is still open at the time
      of recall, any modified data for the file needs to be flushed no
   mapping to the server.

   o  With the write open delegation previous owner) will in place, it is possible rare cases result in
   NFS4ERR_CLID_INUSE.

   In that event, when the
      file was truncated during the duration of the delegation.  For
      example, the truncation could have occurred as a result of an OPEN
      UNCHECKED with server gets a size attribute value of zero.  Therefore, if CREATE_CLIENTID for a
      truncation of client id
   that currently has no state, or it has state, but the file lease has occurred
   expired, rather than returning NFS4ERR_CLID_INUSE, the server MUST
   allow the CREATE_CLIENTID, and this operation has not
      been propagated to confirm the server, new clientid if followed
   by the truncation must occur before
      any modified data is written to appropriate CRREATESESSION.

13.1.3.  State-owner and Stateid Definition

   When opening a file or requesting a byte-range lock, the server.

   In client must
   specify an identifier which represents the case owner of write open delegation, file locking imposes some
   additional requirements.  To precisely maintain the associated
   invariant, it requested
   lock.  This identifier is required to flush any modified data in any region
   for which a write lock was released while the write delegation was form of a state-owner, represented
   in
   effect.  However, because the write open delegation implies no other
   locking protocol by other clients, a simpler implementation is to flush all
   modified data for the file (as described just above) if any write
   lock has been released while the write open delegation was in effect.

   An implementation need not wait until delegation recall (or deciding
   to voluntarily return state_owner4, a delegation) to perform any of the above
   actions, if implementation considerations (e.g. resource availability
   constraints) make that desirable.  Generally, however, variable-length opaque array
   which, when concatenated with the fact that current clientid uniquely defines
   the actual open state owner of lock managed by the file client.  This may continue to change makes it not
   worthwhile to send information about be a thread id,
   process id, or other unique value.

   Owners of opens and closes to the server,
   except as part of delegation return.  Only in the case owners of closing the
   open that resulted in obtaining byte-range locks are separate entities
   and remain separate even if the delegation would clients be
   likely same opaque arrays are used to do this early, since, in that case, the close once done
   will not be undone.  Regardless
   designate owners of the client's choices on scheduling
   these actions, all must be performed before the delegation each.  The protocol distinguishes between open-
   owners (represented by open_owner4 structures) and lock-owners
   (represented by lock_owner4 structures).

   Each open is
   returned, including (when applicable) associated with a specific open-owner while each byte-
   range lock is associated with a lock-owner and an open-owner, the close that corresponds to
   latter being the open-owner associated with the open that resulted in file under which
   the delegation.  These actions can be
   performed either in previous requests or in previous operations in LOCK operation was done.  Delegations and layouts, on the same COMPOUND request.

9.4.5.  Clients that Fail to Honor Delegation Recalls

   A client may fail to respond to other
   hand, are not associated with a recall for various reasons, such specific owner but are associated the
   client as a failure of whole.

   When the callback path from server to the client.  The client
   may be unaware grants a lock of any type (including opens, byte-
   range locks, delegations, and layouts) it responds with a unique
   stateid, that represents a set of locks (often a failure in single lock) for the callback path.  This lack
   same file, of
   awareness could result in the client finding out long after the
   failure that its delegation has been revoked, same type, and another client has
   modified sharing the data for which same ownership
   characteristics.  Thus opens of the client had same file by different open-
   owners each have an identifying stateid.  Similarly, each set of
   byte-range locks on a delegation.  This is
   especially file owned by a problem specific lock-owner and gotten
   via an open for the client that held a write delegation.

   The server also specific open-owner, has its own identifying
   stateid.  Delegations and layouts also have associated stateid's by
   which they may be referenced.  The stateid is used as a dilemma in that the client that fails to
   respond shorthand
   reference to a lock or set of locks and given a stateid the recall might also be sending other NFS requests,
   including those that renew client
   can determine the lease before associated state-owner or state-owners (in the lease expires.
   Without returning case
   of an error open-owner/lock-owner pair) and the associated.  Clients,
   however, must not assume any such mapping and must not use a stateid
   returned for those lease renewing operations, a given filehandle and state-owner in the context of a
   different filehandle or a different state-owner.

   The server leads the client is free to believe that form the delegation it has is stateid in
   force.

   This difficulty any manner that it chooses
   as long as it is solved by able to recognize invalid and out-of-date stateids.
   Although the following rules:

   o  When protocol XDR definition divides the callback path stateid into into
   'seqid' and 'other' fields, for the purposes of minor version one,
   this distinction is down, not important and the server MUST NOT revoke may use the
      delegation if
   available space as it chooses, with one of the following occurs:

      * exception.

   The client has issued a RENEW operation exception is that stateids whose 'other' field is either all
   zeros or all ones are reserved and may not be generated by the server has
         returned an NFS4ERR_CB_PATH_DOWN error.  The server MUST renew
   server.  Clients may use the lease protocol-defined special stateid values
   for their defined purposes, but any record locks and share reservations the
         client has use of stateid's in this reserved
   class that are not specially defined by the protocol MUST result in
   an NFS4ERR_BAD_STATED being returned.

   Clients may not compare stateids associated with different
   filehandles, so that a server has known about (as opposed to those might use stateids with the same bit
   pattern for all opens with a given open-owner or for all sets of
   byte-range locks associated with a given lock-owner/open-owner pair.
   However, if it does so, it must recognize and share reservations the client has established but not
         yet sent to the server, due to the delegation).  The server
         SHOULD give reject any use of
   stateid when the current filehandle is such that no lock for that
   filehandle by that open owner (or lock-owner/open-owner pair) exists.

   Stateid's must remain valid until either a client reboot or a reasonable time to return its
         delegations to the server before revoking sever
   reobot or until the client's
         delegations.

      *  The client has not issued a RENEW operation for some period returns all of
         time after the server attempted to recall locks associated with
   the delegation.  This
         period stateid by means of time MUST NOT be less than an operation such as CLOSE or DELEGRETURN.
   If the value of locks are lost due to revocation the
         lease_time attribute.

   o  When sateid remains usable
   until the client holds a delegation, frees it can not rely on operations,
      except for RENEW, that take by using FREE_STATEID.  Stateid's
   associated with byte-range locks are an exception.  They remain valid
   even if a stateid, to renew delegation leases
      across callback path failures.  The client that wants to keep
      delegations in force across callback path failures must use RENEW
      to do so.

9.4.6.  Delegation Revocation

   At LOCKU free all remaining locks, so long as the point a delegation is revoked, if there opefile with
   which they are associated opens
   on remains open, unless the client, client does a
   FREE_STATEID to caused the applications holding these opens need stateid to be
   notified.  This notification usually freed.

   Because each operation using a stateid occurs by returning errors for
   READ/WRITE operations or when as part of a close session,
   each stateid is attempted for the open file.

   If no opens exist for the file at the point implicitly associated with the delegation is
   revoked, then notification clientid assigned to
   that session.  Use of a stateid in the revocation is unnecessary.
   However, if there is modified data present at the client for the
   file, the user context of a session where the application
   clientid is invalid should be notified.  Unfortunately,
   it may not be possible to notify result in the user since active applications error NFS4ERR_STALE_STATEID.
   Servers MUST NOT do any validation or return other errors in this
   case, even if they have sufficient information available to validate
   stateids associated with an out-of-date client.

   One mechanism that may not be present at used to satisfy the client.  See requirement that the section "Revocation
   Recovery for Write Open Delegation" for additional details.

9.5.  Data Caching and Revocation

   When locks
   server recognize invalid and delegations are revoked, the assumptions upon which
   successful caching depend are no longer guaranteed.  For any locks or
   share reservations that have been revoked, out-of-date stateids is for the corresponding owner
   needs server
   to be notified. divide the stateid into two fields.  This notification includes applications division may coincide
   with the documented division into 'seqid' and 'other' fields or it
   may divide the stateid field up in any other ay it chooses.

   o  An index into a
   file open that has a corresponding delegation table of locking-state structures.

   o  A generation number which has been revoked.
   Cached data associated is incremented on each allocation of a
      table entry a particular allocation of a stateid.

   And then store in each table entry,

   o  The current generation number.

   o  The clientid with which the revocation must be removed from stateid is associated.

   o  The filehandle of the
   client.  In file on which the case locks are taken.

   o  An indication of modified data existing in the client's cache,
   that data must be removed from type of stateid (open, byte-range lock, file
      delegation, directory delegation, layout).

   With this information, the client without it being written following procedure would be used to
   validate an incoming stateid and return an appropriate error, when
   necessary:

   o  If the server.  As mentioned, current session is associated with an invalid clientid,
      return NFS4ERR_STALE_STATEID.

   o  If the table index field is outside the assumptions made by range of the client are no
   longer valid at associated
      table, return NFS4ERR_BAD_STATEID.

   o  If the point when a lock or delegation has been revoked.

   For example, another client may have been granted selected table entry is of a conflicting lock
   after different generation than that
      specified in the revocation of incoming stateid, return NFS4ERR_BAD_STATEID.

   o  If the lock at selected table entry does not match the first client.  Therefore, current file
      handle, return NFS4ERR_BAD_STATEID.

   o  If the
   data within clientid in the lock range may have been modified by table entry does not match the other
   client.  Obviously, clientid
      associated with the first client is unable to guarantee to current session, return NFS4ERR_BAD_STATEID.

   o  If the
   application what has occurred to stateid type is not valid for the file context in which the case of revocation.

   Notification to a lock owner will in many cases consist of simply
   returning an error on
      stateid appears, return NFS4ERR_BAD_STATEID.

   o  Otherwise, the next stateid is valid and all subsequent READs/WRITEs to the
   open file or on the close.  Where table entry should contain
      any additional information about the methods available to a client
   make such notification impossible because errors for certain
   operations may not be returned, more drastic action associated set of locks, such
      as signals
   or process termination may be appropriate.  The justification for
   this is that an invariant for which an application depends on may be
   violated.  Depending open-owner and lock-owner information, as well as information
      on how errors are typically treated for the
   client operating environment, further levels specific locks, such as open modes and byte ranges.

13.1.4.  Use of notification
   including logging, console messages, the Stateid and GUI pop-ups may be
   appropriate.

9.5.1.  Revocation Recovery for Write Open Delegation

   Revocation recovery for Locking

   All READ, WRITE and SETATTR operations contain a write open delegation poses stateid.  For the special
   issue
   purposes of modified data in this section, SETATTR operations which change the client cache while size
   attribute of a file are treated as if they are writing the area
   between the old and new size (i.e. the range truncated or added to
   the file by means of the SETATTR), even where SETATTR is not
   open.  In this situation, any client which does not flush modified
   data to the server on each close must ensure that
   explicitly mentioned in the user receives
   appropriate notification of text.

   If the failure as state-owner performs a result of the
   revocation.  Since such situations may require human action to
   correct problems, notification schemes READ or WRITE in a situation in which the appropriate user
   it has established a lock or administrator is notified may share reservation on the server (any
   OPEN constitutes a share reservation) the stateid (previously
   returned by the server) must be necessary.  Logging used to indicate what locks,
   including both record locks and console
   messages share reservations, are typical examples. held by the
   state-owner.  If there no state is modified data on established by the client, it must not be flushed
   normally to the server.  A client may attempt to provide either record
   lock or share reservation, a copy special stateid of all bits 0 (including
   all fields of the file data as modified during stateid) is used.  Regardless whether a stateid of
   all bits 0, or a stateid returned by the delegation under server is used, if there is
   a different
   name in conflicting share reservation or mandatory record lock held on the file system name space to ease recovery.  Note that when
   file, the client can determine that server MUST refuse to service the file has not been modified by any
   other client, READ or WRITE operation.

   Share reservations are established by OPEN operations and by their
   nature are mandatory in that when the client has a complete cached copy of file OPEN denies READ or WRITE
   operations, that denial results in question, such a saved copy of operations being rejected
   with error NFS4ERR_LOCKED.  Record locks may be implemented by the client's view of
   server as either mandatory or advisory, or the file choice of mandatory or
   advisory behavior may be of particular value for recovery.  In other case, recovery using a
   copy determined by the server on the basis of the
   file based partially being accessed (for example, some UNIX-based servers support a
   "mandatory lock bit" on the client's cached data and
   partially mode attribute such that if set, record
   locks are required on the server copy as modified by other clients, will be
   anything but straightforward, so clients may avoid saving file
   contents in these situations or mark before I/O is possible).  When record
   locks are advisory, they only prevent the results specially to warn
   users of possible problems.

   Saving of such modified data in delegation revocation situations may
   be limited to files granting of a certain size conflicting
   lock requests and have no effect on READs or might be used only when
   sufficient disk space is available within the target file system.
   Such saving may also be restricted to situations when WRITEs.  Mandatory
   record locks, however, prevent conflicting I/O operations.  When they
   are attempted, they are rejected with NFS4ERR_LOCKED.  When the
   client gets NFS4ERR_LOCKED on a file it knows it has
   sufficient buffering resources to keep the cached copy available
   until proper share
   reservation for, it is properly stored will need to issue a LOCK request on the target file system.

9.6.  Attribute Caching

   The attributes discussed in this section do not include named
   attributes.  Individual named attributes are analogous to files and
   caching region
   of the data for these needs file that includes the region the I/O was to be handled just as data
   caching is for ordinary files.  Similarly, LOOKUP results from performed on,
   with an
   OPENATTR directory are to be cached on the same basis as any other
   pathnames and similarly appropriate locktype (i.e.  READ*_LT for directory contents.

   Clients may cache file attributes obtained from the server and use
   them to avoid subsequent GETATTR requests.  Such caching is write
   through in a READ operation,
   WRITE*_LT for a WRITE operation).

   Note that modification to for UNIX environments that support mandatory file attributes is always done by
   means of requests to locking,
   the server distinction between advisory and should not be done locally mandatory locking is subtle.  In
   fact, advisory and
   cached.  The exception to this are modifications to attributes that mandatory record locks are intimately connected with data caching.  Therefore, extending a
   file by writing data to exactly the local data cache is reflected immediately same in the size so
   far as seen the APIs and requirements on implementation.  If the client without this change being
   immediately reflected mandatory
   lock attribute is set on the file, the server checks to see if the
   lock-owner has an appropriate shared (read) or exclusive (write)
   record lock on the server.  Normally such changes are not
   propagated directly region it wishes to read or write to.  If there is
   no appropriate lock, the server but when the modified data checks if there is
   flushed a conflicting lock
   (which can be done by attempting to acquire the server, analogous attribute changes are made conflicting lock on
   the
   server.  When open delegation behalf of the lock-owner, and if successful, release the lock
   after the READ or WRITE is in effect, done), and if there is, the modified attributes
   may be returned to server returns
   NFS4ERR_LOCKED.

   For Windows environments, there are no advisory record locks, so the
   server in always checks for record locks during I/O requests.

   Thus, the response NFS version 4 LOCK operation does not need to a CB_RECALL call.

   The result of local caching of attributes distinguish
   between advisory and mandatory record locks.  It is that the attribute
   caches maintained on individual clients will not be coherent.
   Changes made in one order on NFS version 4
   server's processing of the server may be seen in a different
   order on one client READ and WRITE operations that introduces
   the distinction.

   Every stateid other than the special stateid values noted in this
   section, whether returned by an OPEN-type operation (i.e.  OPEN,
   OPEN_DOWNGRADE), or by a third order on a different client.

   The typical file system application programming interfaces do not
   provide means to atomically modify LOCK-type operation (i.e.  LOCK or interrogate attributes LOCKU),
   defines an access mode for
   multiple files at the same time.  The following rules provide an
   environment where file (i.e.  READ, WRITE, or READ-
   WRITE) as established by the potential incoherences mentioned above can be
   reasonably managed.  These rules are derived from original OPEN which caused the practice
   allocation of
   previous NFS protocols.

   o  All attributes for a given file (per-fsid attributes excepted) are
      cached the open stateid and as a unit at modified by subsequent OPENs
   and OPEN_DOWNGRADEs for the client so that no non-serializability can
      arise within same open-owner/file pair.  Stateids
   returned by byte-range lock operations imply the context of a single file.

   o  An upper time boundary is maintained access mode for the
   open stateid associated with the lock set represented by the stateid.
   Delegation stateids have an access mode based on how long a client cache
      entry can be kept without being refreshed from the server.

   o type of
   delegation.  When operations are performed that change attributes at a READ, WRITE, or SETATTR which specifies the
      server, size
   attribute, is done, the updated attribute set operation is requested as part of subject to checking against the
      containing RPC.  This includes directory operations
   access mode to verify that update
      attributes indirectly.  This is accomplished by following the
      modifying operation is appropriate given the
   OPEN with a GETATTR which the operation and then using is associated.

   In the
      results case of WRITE-type operations (i.e.  WRITEs and SETATTRs which
   set size), the GETATTR to update the client's cached attributes.

   Note server must verify that the access mode allows writing
   and return an NFS4ERR_OPENMODE error if it does not.  In the full set case, of attributes to be cached is requested by
   READDIR,
   READ, the results can be cached by server may perform the client corresponding check on the same basis as
   attributes obtained via GETATTR.

   A client access
   mode, or it may validate its cached version of attributes choose to allow READ on opens for a file by
   fetching just both the change and time_access attributes and assuming
   that WRITE only, to
   accommodate clients whose write implementation may unavoidably do
   reads (e.g. due to buffer cache constraints).  However, even if the change attribute has the same value as it did when the
   attributes were cached, then no attributes other than time_access
   have changed.  The reason why time_access is also fetched is because
   many servers operate READs
   are allowed in environments where these circumstances, the operation server MUST still check for
   locks that updates
   change does not update time_access.  For example, POSIX file
   semantics do not update access time when a file is modified by the
   write system call.  Therefore, conflict with the client READ (e.g. another open specify denial
   of READs).  Note that wants a current
   time_access value should fetch it with change during server which does enforce the attribute
   cache validation processing and update its cached time_access.

   The client may maintain a cache access mode
   check on READs need not explicitly check for conflicting share
   reservations since the existence of modified attributes OPEN for those
   attributes intimately connected with data read access guarantees
   that no conflicting share reservation can exist.

   A special stateid of modified regular files
   (size, time_modify, and change).  Other than those three attributes, all bits 1 (one), including all fields in the client
   stateid indicates a desire to bypass locking checks.  The server MAY
   allow READ operations to bypass locking checks at the server, when
   this special stateid is used.  However, WRITE operations with with
   this special stateid value MUST NOT maintain a cache of modified attributes.
   Instead, attribute changes bypass locking checks and are immediately sent to
   treated exactly the server.

   In some operating environments, same as if a stateid of all bits 0 were used.

   A lock may not be granted while a READ or WRITE operation using one
   of the equivalent to time_access special stateids is
   expected to be implicitly updated by each read being performed and the range of the content lock
   request conflicts with the range of the
   file object.  If an NFS client is caching READ or WRITE operation.  For
   the content purposes of this paragraph, a file
   object, whether it conflict occurs when a shared lock
   is requested and a regular file, directory, or symbolic link,
   the client SHOULD NOT update the time_access attribute (via SETATTR WRITE operation is being performed, or an
   exclusive lock is requested and either a small READ or READDIR request) on the server with each read that
   is satisfied from cache.  The reason a WRITE operation is that this can defeat the
   performance benefits of caching content, especially since an explicit
   being performed.  A SETATTR of time_access may alter the change attribute on the server.
   If the change attribute changes, clients that are caching the content
   will think the content has changed, and will re-read unmodified data
   from the server.  Nor sets size is the client encouraged treated similarly to maintain a modified
   version of time_access in its cache, since this would mean that
   WRITE as discussed above.

13.2.  Lock Ranges

   The protocol allows a lock owner to request a lock with a byte range
   and then either upgrade, downgrade, or unlock a sub-range of the
   client
   initial lock.  It is expected that this will either eventually have to write the access time be an uncommon type of
   request.  In any case, servers or server filesystems may not be able
   to support sub-range lock semantics.  In the event that a server with bad performance effects, or it would never update the
   server's time_access, thereby resulting in
   receives a situation where an
   application locking request that caches access time between represents a close and open sub-range of current
   locking state for the
   same file observes the access time oscillating between the past and
   present.  The time_access attribute always means lock owner, the time of last
   access server is allowed to a file by a read that was satisfied by return the server.  This
   way clients will tend
   error NFS4ERR_LOCK_RANGE to see only time_access changes signify that go forward
   in time.

9.7.  Data and Metadata Caching and Memory Mapped Files

   Some operating environments include it does not support sub-
   range lock operations.  Therefore, the capability for an application client should be prepared to map a file's content into
   receive this error and, if appropriate, report the application's address space.  Each
   time error to the application accesses a memory location
   requesting application.

   The client is discouraged from combining multiple independent locking
   ranges that corresponds happen to a
   block that has not been loaded be adjacent into the address space, a page fault
   occurs and the file is read (or if single request since the block does
   server may not exist in the
   file, the block is allocated support sub-range requests and then instantiated in the
   application's address space).

   As long as each memory mapped access for reasons related to
   the recovery of file requires a page
   fault, locking state in the relevant attributes event of the file that are used to detect
   access and modification (time_access, time_metadata, time_modify, and
   change) will be updated.  However, server failure.
   As discussed in many operating environments,
   when page faults are not required these attributes will not be
   updated on reads or updates to the file via memory access (regardless
   whether section "Server Failure and Recovery" below, the file is local file or is being access remotely).  A
   client or
   server MAY fail to update attributes of a file may employ certain optimizations during recovery that work
   effectively only when the client's behavior during lock recovery is
   being accessed via memory mapped I/O. This has several implications:

   o  If there is an application on
   similar to the client's locking behavior prior to server that failure.

13.3.  Upgrading and Downgrading Locks

   If a client has memory mapped a
      file that write lock on a client is also accessing, record, it can request an atomic
   downgrade of the client may not be able lock to get a consistent value of read lock via the change attribute to determine
      whether its cache is stale or not.  A server that knows that LOCK request, by setting
   the
      file is memory mapped could always pessimistically return updated
      values for change so as type to force READ_LT.  If the application to always get server supports atomic downgrade, the
      most up
   request will succeed.  If not, it will return NFS4ERR_LOCK_NOTSUPP.
   The client should be prepared to date data receive this error, and metadata for if
   appropriate, report the file.  However, due error to the negative performance implications of this, such behavior is
      OPTIONAL.

   o requesting application.

   If the memory mapped file is not being modified on the server, and
      instead is just being a client has a read by lock on a record, it can request an application atomic
   upgrade of the lock to a write lock via the memory mapped
      interface, LOCK request by setting
   the client will type to WRITE_LT or WRITEW_LT.  If the server does not see support
   atomic upgrade, it will return NFS4ERR_LOCK_NOTSUPP.  If the upgrade
   can be achieved without an updated time_access
      attribute.  However, in many operating environments, neither existing conflict, the request will
      any process running on
   succeed.  Otherwise, the server.  Thus NFS clients are at no
      disadvantage with respect to local processes.

   o  If there is another client that server will return either NFS4ERR_DENIED or
   NFS4ERR_DEADLOCK.  The error NFS4ERR_DEADLOCK is memory mapping the file, and returned if
      that the
   client is holding a write delegation, issued the same LOCK request with the type set of issues
      as discussed in to WRITEW_LT and the previous two bullet items apply.  So, when a
   server does a CB_GETATTR to has detected a file that the deadlock.  The client has modified in
      its cache, the response from CB_GETATTR will not necessarily should be
      accurate.  As discussed earlier, the client's obligation is prepared to
   receive such errors and if appropriate, report that the file has been modified since error to the delegation was
      granted,
   requesting application.

13.4.  Blocking Locks

   Some clients require the support of blocking locks.  NFSv4.1 does not whether it has been modified again between successive
      CB_GETATTR calls,
   provide a callback when a previously unavailable lock becomes
   available.  Clients thus have no choice but to continually poll for
   the lock.  This presents a fairness problem.  Two new lock types are
   added, READW and WRITEW, and are used to indicate to the server MUST assume that any file
   the client has modified in cache has been modified again between
      successive CB_GETATTR calls.  Depending on the nature is requesting a blocking lock.  The server should maintain
   an ordered list of pending blocking locks.  When the
      client's memory management system, this weak obligation conflicting lock
   is released, the server may not be
      possible.  A wait the lease period for the first
   waiting client MAY return stale information in CB_GETATTR
      whenever to re-request the file lock.  After the lease period
   expires the next waiting client request is memory mapped.

   o  The mixture of memory mapping and file locking on allowed the same file lock.  Clients
   are required to poll at an interval sufficiently small that it is
      problematic.  Consider
   likely to acquire the following scenario, where lock in a page size
      on each client timely manner.  The server is 8192 bytes.

      *  Client A memory maps first page (8192 bytes) of file X

      *  Client B memory maps first page (8192 bytes) not
   required to maintain a list of file X

      *  Client A write locks first 4096 bytes

      *  Client B write pending blocked locks second 4096 bytes

      *  Client A, via a STORE instruction modifies part as it is used to
   increase fairness and not correct operation.  Because of its locked
         region.

      *  Simultaneous the
   unordered nature of crash recovery, storing of lock state to client A, client B issues a STORE on part stable
   storage would be required to guarantee ordered granting of blocking
   locks.

   Servers may also note the lock types and delay returning denial of
         its locked region.

   Here
   the challenge is for each client request to resynchronize allow extra time for a conflicting lock to get be
   released, allowing a
   correct view successful return.  In this way, clients can
   avoid the burden of needlessly frequent polling for blocking locks.
   The server should take care in the first page.  In many operating environments, length of delay in the event the
   virtual memory management systems on each
   client only know retransmits the request.

   If a page is
   modified, not that server receives a subset of the page corresponding to the
   respective blocking lock regions has been modified.  So it is not possible request, denies it, and then
   later receives a nonblocking request for
   each client to do the right thing, same lock, which is also
   denied, then it should remove the lock in question from its list of
   pending blocking locks.  Clients should use such a nonblocking
   request to only write indicate to the server that portion of the page that this is locked.  For example, if
   client A simply writes out the page, and then client B writes out last time they
   intend to poll for the
   page, client A's data lock, as may happen when the process
   requesting the lock is lost.

   Moreover, if mandatory locking interrupted.  This is enabled on the file, then we have a
   different problem.  When clients A and B issue the STORE
   instructions, courtesy to the resulting page faults require
   server, to prevent it from unnecessarily waiting a record lease period
   before granting other lock on the
   entire page.  Each client then tries to extend their locked range requests.  However, clients are not
   required to perform this courtesy, and servers must not depend on
   them doing so.  Also, clients must be prepared for the entire page, which results in possibility
   that this final locking request will be accepted.

13.5.  Lease Renewal

   The purpose of a deadlock.  Communicating the
   NFS4ERR_DEADLOCK error lease is to allow a STORE instruction is difficult at best.

   If server to remove stale locks
   that are held by a client that has crashed or is locking the entire memory mapped file, there otherwise
   unreachable.  It is no
   problem with advisory or mandatory record locking, at least until the
   client unlocks a region in the middle of the file.

   Given the above issues the following are permitted:

   o  Clients and servers MAY deny memory mapping not a file they know there
      are record locks for.

   o  Clients mechanism for cache consistency and servers MAY deny a record lock on a file they know lease
   renewals may not be denied if the lease interval has not expired.

   Since each session is
      memory mapped.

   o  A client MAY deny memory mapping associated with a file specific client, any
   operation issued on that it knows requires
      mandatory locking for I/O. If mandatory locking session is enabled after an indication that the file associated
   client is opened and mapped, reachable.  When a request is issued for a given session,
   execution of a SEQUENCE operation will result in all leases for the
   associated client MAY deny the application
      further access to its mapped file.

9.8.  Name Caching

   The results of LOOKUP and READDIR operations may be cached to avoid
   the cost of subsequent LOOKUP operations.  Just as in implicitly renewed.  This approach allows for
   low overhead lease renewal which scales well.  In the typical case of
   attribute caching, inconsistencies may arise among the various client
   caches.  To mitigate the effects of these inconsistencies no
   extra RPC calls are required for lease renewal and given in the context of typical file system APIs, an upper time boundary worst case
   one RPC is
   maintained on how long required every lease period, via a client name cache entry can be kept without
   verifying COMPOUND that the entry has not been made invalid by consists
   solely of a directory
   change operation performed single SEQUENCE operation.  The number of locks held by another client. .LP When a
   the client is not making changes to a directory factor since all state for which there exist name cache
   entries, the client needs to periodically fetch attributes for that
   directory to ensure that it is not being modified.  After determining involved
   with the lease renewal action.

   Since all operations that no modification has occurred, create a new lease also renew existing
   leases, the server must maintain a common lease expiration time for the
   associated name cache entries may
   all valid leases for a given client.  This lease time can then be
   easily updated to be upon implicit lease renewal actions.

13.6.  Crash Recovery

   The important requirement in crash recovery is that both the current time
   plus client
   and the name cache staleness bound.

   When server know when the other has failed.  Additionally, it is
   required that a client is making changes to sees a given directory, it needs to
   determine whether there consistent view of data across server
   restarts or reboots.  All READ and WRITE operations that may have
   been changes made to queued within the directory by
   other clients.  It does this by using client or network buffers must wait until the change attribute as
   reported before and after
   client has successfully recovered the directory operation in locks protecting the associated
   change_info4 value returned for READ and
   WRITE operations.

13.6.1.  Client Failure and Recovery

   In the event that a client fails, the operation.  The server is able to
   communicate to may release the client's
   locks when the associated leases have expired.  Conflicting locks
   from another client whether may only be granted after this lease expiration.
   When a client has not not failed and re-establishes his lease before
   expiration occurs, requests for conflicting locks will not be
   granted.

   To minimize client delay upon restart, lock requests are associated
   with an instance of the change_info4 data client by a client supplied verifier.  This
   verifier is provided
   atomically with respect to part of the directory operation.  If initial CREATE_CLIENTID call made by the change
   values are provided atomically,
   client.  The server returns a clientid as a result of the
   CREATE_CLIENTID operation.  The client is then able to compare confirms the pre-operation change value use of the
   clientid by establishing a session associated with that clientid.
   All locks, including opens, byte-range locks, delegations, and layout
   obtained by sessions using that clientid are associated with that
   clientid.

   Since the change value in verifier will be changed by the client's
   name cache.  If client upon each
   initialization, the comparison indicates server can compare a new verifier to the verifier
   associated with currently held locks and determine that they do not
   match.  This signifies the directory was
   updated by another client, client's new instantiation and subsequent
   loss of locking state.  As a result, the name cache server is free to release
   all locks held which are associated with the
   modified directory is purged old clientid which was
   derived from the client.  If the comparison
   indicates no modification, old verifier.  At this point conflicting locks from
   other clients, kept waiting while the name cache leaser had not yet expired, can
   be updated on the
   client to reflect granted.

   Note that the directory operation and verifier must have the associated timeout
   extended.  The post-operation change value needs to be saved as same uniqueness properties of
   the
   basis verifier for future change_info4 comparisons.

   As demonstrated by the scenario above, name caching requires that the
   client revalidate name cache data by inspecting COMMIT operation.

13.6.2.  Server Failure and Recovery

   If the change attribute server loses locking state (usually as a result of a directory at restart
   or reboot), it must allow clients time to discover this fact and re-
   establish the point when lost locking state.  The client must be able to re-
   establish the name cache item was cached.
   This requires that locking state without having the server update the change attribute for
   directories when the contents of deny valid
   requests because the corresponding directory server has granted conflicting access to another
   client.  Likewise, if there is
   modified.  For a client to use the change_info4 information
   appropriately and correctly, possibility that clients have not
   yet re-established their locking state for a file, the server must report the pre
   disallow READ and post
   operation change attribute values atomically. WRITE operations for that file.

   A client can determine that server failure (and thus loss of locking
   state) has occurred, when it receives one of two errors.  The
   NFS4ERR_STALE_STATEID error indicates a stateid invalidated by a
   reboot or restart.  The NFS4ERR_STALE_CLIENTID error indicates a
   clientid invalidated by reboot or restart.  When either of these are
   received, the server client must establish a new clientid (See
   Section 13.1.1) and re-establish its locking state.

   Once a session is
   unable to report established using the before new clientid, the client will
   use reclaim-type locking requests (i.e.  LOCK requests with reclaim
   set to true and after values atomically OPEN operations with respect a claim type of CLAIM_PREVIOUS)
   to re-establish its locking state.  Once this is done, or if there is
   no such locking state to reclaim, the directory operation, the server must client does a RECLAIM_COMPLETE
   operation to indicate that fact in the
   change_info4 return value.  When the information is not atomically
   reported, it has reclaimed all of the client should not assume locking state
   that other clients have not
   changed it will reclaim.  Once a client does a RECLAIM_COMPLETE
   operation, it may attempt non-reclaim locking operations, although it
   may get NFS4ERR_GRACE errors on these until the directory.

9.9.  Directory Caching period of special
   handling is over.

   The results period of READDIR operations may be used special handling of locking and READs and WRITEs, is
   referred to avoid subsequent
   READDIR operations.  Just as in the cases of attribute "grace period".  During the grace period, clients
   recover locks and name
   caching, inconsistencies may arise among the various client caches.
   To mitigate associated state using reclaim-type locking
   requests.  During this period, the effects of these inconsistencies, server must reject READ and given the
   context WRITE
   operations and non-reclaim locking requests (i.e. other LOCK and OPEN
   operations) with an error of typical file system APIs, the following rules should NFS4ERR_GRACE, unless it is able to
   guarantee that these may be
   followed:

   o  Cached READDIR information for done safely, as described below.

   The grace period may last until all clients who are known to possibly
   have had locks have done a directory which RECLAIM_COMPLETE operation, indicating
   that they have finished reclaiming the locks they held before the
   server reboot.  The server is not obtained assumed to maintain in stable storage a single READDIR operation must always be a consistent snapshot
   list of directory contents.  This is determined by using a GETATTR clients who may have such locks.  The server may also
   terminate the grace period before all clients have done
   RECLAIM_COMPLETE.  The server SHOULD NOT terminate the first READDIR and after grace period
   before a time equal to the last of READDIR that
      contributes lease period in order to give clients an
   opportunity to find out about the cache.

   o  An upper server reboot.  Some additional
   time boundary is maintained in order to indicate the length of allow time to establish a directory cache entry is considered valid before the client
      must revalidate new clientid and session
   and to effect lock reclaims may be added.

   If the cached information.

   The revalidation technique parallels server can reliably determine that discussed in the case granting a non-reclaim
   request will not conflict with reclamation of
   name caching.  When locks by other clients,
   the client is NFS4ERR_GRACE error does not changing the directory in
   question, checking have to be returned even within the change attribute of
   grace period, although NFS4ERR_GRACE must always be returned to
   clients attempting a non-reclaim lock request before doing their own
   RECLAIM_COMPLETE.  For the directory with GETATTR
   is adequate.  The lifetime of server to be able to service READ and
   WRITE operations during the cache entry can grace period, it must again be extended at
   these checkpoints.  When able to
   guarantee that no possible conflict could arise between a client potential
   reclaim locking request and the READ or WRITE operation.  If the
   server is modifying unable to offer that guarantee, the directory, NFS4ERR_GRACE error
   must be returned to the
   client needs client.

   For a server to use provide simple, valid handling during the change_info4 data grace
   period, the easiest method is to simply reject all non-reclaim
   locking requests and READ and WRITE operations by returning the
   NFS4ERR_GRACE error.  However, a server may keep information about
   granted locks in stable storage.  With this information, the server
   could determine if a regular lock or READ or WRITE operation can be
   safely processed.

   For example, if the server maintained on stable storage summary
   information on whether there
   are other clients modifying mandatory locks exist, either mandatory byte-
   range locks, or share reservations specifying deny modes, many
   requests could be allowed during the directory. grace period.  If it is determined known
   that no other client modifications are occurring, such share reservations exist, OPEN request that do not
   specify deny modes may be safely granted.  If, in addition, it is
   known that no mandatory byte-range locks exist, either through
   information stored on stable storage or simply because the client server
   does not support such locks, READ and WRITE requests may update
   its directory cache be safely
   processed during the grace period.

   To reiterate, for a server that allows non-reclaim lock and I/O
   requests to reflect its own changes.

   As demonstrated previously, directory caching requires be processed during the grace period, it MUST determine
   that no lock subsequently reclaimed will be rejected and that no lock
   subsequently reclaimed would have prevented any I/O operation
   processed during the
   client revalidate directory cache data by inspecting grace period.

   Clients should be prepared for the change
   attribute return of a directory at the point when NFS4ERR_GRACE errors for
   non-reclaim lock and I/O requests.  In this case the directory was cached.
   This requires that client should
   employ a retry mechanism for the server update request.  A delay (on the change attribute for
   directories when order of
   several seconds) between retries should be used to avoid overwhelming
   the contents server.  Further discussion of the corresponding directory general issue is
   modified.  For a included in
   [Floyd].  The client to use the change_info4 information
   appropriately and correctly, the server must report the pre and post
   operation change attribute values atomically.  When account for the server that is
   unable able to report the before
   perform I/O and after values atomically with respect
   to the directory operation, non-reclaim locking requests within the server must indicate grace period
   as well as those that fact in the
   change_info4 return value.  When the information is can not atomically
   reported, do so.

   A reclaim-type locking request outside the client should not assume server's grace period can
   only succeed if the server can guarantee that other no conflicting lock or
   I/O request has been granted since reboot or restart.

   A server may, upon restart, establish a new value for the lease
   period.  Therefore, clients have not
   changed should, once a new clientid is
   established, refetch the directory.

10.  Security Negotiation

   The NFSv4.0 specification contains three oversights lease_time attribute and ambiguities use it as the basis
   for lease renewal for the lease associated with respect to that server.
   However, the SECINFO operation.

   First, it is impossible server must establish, for this restart event, a grace
   period at least as long as the lease period for the previous server
   instantiation.  This allows the client to use state obtained during the SECINFO operation
   previous server instance to