NFSv4                                                     T. Haynes, Ed.
Internet-Draft                                                    NetApp
Obsoletes: 3530 (if approved)                             D. Noveck, Ed.
Intended status: Standards Track                                     EMC
Expires: November 9, 2013                                   May 08, February 17, 2014                               August 16, 2013

              Network File System (NFS) Version 4 Protocol
                   draft-ietf-nfsv4-rfc3530bis-26.txt
                   draft-ietf-nfsv4-rfc3530bis-27.txt

Abstract

   The Network File System (NFS) version 4 is a distributed filesystem file system
   protocol which owes builds on the heritage to of NFS protocol version 2, RFC
   1094, and version 3, RFC 1813.  Unlike earlier versions, the NFS
   version 4 protocol supports traditional file access while integrating
   support for file locking and the mount protocol.  In addition,
   support for strong security (and its negotiation), compound
   operations, client caching, and internationalization have been added.
   Of course, attention has been applied to making NFS version 4 operate
   well in an Internet environment.

   This document, together with the companion XDR description document,
   RFCNFSv4XDR, obsoletes RFC 3530 as the definition of the NFS version
   4 protocol.

Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on November 9, 2013. February 17, 2014.

Copyright Notice

   Copyright (c) 2013 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

   This document may contain material from IETF Documents or IETF
   Contributions published or made publicly available before November
   10, 2008.  The person(s) controlling the copyright in some of this
   material may not have granted the IETF Trust the right to allow
   modifications of such material outside the IETF Standards Process.
   Without obtaining an adequate license from the person(s) controlling
   the copyright in such materials, this document may not be modified
   outside the IETF Standards Process, and derivative works of it may
   not be created outside the IETF Standards Process, except to format
   it for publication as an RFC or to translate it into languages other
   than English.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   9   8
     1.1.   NFS Version 4 Goals  . . . . . . . . . . . . . . . . . .   9   8
     1.2.   Inconsistencies of this Document with   Definitions in the companion document NFS Version 4
            Protocol are Authoritative . . . . . . . . . . . .  10 . . .   8
     1.3.   Overview of NFSv4 Features . . . . . . . . . . . . . . .  10   9
       1.3.1.   RPC and Security . . . . . . . . . . . . . . . . . .  10   9
       1.3.2.   Procedure and Operation Structure  . . . . . . . . .  10   9
       1.3.3.   Filesystem Model . . . . . . . . . . . . . . . . . .  11  10
       1.3.4.   OPEN and CLOSE . . . . . . . . . . . . . . . . . . .  13  12
       1.3.5.   File Locking . . . . . . . . . . . . . . . . . . . .  13  12
       1.3.6.   Client Caching and Delegation  . . . . . . . . . . .  14  12
     1.4.   General Definitions  . . . . . . . . . . . . . . . . . .  14  13
     1.5.   Changes since RFC 3530 . . . . . . . . . . . . . . . . .  16  15
     1.6.   Changes since RFC 3010 . . . . . . . . . . . . . . . . .  17  16
   2.  Protocol Data Types . . . . . . . . . . . . . . . . . . . . .  18  17
     2.1.   Basic Data Types . . . . . . . . . . . . . . . . . . . .  18  17
     2.2.   Structured Data Types  . . . . . . . . . . . . . . . . .  20  19
   3.  RPC and Security Flavor . . . . . . . . . . . . . . . . . . .  24  23
     3.1.   Ports and Transports . . . . . . . . . . . . . . . . . .  25  23
       3.1.1.   Client Retransmission Behavior . . . . . . . . . . .  26  24
     3.2.   Security Flavors . . . . . . . . . . . . . . . . . . . .  26  25
       3.2.1.   Security mechanisms for NFSv4  . . . . . . . . . . .  27  25
     3.3.   Security Negotiation . . . . . . . . . . . . . . . . . .  28  26
       3.3.1.   SECINFO  . . . . . . . . . . . . . . . . . . . . . .  28  27
       3.3.2.   Security Error . . . . . . . . . . . . . . . . . . .  29  27
       3.3.3.   Callback RPC Authentication  . . . . . . . . . . . .  29  27
   4.  Filehandles . . . . . . . . . . . . . . . . . . . . . . . . .  30  28
     4.1.   Obtaining the First Filehandle . . . . . . . . . . . . .  30  28
       4.1.1.   Root Filehandle  . . . . . . . . . . . . . . . . . .  31  29
       4.1.2.   Public Filehandle  . . . . . . . . . . . . . . . . .  31  29
     4.2.   Filehandle Types . . . . . . . . . . . . . . . . . . . .  31  30
       4.2.1.   General Properties of a Filehandle . . . . . . . . .  32  30
       4.2.2.   Persistent Filehandle  . . . . . . . . . . . . . . .  32  31
       4.2.3.   Volatile Filehandle  . . . . . . . . . . . . . . . .  33  31
       4.2.4.   One Method of Constructing a Volatile Filehandle . .  34  32
     4.3.   Client Recovery from Filehandle Expiration . . . . . . .  34  33
   5.  File  Attributes  . . . . . . . . . . . . . . . . . . . . . . .  35 . .  34
     5.1.   REQUIRED Attributes  . . . . . . . . . . . . . . . . . .  36  35
     5.2.   RECOMMENDED Attributes . . . . . . . . . . . . . . . . .  37  35
     5.3.   Named Attributes . . . . . . . . . . . . . . . . . . . .  37  36
     5.4.   Classification of Attributes . . . . . . . . . . . . . .  39  37
     5.5.   Set-Only and Get-Only Attributes . . . . . . . . . . . .  39  38
     5.6.   REQUIRED Attributes - List and Definition References . .  40  38
     5.7.   RECOMMENDED Attributes - List and Definition
            References . . . . . . . . . . . . . . . . . . . . . . .  41  39
     5.8.   Attribute Definitions  . . . . . . . . . . . . . . . . .  42  40
       5.8.1.   Definitions of REQUIRED Attributes . . . . . . . . .  42  40
       5.8.2.   Definitions of Uncategorized RECOMMENDED
                Attributes . . . . . . . . . . . . . . . . . . . . .  44  42
     5.9.   Interpreting owner and owner_group . . . . . . . . . . .  50  48
     5.10.  Character Case Attributes  . . . . . . . . . . . . . . .  53  51
   6.  Access Control Attributes . . . . . . . . . . . . . . . . . .  53  51
     6.1.   Goals  . . . . . . . . . . . . . . . . . . . . . . . . .  53  51
     6.2.   File Attributes Discussion . . . . . . . . . . . . . . .  54  52
       6.2.1.   Attribute 12: acl  . . . . . . . . . . . . . . . . .  54  52
       6.2.2.   Attribute 33: mode . . . . . . . . . . . . . . . . .  68  67
     6.3.   Common Methods . . . . . . . . . . . . . . . . . . . . .  69  67
       6.3.1.   Interpreting an ACL  . . . . . . . . . . . . . . . .  69  67
       6.3.2.   Computing a Mode Attribute from an ACL . . . . . . .  70  68
     6.4.   Requirements . . . . . . . . . . . . . . . . . . . . . .  71  69
       6.4.1.   Setting the mode and/or ACL Attributes . . . . . . .  72  70
       6.4.2.   Retrieving the mode and/or ACL Attributes  . . . . .  73  71
       6.4.3.   Creating New Objects . . . . . . . . . . . . . . . .  73  71
   7.  Multi-Server Namespace  NFS Server Name Space . . . . . . . . . . . . . . . . . . .  75 .  73
     7.1.   Location Attributes   Server Exports . . . . . . . . . . . . . . . . . .  75 . . .  73
     7.2.   File System Presence or Absence   Browsing Exports . . . . . . . . . . . .  76
     7.3.   Getting Attributes for an Absent File System . . . . . .  77
       7.3.1.   GETATTR Within an Absent File System . .  73
     7.3.   Server Pseudo Filesystem . . . . . .  77
       7.3.2.   READDIR and Absent File Systems . . . . . . . . . .  78  74
     7.4.   Uses of Location Information .   Multiple Roots . . . . . . . . . . . . .  78
       7.4.1.   File System Replication . . . . . . . .  74
     7.5.   Filehandle Volatility  . . . . . .  79
       7.4.2.   File System Migration . . . . . . . . . . .  75
     7.6.   Exported Root  . . . .  80
       7.4.3.   Referrals . . . . . . . . . . . . . . . . .  75
     7.7.   Mount Point Crossing . . . .  81
     7.5.   Location Entries and Server Identity . . . . . . . . . .  81
     7.6.   Additional Client-Side Considerations . . . .  75
     7.8.   Security Policy and Name Space Presentation  . . . . .  82
     7.7.   Effecting File System Transitions .  76
   8.  Multi-Server Namespace  . . . . . . . . . .  83
       7.7.1.   File System Transitions and Simultaneous Access . .  84
       7.7.2.   Filehandles and File System Transitions . . . . . .  85
       7.7.3.   Fileids and File System Transitions .  76
     8.1.   Location Attributes  . . . . . . .  85
       7.7.4.   Fsids and File System Transitions . . . . . . . . .  86
       7.7.5.   The Change Attribute and File System Transitions . .  87
       7.7.6.   Lock State and  77
     8.2.   File System Transitions Presence or Absence  . . . . . . .  87
       7.7.7.   Write Verifiers and File System Transitions . . . .  89
       7.7.8.   Readdir Cookies and Verifiers and .  77
     8.3.   Getting Attributes for an Absent File System
                Transitions . . . . . .  78
       8.3.1.   GETATTR Within an Absent File System . . . . . . . .  78
       8.3.2.   READDIR and Absent File Systems  . . . . . .  89
       7.7.9.   File System Data and File System Transitions . . . .  90
     7.8.   Effecting File System Referrals  79
     8.4.   Uses of Location Information . . . . . . . . . . . .  91
       7.8.1.   Referral Example (LOOKUP) . .  80
       8.4.1.   File System Replication  . . . . . . . . . . .  91
       7.8.2.   Referral Example (READDIR) . . .  81
       8.4.2.   File System Migration  . . . . . . . . . .  95
     7.9.   The Attribute fs_locations . . . . .  81
       8.4.3.   Referrals  . . . . . . . . . .  98
       7.9.1.   Inferring Transition Modes . . . . . . . . . . .  82
     8.5.   Location Entries and Server Identity . .  99
   8.  NFS Server Name Space . . . . . . . .  83
     8.6.   Additional Client-Side Considerations  . . . . . . . . .  83
     8.7.   Effecting File System Referrals  . . . 101
     8.1.   Server Exports . . . . . . . . .  84
       8.7.1.   Referral Example (LOOKUP)  . . . . . . . . . . . . 101
     8.2.   Browsing Exports .  85
       8.7.2.   Referral Example (READDIR) . . . . . . . . . . . . .  89
     8.8.   The Attribute fs_locations . . . . . . 101
     8.3.   Server Pseudo Filesystem . . . . . . . . .  91
       8.8.1.   Inferring Transition Modes . . . . . . . 101
     8.4.   Multiple Roots . . . . . .  93
   9.  File Locking and Share Reservations . . . . . . . . . . . . .  94
     9.1.   Opens and Byte-Range Locks . . 102
     8.5.   Filehandle Volatility . . . . . . . . . . . . .  95
       9.1.1.   Client ID  . . . . 102
     8.6.   Exported Root . . . . . . . . . . . . . . . . .  96
       9.1.2.   Server Release of Client ID  . . . . 102
     8.7.   Mount Point Crossing . . . . . . . .  99
       9.1.3.   Stateid Definition . . . . . . . . . . 103
     8.8.   Security Policy and Name Space Presentation . . . . . . 103
   9.  File Locking and Share Reservations .  99
       9.1.4.   lock-owner . . . . . . . . . . . . 104
     9.1.   Opens and Byte-Range Locks . . . . . . . . . 105
       9.1.5.   Use of the Stateid and Locking . . . . . . 105
       9.1.1.   Client ID . . . . . 106
       9.1.6.   Sequencing of Lock Requests  . . . . . . . . . . . . . . . . 105
       9.1.2.   Server Release of Client ID  . . . . . . . . . . . . 108
       9.1.3.   Stateid Definition . 108
       9.1.7.   Recovery from Replayed Requests  . . . . . . . . . . 109
       9.1.8.   Interactions of multiple sequence values . . . . . . 109
       9.1.4.   lock-owner . . . . . . . . . . . . . . . . . . . . . 115
       9.1.5.   Use of the Stateid and Locking . . . . . . . . . . . 116
       9.1.6.   Sequencing of Lock Requests  . . . . . . . . . . . . 118
       9.1.7.   Recovery from Replayed Requests  . . . . . . . . . . 119
       9.1.8.   Interactions of multiple sequence values . . . . . . 119
       9.1.9.   Releasing state-owner State  . . . . . . . . . . . . 120 110
       9.1.10.  Use of Open Confirmation . . . . . . . . . . . . . . 121 111
     9.2.   Lock Ranges  . . . . . . . . . . . . . . . . . . . . . . 122 112
     9.3.   Upgrading and Downgrading Locks  . . . . . . . . . . . . 122 113
     9.4.   Blocking Locks . . . . . . . . . . . . . . . . . . . . . 123 113
     9.5.   Lease Renewal  . . . . . . . . . . . . . . . . . . . . . 124 114
     9.6.   Crash Recovery . . . . . . . . . . . . . . . . . . . . . 125 115
       9.6.1.   Client Failure and Recovery  . . . . . . . . . . . . 125 115
       9.6.2.   Server Failure and Recovery  . . . . . . . . . . . . 125 116
       9.6.3.   Network Partitions and Recovery  . . . . . . . . . . 127 117
     9.7.   Recovery from a Lock Request Timeout or Abort  . . . . . 135 125
     9.8.   Server Revocation of Locks . . . . . . . . . . . . . . . 135 125
     9.9.   Share Reservations . . . . . . . . . . . . . . . . . . . 136 127
     9.10.  OPEN/CLOSE Operations  . . . . . . . . . . . . . . . . . 137 127
       9.10.1.  Close and Retention of State Information . . . . . . 138 128
     9.11.  Open Upgrade and Downgrade . . . . . . . . . . . . . . . 139 129
     9.12.  Short and Long Leases  . . . . . . . . . . . . . . . . . 139 130
     9.13.  Clocks, Propagation Delay, and Calculating Lease
            Expiration . . . . . . . . . . . . . . . . . . . . . . . 140 130
     9.14.  Migration, Replication and State . . . . . . . . . . . . 140 131
       9.14.1.  Migration and State  . . . . . . . . . . . . . . . . 141 131
       9.14.2.  Replication and State  . . . . . . . . . . . . . . . 142 132
       9.14.3.  Notification of Migrated Lease . . . . . . . . . . . 142 132
       9.14.4.  Migration and the Lease_time Attribute . . . . . . . 143 133
   10. Client-Side Caching . . . . . . . . . . . . . . . . . . . . . 143 134
     10.1.  Performance Challenges for Client-Side Caching . . . . . 144 134
     10.2.  Delegation and Callbacks . . . . . . . . . . . . . . . . 145 135
       10.2.1.  Delegation Recovery  . . . . . . . . . . . . . . . . 147 137
     10.3.  Data Caching . . . . . . . . . . . . . . . . . . . . . . 151 141
       10.3.1.  Data Caching and OPENs . . . . . . . . . . . . . . . 151 142
       10.3.2.  Data Caching and File Locking  . . . . . . . . . . . 152 143
       10.3.3.  Data Caching and Mandatory File Locking  . . . . . . 154 144
       10.3.4.  Data Caching and File Identity . . . . . . . . . . . 154 145
     10.4.  Open Delegation  . . . . . . . . . . . . . . . . . . . . 155 146
       10.4.1.  Open Delegation and Data Caching . . . . . . . . . . 158 148
       10.4.2.  Open Delegation and File Locks . . . . . . . . . . . 159 149
       10.4.3.  Handling of CB_GETATTR . . . . . . . . . . . . . . . 159 150
       10.4.4.  Recall of Open Delegation  . . . . . . . . . . . . . 162 153
       10.4.5.  OPEN Delegation Race with CB_RECALL  . . . . . . . . 164 155
       10.4.6.  Clients that Fail to Honor Delegation Recalls  . . . 165 155
       10.4.7.  Delegation Revocation  . . . . . . . . . . . . . . . 166 156
     10.5.  Data Caching and Revocation  . . . . . . . . . . . . . . 166 157
       10.5.1.  Revocation Recovery for Write Open Delegation  . . . 167 157

     10.6.  Attribute Caching  . . . . . . . . . . . . . . . . . . . 168 158
     10.7.  Data and Metadata Caching and Memory Mapped Files  . . . 170 160
     10.8.  Name Caching . . . . . . . . . . . . . . . . . . . . . . 172 162
     10.9.  Directory Caching  . . . . . . . . . . . . . . . . . . . 173 163
   11. Minor Versioning  . . . . . . . . . . . . . . . . . . . . . . 174 164
   12. Internationalization  . . . . . . . . . . . . . . . . . . . . 176 167
     12.1.  Use of UTF-8 . . . . . . . . . . . . . . . . . .  Introduction . . . . 177
       12.1.1.  Relation to Stringprep . . . . . . . . . . . . . . . 177
       12.1.2.  Normalization, Equivalence, and Confusability . . . 178 167
     12.2.  String Type Overview . . . . . . . . . . . . . . . . . . 181
       12.2.1.  Overall String Class Divisions . . . . . Encoding  . . . . . . 181
       12.2.2.  Divisions by Typedef Parent types . . . . . . . . . 182
       12.2.3.  Individual Types and Their Handling  . . . . . . . . 183 167
     12.3.  Errors Related to Strings  . . . . . . . . . . . . . . . 184
     12.4.  Types with Pre-processing to Resolve Mixture Issues  . . 185
       12.4.1.  Processing of Principal Strings  .  Normalization  . . . . . . . . . 185
       12.4.2.  Processing of Server Id Strings . . . . . . . . . . 186
     12.5.  String Types without Internationalization Processing . . 186
     12.6. 168
     12.4.  Types with Processing Defined by Other Internet Areas  . 187
     12.7.  String Types with NFS-specific Processing  . . . . . . . 188
       12.7.1.  Handling of File Name Components . . . . . . . . . . 188
       12.7.2.  Processing of Link Text  . . . . . . 168
     12.5.  UTF-8 Related Errors . . . . . . . . 197
       12.7.3.  Processing of Principal Prefixes . . . . . . . . . . 198 169
   13. Error Values  . . . . . . . . . . . . . . . . . . . . . . . . 199 170
     13.1.  Error Definitions  . . . . . . . . . . . . . . . . . . . 199 170
       13.1.1.  General Errors . . . . . . . . . . . . . . . . . . . 201 172
       13.1.2.  Filehandle Errors  . . . . . . . . . . . . . . . . . 202 173
       13.1.3.  Compound Structure Errors  . . . . . . . . . . . . . 204 174
       13.1.4.  File System Errors . . . . . . . . . . . . . . . . . 204 175
       13.1.5.  State Management Errors  . . . . . . . . . . . . . . 206 177
       13.1.6.  Security Errors  . . . . . . . . . . . . . . . . . . 207 178
       13.1.7.  Name Errors  . . . . . . . . . . . . . . . . . . . . 208 179
       13.1.8.  Locking Errors . . . . . . . . . . . . . . . . . . . 209 179
       13.1.9.  Reclaim Errors . . . . . . . . . . . . . . . . . . . 210 181
       13.1.10. Client Management Errors . . . . . . . . . . . . . . 211 181
       13.1.11. Attribute Handling Errors  . . . . . . . . . . . . . 211 182
     13.2.  Operations and their valid errors  . . . . . . . . . . . 212 182
     13.3.  Callback operations and their valid errors . . . . . . . 219 189
     13.4.  Errors and the operations that use them  . . . . . . . . 219 190
   14. NFSv4 Requests  . . . . . . . . . . . . . . . . . . . . . . . 224 194
     14.1.  Compound Procedure . . . . . . . . . . . . . . . . . . . 224 195
     14.2.  Evaluation of a Compound Request . . . . . . . . . . . . 225 195
     14.3.  Synchronous Modifying Operations . . . . . . . . . . . . 226 196
     14.4.  Operation Values . . . . . . . . . . . . . . . . . . . . 226 196
   15. NFSv4 Procedures  . . . . . . . . . . . . . . . . . . . . . . 226 197
     15.1.  Procedure 0: NULL - No Operation . . . . . . . . . . . . 226 197
     15.2.  Procedure 1: COMPOUND - Compound Operations  . . . . . . 227 197
     15.3.  Operation 3: ACCESS - Check Access Rights  . . . . . . . 230 201
     15.4.  Operation 4: CLOSE - Close File  . . . . . . . . . . . . 233 204
     15.5.  Operation 5: COMMIT - Commit Cached Data . . . . . . . . 234 205
     15.6.  Operation 6: CREATE - Create a Non-Regular File Object . 237 207
     15.7.  Operation 7: DELEGPURGE - Purge Delegations Awaiting
            Recovery . . . . . . . . . . . . . . . . . . . . . . . . 239 210
     15.8.  Operation 8: DELEGRETURN - Return Delegation . . . . . . 241 211
     15.9.  Operation 9: GETATTR - Get Attributes  . . . . . . . . . 241 212
     15.10. Operation 10: GETFH - Get Current Filehandle . . . . . . 243 214
     15.11. Operation 11: LINK - Create Link to a File . . . . . . . 244 215
     15.12. Operation 12: LOCK - Create Lock . . . . . . . . . . . . 246 216
     15.13. Operation 13: LOCKT - Test For Lock  . . . . . . . . . . 250 220
     15.14. Operation 14: LOCKU - Unlock File  . . . . . . . . . . . 252 222
     15.15. Operation 15: LOOKUP - Lookup Filename . . . . . . . . . 253 223
     15.16. Operation 16: LOOKUPP - Lookup Parent Directory  . . . . 255 225
     15.17. Operation 17: NVERIFY - Verify Difference in
            Attributes . . . . . . . . . . . . . . . . . . . . . . . 256 226
     15.18. Operation 18: OPEN - Open a Regular File . . . . . . . . 257 227
     15.19. Operation 19: OPENATTR - Open Named Attribute
            Directory  . . . . . . . . . . . . . . . . . . . . . . . 267 237
     15.20. Operation 20: OPEN_CONFIRM - Confirm Open  . . . . . . . 268 238
     15.21. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access . 270 240
     15.22. Operation 22: PUTFH - Set Current Filehandle . . . . . . 271 241
     15.23. Operation 23: PUTPUBFH - Set Public Filehandle . . . . . 272 242
     15.24. Operation 24: PUTROOTFH - Set Root Filehandle  . . . . . 273 243
     15.25. Operation 25: READ - Read from File  . . . . . . . . . . 274 244
     15.26. Operation 26: READDIR - Read Directory . . . . . . . . . 276 246
     15.27. Operation 27: READLINK - Read Symbolic Link  . . . . . . 280 250
     15.28. Operation 28: REMOVE - Remove Filesystem Object  . . . . 281 251
     15.29. Operation 29: RENAME - Rename Directory Entry  . . . . . 283 253
     15.30. Operation 30: RENEW - Renew a Lease  . . . . . . . . . . 285 255
     15.31. Operation 31: RESTOREFH - Restore Saved Filehandle . . . 286 256
     15.32. Operation 32: SAVEFH - Save Current Filehandle . . . . . 287 257
     15.33. Operation 33: SECINFO - Obtain Available Security  . . . 288 258
     15.34. Operation 34: SETATTR - Set Attributes . . . . . . . . . 292 262
     15.35. Operation 35: SETCLIENTID - Negotiate Client ID  . . . . 294 264
     15.36. Operation 36: SETCLIENTID_CONFIRM - Confirm Client ID  . 298 268
     15.37. Operation 37: VERIFY - Verify Same Attributes  . . . . . 301 271
     15.38. Operation 38: WRITE - Write to File  . . . . . . . . . . 303 273
     15.39. Operation 39: RELEASE_LOCKOWNER - Release Lockowner
            State  . . . . . . . . . . . . . . . . . . . . . . . . . 307 277
     15.40. Operation 10044: ILLEGAL - Illegal operation . . . . . . 308 278
   16. NFSv4 Callback Procedures . . . . . . . . . . . . . . . . . . 309 279
     16.1.  Procedure 0: CB_NULL - No Operation  . . . . . . . . . . 309 279
     16.2.  Procedure 1: CB_COMPOUND - Compound Operations . . . . . 309 279
       16.2.6.  Operation 3: CB_GETATTR - Get Attributes . . . . . . 311 281
       16.2.7.  Operation 4: CB_RECALL - Recall an Open Delegation . 312 282
       16.2.8.  Operation 10044: CB_ILLEGAL - Illegal Callback
                Operation  . . . . . . . . . . . . . . . . . . . . . 313 283
   17. Security Considerations . . . . . . . . . . . . . . . . . . . 314 284
   18. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 316 286
     18.1.  Named Attribute Definitions  . . . . . . . . . . . . . . 316 286
       18.1.1.  Initial Registry . . . . . . . . . . . . . . . . . . 317 287
       18.1.2.  Updating Registrations . . . . . . . . . . . . . . . 317 287
   19. References  . . . . . . . . . . . . . . . . . . . . . . . . . 317 287
     19.1.  Normative References . . . . . . . . . . . . . . . . . . 317 287
     19.2.  Informative References . . . . . . . . . . . . . . . . . 318 288
   Appendix A.  Acknowledgments  . . . . . . . . . . . . . . . . . . 321 291
   Appendix B.  RFC Editor Notes . . . . . . . . . . . . . . . . . . 322 292
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . . 322 292

1.  Introduction

1.1.  NFS Version 4 Goals

   The Network Filesystem version 4 (NFSv4) protocol is a further
   revision of the NFS protocol defined already by versions 2 [RFC1094]
   and 3 [RFC1813].  It retains the essential characteristics of
   previous versions: design for easy recovery, independent of transport
   protocols, operating systems and filesystems, file systems, simplicity, and good
   performance.  The NFSv4 revision has the following goals:

   o  Improved access and good performance on the Internet.

      The protocol is designed to transit firewalls easily, perform well
      where latency is high and bandwidth is low, and scale to very
      large numbers of clients per server.

   o  Strong security with negotiation built into the protocol.

      The protocol builds on the work of the ONCRPC Open Network Computing
      (ONC) Remote Procedure Call (RPC) working group in supporting the
      RPCSEC_GSS protocol. protocol (see both [RFC2203] and [RFC5403]).
      Additionally, the NFS version 4 protocol provides a mechanism to
      allow clients and servers the ability to negotiate security and
      require clients and servers to support a minimal set of security
      schemes.

   o  Good cross-platform interoperability.

      The protocol features a filesystem file system model that provides a useful,
      common set of features that does not unduly favor one filesystem file system
      or operating system over another.

   o  Designed for protocol extensions.

      The protocol is designed to accept standard extensions that do not
      compromise backward compatibility.

   This document, together with the companion XDR description document
   [I-D.ietf-nfsv4-rfc3530bis-dot-x], obsoletes RFC 3530 [RFC3530] as
   the authoritative document describing NFSv4.  It does not introduce
   any over-the-wire protocol changes, in the sense that previously
   valid requests requests remain valid.  However, some requests
   previously defined as invalid, although not generally rejected, are
   now explicitly allowed, in that internationalization handling has
   been generalized and liberalized.

1.2.  Inconsistencies of this Document with  Definitions in the companion document NFS Version 4 Protocol are
      Authoritative

   [I-D.ietf-nfsv4-rfc3530bis-dot-x], NFS Version 4 Protocol, contains
   the definitions in XDR description language of the constructs used by
   the protocol.  Inside this document, several of the constructs are
   reproduced for purposes of explanation.  The reader is warned of the
   possibility of errors in the reproduced constructs outside of
   [I-D.ietf-nfsv4-rfc3530bis-dot-x].  For any part of the document that
   is inconsistent with [I-D.ietf-nfsv4-rfc3530bis-dot-x],
   [I-D.ietf-nfsv4-rfc3530bis-dot-x] is to be considered authoritative.

1.3.  Overview of NFSv4 Features

   To provide a reasonable context for the reader, the major features of
   NFSv4 protocol will be reviewed in brief.  This will be done to
   provide an appropriate context for both the reader who is familiar
   with the previous versions of the NFS protocol and the reader that who is
   new to the NFS protocols.  For the reader new to the NFS protocols,
   some fundamental knowledge is still expected.  The reader should be
   familiar with the XDR and RPC protocols as described in [RFC5531] and
   [RFC4506].  A basic knowledge of filesystems file systems and distributed
   filesystems file
   systems is expected as well.

1.3.1.  RPC and Security

   As with previous versions of NFS, the External Data Representation
   (XDR) and Remote Procedure Call (RPC) RPC mechanisms used for the NFSv4 protocol are those
   defined in [RFC5531] and [RFC4506].  To meet end to end security
   requirements, the RPCSEC_GSS framework (both version 1 in [RFC2203]
   and version 2 in [RFC5403]) will be used to extend the basic RPC
   security.  With the use of RPCSEC_GSS, various mechanisms can be
   provided to offer authentication, integrity, and privacy to the NFS
   version 4 protocol.  Kerberos V5 will be used as described in
   [RFC4121] to provide one security framework.  With the use of
   RPCSEC_GSS, other mechanisms may also be specified and used for NFS
   version 4 security.

   To enable in-band security negotiation, the NFSv4 protocol has added
   a new operation which provides the client with a method of querying
   the server about its policies regarding which security mechanisms
   must be used for access to the server's filesystem file system resources.  With
   this, the client can securely match the security mechanism that meets
   the policies specified at both the client and server.

1.3.2.  Procedure and Operation Structure

   A significant departure from the previous versions of the NFS
   protocol is the introduction of the COMPOUND procedure.  For the
   NFSv4 protocol, there are two RPC procedures, NULL and COMPOUND.  The
   COMPOUND procedure is defined in terms of operations and these
   operations correspond more closely to the traditional NFS procedures.

   With the use of the COMPOUND procedure, the client is able to build
   simple or complex requests.  These COMPOUND requests allow for a
   reduction in the number of RPCs needed for logical filesystem file system
   operations.  For example, without previous contact with a server a
   client will be able to read data from a file in one request by
   combining LOOKUP, OPEN, and READ operations in a single COMPOUND RPC.
   With previous versions of the NFS protocol, this type of single
   request was not possible.

   The model used for COMPOUND is very simple.  There is no logical OR
   or ANDing of operations.  The operations combined within a COMPOUND
   request are evaluated in order by the server.  Once an operation
   returns a failing result, the evaluation ends and the results of all
   evaluated operations are returned to the client.

   The NFSv4 protocol continues to have the client refer to a file or
   directory at the server by a "filehandle".  The COMPOUND procedure
   has a method of passing a filehandle from one operation to another
   within the sequence of operations.  There is a concept of a "current
   filehandle" and "saved filehandle".  Most operations use the "current
   filehandle" as the filesystem file system object to operate upon.  The "saved
   filehandle" is used as temporary filehandle storage within a COMPOUND
   procedure as well as an additional operand for certain operations.

1.3.3.  Filesystem Model

   The general filesystem file system model used for the NFSv4 protocol is the same
   as previous versions.  The server filesystem file system is hierarchical with
   the regular files contained within being treated as opaque byte
   streams.  In a slight departure, file and directory names are encoded
   with UTF-8 to deal with the basics of internationalization.

   The NFSv4 protocol does not require a separate protocol to provide
   for the initial mapping between path name and filehandle.  Instead of
   using the older MOUNT protocol for this mapping, the server provides
   a ROOT filehandle that represents the logical root or top of the
   filesystem file
   system tree provided by the server.  The server provides multiple
   filesystems
   file systems by gluing them together with pseudo filesystems. file systems.  These
   pseudo filesystems file systems provide for potential gaps in the path names
   between real filesystems. file systems.

1.3.3.1.  Filehandle Types

   In previous versions of the NFS protocol, the filehandle provided by
   the server was guaranteed to be valid or persistent for the lifetime
   of the filesystem file system object to which it referred.  For some server
   implementations, this persistence requirement has been difficult to
   meet.  For the NFSv4 protocol, this requirement has been relaxed by
   introducing another type of filehandle, volatile.  With persistent
   and volatile filehandle types, the server implementation can match
   the abilities of the filesystem file system at the server along with the
   operating environment.  The client will have knowledge of the type of
   filehandle being provided by the server and can be prepared to deal
   with the semantics of each.

1.3.3.2.  Attribute Types

   The NFSv4 protocol has a rich and extensible file object attribute
   structure, which is divided into REQUIRED, RECOMMENDED, and named
   attributes (see Section 5).

   Several (but not all) of the REQUIRED attributes are derived from the
   attributes of NFSv3 (see definition of the fattr3 data type in
   [RFC1813]).  An example of a REQUIRED attribute is the file object's
   type (Section 5.8.1.2) so that regular files can be distinguished
   from directories (also known as folders in some operating
   environments) and other types of objects.  REQUIRED attributes are
   discussed in Section 5.1.

   An example of the RECOMMENDED attributes is an acl. acl (Section 6.2.1).
   This attribute defines an Access Control List (ACL) on a file object ((Section 6). object.
   An ACL provides file access control beyond the model used in NFSv3.
   The ACL definition allows for specification of specific sets of
   permissions for individual users and groups.  In addition, ACL
   inheritance allows propagation of access permissions and restriction
   down a directory tree as file system objects are created.
   RECOMMENDED attributes are discussed in Section 5.2.

   A named attribute is an opaque byte stream that is associated with a
   directory or file and referred to by a string name.  Named attributes
   are meant to be used by client applications as a method to associate
   application-specific data with a regular file or directory.  NFSv4.1
   modifies named attributes relative to NFSv4.0 by tightening the
   allowed operations in order to prevent the development of non-
   interoperable implementations.  Named attributes are discussed in
   Section 5.3.

1.3.3.3.  Multi-server Namespace

   NFSv4 contains a number of features to allow implementation of
   namespaces that cross server boundaries and that allow and facilitate
   a non-disruptive transfer of support for individual file systems
   between servers.  They are all based upon attributes that allow one
   file system to specify alternate or new locations for that file
   system.

   These attributes may be used together with the concept of absent file
   systems, which provide specifications for additional locations but no
   actual file system content.  This allows a number of important
   facilities:

   o  Location attributes may be used with absent file systems to
      implement referrals whereby one server may direct the client to a
      file system provided by another server.  This allows extensive
      multi-server namespaces to be constructed.

   o  Location attributes may be provided for present file systems to
      provide the locations of alternate file system instances or
      replicas to be used in the event that the current file system
      instance becomes unavailable.

   o  Location attributes may be provided when a previously present file
      system becomes absent.  This allows non-disruptive migration of
      file systems to alternate servers.

1.3.4.  OPEN and CLOSE

   The NFSv4 protocol introduces OPEN and CLOSE operations.  The OPEN
   operation provides a single point where file lookup, creation, and
   share semantics can be combined.  The CLOSE operation also provides
   for the release of state accumulated by OPEN.

1.3.5.  File Locking

   With the NFSv4 protocol, the support for byte range file locking is
   part of the NFS protocol.  The file locking support is structured so
   that an RPC callback mechanism is not required.  This is a departure
   from the previous versions of the NFS file locking protocol, Network
   Lock Manager (NLM).  The state associated with file locks is
   maintained at the server under a lease-based model.  The server
   defines a single lease period for all state held by a NFS client.  If
   the client does not renew its lease within the defined period, all
   state associated with the client's lease may be released by the
   server.  The client may renew its lease with use of the RENEW
   operation or implicitly by use of other operations (primarily READ).

1.3.6.  Client Caching and Delegation

   The file, attribute, and directory caching for the NFSv4 protocol is
   similar to previous versions.  Attributes and directory information
   are cached for a duration determined by the client.  At the end of a
   predefined timeout, the client will query the server to see if the
   related filesystem file system object has been updated.

   For file data, the client checks its cache validity when the file is
   opened.  A query is sent to the server to determine if the file has
   been changed.  Based on this information, the client determines if
   the data cache for the file should kept or released.  Also, when the
   file is closed, any modified data is written to the server.

   If an application wants to serialize access to file data, file
   locking of the file data ranges in question should be used.

   The major addition to NFSv4 in the area of caching is the ability of
   the server to delegate certain responsibilities to the client.  When
   the server grants a delegation for a file to a client, the client is
   guaranteed certain semantics with respect to the sharing of that file
   with other clients.  At OPEN, the server may provide the client
   either a OPEN_DELEGATE_READ or OPEN_DELEGATE_WRITE delegation for the
   file.  If the client is granted a OPEN_DELEGATE_READ delegation, it
   is assured that no other client has the ability to write to the file
   for the duration of the delegation.  If the client is granted a
   OPEN_DELEGATE_WRITE delegation, the client is assured that no other
   client has read or write access to the file.

   Delegations can be recalled by the server.  If another client
   requests access to the file in such a way that the access conflicts
   with the granted delegation, the server is able to notify the initial
   client and recall the delegation.  This requires that a callback path
   exist between the server and client.  If this callback path does not
   exist, then delegations cannot be granted.  The essence of a
   delegation is that it allows the client to locally service operations
   such as OPEN, CLOSE, LOCK, LOCKU, READ, or WRITE without immediate
   interaction with the server.

1.4.  General Definitions

   The following definitions are provided for the purpose of providing
   an appropriate context for the reader.

   Absent File System:  A file system is "absent" when a namespace
      component does not have a backing file system.

   Byte:  In this document, a byte is an octet, i.e., a datum exactly 8
      bits in length.

   Client:  The client is the entity that accesses the NFS server's
      resources.  The client may be an application that contains the
      logic to access the NFS server directly.  The client may also be
      the traditional operating system client that provides remote
      filesystem file
      system services for a set of applications.

      With reference to byte-range locking, the client is also the
      entity that maintains a set of locks on behalf of one or more
      applications.  This client is responsible for crash or failure
      recovery for those locks it manages.

      Note that multiple clients may share the same transport and
      connection and multiple clients may exist on the same network
      node.

   Client ID:  A 64-bit quantity used as a unique, short-hand reference
      to a client supplied Verifier and ID.  The server is responsible
      for supplying the Client ID.

   File System:  The file system is the collection of objects on a
      server that share the same fsid attribute (see Section 5.8.1.9).

   Lease:  An interval of time defined by the server for which the
      client is irrevocably granted a lock.  At the end of a lease
      period the lock may be revoked if the lease has not been extended.
      The lock must be revoked if a conflicting lock has been granted
      after the lease interval.

      All leases granted by a server have the same fixed interval.  Note
      that the fixed interval was chosen to alleviate the expense a
      server would have in maintaining state about variable length
      leases across server failures.

   Lock:  The term "lock" is used to refer to both record (byte-range)
      locks as well as share reservations unless specifically stated
      otherwise.

   Server:  The "Server" is the entity responsible for coordinating
      client access to a set of filesystems. file systems.

   Stable Storage:  NFSv4 servers must be able to recover without data
      loss from multiple power failures (including cascading power
      failures, that is, several power failures in quick succession),
      operating system failures, and hardware failure of components
      other than the storage medium itself (for example, disk,
      nonvolatile RAM).

      Some examples of stable storage that are allowable for an NFS
      server include:

      (1)  Media commit of data, that is, the modified data has been
           successfully written to the disk media, for example, the disk
           platter.

      (2)  An immediate reply disk drive with battery-backed on-drive
           intermediate storage or uninterruptible power system (UPS).

      (3)  Server commit of data with battery-backed intermediate
           storage and recovery software.

      (4)  Cache commit with uninterruptible power system (UPS) and
           recovery software.

   Stateid:  A stateid is a 128-bit quantity returned by a server that
      uniquely identifies the open and locking states provided by the
      server for a specific open-owner or lock-owner/open-owner pair for
      a specific file and type of lock.

   Verifier:  A 64-bit quantity generated by the client that the server
      can use to determine if the client has restarted and lost all
      previous lock state.

1.5.  Changes since RFC 3530

   The main changes from RFC 3530 [RFC3530] are:

   o  The XDR definition has been moved to a companion document
      [I-D.ietf-nfsv4-rfc3530bis-dot-x]

   o  Updates for the latest IETF intellectual property statements

   o  There is a restructured and more complete explanation of multi-
      server namespace features.  In particular, this explanation
      explicitly describes handling of inter-server referrals, even
      where neither migration nor replication is involved.

   o  More liberal handling of internationalization for file names and
      user and group names, with the elimination of restrictions imposed
      by stringprep, with the recognition that rules for the forms of
      these name are the province of the receiving entity.

   o  Updating handling of domain names to reflect IDNA Internationalized
      Domain Names in Applications (IDNA) [RFC5891].

   o  Restructuring of string types to more appropriately reflect the
      reality of required string processing.

   o  The previously required LIPKEY and SPKM-3 security mechanisms have
      been removed.

   o  Some clarification on a client re-establishing callback
      information to the new server if state has been migrated.

   o  A third edge case was added for Courtesy locks and network
      partitions.

   o  The definition of stateid was strengthened.

1.6.  Changes since RFC 3010

   This definition of the NFSv4 protocol replaces or obsoletes the
   definition present in [RFC3010].  While portions of the two documents
   have remained the same, there have been substantive changes in
   others.  The changes made between [RFC3010] and this document
   represent implementation experience and further review of the
   protocol.  While some modifications were made for ease of
   implementation or clarification, most updates represent errors or
   situations where the [RFC3010] definition were untenable.

   The following list is not all inclusive of all changes but presents
   some of the most notable changes or additions made:

   o  The state model has added an open_owner4 identifier.  This was
      done to accommodate Posix based clients and the model they use for
      file locking.  For Posix clients, an open_owner4 would correspond
      to a file descriptor potentially shared amongst a set of processes
      and the lock_owner4 identifier would correspond to a process that
      is locking a file.

   o  Clarifications and error conditions were added for the handling of
      the owner and group attributes.  Since these attributes are string
      based (as opposed to the numeric uid/gid of previous versions of
      NFS), translations may not be available and hence the changes
      made.

   o  Clarifications for the ACL and mode attributes to address
      evaluation and partial support.

   o  For identifiers that are defined as XDR opaque, limits were set on
      their size.

   o  Added the mounted_on_fileid attribute to allow Posix clients to
      correctly construct local mounts.

   o  Modified the SETCLIENTID/SETCLIENTID_CONFIRM operations to deal
      correctly with confirmation details along with adding the ability
      to specify new client callback information.  Also added
      clarification of the callback information itself.

   o  Added a new operation RELEASE_LOCKOWNER to enable notifying the
      server that a lock_owner4 will no longer be used by the client.

   o  RENEW operation changes to identify the client correctly and allow
      for additional error returns.

   o  Verify error return possibilities for all operations.

   o  Remove use of the pathname4 data type from LOOKUP and OPEN in
      favor of having the client construct a sequence of LOOKUP
      operations to achieve the same effect.

   o  Clarification of the internationalization issues and adoption of
      the new stringprep profile framework.

2.  Protocol Data Types

   The syntax and semantics to describe the data types of the NFS
   version 4 protocol are defined in the XDR [RFC4506] and RPC [RFC5531]
   documents.  The next sections build upon the XDR data types to define
   types and structures specific to this protocol.

2.1.  Basic Data Types

                   These are the base NFSv4 data types.

   +----------------------+--------------------------------------------+

   +-----------------+-------------------------------------------------+
   | Data Type       | Definition                                      |
   +----------------------+--------------------------------------------+
   +-----------------+-------------------------------------------------+
   | int32_t         | typedef int int32_t;                            |
   | uint32_t        | typedef unsigned int uint32_t;                  |
   | int64_t         | typedef hyper int64_t;                          |
   | uint64_t        | typedef unsigned hyper uint64_t;                |
   | attrlist4       | typedef opaque attrlist4<>;                     |
   |                 | Used for file/directory attributes.             |
   | bitmap4         | typedef uint32_t bitmap4<>;                     |
   |                 | Used in attribute array encoding.               |
   | changeid4       | typedef uint64_t changeid4;                     |
   |                 | Used in the definition of change_info4.         |
   | clientid4       | typedef uint64_t clientid4;                     |
   |                 | Shorthand reference to client              |
   |                      | identification.   |
   | count4          | typedef uint32_t count4;                        |
   |                 | Various count parameters (READ, WRITE,     |
   |                      | COMMIT). |
   | length4         | typedef uint64_t length4;                       |
   |                 | Describes LOCK lengths.                         |
   | mode4           | typedef uint32_t mode4;                         |
   |                 | Mode attribute data type.                       |
   | nfs_cookie4     | typedef uint64_t nfs_cookie4;                   |
   |                 | Opaque cookie value for READDIR.                |
   | nfs_fh4         | typedef opaque nfs_fh4<NFS4_FHSIZE>;            |
   |                 | Filehandle definition.                          |
   | nfs_ftype4      | enum nfs_ftype4;                                |
   |                 | Various defined file types.                     |
   | nfsstat4        | enum nfsstat4;                                  |
   |                 | Return value for operations.                    |
   | offset4         | typedef uint64_t offset4;                       |
   |                 | Various offset designations (READ, WRITE, LOCK, |
   |                 | LOCK, COMMIT).                                        |
   | qop4            | typedef uint32_t qop4;                          |
   |                 | Quality of protection designation in       |
   |                      | SECINFO.   |
   | sec_oid4        | typedef opaque sec_oid4<>;                      |
   |                 | Security Object Identifier.  The sec_oid4 data  |
   |                 | data type is not really opaque.  Instead it contains |
   |                 | it contains an ASN.1 OBJECT IDENTIFIER as  |
   |                      | used by GSS-API   |
   |                 | in the mech_type argument to                    |
   |                 | to GSS_Init_sec_context.  See [RFC2743] for        |
   |                 | for details.                                        |
   | seqid4          | typedef uint32_t seqid4;                        |
   |                 | Sequence identifier used for file locking.      |
   | utf8string      | typedef opaque utf8string<>;                    |
   |                 | UTF-8 encoding for strings.                     |
   | utf8_expected utf8str_cis     | typedef utf8string utf8_expected; utf8str_cis;                 |
   |                 | String expected to be Case-insensitive UTF-8 but no         |
   |                      | validation string.                  |
   | utf8val_RECOMMENDED4 utf8str_cs      | typedef utf8string utf8val_RECOMMENDED4; utf8str_cs;                  |
   |                 | String SHOULD be sent Case-sensitive UTF-8 and SHOULD be  |
   |                      | validated string.                    |
   | utf8val_REQUIRED4 utf8str_mixed   | typedef utf8string utf8val_REQUIRED4; utf8str_mixed;               |
   |                 | String MUST be sent UTF-8 strings with a case-sensitive prefix and MUST be  |
   |                 | validated a case-insensitive suffix.                      |
   | ascii_REQUIRED4 component4      | typedef utf8string ascii_REQUIRED4;        |
   |                      | String MUST be sent as ASCII and thus is utf8str_cs component4;                  |
   |                 | automatically UTF-8 Represents pathname components.                 |
   | comptag4 linktext4       | typedef utf8_expected comptag4; utf8str_cs linktext4;                   |
   |                 | Tag should be UTF-8 but Symbolic link contents ("symbolic link" is not checked      |
   | component4                 | typedef utf8val_RECOMMENDED4 component4; defined in an Open Group [openg_symlink]        |
   |                 | Represents path name components. standard).                                      |
   | linktext4 ascii_REQUIRED4 | typedef utf8val_RECOMMENDED4 linktext4; utf8string ascii_REQUIRED4;             |
   |                 | Symbolic link contents. String MUST be sent as ASCII and thus is        |
   |                 | automatically UTF-8.                            |
   | pathname4       | typedef component4 pathname4<>;                 |
   |                 | Represents path name for fs_locations.          |
   | nfs_lockid4     | typedef uint64_t nfs_lockid4;                   |
   | verifier4       | typedef opaque                             |
   |                      | verifier4[NFS4_VERIFIER_SIZE];   |
   |                 | Verifier used for various operations (COMMIT,   |
   |                 | (COMMIT, CREATE, OPEN, READDIR, WRITE)                   |
   |                 | NFS4_VERIFIER_SIZE is defined as 8.             |
   +----------------------+--------------------------------------------+
   +-----------------+-------------------------------------------------+

                          End of Base Data Types

                                  Table 1

2.2.  Structured Data Types

2.2.1.  nfstime4

   struct nfstime4 {
           int64_t         seconds;
           uint32_t        nseconds;
   };

   The nfstime4 structure gives the number of seconds and nanoseconds
   since midnight or 0 hour January 1, 1970 Coordinated Universal Time
   (UTC).  Values greater than zero for the seconds field denote dates
   after the 0 hour January 1, 1970.  Values less than zero for the
   seconds field denote dates before the 0 hour January 1, 1970.  In
   both cases, the nseconds field is to be added to the seconds field
   for the final time representation.  For example, if the time to be
   represented is one-half second before 0 hour January 1, 1970, the
   seconds field would have a value of negative one (-1) and the
   nseconds fields would have a value of one-half second (500000000).
   Values greater than 999,999,999 for nseconds are considered invalid.

   This data type is used to pass time and date information.  A server
   converts to and from its local representation of time when processing
   time values, preserving as much accuracy as possible.  If the
   precision of timestamps stored for a filesystem file system object is less than
   defined, loss of precision can occur.  An adjunct time maintenance
   protocol is recommended to reduce client and server time skew.

2.2.2.  time_how4

   enum time_how4 {
           SET_TO_SERVER_TIME4 = 0,
           SET_TO_CLIENT_TIME4 = 1
   };

2.2.3.  settime4

   union settime4 switch (time_how4 set_it) {
    case SET_TO_CLIENT_TIME4:
            nfstime4       time;
    default:
            void;
   };

   The above definitions are used as the attribute definitions to set
   time values.  If set_it is SET_TO_SERVER_TIME4, then the server uses
   its local representation of time for the time value.

2.2.4.  specdata4

   struct specdata4 {
    uint32_t specdata1; /* major device number */
    uint32_t specdata2; /* minor device number */
   };

   This data type represents additional information for the device file
   types NF4CHR and NF4BLK.

2.2.5.  fsid4

   struct fsid4 {
           uint64_t        major;
           uint64_t        minor;
   };

   This type is the filesystem file system identifier that is used as a mandatory
   attribute.

2.2.6.  fs_location4

   struct fs_location4 {
           utf8val_REQUIRED4
           utf8str_cis             server<>;
           pathname4               rootpath;
   };

2.2.7.  fs_locations4

   struct fs_locations4 {
           pathname4       fs_root;
           fs_location4    locations<>;
   };

   The fs_location4 and fs_locations4 data types are used for the
   fs_locations recommended attribute which is used for migration and
   replication support.

2.2.8.  fattr4

   struct fattr4 {
           bitmap4         attrmask;
           attrlist4       attr_vals;
   };

   The fattr4 structure is used to represent file and directory
   attributes.

   The bitmap is a counted array of 32 bit integers used to contain bit
   values.  The position of the integer in the array that contains bit n
   can be computed from the expression (n / 32) and its bit within that
   integer is (n mod 32).

                       0            1
     +-----------+-----------+-----------+--
     |  count    | 31  ..  0 | 63  .. 32 |
     +-----------+-----------+-----------+--

2.2.9.  change_info4

   struct change_info4 {
           bool            atomic;
           changeid4       before;
           changeid4       after;
   };

   This structure is used with the CREATE, LINK, REMOVE, RENAME
   operations to let the client know the value of the change attribute
   for the directory in which the target filesystem file system object resides.

2.2.10.  clientaddr4

   struct clientaddr4 {
           /* see struct rpcb in RFC 1833 */
           string r_netid<>;    /* network id */
           string r_addr<>;     /* universal address */
   };

   The clientaddr4 structure is used as part of the SETCLIENTID
   operation to either specify the address of the client that is using a
   client ID or as part of the callback registration.  The r_netid and
   r_addr fields respectively contain a netid and uaddr.  The netid and
   uaddr concepts are defined in [RFC5665].  The netid and uaddr formats
   for TCP over IPv4 and TCP over IPv6 are defined in [RFC5665],
   specifically Tables 2 and 3 and Sections 5.2.3.3 and 5.2.3.4.

2.2.11.  cb_client4

   struct cb_client4 {
           unsigned int    cb_program;
           clientaddr4     cb_location;
   };

   This structure is used by the client to inform the server of its call
   back address; includes the program number and client address.

2.2.12.  nfs_client_id4

   struct nfs_client_id4 {
           verifier4       verifier;
           opaque          id<NFS4_OPAQUE_LIMIT>;
   };

   This structure is part of the arguments to the SETCLIENTID operation.
   NFS4_OPAQUE_LIMIT is defined as 1024.

2.2.13.  open_owner4

   struct open_owner4 {
           clientid4       clientid;
           opaque          owner<NFS4_OPAQUE_LIMIT>;
   };

   This structure is used to identify the owner of open state.
   NFS4_OPAQUE_LIMIT is defined as 1024.

2.2.14.  lock_owner4

   struct lock_owner4 {
           clientid4       clientid;
           opaque          owner<NFS4_OPAQUE_LIMIT>;
   };

   This structure is used to identify the owner of file locking state.
   NFS4_OPAQUE_LIMIT is defined as 1024.

2.2.15.  open_to_lock_owner4

   struct open_to_lock_owner4 {
           seqid4          open_seqid;
           stateid4        open_stateid;
           seqid4          lock_seqid;
           lock_owner4     lock_owner;
   };

   This structure is used for the first LOCK operation done for an
   open_owner4.  It provides both the open_stateid and lock_owner such
   that the transition is made from a valid open_stateid sequence to
   that of the new lock_stateid sequence.  Using this mechanism avoids
   the confirmation of the lock_owner/lock_seqid pair since it is tied
   to established state in the form of the open_stateid/open_seqid.

2.2.16.  stateid4

   struct stateid4 {
           uint32_t        seqid;
           opaque          other[12];          other[NFS4_OTHER_SIZE];
   };

   This structure is used for the various state sharing mechanisms
   between the client and server.  For the client, this data structure
   is read-only.  The server is required to increment the seqid field
   monotonically at each transition of the stateid.  This is important
   since the client will inspect the seqid in OPEN stateids to determine
   the order of OPEN processing done by the server.

3.  RPC and Security Flavor

   The NFSv4 protocol is a Remote Procedure Call (RPC) RPC application that uses RPC version 2 and
   the corresponding eXternal Data Representation
   (XDR) XDR as defined in [RFC5531] and [RFC4506].  The RPCSEC_GSS
   security
   flavor flavors as defined in [RFC2203] version 1 ([RFC2203]) and version 2
   ([RFC5403]) MUST be implemented as the mechanism to deliver stronger
   security for the NFSv4 protocol.  However, deployment of RPCSEC_GSS
   is optional.

3.1.  Ports and Transports

   Historically, NFSv2 and NFSv3 servers have resided on port 2049.  The
   registered port 2049 [RFC3232] for the NFS protocol SHOULD be the
   default configuration.  Using the registered port for NFS services
   means the NFS client will not need to use the RPC binding protocols
   as described in [RFC1833]; this will allow NFS to transit firewalls.

   Where an NFSv4 implementation supports operation over the IP network
   protocol, the supported transports transport layer between NFS and IP MUST be among
   the IETF-approved congestion control an
   IETF standardised transport protocols, which protocol that is specified to avoid
   network congestion; such transports include TCP and SCTP.  To enhance
   the possibilities for interoperability, an NFSv4 implementation MUST
   support operation over the TCP transport protocol, at least until
   such time as a standards track RFC revises this requirement to use a
   different IETF-approved
   congestion control IETF standardised transport protocol. protocol with appropriate
   congestion control.

   If TCP is used as the transport, the client and server SHOULD use
   persistent connections.  This will prevent the weakening of TCP's
   congestion control via short lived connections and will improve
   performance for the WAN Wide Area Network (WAN) environment by
   eliminating the need for SYN handshakes.

   To date, all NFSv4 implementations are TCP based, i.e., there are
   none for SCTP nor UDP.  UDP by itself is not sufficient as a
   transport for NFSv4, neither is UDP in combination with some other
   mechanism (e.g., DCCP [RFC4340], NORM [RFC5740]).

   As noted in Section 17, the authentication model for NFSv4 has moved
   from machine-based to principal-based.  However, this modification of
   the authentication model does not imply a technical requirement to
   move the TCP connection management model from whole machine-based to
   one based on a per user model.  In particular, NFS over TCP client
   implementations have traditionally multiplexed traffic for multiple
   users over a common TCP connection between an NFS client and server.
   This has been true, regardless of whether the NFS client is using
   AUTH_SYS, AUTH_DH, RPCSEC_GSS or any other flavor.  Similarly, NFS
   over TCP server implementations have assumed such a model and thus
   scale the implementation of TCP connection management in proportion
   to the number of expected client machines.  It is intended that NFSv4
   will not modify this connection management model.  NFSv4 clients that
   violate this assumption can expect scaling issues on the server and
   hence reduced service.

   Note that for various timers, the client and server should avoid
   inadvertent synchronization of those timers.  For further discussion
   of the general issue refer to [Floyd].

3.1.1.  Client Retransmission Behavior

   When processing a NFSv4 request received over a reliable transport
   such as TCP, the NFSv4 server MUST NOT silently drop the request,
   except if the established transport connection has been broken.
   Given such a contract between NFSv4 clients and servers, clients MUST
   NOT retry a request unless one or both of the following are true:

   o  The transport connection has been broken

   o  The procedure being retried is the NULL procedure

   Since reliable transports, such as TCP, do not always synchronously
   inform a peer when the other peer has broken the connection (for
   example, when an NFS server reboots), the NFSv4 client may want to
   actively "probe" the connection to see if has been broken.  Use of
   the NULL procedure is one recommended way to do so.  So, when a
   client experiences a remote procedure call timeout (of some arbitrary
   implementation specific amount), rather than retrying the remote
   procedure call, it could instead issue a NULL procedure call to the
   server.  If the server has died, the transport connection break will
   eventually be indicated to the NFSv4 client.  The client can then
   reconnect, and then retry the original request.  If the NULL
   procedure call gets a response, the connection has not broken.  The
   client can decide to wait longer for the original request's response,
   or it can break the transport connection and reconnect before re-
   sending the original request.

   For callbacks from the server to the client, the same rules apply,
   but the server doing the callback becomes the client, and the client
   receiving the callback becomes the server.

3.2.  Security Flavors

   Traditional RPC implementations have included AUTH_NONE, AUTH_SYS,
   AUTH_DH, and AUTH_KRB4 as security flavors.  With [RFC2203] an
   additional security flavor of RPCSEC_GSS has been introduced which
   uses the functionality of GSS-API [RFC2743].  This allows for the use
   of various security mechanisms by the RPC layer without the
   additional implementation overhead of adding RPC security flavors.
   For NFSv4, the RPCSEC_GSS security flavor MUST be used to enable the
   mandatory security mechanism.  Other flavors, such as, AUTH_NONE,
   AUTH_SYS, and AUTH_DH MAY be implemented as well.

3.2.1.  Security mechanisms for NFSv4

   RPCSEC_GSS, via GSS-API, normalizes access to mechanisms that provide
   security services.  Therefore, NFSv4 clients and servers MUST support
   the Kerberos V5 security mechanism.

   The use of RPCSEC_GSS requires selection of mechanism, quality of
   protection (QOP), and service (authentication, integrity, privacy).
   For the mandated security mechanisms, NFSv4 specifies that a QOP of
   zero is used, leaving it up to the mechanism or the mechanism's
   configuration to map QOP zero to an appropriate level of protection.
   Each mandated mechanism specifies a minimum set of cryptographic
   algorithms for implementing integrity and privacy.  NFSv4 clients and
   servers MUST be implemented on operating environments that comply
   with the REQUIRED cryptographic algorithms of each REQUIRED
   mechanism.

3.2.1.1.  Kerberos V5 as a security triple Security Triple

   The Kerberos V5 GSS-API mechanism as described in [RFC4121] MUST be
   implemented with the RPCSEC_GSS services as specified in the
   following table:

      column descriptions:
      1 == number of pseudo flavor
      2 == name of pseudo flavor
      3 == mechanism's OID
      4 == RPCSEC_GSS service
      5 == NFSv4 clients MUST support
      6 == NFSv4 servers MUST support

      1      2        3                    4                     5   6
      ------------------------------------------------------------------
      390003 krb5     1.2.840.113554.1.2.2 rpc_gss_svc_none      yes yes
      390004 krb5i    1.2.840.113554.1.2.2 rpc_gss_svc_integrity yes yes
      390005 krb5p    1.2.840.113554.1.2.2 rpc_gss_svc_privacy    no yes

   Note that the pseudo flavor is presented here as a mapping aid to the
   implementor.  Because this NFS protocol includes a method to
   negotiate security and it understands the GSS-API mechanism, the
   pseudo flavor is not needed.  The pseudo flavor is needed for NFSv3
   since the security negotiation is done via the MOUNT protocol as
   described in [RFC2623].

   At the time this document was specified, the Advanced Encryption
   Standard (AES) with HMAC-SHA1 was a REQUIRED algorithm set for
   Kerberos V5.  In contrast, when NFSv4.0 was first specified in
   [RFC3530], weaker algorithm sets were REQUIRED for Kerberos V5, and
   were REQUIRED in the NFSv4.0 specification, because the Kerberos V5
   specification at the time did not specify stronger algorithms.  The
   NFSv4 specification does not specify REQUIRED algorithms for Kerberos
   V5, and instead, the implementor is expected to track the evolution
   of the Kerberos V5 standard if and when stronger algorithms are
   specified.

3.2.1.1.1.  Security Considerations for Cryptographic Algorithms in
            Kerberos V5

   When deploying NFSv4, the strength of the security achieved depends
   on the existing Kerberos V5 infrastructure.  The algorithms of
   Kerberos V5 are not directly exposed to or selectable by the client
   or server, so there is some due diligence required by the user of
   NFSv4 to ensure that security is acceptable where needed.  Guidance
   is provided in [RFC6649] as to why weak algorithms should be disabled
   by default.

3.3.  Security Negotiation

   With the NFSv4 server potentially offering multiple security
   mechanisms, the client needs a method to determine or negotiate which
   mechanism is to be used for its communication with the server.  The
   NFS server may have multiple points within its filesystem file system name space
   that are available for use by NFS clients.  In turn the NFS server
   may be configured such that each of these entry points may have
   different or multiple security mechanisms in use.

   The security negotiation between client and server SHOULD be done
   with a secure channel to eliminate the possibility of a third party
   intercepting the negotiation sequence and forcing the client and
   server to choose a lower level of security than required or desired.
   See Section 17 for further discussion.

3.3.1.  SECINFO

   The new SECINFO operation will allow the client to determine, on a
   per filehandle basis, what security triple (see [RFC2743]) is to be
   used for server access.  In general, the client will not have to use
   the SECINFO operation except during initial communication with the
   server or when the client crosses policy boundaries at the server.
   It is possible that the server's policies change during the client's
   interaction therefore forcing the client to negotiate a new security
   triple.

3.3.2.  Security Error

   Based on the assumption that each NFSv4 client and server MUST
   support a minimum set of security (i.e., Kerberos-V5 under
   RPCSEC_GSS), the NFS client will start its communication with the
   server with one of the minimal security triples.  During
   communication with the server, the client may receive an NFS error of
   NFS4ERR_WRONGSEC.  This error allows the server to notify the client
   that the security triple currently being used is not appropriate for
   access to the server's filesystem file system resources.  The client is then
   responsible for determining what security triples are available at
   the server and choose one which is appropriate for the client.  See
   Section 15.33 for further discussion of how the client will respond
   to the NFS4ERR_WRONGSEC error and use SECINFO.

3.3.3.  Callback RPC Authentication

   Except as noted elsewhere in this section, the callback RPC
   (described later) MUST mutually authenticate the NFS server to the
   principal that acquired the client ID (also described later), using
   the security flavor the original SETCLIENTID operation used.

   For AUTH_NONE, there are no principals, so this is a non-issue.

   AUTH_SYS has no notions of mutual authentication or a server
   principal, so the callback from the server simply uses the AUTH_SYS
   credential that the user used when he set up the delegation.

   For AUTH_DH, one commonly used convention is that the server uses the
   credential corresponding to this AUTH_DH principal:

     unix.host@domain

   where host and domain are variables corresponding to the name of
   server host and directory services domain in which it lives such as a
   Network Information System domain or a DNS domain.

   Regardless of what security mechanism under RPCSEC_GSS is being used,
   the NFS server MUST identify itself in GSS-API via a
   GSS_C_NT_HOSTBASED_SERVICE name type.  GSS_C_NT_HOSTBASED_SERVICE
   names are of the form:

     service@hostname

   For NFS, the "service" element is

     nfs

   Implementations of security mechanisms will convert nfs@hostname to
   various different forms.  For Kerberos V5, the following form is
   RECOMMENDED:

   nfs/hostname

   For Kerberos V5, nfs/hostname would be a server principal in the
   Kerberos Key Distribution Center database.  This is the same
   principal the client acquired a GSS-API context for when it issued
   the SETCLIENTID operation, therefore, the realm name for the server
   principal must be the same for the callback as it was for the
   SETCLIENTID.

4.  Filehandles

   The filehandle in the NFS protocol is a per server unique identifier
   for a filesystem file system object.  The contents of the filehandle are opaque
   to the client.  Therefore, the server is responsible for translating
   the filehandle to an internal representation of the filesystem file system
   object.

4.1.  Obtaining the First Filehandle

   The operations of the NFS protocol are defined in terms of one or
   more filehandles.  Therefore, the client needs a filehandle to
   initiate communication with the server.  With the NFSv2 protocol
   [RFC1094] and the NFSv3 protocol [RFC1813], there exists an ancillary
   protocol to obtain this first filehandle.  The MOUNT protocol, RPC
   program number 100005, provides the mechanism of translating a string
   based filesystem file system path name to a filehandle which can then be used by
   the NFS protocols.

   The MOUNT protocol has deficiencies in the area of security and use
   via firewalls.  This is one reason that the use of the public
   filehandle was introduced in [RFC2054] and [RFC2055].  With the use
   of the public filehandle in combination with the LOOKUP operation in
   the NFSv2 and NFSv3 protocols, it has been demonstrated that the
   MOUNT protocol is unnecessary for viable interaction between NFS
   client and server.

   Therefore, the NFSv4 protocol will not use an ancillary protocol for
   translation from string based path names to a filehandle.  Two
   special filehandles will be used as starting points for the NFS
   client.

4.1.1.  Root Filehandle

   The first of the special filehandles is the ROOT filehandle.  The
   ROOT filehandle is the "conceptual" root of the filesystem file system name
   space at the NFS server.  The client uses or starts with the ROOT
   filehandle by employing the PUTROOTFH operation.  The PUTROOTFH
   operation instructs the server to set the "current" filehandle to the
   ROOT of the server's file tree.  Once this PUTROOTFH operation is
   used, the client can then traverse the entirety of the server's file
   tree with the LOOKUP operation.  A complete discussion of the server
   name space is in Section 8. 7.

4.1.2.  Public Filehandle

   The second special filehandle is the PUBLIC filehandle.  Unlike the
   ROOT filehandle, the PUBLIC filehandle may be bound or represent an
   arbitrary filesystem file system object at the server.  The server is
   responsible for this binding.  It may be that the PUBLIC filehandle
   and the ROOT filehandle refer to the same filesystem file system object.
   However, it is up to the administrative software at the server and
   the policies of the server administrator to define the binding of the
   PUBLIC filehandle and server filesystem file system object.  The client may not
   make any assumptions about this binding.  The client uses the PUBLIC
   filehandle via the PUTPUBFH operation.

4.2.  Filehandle Types

   In the NFSv2 and NFSv3 protocols, there was one type of filehandle
   with a single set of semantics.  This type of filehandle is termed
   "persistent" in NFS Version 4.  The semantics of a persistent
   filehandle remain the same as before.  A new type of filehandle
   introduced in NFS Version 4 is the "volatile" filehandle, which
   attempts to accommodate certain server environments.

   The volatile filehandle type was introduced to address server
   functionality or implementation issues which make correct
   implementation of a persistent filehandle infeasible.  Some server
   environments do not provide a filesystem file system level invariant that can be
   used to construct a persistent filehandle.  The underlying server
   filesystem
   file system may not provide the invariant or the server's filesystem file system
   programming interfaces may not provide access to the needed
   invariant.  Volatile filehandles may ease the implementation of
   server functionality such as hierarchical storage management or
   filesystem file
   system reorganization or migration.  However, the volatile filehandle
   increases the implementation burden for the client.

   Since the client will need to handle persistent and volatile
   filehandles differently, a file attribute is defined which may be
   used by the client to determine the filehandle types being returned
   by the server.

4.2.1.  General Properties of a Filehandle

   The filehandle contains all the information the server needs to
   distinguish an individual file.  To the client, the filehandle is
   opaque.  The client stores filehandles for use in a later request and
   can compare two filehandles from the same server for equality by
   doing a byte-by-byte comparison.  However, the client MUST NOT
   otherwise interpret the contents of filehandles.  If two filehandles
   from the same server are equal, they MUST refer to the same file. file
   system object.  Servers SHOULD try to maintain a one-to-one
   correspondence between filehandles and files file system objects but this
   is not required.  Clients MUST use filehandle comparisons only to
   improve performance, not for correct behavior.  All clients need to
   be prepared for situations in which it cannot be determined whether
   two filehandles denote the same object and in such cases, avoid
   making invalid assumptions which might cause incorrect behavior.
   Further discussion of filehandle and attribute comparison in the
   context of data caching is presented in Section 10.3.4.

   As an example, in the case that two different path names when
   traversed at the server terminate at the same filesystem file system object, the
   server SHOULD return the same filehandle for each path.  This can
   occur if a hard link is used to create two file names which refer to
   the same underlying file object and associated data.  For example, if
   paths /a/b/c and /a/d/c refer to the same file, the server SHOULD
   return the same filehandle for both path names traversals.

4.2.2.  Persistent Filehandle

   A persistent filehandle is defined as having a fixed value for the
   lifetime of the filesystem file system object to which it refers.  Once the
   server creates the filehandle for a filesystem file system object, the server
   MUST accept the same filehandle for the object for the lifetime of
   the object.  If the server restarts or reboots the NFS server must
   honor the same filehandle value as it did in the server's previous
   instantiation.  Similarly, if the filesystem file system is migrated, the new
   NFS server must honor the same filehandle as the old NFS server.

   The persistent filehandle will be become stale or invalid when the
   filesystem
   file system object is removed.  When the server is presented with a
   persistent filehandle that refers to a deleted object, it MUST return
   an error of NFS4ERR_STALE.  A filehandle may become stale when the
   filesystem
   file system containing the object is no longer available.  The file
   system may become unavailable if it exists on removable media and the
   media is no longer available at the server or the filesystem file system in
   whole has been destroyed or the filesystem file system has simply been removed
   from the server's name space (i.e., unmounted in a UNIX environment).

4.2.3.  Volatile Filehandle

   A volatile filehandle does not share the same longevity
   characteristics of a persistent filehandle.  The server may determine
   that a volatile filehandle is no longer valid at many different
   points in time.  If the server can definitively determine that a
   volatile filehandle refers to an object that has been removed, the
   server should return NFS4ERR_STALE to the client (as is the case for
   persistent filehandles).  In all other cases where the server
   determines that a volatile filehandle can no longer be used, it
   should return an error of NFS4ERR_FHEXPIRED.

   The mandatory attribute "fh_expire_type" is used by the client to
   determine what type of filehandle the server is providing for a
   particular filesystem. file system.  This attribute is a bitmask with the
   following values:

   FH4_PERSISTENT:  The value of FH4_PERSISTENT is used to indicate a
      persistent filehandle, which is valid until the object is removed
      from the filesystem. file system.  The server will not return
      NFS4ERR_FHEXPIRED for this filehandle.  FH4_PERSISTENT is defined
      as a value in which none of the bits specified below are set.

   FH4_VOLATILE_ANY:  The filehandle may expire at any time, except as
      specifically excluded (i.e., FH4_NOEXPIRE_WITH_OPEN).

   FH4_NOEXPIRE_WITH_OPEN:  May only be set when FH4_VOLATILE_ANY is
      set.  If this bit is set, then the meaning of FH4_VOLATILE_ANY is
      qualified to exclude any expiration of the filehandle when it is
      open.

   FH4_VOL_MIGRATION:  The filehandle will expire as a result of
      migration.  If FH4_VOLATILE_ANY is set, FH4_VOL_MIGRATION is
      redundant.

   FH4_VOL_RENAME:  The filehandle will expire during rename.  This
      includes a rename by the requesting client or a rename by any
      other client.  If FH4_VOLATILE_ANY is set, FH4_VOL_RENAME is
      redundant.

   Servers which provide volatile filehandles that may expire while open
   (i.e., if FH4_VOL_MIGRATION or FH4_VOL_RENAME is set or if
   FH4_VOLATILE_ANY is set and FH4_NOEXPIRE_WITH_OPEN not set), should
   deny a RENAME or REMOVE that would affect an OPEN file of any of the
   components leading to the OPEN file.  In addition, the server should
   deny all RENAME or REMOVE requests during the grace period upon
   server restart.

   Note that the bits FH4_VOL_MIGRATION and FH4_VOL_RENAME allow the
   client to determine that expiration has occurred whenever a specific
   event occurs, without an explicit filehandle expiration error from
   the server.  FH4_VOLATILE_ANY does not provide this form of
   information.  In situations where the server will expire many, but
   not all filehandles upon migration (e.g., all but those that are
   open), FH4_VOLATILE_ANY (in this case with FH4_NOEXPIRE_WITH_OPEN) is
   a better choice since the client may not assume that all filehandles
   will expire when migration occurs, and it is likely that additional
   expirations will occur (as a result of file CLOSE) that are separated
   in time from the migration event itself.

4.2.4.  One Method of Constructing a Volatile Filehandle

   A volatile filehandle, while opaque to the client could contain:

     [volatile bit = 1 | server boot time | slot | generation number]

   o  slot is an index in the server volatile filehandle table

   o  generation number is the generation number for the table entry/
      slot

   When the client presents a volatile filehandle, the server makes the
   following checks, which assume that the check for the volatile bit
   has passed.  If the server boot time is less than the current server
   boot time, return NFS4ERR_FHEXPIRED.  If slot is out of range, return
   NFS4ERR_BADHANDLE.  If the generation number does not match, return
   NFS4ERR_FHEXPIRED.

   When the server reboots, the table is gone (it is volatile).

   If volatile bit is 0, then it is a persistent filehandle with a
   different structure following it.

4.3.  Client Recovery from Filehandle Expiration

   If possible, the client SHOULD should recover from the receipt of an
   NFS4ERR_FHEXPIRED error.  The client must take on additional
   responsibility so that it may prepare itself to recover from the
   expiration of a volatile filehandle.  If the server returns
   persistent filehandles, the client does not need these additional
   steps.

   For volatile filehandles, most commonly the client will need to store
   the component names leading up to and including the filesystem file system
   object in question.  With these names, the client should be able to
   recover by finding a filehandle in the name space that is still
   available or by starting at the root of the server's filesystem file system name
   space.

   If the expired filehandle refers to an object that has been removed
   from the filesystem, file system, obviously the client will not be able to
   recover from the expired filehandle.

   It is also possible that the expired filehandle refers to a file that
   has been renamed.  If the file was renamed by another client, again
   it is possible that the original client will not be able to recover.
   However, in the case that the client itself is renaming the file and
   the file is open, it is possible that the client may be able to
   recover.  The client can determine the new path name based on the
   processing of the rename request.  The client can then regenerate the
   new filehandle based on the new path name.  The client could also use
   the compound operation mechanism to construct a set of operations
   like:

     RENAME A B
     LOOKUP B
     GETFH

   Note that the COMPOUND procedure does not provide atomicity.  This
   example only reduces the overhead of recovering from an expired
   filehandle.

5.  File  Attributes

   To meet the requirements of extensibility and increased
   interoperability with non-UNIX platforms, attributes need to be
   handled in a flexible manner.  The NFSv3 fattr3 structure contains a
   fixed list of attributes that not all clients and servers are able to
   support or care about.  The fattr3 structure cannot be extended as
   new needs arise and it provides no way to indicate non-support.  With
   the NFSv4.0 protocol, the client is able to query what attributes the
   server supports and construct requests with only those supported
   attributes (or a subset thereof).

   To this end, attributes are divided into three groups: REQUIRED,
   RECOMMENDED, and named.  Both REQUIRED and RECOMMENDED attributes are
   supported in the NFSv4.0 protocol by a specific and well-defined
   encoding and are identified by number.  They are requested by setting
   a bit in the bit vector sent in the GETATTR request; the server
   response includes a bit vector to list what attributes were returned
   in the response.  New REQUIRED or RECOMMENDED attributes may be added
   to the NFSv4 protocol as part of a new minor version by publishing a
   Standards Track RFC which allocates a new attribute number value and
   defines the encoding for the attribute.  See Section 11 for further
   discussion.

   Named attributes are accessed by the new OPENATTR operation, which
   accesses a hidden directory of attributes associated with a file
   system object.  OPENATTR takes a filehandle for the object and
   returns the filehandle for the attribute hierarchy.  The filehandle
   for the named attributes is a directory object accessible by LOOKUP
   or READDIR and contains files whose names represent the named
   attributes and whose data bytes are the value of the attribute.  For
   example:

        +----------+-----------+---------------------------------+
        | LOOKUP   | "foo"     | ; look up file                  |
        | GETATTR  | attrbits  |                                 |
        | OPENATTR |           | ; access foo's named attributes |
        | LOOKUP   | "x11icon" | ; look up specific attribute    |
        | READ     | 0,4096    | ; read stream of bytes          |
        +----------+-----------+---------------------------------+

   Named attributes are intended for data needed by applications rather
   than by an NFS client implementation.  NFS implementors are strongly
   encouraged to define their new attributes as RECOMMENDED attributes
   by bringing them to the IETF Standards Track process.

   The set of attributes that are classified as REQUIRED is deliberately
   small since servers need to do whatever it takes to support them.  A
   server should support as many of the RECOMMENDED attributes as
   possible but, by their definition, the server is not required to
   support all of them.  Attributes are deemed REQUIRED if the data is
   both needed by a large number of clients and is not otherwise
   reasonably computable by the client when support is not provided on
   the server.

   Note that the hidden directory returned by OPENATTR is a convenience
   for protocol processing.  The client should not make any assumptions
   about the server's implementation of named attributes and whether or
   not the underlying file system at the server has a named attribute
   directory.  Therefore, operations such as SETATTR and GETATTR on the
   named attribute directory are undefined.

5.1.  REQUIRED Attributes

   These MUST be supported by every NFSv4.0 client and server in order
   to ensure a minimum level of interoperability.  The server MUST store
   and return these attributes, and the client MUST be able to function
   with an attribute set limited to these attributes.  With just the
   REQUIRED attributes some client functionality may be impaired or
   limited in some ways.  A client may ask for any of these attributes
   to be returned by setting a bit in the GETATTR request, and the
   server must MUST return their value.

5.2.  RECOMMENDED Attributes

   These attributes are understood well enough to warrant support in the
   NFSv4.0 protocol.  However, they may not be supported on all clients
   and servers.  A client MAY ask for any of these attributes to be
   returned by setting a bit in the GETATTR request but must handle the
   case where the server does not return them.  A client MAY ask for the
   set of attributes the server supports and SHOULD NOT request
   attributes the server does not support.  A server should be tolerant
   of requests for unsupported attributes and simply not return them
   rather than considering the request an error.  It is expected that
   servers will support all attributes they comfortably can and only
   fail to support attributes that are difficult to support in their
   operating environments.  A server should provide attributes whenever
   they don't have to "tell lies" to the client.  For example, a file
   modification time should be either an accurate time or should not be
   supported by the server.  At times this will be difficult for
   clients, but a client is better positioned to decide whether and how
   to fabricate or construct an attribute or whether to do without the
   attribute.

5.3.  Named Attributes

   These attributes are not supported by direct encoding in the NFSv4
   protocol but are accessed by string names rather than numbers and
   correspond to an uninterpreted stream of bytes that are stored with
   the file system object.  The name space for these attributes may be
   accessed by using the OPENATTR operation.  The OPENATTR operation
   returns a filehandle for a virtual "named attribute directory", and
   further perusal and modification of the name space may be done using
   operations that work on more typical directories.  In particular,
   READDIR may be used to get a list of such named attributes, and
   LOOKUP and OPEN may select a particular attribute.  Creation of a new
   named attribute may be the result of an OPEN specifying file
   creation.

   Once an OPEN is done, named attributes may be examined and changed by
   normal READ and WRITE operations using the filehandles and stateids
   returned by OPEN.

   Named attributes and the named attribute directory may have their own
   (non-named) attributes.  Each of these objects must have all of the
   REQUIRED attributes and may have additional RECOMMENDED attributes.
   However, the set of attributes for named attributes and the named
   attribute directory need not be, and typically will not be, as large
   as that for other objects in that file system.

   Named attributes might be the target of delegations.  However, since
   granting of delegations is at the server's discretion, a server need
   not support delegations on named attributes.

   It is RECOMMENDED that servers support arbitrary named attributes.  A
   client should not depend on the ability to store any named attributes
   in the server's file system.  If a server does support named
   attributes, a client that is also able to handle them should be able
   to copy a file's data and metadata with complete transparency from
   one location to another; this would imply that names allowed for
   regular directory entries are valid for named attribute names as
   well.

   In NFSv4.0, the structure of named attribute directories is
   restricted in a number of ways, in order to prevent the development
   of non-interoperable implementations in which some servers support a
   fully general hierarchical directory structure for named attributes
   while others support a limited but adequate structure for named
   attributes.  In such an environment, clients or applications might
   come to depend on non-portable extensions.  The restrictions are:

   o  CREATE is not allowed in a named attribute directory.  Thus, such
      objects as symbolic links and special files are not allowed to be
      named attributes.  Further, directories may not be created in a
      named attribute directory, so no hierarchical structure of named
      attributes for a single object is allowed.

   o  If OPENATTR is done on a named attribute directory or on a named
      attribute, the server MUST return an error.

   o  Doing a RENAME of a named attribute to a different named attribute
      directory or to an ordinary (i.e., non-named-attribute) directory
      is not allowed.

   o  Creating hard links between named attribute directories or between
      named attribute directories and ordinary directories is not
      allowed.

   Names of attributes will not be controlled by this document or other
   IETF Standards Track documents.  See Section 18 for further
   discussion.

5.4.  Classification of Attributes

   Each of the REQUIRED and RECOMMENDED attributes can be classified in
   one of three categories: per server (i.e., the value of the attribute
   will be the same for all file objects that share the same server),
   per file system (i.e., the value of the attribute will be the same
   for some or all file objects that share the same fsid attribute
   (Section 5.8.1.9) and server owner), or per file system object.  Note
   that it is possible that some per file system attributes may vary
   within the file system.  Note that it is possible that some per file
   system attributes may vary within the file system, depending on the
   value of the "homogeneous" (Section 5.8.2.16) attribute.  Note that
   the attributes time_access_set and time_modify_set are not listed in
   this section because they are write-only attributes corresponding to
   time_access and time_modify, and are used in a special instance of
   SETATTR.

   o  The per-server attribute is:

         lease_time

   o  The per-file system attributes are:

         supported_attrs, fh_expire_type, link_support, symlink_support,
         unique_handles, aclsupport, cansettime, case_insensitive,
         case_preserving, chown_restricted, files_avail, files_free,
         files_total, fs_locations, homogeneous, maxfilesize, maxname,
         maxread, maxwrite, no_trunc, space_avail, space_free,
         space_total, time_delta,

   o  The per-file system object attributes are:

         type, change, size, named_attr, fsid, rdattr_error, filehandle,
         acl, archive, fileid, hidden, maxlink, mimetype, mode,
         numlinks, owner, owner_group, rawdev, space_used, system,
         time_access, time_backup, time_create, time_metadata,
         time_modify, mounted_on_fileid

   For quota_avail_hard, quota_avail_soft, and quota_used, see their
   definitions below for the appropriate classification.

5.5.  Set-Only and Get-Only Attributes

   Some REQUIRED and RECOMMENDED attributes are set-only; i.e., they can
   be set via SETATTR but not retrieved via GETATTR.  Similarly, some
   REQUIRED and RECOMMENDED attributes are get-only; i.e., they can be
   retrieved via GETATTR but not set via SETATTR.  If a client attempts
   to set a get-only attribute or get a set-only attribute, the server
   MUST return NFS4ERR_INVAL.

5.6.  REQUIRED Attributes - List and Definition References

   The list of REQUIRED attributes appears in Table 2.  The meaning of
   the columns of the table are:

   o  Name: The name of attribute

   o  Id: The number assigned to the attribute.  In the event of
      conflicts between the assigned number and
      [I-D.ietf-nfsv4-rfc3530bis-dot-x], the latter is likely authoritative,
      but in such an event, it should be resolved with Errata to this
      document and/or [I-D.ietf-nfsv4-rfc3530bis-dot-x].  See
      [ISEG_errata] for the Errata process.

   o  Data Type: The XDR data type of the attribute.

   o  Acc: Access allowed to the attribute.  R means read-only (GETATTR
      may retrieve, SETATTR may not set).  W means write-only (SETATTR
      may set, GETATTR may not retrieve).  R W means read/write (GETATTR
      may retrieve, SETATTR may set).

   o  Defined in: The section of this specification that describes the
      attribute.

      +-----------------+----+------------+-----+------------------+
      | Name            | Id | Data Type  | Acc | Defined in:      |
      +-----------------+----+------------+-----+------------------+
      | supported_attrs | 0  | bitmap4    | R   | Section 5.8.1.1  |
      | type            | 1  | nfs_ftype4 | R   | Section 5.8.1.2  |
      | fh_expire_type  | 2  | uint32_t   | R   | Section 5.8.1.3  |
      | change          | 3  | uint64_t   | R   | Section 5.8.1.4  |
      | size            | 4  | uint64_t   | R W | Section 5.8.1.5  |
      | link_support    | 5  | bool       | R   | Section 5.8.1.6  |
      | symlink_support | 6  | bool       | R   | Section 5.8.1.7  |
      | named_attr      | 7  | bool       | R   | Section 5.8.1.8  |
      | fsid            | 8  | fsid4      | R   | Section 5.8.1.9  |
      | unique_handles  | 9  | bool       | R   | Section 5.8.1.10 |
      | lease_time      | 10 | nfs_lease4 | R   | Section 5.8.1.11 |
      | rdattr_error    | 11 | enum nfsstat4   | R   | Section 5.8.1.12 |
      | filehandle      | 19 | nfs_fh4    | R   | Section 5.8.1.13 |
      +-----------------+----+------------+-----+------------------+

                                  Table 2

5.7.  RECOMMENDED Attributes - List and Definition References

   The RECOMMENDED attributes are defined in Table 3.  The meanings of
   the column headers are the same as Table 2; see Section 5.6 for the
   meanings.

    +-------------------+----+--------------+-----+------------------+
    | Name              | Id | Data Type    | Acc | Defined in:      |
    +-------------------+----+--------------+-----+------------------+
    | acl               | 12 | nfsace4<>    | R W | Section 6.2.1    |
    | aclsupport        | 13 | uint32_t     | R   | Section 6.2.1.2  |
    | archive           | 14 | bool         | R W | Section 5.8.2.1  |
    | cansettime        | 15 | bool         | R   | Section 5.8.2.2  |
    | case_insensitive  | 16 | bool         | R   | Section 5.8.2.3  |
    | case_preserving   | 17 | bool         | R   | Section 5.8.2.4  |
    | chown_restricted  | 18 | bool         | R   | Section 5.8.2.5  |
    | fileid            | 20 | uint64_t     | R   | Section 5.8.2.6  |
    | files_avail       | 21 | uint64_t     | R   | Section 5.8.2.7  |
    | files_free        | 22 | uint64_t     | R   | Section 5.8.2.8  |
    | files_total       | 23 | uint64_t     | R   | Section 5.8.2.9  |
    | fs_locations      | 24 | fs_locations | R   | Section 5.8.2.10 |
    | hidden            | 25 | bool         | R W | Section 5.8.2.11 |
    | homogeneous       | 26 | bool         | R   | Section 5.8.2.12 |
    | maxfilesize       | 27 | uint64_t     | R   | Section 5.8.2.13 |
    | maxlink           | 28 | uint32_t     | R   | Section 5.8.2.14 |
    | maxname           | 29 | uint32_t     | R   | Section 5.8.2.15 |
    | maxread           | 30 | uint64_t     | R   | Section 5.8.2.16 |
    | maxwrite          | 31 | uint64_t     | R   | Section 5.8.2.17 |
    | mimetype          | 32 | utf8<>       | R W | Section 5.8.2.18 |
    | mode              | 33 | mode4        | R W | Section 6.2.2    |
    | mounted_on_fileid | 55 | uint64_t     | R   | Section 5.8.2.19 |
    | no_trunc          | 34 | bool         | R   | Section 5.8.2.20 |
    | numlinks          | 35 | uint32_t     | R   | Section 5.8.2.21 |
    | owner             | 36 | utf8<>       | R W | Section 5.8.2.22 |
    | owner_group       | 37 | utf8<>       | R W | Section 5.8.2.23 |
    | quota_avail_hard  | 38 | uint64_t     | R   | Section 5.8.2.24 |
    | quota_avail_soft  | 39 | uint64_t     | R   | Section 5.8.2.25 |
    | quota_used        | 40 | uint64_t     | R   | Section 5.8.2.26 |
    | rawdev            | 41 | specdata4    | R   | Section 5.8.2.27 |
    | space_avail       | 42 | uint64_t     | R   | Section 5.8.2.28 |
    | space_free        | 43 | uint64_t     | R   | Section 5.8.2.29 |
    | space_total       | 44 | uint64_t     | R   | Section 5.8.2.30 |
    | space_used        | 45 | uint64_t     | R   | Section 5.8.2.31 |
    | system            | 46 | bool         | R W | Section 5.8.2.32 |
    | time_access       | 47 | nfstime4     | R   | Section 5.8.2.33 |
    | time_access_set   | 48 | settime4     |   W | Section 5.8.2.34 |
    | time_backup       | 49 | nfstime4     | R W | Section 5.8.2.35 |
    | time_create       | 50 | nfstime4     | R W | Section 5.8.2.36 |
    | time_delta        | 51 | nfstime4     | R   | Section 5.8.2.37 |
    | time_metadata     | 52 | nfstime4     | R   | Section 5.8.2.38 |
    | time_modify       | 53 | nfstime4     | R   | Section 5.8.2.39 |
    | time_modify_set   | 54 | settime4     |   W | Section 5.8.2.40 |
    +-------------------+----+--------------+-----+------------------+

                                  Table 3

5.8.  Attribute Definitions

5.8.1.  Definitions of REQUIRED Attributes

5.8.1.1.  Attribute 0: supported_attrs

   The bit vector that would retrieve all REQUIRED and RECOMMENDED
   attributes that are supported for this object.  The scope of this
   attribute applies to all objects with a matching fsid.

5.8.1.2.  Attribute 1: type

   Designates the type of an object in terms of one of a number of
   special constants:

   o  NF4REG designates a regular file.

   o  NF4DIR designates a directory.

   o  NF4BLK designates a block device special file.

   o  NF4CHR designates a character device special file.

   o  NF4LNK designates a symbolic link.

   o  NF4SOCK designates a named socket special file.

   o  NF4FIFO designates a fifo special file.

   o  NF4ATTRDIR designates a named attribute directory.

   o  NF4NAMEDATTR designates a named attribute.

   Within the explanatory text and operation descriptions, the following
   phrases will be used with the meanings given below:

   o  The phrase "is a directory" means that the object's type attribute
      is NF4DIR or NF4ATTRDIR.

   o  The phrase "is a special file" means that the object's type
      attribute is NF4BLK, NF4CHR, NF4SOCK, or NF4FIFO.

   o  The phrase "is an regular file" means that the object's type
      attribute is NF4REG or NF4NAMEDATTR.

5.8.1.3.  Attribute 2: fh_expire_type

   Server uses this to specify filehandle expiration behavior to the
   client.  See Section 4 for additional description.

5.8.1.4.  Attribute 3: change

   A value created by the server that the client can use to determine if
   file data, directory contents, or attributes of the object have been
   modified.  The server may MAY return the object's time_metadata attribute
   for this attribute's value but only if the file system object cannot
   be updated more frequently than the resolution of time_metadata.

5.8.1.5.  Attribute 4: size

   The size of the object in bytes.

5.8.1.6.  Attribute 5: link_support

   TRUE, if the object's file system supports hard links.

5.8.1.7.  Attribute 6: symlink_support

   TRUE, if the object's file system supports symbolic links.

5.8.1.8.  Attribute 7: named_attr

   TRUE, if this object has named attributes.  In other words, object
   has a non-empty named attribute directory.

5.8.1.9.  Attribute 8: fsid

   Unique file system identifier for the file system holding this
   object.  The fsid attribute has major and minor components, each of
   which are of data type uint64_t.

5.8.1.10.  Attribute 9: unique_handles

   TRUE, if two distinct filehandles are guaranteed to refer to two
   different file system objects.

5.8.1.11.  Attribute 10: lease_time

   Duration of the lease at server in seconds.

5.8.1.12.  Attribute 11: rdattr_error

   Error returned from an attempt to retrieve attributes during a
   READDIR operation.

5.8.1.13.  Attribute 19: filehandle

   The filehandle of this object (primarily for READDIR requests).

5.8.2.  Definitions of Uncategorized RECOMMENDED Attributes

   The definitions of most of the RECOMMENDED attributes follow.
   Collections that share a common category are defined in other
   sections.

5.8.2.1.  Attribute 14: archive

   TRUE, if this file has been archived since the time of last
   modification (deprecated in favor of time_backup).

5.8.2.2.  Attribute 15: cansettime

   TRUE, if the server is able to change the times for a file system
   object as specified in a SETATTR operation.

5.8.2.3.  Attribute 16: case_insensitive

   TRUE, if file name comparisons on this file system are case
   insensitive.

5.8.2.4.  Attribute 17: case_preserving

   TRUE, if file name case on this file system is preserved.

5.8.2.5.  Attribute 18: chown_restricted

   If TRUE, the server will reject any request to change either the
   owner or the group associated with a file if the caller is not a
   privileged user (for example, "root" in UNIX operating environments
   or in Windows 2000, the "Take Ownership" privilege).

5.8.2.6.  Attribute 20: fileid

   A number uniquely identifying the file within the file system.

5.8.2.7.  Attribute 21: files_avail

   File slots available to this user on the file system containing this
   object -- this should be the smallest relevant limit.

5.8.2.8.  Attribute 22: files_free

   Free file slots on the file system containing this object - this
   should be the smallest relevant limit.

5.8.2.9.  Attribute 23: files_total

   Total file slots on the file system containing this object.

5.8.2.10.  Attribute 24: fs_locations

   Locations where this file system may be found.  If the server returns
   NFS4ERR_MOVED as an error, this attribute MUST be supported.

   The server specifies the root path for a given server by returning a
   path consisting of zero path components.

5.8.2.11.  Attribute 25: hidden

   TRUE, if the file is considered hidden with respect to the Windows
   API.

5.8.2.12.  Attribute 26: homogeneous

   TRUE, if this object's file system is homogeneous, i.e., all objects
   in the file system (all objects on the server with the same fsid)
   have common values for all per-file-system attributes.

5.8.2.13.  Attribute 27: maxfilesize

   Maximum supported file size for the file system of this object.

5.8.2.14.  Attribute 28: maxlink

   Maximum number of links for this object.

5.8.2.15.  Attribute 29: maxname

   Maximum file name size supported for this object.

5.8.2.16.  Attribute 30: maxread

   Maximum amount of data the READ operation will return for this
   object.

5.8.2.17.  Attribute 31: maxwrite

   Maximum amount of data the WRITE operation will accept for this
   object.  This attribute SHOULD be supported if the file is writable.
   Lack of this attribute can lead to the client either wasting
   bandwidth or not receiving the best performance.

5.8.2.18.  Attribute 32: mimetype

   MIME body type/subtype of this object.

5.8.2.19.  Attribute 55: mounted_on_fileid

   Like fileid, but if the target filehandle is the root of a file
   system, this attribute represents the fileid of the underlying
   directory.

   UNIX-based operating environments connect a file system into the
   namespace by connecting (mounting) the file system onto the existing
   file object (the mount point, usually a directory) of an existing
   file system.  When the mount point's parent directory is read via an
   API like readdir(), the return results are directory entries, each
   with a component name and a fileid.  The fileid of the mount point's
   directory entry will be different from the fileid that the stat()
   system call returns.  The stat() system call is returning the fileid
   of the root of the mounted file system, whereas readdir() is
   returning the fileid that stat() would have returned before any file
   systems were mounted on the mount point.

   Unlike NFSv3, NFSv4.0 allows a client's LOOKUP request to cross other
   file systems.  The client detects the file system crossing whenever
   the filehandle argument of LOOKUP has an fsid attribute different
   from that of the filehandle returned by LOOKUP.  A UNIX-based client
   will consider this a "mount point crossing".  UNIX has a legacy
   scheme for allowing a process to determine its current working
   directory.  This relies on readdir() of a mount point's parent and
   stat() of the mount point returning fileids as previously described.
   The mounted_on_fileid attribute corresponds to the fileid that
   readdir() would have returned as described previously.

   While the NFSv4.0 client could simply fabricate a fileid
   corresponding to what mounted_on_fileid provides (and if the server
   does not support mounted_on_fileid, the client has no choice), there
   is a risk that the client will generate a fileid that conflicts with
   one that is already assigned to another object in the file system.
   Instead, if the server can provide the mounted_on_fileid, the
   potential for client operational problems in this area is eliminated.

   If the server detects that there is no mounted point at the target
   file object, then the value for mounted_on_fileid that it returns is
   the same as that of the fileid attribute.

   The mounted_on_fileid attribute is RECOMMENDED, so the server SHOULD
   provide it if possible, and for a UNIX-based server, this is
   straightforward.  Usually, mounted_on_fileid will be requested during
   a READDIR operation, in which case it is trivial (at least for UNIX-
   based servers) to return mounted_on_fileid since it is equal to the
   fileid of a directory entry returned by readdir().  If
   mounted_on_fileid is requested in a GETATTR operation, the server
   should obey an invariant that has it returning a value that is equal
   to the file object's entry in the object's parent directory, i.e.,
   what readdir() would have returned.  Some operating environments
   allow a series of two or more file systems to be mounted onto a
   single mount point.  In this case, for the server to obey the
   aforementioned invariant, it will need to find the base mount point,
   and not the intermediate mount points.

5.8.2.20.  Attribute 34: no_trunc

   If this attribute is TRUE, then if the client uses a file name longer
   than name_max, an error will be returned instead of the name being
   truncated.

5.8.2.21.  Attribute 35: numlinks

   Number of hard links to this object.

5.8.2.22.  Attribute 36: owner

   The string name of the owner of this object.

5.8.2.23.  Attribute 37: owner_group

   The string name of the group ownership of this object.

5.8.2.24.  Attribute 38: quota_avail_hard

   The value in bytes that represents the amount of additional disk
   space beyond the current allocation that can be allocated to this
   file or directory before further allocations will be refused.  It is
   understood that this space may be consumed by allocations to other
   files or directories.

5.8.2.25.  Attribute 39: quota_avail_soft

   The value in bytes that represents the amount of additional disk
   space that can be allocated to this file or directory before the user
   may reasonably be warned.  It is understood that this space may be
   consumed by allocations to other files or directories though there is
   a rule
   may exist server side rules as to which other files or directories.

5.8.2.26.  Attribute 40: quota_used

   The value in bytes that represents the amount of disc disk space used by
   this file or directory and possibly a number of other similar files
   or directories, where the set of "similar" meets at least the
   criterion that allocating space to any file or directory in the set
   will reduce the "quota_avail_hard" of every other file or directory
   in the set.

   Note that there may be a number of distinct but overlapping sets of
   files or directories for which a quota_used value is maintained,
   e.g., "all files with a given owner", "all files with a given group
   owner", etc.  The server is at liberty to choose any of those sets
   when providing the content of the quota_used attribute, but should do
   so in a repeatable way.  The rule may be configured per file system
   or may be "choose the set with the smallest quota".

5.8.2.27.  Attribute 41: rawdev

   Raw device number of file of type NF4BLK or NF4CHR.  The device
   number is split into major and minor numbers.  If the file's type
   attribute is not NF4BLK or NF4CHR, the value returned SHOULD NOT be
   considered useful.

5.8.2.28.  Attribute 42: space_avail

   Disk space in bytes available to this user on the file system
   containing this object -- this should be the smallest relevant limit.

5.8.2.29.  Attribute 43: space_free

   Free disk space in bytes on the file system containing this object --
   this should be the smallest relevant limit.

5.8.2.30.  Attribute 44: space_total

   Total disk space in bytes on the file system containing this object.

5.8.2.31.  Attribute 45: space_used

   Number of file system bytes allocated to this object.

5.8.2.32.  Attribute 46: system

   This attribute is TRUE if this file is a "system" file with respect
   to the Windows operating environment.

5.8.2.33.  Attribute 47: time_access

   The time_access attribute represents the time of last access to the
   object by a READ operation sent to the server.  The notion of what is
   an "access" depends on the server's operating environment and/or the
   server's file system semantics.  For example, for servers obeying
   Portable Operating System Interface (POSIX) semantics, time_access
   would be updated only by the READ and READDIR operations and not any
   of the operations that modify the content of the object [16], [17],
   [read_api], [readdir_api], [write_api].  Of course, setting the
   corresponding time_access_set attribute is another way to modify the
   time_access attribute.

   Whenever the file object resides on a writable file system, the
   server should make its best efforts to record time_access into stable
   storage.  However, to mitigate the performance effects of doing so,
   and most especially whenever the server is satisfying the read of the
   object's content from its cache, the server MAY cache access time
   updates and lazily write them to stable storage.  It is also
   acceptable to give administrators of the server the option to disable
   time_access updates.

5.8.2.34.  Attribute 48: time_access_set

   Sets the time of last access to the object.  SETATTR use only.

5.8.2.35.  Attribute 49: time_backup

   The time of last backup of the object.

5.8.2.36.  Attribute 50: time_create

   The time of creation of the object.  This attribute does not have any
   relation to the traditional UNIX file attribute "ctime" or "change
   time".

5.8.2.37.  Attribute 51: time_delta

   Smallest useful server time granularity.

5.8.2.38.  Attribute 52: time_metadata

   The time of last metadata modification of the object.

5.8.2.39.  Attribute 53: time_modify

   The time of last modification to the object.

5.8.2.40.  Attribute 54: time_modify_set

   Sets the time of last modification to the object.  SETATTR use only.

5.9.  Interpreting owner and owner_group

   The RECOMMENDED attributes "owner" and "owner_group" (and also users
   and groups within the "acl" attribute) are represented in terms of a
   UTF-8 string.  To avoid a representation that is tied to a particular
   underlying implementation at the client or server, the use of the
   UTF-8 string has been chosen.  Note that section 6.1 of RFC 2624
   [RFC2624] provides additional rationale.  It is expected that the
   client and server will have their own local representation of owner
   and owner_group that is used for local storage or presentation to the
   end user.  Therefore, it is expected that when these attributes are
   transferred between the client and server, the local representation
   is translated to a syntax of the form "user@dns_domain".  This will
   allow for a client and server that do not use the same local
   representation the ability to translate to a common syntax that can
   be interpreted by both.

   Similarly, security principals may be represented in different ways
   by different security mechanisms.  Servers normally translate these
   representations into a common format, generally that used by local
   storage, to serve as a means of identifying the users corresponding
   to these security principals.  When these local identifiers are
   translated to the form of the owner attribute, associated with files
   created by such principals, they identify, in a common format, the
   users associated with each corresponding set of security principals.

   The translation used to interpret owner and group strings is not
   specified as part of the protocol.  This allows various solutions to
   be employed.  For example, a local translation table may be consulted
   that maps a numeric identifier to the user@dns_domain syntax.  A name
   service may also be used to accomplish the translation.  A server may
   provide a more general service, not limited by any particular
   translation (which would only translate a limited set of possible
   strings) by storing the owner and owner_group attributes in local
   storage without any translation or it may augment a translation
   method by storing the entire string for attributes for which no
   translation is available while using the local representation for
   those cases in which a translation is available.

   Servers that do not provide support for all possible values of the
   owner and owner_group attributes SHOULD return an error
   (NFS4ERR_BADOWNER) when a string is presented that has no
   translation, as the value to be set for a SETATTR of the owner,
   owner_group, or acl attributes.  When a server does accept an owner
   or owner_group value as valid on a SETATTR (and similarly for the
   owner and group strings in an acl), it is promising to return that
   same string (for which see below) when a corresponding GETATTR is
   done.  For some internationalization-related exceptions where this is
   not possible, see below.  Configuration changes (including changes
   from the mapping of the string to the local representation) and ill-
   constructed name translations (those that contain aliasing) may make
   that promise impossible to honor.  Servers should make appropriate
   efforts to avoid a situation in which these attributes have their
   values changed when no real change to ownership has occurred.

   The "dns_domain" portion of the owner string is meant to be a DNS
   domain name.  For example, user@example.org.  Servers should accept
   as valid a set of users for at least one domain.  A server may treat
   other domains as having no valid translations.  A more general
   service is provided when a server is capable of accepting users for
   multiple domains, or for all domains, subject to security
   constraints.

   As an implementation guide, both clients and servers may provide a
   means to configure the "dns_domain" portion of the owner string.  For
   example, the DNS domain name might be "lab.example.org", but the user
   names are defined in "example.org".  In the absence of such a
   configuration, or as a default, the current DNS domain name of the
   server should be the value used for the "dns_domain".

   As mentioned above, it is desirable that a server when accepting a
   string of the form user@domain or group@domain in an attribute,
   return this same string when that corresponding attribute is fetched.
   Internationalization issues (for a general discussion of which see
   Section 12) may make this impossible and the client needs to take
   note of the following situations:

   o  The string representing the domain may be converted to equivalent
      U-label,
      U-label (see [RFC5890]), if presented using a form other than a a
      U-label.  See Section 12.6 12.4 for details.

   o  The user or group may be returned in a different form, due to
      normalization issues, although it will always be a canonically
      equivalent string.  See Section 12.7.3 for details.

   In the case where there is no translation available to the client or
   server, the attribute value will be constructed without the "@".
   Therefore, the absence of the "@" from the owner or owner_group
   attribute signifies that no translation was available at the sender
   and that the receiver of the attribute should not use that string as
   a basis for translation into its own internal format.  Even though
   the attribute value cannot be translated, it may still be useful.  In
   the case of a client, the attribute string may be used for local
   display of ownership.

   To provide a greater degree of compatibility with NFSv3, which
   identified users and groups by 32-bit unsigned user identifiers and
   group identifiers, owner and group strings that consist of ASCII-
   encoded decimal numeric values with no leading zeros can be given a
   special interpretation by clients and servers that choose to provide
   such support.  The receiver may treat such a user or group string as
   representing the same user as would be represented by an NFSv3 uid or
   gid having the corresponding numeric value.

   A server SHOULD reject such a numeric value if the security mechanism
   is kerberized.  I.e., in such a scenario, the client will already
   need to form "user@domain" strings.  For any other security
   mechanism, the server SHOULD accept such numeric values.  As an
   implementation note, the server could make such an acceptance be
   configurable.  If the server does not support numeric values or if it
   is configured off, then it MUST return an NFS4ERR_BADOWNER error.  If
   the security mechanism is kerberized and the client attempts to use
   the special form, then the server SHOULD return an NFS4ERR_BADOWNER
   error when there is a valid translation for the user or owner
   designated in this way.  In that case, the client must use the
   appropriate user@domain string and not the special form for
   compatibility.

   The client MUST always accept numeric values if the security
   mechanism is not RPCSEC_GSS.  A client can determine if a server
   supports numeric identifiers by first attempting to provide a numeric
   identifier.  If this attempt rejected with an NFS4ERR_BADOWNER error,
   the the client should only use named identifiers of the form "user@
   dns_domain".

   The owner string "nobody" may be used to designate an anonymous user,
   which will be associated with a file created by a security principal
   that cannot be mapped through normal means to the owner attribute.

5.10.  Character Case Attributes

   With respect to the case_insensitive and case_preserving attributes,
   each UCS-4 Universal Multiple-octet coded Character Set-4 (UCS-4)
   [ISO.10646-1.1993] character (which UTF-8 encodes) has a "long
   descriptive name" RFC1345 [RFC1345] which may or may not include the
   word "CAPITAL" or "SMALL".  The presence of SMALL or CAPITAL allows
   an NFS server to implement unambiguous and efficient table driven
   mappings for case insensitive comparisons, and non-case-preserving
   storage, although there are variations that occur additional
   characters with a name including "SMALL" or "CAPITAL" are added in a
   subsequent version of Unicode.

   For general character handling and internationalization issues, see
   Section 12.  For details regarding case mapping, see the section
   Case-based Mapping Used for Component4 Strings.

6.  Access Control Attributes

   Access Control Lists (ACLs) are file attributes that specify fine
   grained access control.  This chapter covers the "acl", "aclsupport",
   "mode", file attributes, and their interactions.  Note that file
   attributes may apply to any file system object.

6.1.  Goals

   ACLs and modes represent two well established models for specifying
   permissions.  This chapter specifies requirements that attempt to
   meet the following goals:

   o  If a server supports the mode attribute, it should provide
      reasonable semantics to clients that only set and retrieve the
      mode attribute.

   o  If a server supports ACL attributes, it should provide reasonable
      semantics to clients that only set and retrieve those attributes.

   o  On servers that support the mode attribute, if ACL attributes have
      never been set on an object, via inheritance or explicitly, the
      behavior should be traditional UNIX-like behavior.

   o  On servers that support the mode attribute, if the ACL attributes
      have been previously set on an object, either explicitly or via
      inheritance:

      *  Setting only the mode attribute should effectively control the
         traditional UNIX-like permissions of read, write, and execute
         on owner, owner_group, and other.

      *  Setting only the mode attribute should provide reasonable
         security.  For example, setting a mode of 000 should be enough
         to ensure that future opens for read or write by any principal
         fail, regardless of a previously existing or inherited ACL.

   o  When a mode attribute is set on an object, the ACL attributes may
      need to be modified so as to not conflict with the new mode.  In
      such cases, it is desirable that the ACL keep as much information
      as possible.  This includes information about inheritance, AUDIT
      and ALARM ACEs, and permissions granted and denied that do not
      conflict with the new mode.

6.2.  File Attributes Discussion

6.2.1.  Attribute 12: acl

   The NFSv4.0 ACL attribute contains an array of access control entries
   (ACEs) that are associated with the file system object.  Although the
   client can read and write the acl attribute, the server is
   responsible for using the ACL to perform access control.  The client
   can use the OPEN or ACCESS operations to check access without
   modifying or reading data or metadata.

   The NFS ACE structure is defined as follows:

   typedef uint32_t        acetype4;

   typedef uint32_t aceflag4;

   typedef uint32_t        acemask4;
   struct nfsace4 {
           acetype4                type;
           aceflag4                flag;
           acemask4                access_mask;
           utf8val_REQUIRED4
           utf8str_mixed           who;
   };

   To determine if a request succeeds, the server processes each nfsace4
   entry in order.  Only ACEs which have a "who" that matches the
   requester are considered.  Each ACE is processed until all of the
   bits of the requester's access have been ALLOWED.  Once a bit (see
   below) has been ALLOWED by an ACCESS_ALLOWED_ACE, it is no longer
   considered in the processing of later ACEs.  If an ACCESS_DENIED_ACE
   is encountered where the requester's access still has unALLOWED bits
   in common with the "access_mask" of the ACE, the request is denied.
   When the ACL is fully processed, if there are bits in the requester's
   mask that have not been ALLOWED or DENIED, access is denied.

   Unlike the ALLOW and DENY ACE types, the ALARM and AUDIT ACE types do
   not affect a requester's access, and instead are for triggering
   events as a result of a requester's access attempt.  Therefore, AUDIT
   and ALARM ACEs are processed only after processing ALLOW and DENY
   ACEs.

   The NFSv4.0 ACL model is quite rich.  Some server platforms may
   provide access control functionality that goes beyond the UNIX-style
   mode attribute, but which is not as rich as the NFS ACL model.  So
   that users can take advantage of this more limited functionality, the
   server may support the acl attributes by mapping between its ACL
   model and the NFSv4.0 ACL model.  Servers must ensure that the ACL
   they actually store or enforce is at least as strict as the NFSv4 ACL
   that was set.  It is tempting to accomplish this by rejecting any ACL
   that falls outside the small set that can be represented accurately.
   However, such an approach can render ACLs unusable without special
   client-side knowledge of the server's mapping, which defeats the
   purpose of having a common NFSv4 ACL protocol.  Therefore servers
   should accept every ACL that they can without compromising security.
   To help accomplish this, servers may make a special exception, in the
   case of unsupported permission bits, to the rule that bits not
   ALLOWED or DENIED by an ACL must be denied.  For example, a UNIX-
   style server might choose to silently allow read attribute
   permissions even though an ACL does not explicitly allow those
   permissions.  (An ACL that explicitly denies permission to read
   attributes should still be rejected.)

   The situation is complicated by the fact that a server may have
   multiple modules that enforce ACLs.  For example, the enforcement for
   NFSv4.0 access may be different from, but not weaker than, the
   enforcement for local access, and both may be different from the
   enforcement for access through other protocols such as SMB. Server Message
   Block (SMB).  So it may be useful for a server to accept an ACL even
   if not all of its modules are able to support it.

   The guiding principle with regard to NFSv4 access is that the server
   must not accept ACLs that appear to make access to the file more
   restrictive than it really is.

6.2.1.1.  ACE Type

   The constants used for the type field (acetype4) are as follows:

   const ACE4_ACCESS_ALLOWED_ACE_TYPE      = 0x00000000;
   const ACE4_ACCESS_DENIED_ACE_TYPE       = 0x00000001;
   const ACE4_SYSTEM_AUDIT_ACE_TYPE        = 0x00000002;
   const ACE4_SYSTEM_ALARM_ACE_TYPE        = 0x00000003;

   All four but bit types are permitted in the acl attribute.

   +------------------------------+--------------+---------------------+
   | Value                        | Abbreviation | Description         |
   +------------------------------+--------------+---------------------+
   | ACE4_ACCESS_ALLOWED_ACE_TYPE | ALLOW        | Explicitly grants   |
   |                              |              | the access defined  |
   |                              |              | in acemask4 to the  |
   |                              |              | file or directory.  |
   | ACE4_ACCESS_DENIED_ACE_TYPE  | DENY         | Explicitly denies   |
   |                              |              | the access defined  |
   |                              |              | in acemask4 to the  |
   |                              |              | file or directory.  |
   | ACE4_SYSTEM_AUDIT_ACE_TYPE   | AUDIT        | LOG (in a system    |
   |                              |              | dependent way) any  |
   |                              |              | access attempt to a |
   |                              |              | file or directory   |
   |                              |              | which uses any of   |
   |                              |              | the access methods  |
   |                              |              | specified in        |
   |                              |              | acemask4.           |
   | ACE4_SYSTEM_ALARM_ACE_TYPE   | ALARM        | Generate a system   |
   |                              |              | ALARM (system       |
   |                              |              | dependent) when any |
   |                              |              | access attempt is   |
   |                              |              | made to a file or   |
   |                              |              | directory for the   |
   |                              |              | access methods      |
   |                              |              | specified in        |
   |                              |              | acemask4.           |
   +------------------------------+--------------+---------------------+

    The "Abbreviation" column denotes how the types will be referred to
                   throughout the rest of this chapter.

6.2.1.2.  Attribute 13: aclsupport

   A server need not support all of the above ACE types.  This attribute
   indicates which ACE types are supported for the current file system.
   The bitmask constants used to represent the above definitions within
   the aclsupport attribute are as follows:

   const ACL4_SUPPORT_ALLOW_ACL    = 0x00000001;
   const ACL4_SUPPORT_DENY_ACL     = 0x00000002;
   const ACL4_SUPPORT_AUDIT_ACL    = 0x00000004;
   const ACL4_SUPPORT_ALARM_ACL    = 0x00000008;

   Servers which support either the ALLOW or DENY ACE type SHOULD
   support both ALLOW and DENY ACE types.

   Clients should not attempt to set an ACE unless the server claims
   support for that ACE type.  If the server receives a request to set
   an ACE that it cannot store, it MUST reject the request with
   NFS4ERR_ATTRNOTSUPP.  If the server receives a request to set an ACE
   that it can store but cannot enforce, the server SHOULD reject the
   request with NFS4ERR_ATTRNOTSUPP.

   Support for any of the ACL attributes is optional (albeit,
   RECOMMENDED).

6.2.1.3.  ACE Access Mask

   The bitmask constants used for the access mask field are as follows:

   const ACE4_READ_DATA            = 0x00000001;
   const ACE4_LIST_DIRECTORY       = 0x00000001;
   const ACE4_WRITE_DATA           = 0x00000002;
   const ACE4_ADD_FILE             = 0x00000002;
   const ACE4_APPEND_DATA          = 0x00000004;
   const ACE4_ADD_SUBDIRECTORY     = 0x00000004;
   const ACE4_READ_NAMED_ATTRS     = 0x00000008;
   const ACE4_WRITE_NAMED_ATTRS    = 0x00000010;
   const ACE4_EXECUTE              = 0x00000020;
   const ACE4_DELETE_CHILD         = 0x00000040;
   const ACE4_READ_ATTRIBUTES      = 0x00000080;
   const ACE4_WRITE_ATTRIBUTES     = 0x00000100;

   const ACE4_DELETE               = 0x00010000;
   const ACE4_READ_ACL             = 0x00020000;
   const ACE4_WRITE_ACL            = 0x00040000;
   const ACE4_WRITE_OWNER          = 0x00080000;
   const ACE4_SYNCHRONIZE          = 0x00100000;

   Note that some masks have coincident values, for example,
   ACE4_READ_DATA and ACE4_LIST_DIRECTORY.  The mask entries
   ACE4_LIST_DIRECTORY, ACE4_ADD_FILE, and ACE4_ADD_SUBDIRECTORY are
   intended to be used with directory objects, while ACE4_READ_DATA,
   ACE4_WRITE_DATA, and ACE4_APPEND_DATA are intended to be used with
   non-directory objects.

6.2.1.3.1.  Discussion of Mask Attributes

   ACE4_READ_DATA

      Operation(s) affected:

         READ

         OPEN

      Discussion:

         Permission to read the data of the file.

         Servers SHOULD allow a user the ability to read the data of the
         file when only the ACE4_EXECUTE access mask bit is allowed.

   ACE4_LIST_DIRECTORY
      Operation(s) affected:

         READDIR

      Discussion:

         Permission to list the contents of a directory.

   ACE4_WRITE_DATA

      Operation(s) affected:

         WRITE

         OPEN

         SETATTR of size

      Discussion:

         Permission to modify a file's data.

   ACE4_ADD_FILE

      Operation(s) affected:

         CREATE

         LINK

         OPEN

         RENAME

      Discussion:

         Permission to add a new file in a directory.  The CREATE
         operation is affected when nfs_ftype4 is NF4LNK, NF4BLK,
         NF4CHR, NF4SOCK, or NF4FIFO.  (NF4DIR is not listed because it
         is covered by ACE4_ADD_SUBDIRECTORY.)  OPEN is affected when
         used to create a regular file.  LINK and RENAME are always
         affected.

   ACE4_APPEND_DATA
      Operation(s) affected:

         WRITE

         OPEN

         SETATTR of size

      Discussion:

         The ability to modify a file's data, but only starting at EOF.
         This allows for the notion of append-only files, by allowing
         ACE4_APPEND_DATA and denying ACE4_WRITE_DATA to the same user
         or group.  If a file has an ACL such as the one described above
         and a WRITE request is made for somewhere other than EOF, the
         server SHOULD return NFS4ERR_ACCESS.

   ACE4_ADD_SUBDIRECTORY

      Operation(s) affected:

         CREATE

         RENAME

      Discussion:

         Permission to create a subdirectory in a directory.  The CREATE
         operation is affected when nfs_ftype4 is NF4DIR.  The RENAME
         operation is always affected.

   ACE4_READ_NAMED_ATTRS

      Operation(s) affected:

         OPENATTR

      Discussion:

         Permission to read the named attributes of a file or to lookup
         the named attributes directory.  OPENATTR is affected when it
         is not used to create a named attribute directory.  This is
         when 1.) createdir is TRUE, but a named attribute directory
         already exists, or 2.) createdir is FALSE.

   ACE4_WRITE_NAMED_ATTRS

      Operation(s) affected:

         OPENATTR

      Discussion:

         Permission to write the named attributes of a file or to create
         a named attribute directory.  OPENATTR is affected when it is
         used to create a named attribute directory.  This is when
         createdir is TRUE and no named attribute directory exists.  The
         ability to check whether or not a named attribute directory
         exists depends on the ability to look it up, therefore, users
         also need the ACE4_READ_NAMED_ATTRS permission in order to
         create a named attribute directory.

   ACE4_EXECUTE

      Operation(s) affected:

         READ

      Discussion:

         Permission to execute a file.

         Servers SHOULD allow a user the ability to read the data of the
         file when only the ACE4_EXECUTE access mask bit is allowed.
         This is because there is no way to execute a file without
         reading the contents.  Though a server may treat ACE4_EXECUTE
         and ACE4_READ_DATA bits identically when deciding to permit a
         READ operation, it SHOULD still allow the two bits to be set
         independently in ACLs, and MUST distinguish between them when
         replying to ACCESS operations.  In particular, servers SHOULD
         NOT silently turn on one of the two bits when the other is set,
         as that would make it impossible for the client to correctly
         enforce the distinction between read and execute permissions.

         As an example, following a SETATTR of the following ACL:

         nfsuser:ACE4_EXECUTE:ALLOW

         A subsequent GETATTR of ACL for that file SHOULD return:

         nfsuser:ACE4_EXECUTE:ALLOW
         Rather than:

         nfsuser:ACE4_EXECUTE/ACE4_READ_DATA:ALLOW

   ACE4_EXECUTE

      Operation(s) affected:

         LOOKUP

         OPEN

         REMOVE

         RENAME

         LINK

         CREATE

      Discussion:

         Permission to traverse/search a directory.

   ACE4_DELETE_CHILD

      Operation(s) affected:

         REMOVE

         RENAME

      Discussion:

         Permission to delete a file or directory within a directory.
         See Section 6.2.1.3.2 for information on how ACE4_DELETE and
         ACE4_DELETE_CHILD interact.

   ACE4_READ_ATTRIBUTES

      Operation(s) affected:

         GETATTR of file system object attributes

         VERIFY
         NVERIFY

         READDIR

      Discussion:

         The ability to read basic attributes (non-ACLs) of a file.  On
         a UNIX system, basic attributes can be thought of as the stat
         level attributes.  Allowing this access mask bit would mean the
         entity can execute "ls -l" and stat.  If a READDIR operation
         requests attributes, this mask must be allowed for the READDIR
         to succeed.

   ACE4_WRITE_ATTRIBUTES

      Operation(s) affected:

         SETATTR of time_access_set, time_backup,

         time_create, time_modify_set, mimetype, hidden, system

      Discussion:

         Permission to change the times associated with a file or
         directory to an arbitrary value.  Also permission to change the
         mimetype, hidden and system attributes.  A user having
         ACE4_WRITE_DATA or ACE4_WRITE_ATTRIBUTES will be allowed to set
         the times associated with a file to the current server time.

   ACE4_DELETE

      Operation(s) affected:

         REMOVE

      Discussion:

         Permission to delete the file or directory.  See
         Section 6.2.1.3.2 for information on ACE4_DELETE and
         ACE4_DELETE_CHILD interact.

   ACE4_READ_ACL

      Operation(s) affected:

         GETATTR of acl

         NVERIFY

         VERIFY

      Discussion:

         Permission to read the ACL.

   ACE4_WRITE_ACL

      Operation(s) affected:

         SETATTR of acl and mode

      Discussion:

         Permission to write the acl and mode attributes.

   ACE4_WRITE_OWNER

      Operation(s) affected:

         SETATTR of owner and owner_group

      Discussion:

         Permission to write the owner and owner_group attributes.  On
         UNIX systems, this is the ability to execute chown() and
         chgrp().

   ACE4_SYNCHRONIZE

      Operation(s) affected:

         NONE

      Discussion:

         Permission to use the file object as a synchronization
         primitive for interprocess communication.  This permission is
         not enforced or interpreted by the NFSv4.0 server on behalf of
         the client.

         Typically, the ACE4_SYNCHRONIZE permission is only meaningful
         on local file systems, i.e., file systems not accessed via
         NFSv4.0.  The reason that the permission bit exists is that
         some operating environments, such as Windows, use
         ACE4_SYNCHRONIZE.

         For example, if a client copies a file that has
         ACE4_SYNCHRONIZE set from a local file system to an NFSv4.0
         server, and then later copies the file from the NFSv4.0 server
         to a local file system, it is likely that if ACE4_SYNCHRONIZE
         was set in the original file, the client will want it set in
         the second copy.  The first copy will not have the permission
         set unless the NFSv4.0 server has the means to set the
         ACE4_SYNCHRONIZE bit.  The second copy will not have the
         permission set unless the NFSv4.0 server has the means to
         retrieve the ACE4_SYNCHRONIZE bit.

   Server implementations need not provide the granularity of control
   that is implied by this list of masks.  For example, POSIX-based
   systems might not distinguish ACE4_APPEND_DATA (the ability to append
   to a file) from ACE4_WRITE_DATA (the ability to modify existing
   contents); both masks would be tied to a single "write" permission.
   When such a server returns attributes to the client, it would show
   both ACE4_APPEND_DATA and ACE4_WRITE_DATA if and only if the write
   permission is enabled.

   If a server receives a SETATTR request that it cannot accurately
   implement, it should err in the direction of more restricted access,
   except in the previously discussed cases of execute and read.  For
   example, suppose a server cannot distinguish overwriting data from
   appending new data, as described in the previous paragraph.  If a
   client submits an ALLOW ACE where ACE4_APPEND_DATA is set but
   ACE4_WRITE_DATA is not (or vice versa), the server should either turn
   off ACE4_APPEND_DATA or reject the request with NFS4ERR_ATTRNOTSUPP.

6.2.1.3.2.  ACE4_DELETE vs. ACE4_DELETE_CHILD

   Two access mask bits govern the ability to delete a directory entry:
   ACE4_DELETE on the object itself (the "target"), and
   ACE4_DELETE_CHILD on the containing directory (the "parent").

   Many systems also take the "sticky bit" (MODE4_SVTX) on a directory
   to allow unlink only to a user that owns either the target or the
   parent; on some such systems the decision also depends on whether the
   target is writable.

   Servers SHOULD allow unlink if either ACE4_DELETE is permitted on the
   target, or ACE4_DELETE_CHILD is permitted on the parent.  (Note that
   this is true even if the parent or target explicitly denies one of
   these permissions.)
   If the ACLs in question neither explicitly ALLOW nor DENY either of
   the above, and if MODE4_SVTX is not set on the parent, then the
   server SHOULD allow the removal if and only if ACE4_ADD_FILE is
   permitted.  In the case where MODE4_SVTX is set, the server may also
   require the remover to own either the parent or the target, or may
   require the target to be writable.

   This allows servers to support something close to traditional UNIX-
   like semantics, with ACE4_ADD_FILE taking the place of the write bit.

6.2.1.4.  ACE flag

   The bitmask constants used for the flag field are as follows:

   const ACE4_FILE_INHERIT_ACE             = 0x00000001;
   const ACE4_DIRECTORY_INHERIT_ACE        = 0x00000002;
   const ACE4_NO_PROPAGATE_INHERIT_ACE     = 0x00000004;
   const ACE4_INHERIT_ONLY_ACE             = 0x00000008;
   const ACE4_SUCCESSFUL_ACCESS_ACE_FLAG   = 0x00000010;
   const ACE4_FAILED_ACCESS_ACE_FLAG       = 0x00000020;
   const ACE4_IDENTIFIER_GROUP             = 0x00000040;

   A server need not support any of these flags.  If the server supports
   flags that are similar to, but not exactly the same as, these flags,
   the implementation may define a mapping between the protocol-defined
   flags and the implementation-defined flags.

   For example, suppose a client tries to set an ACE with
   ACE4_FILE_INHERIT_ACE set but not ACE4_DIRECTORY_INHERIT_ACE.  If the
   server does not support any form of ACL inheritance, the server
   should reject the request with NFS4ERR_ATTRNOTSUPP.  If the server
   supports a single "inherit ACE" flag that applies to both files and
   directories, the server may reject the request (i.e., requiring the
   client to set both the file and directory inheritance flags).  The
   server may also accept the request and silently turn on the
   ACE4_DIRECTORY_INHERIT_ACE flag.

6.2.1.4.1.  Discussion of Flag Bits

   ACE4_FILE_INHERIT_ACE
      Any non-directory file in any sub-directory will get this ACE
      inherited.

   ACE4_DIRECTORY_INHERIT_ACE
      Can be placed on a directory and indicates that this ACE should be
      added to each new directory created.
      If this flag is set in an ACE in an ACL attribute to be set on a
      non-directory file system object, the operation attempting to set
      the ACL SHOULD fail with NFS4ERR_ATTRNOTSUPP.

   ACE4_INHERIT_ONLY_ACE
      Can be placed on a directory but does not apply to the directory;
      ALLOW and DENY ACEs with this bit set do not affect access to the
      directory, and AUDIT and ALARM ACEs with this bit set do not
      trigger log or alarm events.  Such ACEs only take effect once they
      are applied (with this bit cleared) to newly created files and
      directories as specified by the above two flags.
      If this flag is present on an ACE, but neither
      ACE4_DIRECTORY_INHERIT_ACE nor ACE4_FILE_INHERIT_ACE is present,
      then an operation attempting to set such an attribute SHOULD fail
      with NFS4ERR_ATTRNOTSUPP.

   ACE4_NO_PROPAGATE_INHERIT_ACE
      Can be placed on a directory.  This flag tells the server that
      inheritance of this ACE should stop at newly created child
      directories.

   ACE4_SUCCESSFUL_ACCESS_ACE_FLAG

   ACE4_FAILED_ACCESS_ACE_FLAG
      The ACE4_SUCCESSFUL_ACCESS_ACE_FLAG (SUCCESS) and
      ACE4_FAILED_ACCESS_ACE_FLAG (FAILED) flag bits may be set only on
      ACE4_SYSTEM_AUDIT_ACE_TYPE (AUDIT) and ACE4_SYSTEM_ALARM_ACE_TYPE
      (ALARM) ACE types.  If during the processing of the file's ACL,
      the server encounters an AUDIT or ALARM ACE that matches the
      principal attempting the OPEN, the server notes that fact, and the
      presence, if any, of the SUCCESS and FAILED flags encountered in
      the AUDIT or ALARM ACE.  Once the server completes the ACL
      processing, it then notes if the operation succeeded or failed.
      If the operation succeeded, and if the SUCCESS flag was set for a
      matching AUDIT or ALARM ACE, then the appropriate AUDIT or ALARM
      event occurs.  If the operation failed, and if the FAILED flag was
      set for the matching AUDIT or ALARM ACE, then the appropriate
      AUDIT or ALARM event occurs.  Either or both of the SUCCESS or
      FAILED can be set, but if neither is set, the AUDIT or ALARM ACE
      is not useful.

      The previously described processing applies to ACCESS operations
      even when they return NFS4_OK.  For the purposes of AUDIT and
      ALARM, we consider an ACCESS operation to be a "failure" if it
      fails to return a bit that was requested and supported.

   ACE4_IDENTIFIER_GROUP
      Indicates that the "who" refers to a GROUP as defined under UNIX
      or a GROUP ACCOUNT as defined under Windows.  Clients and servers
      MUST ignore the ACE4_IDENTIFIER_GROUP flag on ACEs with a who
      value equal to one of the special identifiers outlined in
      Section 6.2.1.5.

6.2.1.5.  ACE Who

   The "who" field of an ACE is an identifier that specifies the
   principal or principals to whom the ACE applies.  It may refer to a
   user or a group, with the flag bit ACE4_IDENTIFIER_GROUP specifying
   which.

   There are several special identifiers which need to be understood
   universally, rather than in the context of a particular DNS domain.
   Some of these identifiers cannot be understood when an NFS client
   accesses the server, but have meaning when a local process accesses
   the file.  The ability to display and modify these permissions is
   permitted over NFS, even if none of the access methods on the server
   understands the identifiers.

   +---------------+--------------------------------------------------+
   | Who           | Description                                      |
   +---------------+--------------------------------------------------+
   | OWNER         | The owner of the file                            |
   | GROUP         | The group associated with the file.              |
   | EVERYONE      | The world, including the owner and owning group. |
   | INTERACTIVE   | Accessed from an interactive terminal.           |
   | NETWORK       | Accessed via the network.                        |
   | DIALUP        | Accessed as a dialup user to the server.         |
   | BATCH         | Accessed from a batch job.                       |
   | ANONYMOUS     | Accessed without any authentication.             |
   | AUTHENTICATED | Any authenticated user (opposite of ANONYMOUS)   |
   | SERVICE       | Access from a system service.                    |
   +---------------+--------------------------------------------------+

                                  Table 4

   To avoid conflict, these special identifiers are distinguished by an
   appended "@" and should appear in the form "xxxx@" (with no domain
   name after the "@").  For example: ANONYMOUS@.

   The ACE4_IDENTIFIER_GROUP flag MUST be ignored on entries with these
   special identifiers.  When encoding entries with these special
   identifiers, the ACE4_IDENTIFIER_GROUP flag SHOULD be set to zero.

6.2.1.5.1.  Discussion of EVERYONE@

   It is important to note that "EVERYONE@" is not equivalent to the
   UNIX "other" entity.  This is because, by definition, UNIX "other"
   does not include the owner or owning group of a file.  "EVERYONE@"
   means literally everyone, including the owner or owning group.

6.2.2.  Attribute 33: mode

   The NFSv4.0 mode attribute is based on the UNIX mode bits.  The
   following bits are defined:

   const MODE4_SUID = 0x800;  /* set user id on execution */
   const MODE4_SGID = 0x400;  /* set group id on execution */
   const MODE4_SVTX = 0x200;  /* save text even after use */
   const MODE4_RUSR = 0x100;  /* read permission: owner */
   const MODE4_WUSR = 0x080;  /* write permission: owner */
   const MODE4_XUSR = 0x040;  /* execute permission: owner */
   const MODE4_RGRP = 0x020;  /* read permission: group */
   const MODE4_WGRP = 0x010;  /* write permission: group */
   const MODE4_XGRP = 0x008;  /* execute permission: group */
   const MODE4_ROTH = 0x004;  /* read permission: other */
   const MODE4_WOTH = 0x002;  /* write permission: other */
   const MODE4_XOTH = 0x001;  /* execute permission: other */

   Bits MODE4_RUSR, MODE4_WUSR, and MODE4_XUSR apply to the principal
   identified in the owner attribute.  Bits MODE4_RGRP, MODE4_WGRP, and
   MODE4_XGRP apply to principals identified in the owner_group
   attribute but who are not identified in the owner attribute.  Bits
   MODE4_ROTH, MODE4_WOTH, MODE4_XOTH apply to any principal that does
   not match that in the owner attribute, and does not have a group
   matching that of the owner_group attribute.

   Bits within the mode other than those specified above are not defined
   by this protocol.  A server MUST NOT return bits other than those
   defined above in a GETATTR or READDIR operation, and it MUST return
   NFS4ERR_INVAL if bits other than those defined above are set in a
   SETATTR, CREATE, OPEN, VERIFY or NVERIFY operation.

6.3.  Common Methods

   The requirements in this section will be referred to in future
   sections, especially Section 6.4.

6.3.1.  Interpreting an ACL

6.3.1.1.  Server Considerations

   The server uses the algorithm described in Section 6.2.1 to determine
   whether an ACL allows access to an object.  However, the ACL may not
   be the sole determiner of access.  For example:

   o  In the case of a file system exported as read-only, the server may
      deny write permissions even though an object's ACL grants it.

   o  Server implementations MAY grant ACE4_WRITE_ACL and ACE4_READ_ACL
      permissions to prevent a situation from arising in which there is
      no valid way to ever modify the ACL.

   o  All servers will allow a user the ability to read the data of the
      file when only the execute permission is granted (i.e., If the ACL
      denies the user the ACE4_READ_DATA access and allows the user
      ACE4_EXECUTE, the server will allow the user to read the data of
      the file).

   o  Many servers have the notion of owner-override in which the owner
      of the object is allowed to override accesses that are denied by
      the ACL.  This may be helpful, for example, to allow users
      continued access to open files on which the permissions have
      changed.

   o  Many servers have the notion of a "superuser" that has privileges
      beyond an ordinary user.  The superuser may be able to read or
      write data or metadata in ways that would not be permitted by the
      ACL.

6.3.1.2.  Client Considerations

   Clients SHOULD NOT do their own access checks based on their
   interpretation the ACL, but rather use the OPEN and ACCESS operations
   to do access checks.  This allows the client to act on the results of
   having the server determine whether or not access should be granted
   based on its interpretation of the ACL.

   Clients must be aware of situations in which an object's ACL will
   define a certain access even though the server will not enforce it.
   In general, but especially in these situations, the client needs to
   do its part in the enforcement of access as defined by the ACL.  To
   do this, the client MAY send the appropriate ACCESS operation prior
   to servicing the request of the user or application in order to
   determine whether the user or application should be granted the
   access requested.  For examples in which the ACL may define accesses
   that the server doesn't enforce see Section 6.3.1.1.

6.3.2.  Computing a Mode Attribute from an ACL

   The following method can be used to calculate the MODE4_R*, MODE4_W*
   and MODE4_X* bits of a mode attribute, based upon an ACL.

   First, for each of the special identifiers OWNER@, GROUP@, and
   EVERYONE@, evaluate the ACL in order, considering only ALLOW and DENY
   ACEs for the identifier EVERYONE@ and for the identifier under
   consideration.  The result of the evaluation will be an NFSv4 ACL
   mask showing exactly which bits are permitted to that identifier.

   Then translate the calculated mask for OWNER@, GROUP@, and EVERYONE@
   into mode bits for, respectively, the user, group, and other, as
   follows:

   1.  Set the read bit (MODE4_RUSR, MODE4_RGRP, or MODE4_ROTH) if and
       only if ACE4_READ_DATA is set in the corresponding mask.

   2.  Set the write bit (MODE4_WUSR, MODE4_WGRP, or MODE4_WOTH) if and
       only if ACE4_WRITE_DATA and ACE4_APPEND_DATA are both set in the
       corresponding mask.

   3.  Set the execute bit (MODE4_XUSR, MODE4_XGRP, or MODE4_XOTH), if
       and only if ACE4_EXECUTE is set in the corresponding mask.

6.3.2.1.  Discussion

   Some server implementations also add bits permitted to named users
   and groups to the group bits (MODE4_RGRP, MODE4_WGRP, and
   MODE4_XGRP).

   Implementations are discouraged from doing this, because it has been
   found to cause confusion for users who see members of a file's group
   denied access that the mode bits appear to allow.  (The presence of
   DENY ACEs may also lead to such behavior, but DENY ACEs are expected
   to be more rarely used.)

   The same user confusion seen when fetching the mode also results if
   setting the mode does not effectively control permissions for the
   owner, group, and other users; this motivates some of the
   requirements that follow.

6.4.  Requirements

   The server that supports both mode and ACL must take care to
   synchronize the MODE4_*USR, MODE4_*GRP, and MODE4_*OTH bits with the
   ACEs which have respective who fields of "OWNER@", "GROUP@", and
   "EVERYONE@" so that the client can see semantically equivalent access
   permissions exist whether the client asks for owner, owner_group and
   mode attributes, or for just the ACL.

   In this section, much is made of the methods in Section 6.3.2.

   Many requirements refer to this section.  But Section 6.3.2, but note that the methods
   have behaviors specified with "SHOULD".  This is intentional, to
   avoid invalidating existing implementations that compute the mode
   according to the withdrawn POSIX ACL draft (1003.1e draft 17), ([P1003.1e]), rather than
   by actual permissions on owner, group, and other.

6.4.1.  Setting the mode and/or ACL Attributes

6.4.1.1.  Setting mode and not ACL

   When any of the nine low-order mode bits are changed because the mode
   attribute was set, and no ACL attribute is explicitly set, the acl
   attribute must be modified in accordance with the updated value of
   those bits.  This must happen even if the value of the low-order bits
   is the same after the mode is set as before.

   Note that any AUDIT or ALARM ACEs are unaffected by changes to the
   mode.

   In cases in which the permissions bits are subject to change, the acl
   attribute MUST be modified such that the mode computed via the method
   in Section 6.3.2 yields the low-order nine bits (MODE4_R*, MODE4_W*,
   MODE4_X*) of the mode attribute as modified by the attribute change.
   The ACL attributes SHOULD also be modified such that:

   1.  If MODE4_RGRP is not set, entities explicitly listed in the ACL
       other than OWNER@ and EVERYONE@ SHOULD NOT be granted
       ACE4_READ_DATA.

   2.  If MODE4_WGRP is not set, entities explicitly listed in the ACL
       other than OWNER@ and EVERYONE@ SHOULD NOT be granted
       ACE4_WRITE_DATA or ACE4_APPEND_DATA.

   3.  If MODE4_XGRP is not set, entities explicitly listed in the ACL
       other than OWNER@ and EVERYONE@ SHOULD NOT be granted
       ACE4_EXECUTE.

   Access mask bits other than those listed above, appearing in ALLOW
   ACEs, MAY also be disabled.

   Note that ACEs with the flag ACE4_INHERIT_ONLY_ACE set do not affect
   the permissions of the ACL itself, nor do ACEs of the type AUDIT and
   ALARM.  As such, it is desirable to leave these ACEs unmodified when
   modifying the ACL attributes.

   Also note that the requirement may be met by discarding the acl in
   favor of an ACL that represents the mode and only the mode.  This is
   permitted, but it is preferable for a server to preserve as much of
   the ACL as possible without violating the above requirements.
   Discarding the ACL makes it effectively impossible for a file created
   with a mode attribute to inherit an ACL (see Section 6.4.3).

6.4.1.2.  Setting ACL and not mode

   When setting the acl and not setting the mode attribute, the
   permission bits of the mode need to be derived from the ACL.  In this
   case, the ACL attribute SHOULD be set as given.  The nine low-order
   bits of the mode attribute (MODE4_R*, MODE4_W*, MODE4_X*) MUST be
   modified to match the result of the method Section 6.3.2.  The three
   high-order bits of the mode (MODE4_SUID, MODE4_SGID, MODE4_SVTX)
   SHOULD remain unchanged.

6.4.1.3.  Setting both ACL and mode

   When setting both the mode and the acl attribute in the same
   operation, the attributes MUST be applied in this order: mode, then
   ACL.  The mode-related attribute is set as given, then the ACL
   attribute is set as given, possibly changing the final mode, as
   described above in Section 6.4.1.2.

6.4.2.  Retrieving the mode and/or ACL Attributes

   This section applies only to servers that support both the mode and
   ACL attributes.

   Some server implementations may have a concept of "objects without
   ACLs", meaning that all permissions are granted and denied according
   to the mode attribute, and that no ACL attribute is stored for that
   object.  If an ACL attribute is requested of such a server, the
   server SHOULD return an ACL that does not conflict with the mode;
   that is to say, the ACL returned SHOULD represent the nine low-order
   bits of the mode attribute (MODE4_R*, MODE4_W*, MODE4_X*) as
   described in Section 6.3.2.

   For other server implementations, the ACL attribute is always present
   for every object.  Such servers SHOULD store at least the three high-
   order bits of the mode attribute (MODE4_SUID, MODE4_SGID,
   MODE4_SVTX).  The server SHOULD return a mode attribute if one is
   requested, and the low-order nine bits of the mode (MODE4_R*,
   MODE4_W*, MODE4_X*) MUST match the result of applying the method in
   Section 6.3.2 to the ACL attribute.

6.4.3.  Creating New Objects

   If a server supports any ACL attributes, it may use the ACL
   attributes on the parent directory to compute an initial ACL
   attribute for a newly created object.  This will be referred to as
   the inherited ACL within this section.  The act of adding one or more
   ACEs to the inherited ACL that are based upon ACEs in the parent
   directory's ACL will be referred to as inheriting an ACE within this
   section.

   Implementors should standardize on what the behavior of CREATE and
   OPEN must be depending on

   In the presence or absence of the mode and ACL
   attributes. attributes, the
   behavior of CREATE and OPEN SHOULD be:

   1.  If just the mode is given in the call:

       In this case, inheritance SHOULD take place, but the mode MUST be
       applied to the inherited ACL as described in Section 6.4.1.1,
       thereby modifying the ACL.

   2.  If just the ACL is given in the call:

       In this case, inheritance SHOULD NOT take place, and the ACL as
       defined in the CREATE or OPEN will be set without modification,
       and the mode modified as in Section 6.4.1.2

   3.  If both mode and ACL are given in the call:

       In this case, inheritance SHOULD NOT take place, and both
       attributes will be set as described in Section 6.4.1.3.

   4.  If neither mode nor ACL are given in the call:

       In the case where an object is being created without any initial
       attributes at all, e.g., an OPEN operation with an opentype4 of
       OPEN4_CREATE and a createmode4 of EXCLUSIVE4, inheritance SHOULD
       NOT take place.  Instead, the server SHOULD set permissions to
       deny all access to the newly created object.  It is expected that
       the appropriate client will set the desired attributes in a
       subsequent SETATTR operation, and the server SHOULD allow that
       operation to succeed, regardless of what permissions the object
       is created with.  For example, an empty ACL denies all
       permissions, but the server should allow the owner's SETATTR to
       succeed even though WRITE_ACL is implicitly denied.

       In other cases, inheritance SHOULD take place, and no
       modifications to the ACL will happen.  The mode attribute, if
       supported, MUST be as computed in Section 6.3.2, with the
       MODE4_SUID, MODE4_SGID and MODE4_SVTX bits clear.  If no
       inheritable ACEs exist on the parent directory, the rules for
       creating acl attributes are implementation defined.

6.4.3.1.  The Inherited ACL

   If the object being created is not a directory, the inherited ACL
   SHOULD NOT inherit ACEs from the parent directory ACL unless the
   ACE4_FILE_INHERIT_FLAG is set.

   If the object being created is a directory, the inherited ACL should
   inherit all inheritable ACEs from the parent directory, those that
   have ACE4_FILE_INHERIT_ACE or ACE4_DIRECTORY_INHERIT_ACE flag set.
   If the inheritable ACE has ACE4_FILE_INHERIT_ACE set, but
   ACE4_DIRECTORY_INHERIT_ACE is clear, the inherited ACE on the newly
   created directory MUST have the ACE4_INHERIT_ONLY_ACE flag set to
   prevent the directory from being affected by ACEs meant for non-
   directories.

   When a new directory is created, the server MAY split any inherited
   ACE which is both inheritable and effective (in other words, which
   has neither ACE4_INHERIT_ONLY_ACE nor ACE4_NO_PROPAGATE_INHERIT_ACE
   set), into two ACEs, one with no inheritance flags, and one with
   ACE4_INHERIT_ONLY_ACE set.  This makes it simpler to modify the
   effective permissions on the directory without modifying the ACE
   which is to be inherited to the new directory's children.

7.  Multi-Server Namespace

   NFSv4 supports attributes that allow  NFS Server Name Space

7.1.  Server Exports

   On a namespace to extend beyond UNIX server the
   boundaries of a single server.  It is RECOMMENDED that clients and
   servers support construction of such multi-server namespaces.  Use of
   such multi-server namespaces is OPTIONAL, however, and for many
   purposes, single-server namespaces are perfectly acceptable.  Use of
   multi-server namespaces can provide many advantages, however, name space describes all the files reachable by
   separating
   pathnames under the root directory or "/".  On a Windows NT server
   the name space constitutes all the files on disks named by mapped
   disk letters.  NFS server administrators rarely make the entire
   server's file system's logical position in system name space available to NFS clients.  More often
   portions of the name space are made available via an "export"
   feature.  In previous versions of the NFS protocol, the root
   filehandle for each export is obtained through the MOUNT protocol;
   the client sends a namespace from string that identifies the
   (possibly changing) logistical export of name space
   and administrative considerations the server returns the root filehandle for it.  The MOUNT
   protocol supports an EXPORTS procedure that
   result in particular file systems being located on particular
   servers.

7.1.  Location Attributes will enumerate the
   server's exports.

7.2.  Browsing Exports

   The NFSv4 contains RECOMMENDED attributes protocol provides a root filehandle that allow clients can use to
   obtain filehandles for these exports via a multi-component LOOKUP.  A
   common user experience is to use a graphical user interface (perhaps
   a file systems on one
   server "Open" dialog window) to find a file via progressive browsing
   through a directory tree.  The client must be associated with able to move from one or more instances
   export to another export via single-component, progressive LOOKUP
   operations.

   This style of that file
   system on other servers.  These attributes specify such file system
   instances browsing is not well supported by specifying the NFSv2 and NFSv3
   protocols.  The client expects all LOOKUP operations to remain within
   a single server address target (either as file system.  For example, the device attribute will
   not change.  This prevents a DNS client from taking name
   representing one or more IP addresses or as space paths that
   span exports.

   An automounter on the client can obtain a literal IP address)
   together with snapshot of the path server's
   name space using the EXPORTS procedure of that file system within the associated
   single-server namespace. MOUNT protocol.  If it
   understands the server's pathname syntax, it can create an image of
   the server's name space on the client.  The fs_locations RECOMMENDED attribute allows specification parts of the
   file system locations where name space
   that are not exported by the data corresponding to a given file
   system may be found.

7.2.  File System Presence or Absence

   A given location server are filled in an NFSv4 namespace (typically but not necessarily
   a multi-server namespace) can have a number of file system instance
   locations associated with it via the fs_locations attribute.  There
   may also be an actual current a "pseudo file system at
   system" that location,
   accessible via normal namespace operations (e.g., LOOKUP).  In this
   case, allows the user to browse from one mounted file system
   to another.  There is said a drawback to be "present" at that position in the
   namespace, and clients will typically use it, reserving use this representation of
   additional locations specified via the location-related attributes to
   situations in which
   server's name space on the principal location is no longer available.

   When there client: it is no actual file system at static.  If the namespace location in
   question, server
   administrator adds a new export the file system is said to client will be "absent".  An absent file
   system contains no files or directories other than unaware of it.

7.3.  Server Pseudo Filesystem

   NFSv4 servers avoid this name space inconsistency by presenting all
   the root.  Any
   reference exports within the framework of a single server name space.  An
   NFSv4 client uses LOOKUP and READDIR operations to it, except browse seamlessly
   from one export to access a small set another.  Portions of attributes useful in
   determining alternate locations, will result in an error,
   NFS4ERR_MOVED.  Note that if the server ever returns the error
   NFS4ERR_MOVED, it MUST support the fs_locations attribute.

   While the error name suggests space that we have
   are not exported are bridged via a case of "pseudo file system" that provides
   a view of exported directories only.  A pseudo file system
   that once was present, and has a
   unique fsid and behaves like a normal, read only become absent later, this is only
   one possibility.  A position in the namespace may be permanently
   absent with the set of file system(s) designated by system.

   Based on the location
   attributes being construction of the only realization.  The server's name NFS4ERR_MOVED
   reflects an earlier, more limited conception space, it is possible
   that multiple pseudo file systems may exist.  For example,

     /a         pseudo file system
     /a/b       real file system
     /a/b/c     pseudo file system
     /a/b/c/d   real file system

   Each of its function, but
   this error will be returned whenever the referenced pseudo file systems are considered separate entities and
   therefore will have a unique fsid.

7.4.  Multiple Roots

   The DOS and Windows operating environments are sometimes described as
   having "multiple roots".  Filesystems are commonly represented as
   disk letters.  MacOS represents file systems as top level names.
   NFSv4 servers for these platforms can construct a pseudo file system is
   absent, whether it has moved
   above these root names so that disk letters or not.

   Except volume names are
   simply directory names in the case pseudo root.

7.5.  Filehandle Volatility

   The nature of GETATTR-type operations (to be discussed
   later), when the current filehandle at the start of an operation is
   within an absent server's pseudo file system, system is that operation it is not performed and a logical
   representation of file system(s) available from the
   error NFS4ERR_MOVED is returned, to indicate that server.
   Therefore, the pseudo file system is
   absent on the current server.

   Because a GETFH cannot succeed if most likely constructed
   dynamically when the current filehandle server is within an
   absent file system, filehandles within an absent first instantiated.  It is expected
   that the pseudo file system cannot
   be transferred to the client.  When a client does may not have filehandles
   within an absent file system, on disk counterpart from
   which persistent filehandles could be constructed.  Even though it is
   preferable that the result of obtaining them when server provide persistent filehandles for the
   pseudo file system was present, and having system, the NFS client should expect that pseudo file
   system become absent
   subsequently.

   It should filehandles are volatile.  This can be noted that because confirmed by checking
   the check associated "fh_expire_type" attribute for those filehandles in
   question.  If the current filehandles are volatile, the NFS client must be
   prepared to recover a filehandle
   being within value (e.g., with a multi-component
   LOOKUP) when receiving an absent error of NFS4ERR_FHEXPIRED.

7.6.  Exported Root

   If the server's root file system happens at the start of every
   operation, operations that change the current filehandle so that it is within an absent file exported, one might conclude that
   a pseudo-file system will is not result in an error. needed.  This
   allows such combinations as PUTFH-GETATTR and LOOKUP-GETATTR to would be
   used to get attribute information, particularly location attribute
   information, as discussed below.

7.3.  Getting Attributes for an Absent File System

   When a wrong.  Assume the
   following file system systems on a server:

     /       disk1  (exported)
     /a      disk2  (not exported)
     /a/b    disk3  (exported)

   Because disk2 is absent, most attributes are not available, but
   it is necessary to allow the client access to the small set of
   attributes that are available, and most particularly that which gives
   information about exported, disk3 cannot be reached with simple
   LOOKUPs.  The server must bridge the correct current locations for this gap with a pseudo-file system.

7.7.  Mount Point Crossing

   The server file system,
   fs_locations.

7.3.1.  GETATTR Within an Absent File System

   As mentioned above, an exception is made for GETATTR in that
   attributes system environment may be obtained for constructed in such a filehandle within an absent way
   that one file
   system.  This exception only applies if the attribute mask system contains
   at least the fs_locations attribute bit, a directory which indicates the client is interested in 'covered' or
   mounted upon by a result regarding an absent second file system.  If it  For example:

     /a/b            (file system 1)
     /a/b/c/d        (file system 2)

   The pseudo file system for this server may be constructed to look
   like:

     /               (place holder/not exported)
     /a/b            (file system 1)
     /a/b/c/d        (file system 2)

   It is
   not requested, GETATTR will result in an NFS4ERR_MOVED error.

   When the server's responsibility to present the pseudo file system
   that is complete to the client.  If the client sends a GETATTR lookup request
   for the path "/a/b/c/d", the server's response is done on an absent file system, the set filehandle of supported
   attributes is very limited.  Many attributes, including those that
   are normally REQUIRED, will not be available on an absent
   the file
   system. system "/a/b/c/d".  In addition to previous versions of the fs_locations attribute, NFS
   protocol, the following
   attributes SHOULD be available on absent file systems.  In server would respond with the case filehandle of RECOMMENDED attributes, they should be available at least to directory
   "/a/b/c/d" within the
   same degree that they are available on present file systems.

   fsid:  This attribute should be provided so that the system "/a/b".

   The NFS client can will be able to determine file system boundaries, including, if it crosses a server mount
   point by a change in particular, the
      boundary between present and absent file systems.  This value must
      be different from any other fsid on of the current server "fsid" attribute.

7.8.  Security Policy and need
      have no particular relationship to fsids on any particular
      destination to which the client might be directed.

   mounted_on_fileid:  For objects at the top Name Space Presentation

   The application of an absent file system,
      this attribute the server's security policy needs to be available.  Since the fileid is within carefully
   considered by the present parent file system, there should be no need implementor.  One may choose to
      reference limit the absent
   viewability of portions of the pseudo file system based on the
   server's perception of the client's ability to provide this information.

   Other attributes SHOULD NOT be made available for absent file
   systems, even when it is possible to provide them.  The server should
   not assume that more information is always better and should avoid
   gratuitously providing additional information.

   When a GETATTR operation includes a bit mask for the attribute
   fs_locations, but where the bit mask includes attributes that are not
   supported, GETATTR will not return an error, but will return authenticate itself
   properly.  However, with the mask support of multiple security mechanisms
   and the actual attributes supported with ability to negotiate the results.

   Handling appropriate use of VERIFY/NVERIFY these mechanisms,
   the server is similar unable to GETATTR in that properly determine if the
   attribute mask does not include fs_locations the error NFS4ERR_MOVED a client will result.  It differs in that any appearance in be able
   to authenticate itself.  If, based on its policies, the attribute mask server
   chooses to limit the contents of an attribute not supported for an absent file system (and note
   that this will include some normally REQUIRED attributes) will also
   cause an NFS4ERR_MOVED result.

7.3.2.  READDIR and Absent File Systems

   A READDIR performed when the current filehandle is within an absent pseudo file system will result in an NFS4ERR_MOVED error, since, unlike system, the
   case of GETATTR, no such exception is made for READDIR.

   Attributes for an absent file system server
   may be fetched via a READDIR for
   a directory in a present effectively hide file system, when systems from a client that directory contains may otherwise
   have legitimate access.

   As suggested practice, the root directories server should apply the security policy of one or more absent file systems.  In this
   case,
   a shared resource in the handling is as follows:

   o  If server's namespace to the attribute set requested includes fs_locations, then
      fetching components of attributes proceeds normally the
   resource's ancestors.  For example:

     /
     /a/b
     /a/b/c

   The /a/b/c directory is a real file system and no NFS4ERR_MOVED
      indication is returned, even when the rdattr_error attribute shared
   resource.  The security policy for /a/b/c is
      requested.

   o  If the attribute set requested does not include fs_locations, then
      if Kerberos with integrity.
   The server should apply the rdattr_error attribute is requested, each directory entry same security policy to /, /a, and /a/b.
   This allows for the root extension of an absent file system will report NFS4ERR_MOVED as the value protection of the rdattr_error attribute.

   o  If server's
   namespace to the attribute set requested does not include either ancestors of the
      attributes fs_locations or rdattr_error then real shared resource.

   For the occurrence case of the
      root use of an absent file system within the directory will result multiple, disjoint security mechanisms in
   the READDIR failing with an NFS4ERR_MOVED error.

   o  The unavailability of an attribute because of server's resources, the security for a file system's
      absence, even one that is ordinarily REQUIRED, does not result particular object in
      any error indication.  The set of attributes returned for the root
      directory of
   server's namespace should be the absent file system in that case is simply
      restricted to those actually available.

7.4.  Uses of Location Information

   The location-bearing attribute union of fs_locations provides, together
   with the possibility all security mechanisms of absent file systems,
   all direct descendants.

8.  Multi-Server Namespace

   NFSv4 supports attributes that allow a number namespace to extend beyond the
   boundaries of important
   facilities in providing reliable, manageable, and scalable data
   access.

   When a file system single server.  It is present, these attributes can provide
   alternative locations, to be used to access the same data, RECOMMENDED that clients and
   servers support construction of such multi-server namespaces.  Use of
   such multi-server namespaces is OPTIONAL, however, and for many
   purposes, single-server namespaces are perfectly acceptable.  Use of
   multi-server namespaces can provide many advantages, however, by
   separating a file system's logical position in a namespace from the
   event of
   (possibly changing) logistical and administrative considerations that
   result in particular file systems being located on particular
   servers.

8.1.  Location Attributes

   NFSv4 contains RECOMMENDED attributes that allow file systems on one
   server failures, communications problems, to be associated with one or other
   difficulties more instances of that make continued access to the current file
   system
   impossible on other servers.  These attributes specify such file system
   instances by specifying a server address target (either as a DNS name
   representing one or otherwise impractical.  Under some circumstances,
   multiple alternative locations may be used simultaneously to provide
   higher-performance access to more IP addresses or as a literal IP address)
   together with the path of that file system in question.  Provision within the associated
   single-server namespace.

   The fs_locations RECOMMENDED attribute allows specification of such alternate the
   file system locations is referred to as "replication" although
   there are cases in which replicated sets of data are not in fact
   present, and where the replicas are instead different paths data corresponding to the same
   data.

   When a given file
   system is present and becomes absent, clients can may be found.

8.2.  File System Presence or Absence

   A given the opportunity to location in an NFSv4 namespace (typically but not necessarily
   a multi-server namespace) can have continued access to their data, at a number of file system instance
   locations associated with it via the fs_locations attribute.  There
   may also be an
   alternate location. actual current file system at that location,
   accessible via normal namespace operations (e.g., LOOKUP).  In this
   case, a continued attempt to use the
   data in the now-absent file system will result in an NFS4ERR_MOVED
   error and, is said to be "present" at that point, position in the successor locations (typically only one
   although multiple choices are possible) can be fetched
   namespace, and used to
   continue access.  Transfer clients will typically use it, reserving use of
   additional locations specified via the file system contents location-related attributes to
   situations in which the new principal location is referred to as "migration", but it should be kept in mind
   that there are cases in which this term can be used, like
   "replication", when no longer available.

   When there is no actual data migration per se.

   Where a file system was not previously present, specification of file
   system location provides a means by which file systems located on one
   server can be associated with a namespace defined by another server,
   thus allowing a general multi-server at the namespace facility.  A
   designation of such a location, location in place of an absent
   question, the file system, is
   called a "referral".

   Because client support for location-related attributes is OPTIONAL, a
   server may (but system is not required to) take action to hide migration and
   referral events from such clients, by acting as a proxy, for example.

7.4.1.  File System Replication

   The fs_locations attribute provides alternative locations, said to be used "absent".  An absent file
   system contains no files or directories other than the root.  Any
   reference to it, except to access data in place a small set of or in addition to the current file system
   instance.  On first access to a file system, the client should obtain attributes useful in
   determining alternate locations, will result in an error,
   NFS4ERR_MOVED.  Note that if the value of server ever returns the set of alternate locations by interrogating error
   NFS4ERR_MOVED, it MUST support the fs_locations attribute.

   In

   While the event error name suggests that server failures, communications problems, or other
   difficulties make continued access to the current we have a case of a file system
   impossible or otherwise impractical, the client can use
   that once was present, and has only become absent later, this is only
   one possibility.  A position in the alternate
   locations as a way to get continued access to its data.  Multiple
   locations namespace may be used simultaneously, to provide higher performance
   through permanently
   absent with the exploitation set of multiple paths between client and target file system. system(s) designated by the location
   attributes being the only realization.  The alternate locations may be physical replicas name NFS4ERR_MOVED
   reflects an earlier, more limited conception of its function, but
   this error will be returned whenever the (typically
   read-only) referenced file system data, or they may reflect alternate paths to
   the same server is
   absent, whether it has moved or provide for the use of various forms of server
   clustering not.

   Except in which multiple servers provide alternate ways the case of
   accessing GETATTR-type operations (to be discussed
   later), when the same physical file system.  How these different modes current filehandle at the start of file system transition are represented an operation is
   within the fs_locations
   attribute an absent file system, that operation is not performed and how the client deals with
   error NFS4ERR_MOVED is returned, to indicate that the file system transition issues
   will be discussed in detail below.

   Multiple server addresses, whether they are derived from a single
   entry with a DNS name representing a set of IP addresses or from
   multiple entries each with its own server address, may correspond to is
   absent on the same actual current server.

7.4.2.  File System Migration

   When

   Because a GETFH cannot succeed if the current filehandle is within an
   absent file system, filehandles within an absent file system is present and becomes absent, clients can cannot
   be
   given the opportunity to have continued access transferred to their data, at an
   alternate location, as specified by the fs_locations attribute.
   Typically, client.  When a client will be accessing the file system in question,
   get does have filehandles
   within an NFS4ERR_MOVED error, and then use the fs_locations attribute
   to determine absent file system, it is the new location result of obtaining them when
   the data.

   Such migration can be helpful in providing load balancing or general
   resource reallocation.  The protocol does not specify how file system was present, and having the file system will be moved between servers. become absent
   subsequently.

   It is anticipated that a
   number of different server-to-server transfer mechanisms might should be
   used with the choice left to the server implementor.  The NFSv4
   protocol specifies noted that because the method used to communicate check for the migration event
   between client and server.

   The new location may be current filehandle
   being within an alternate communication path to the same
   server or, in absent file system happens at the case of various forms start of server clustering, another
   server providing access to every
   operation, operations that change the same physical file system.  The
   client's responsibilities current filehandle so that it
   is within an absent file system will not result in dealing with this transition depend on
   the specific nature of the new access path as well an error.  This
   allows such combinations as how PUTFH-GETATTR and whether
   data was in fact migrated.  These issues will LOOKUP-GETATTR to be discussed in detail
   below.

   When an alternate
   used to get attribute information, particularly location is designated attribute
   information, as the target discussed below.

8.3.  Getting Attributes for migration,
   it must designate the same data.  Where file systems are writable, a
   change made on the original file system must be visible on all
   migration targets.  Where an Absent File System

   When a file system is absent, most attributes are not writable available, but
   represents a read-only copy (possibly periodically updated) of a
   writable file system, similar requirements apply
   it is necessary to allow the propagation client access to the small set of updates.  Any change visible in
   attributes that are available, and most particularly that which gives
   information about the original correct current locations for this file system must
   already be effected on all migration targets, to avoid any
   possibility that a client, system,
   fs_locations.

8.3.1.  GETATTR Within an Absent File System

   As mentioned above, an exception is made for GETATTR in effecting that
   attributes may be obtained for a transition to filehandle within an absent file
   system.  This exception only applies if the migration
   target, will see any reversion attribute mask contains
   at least the fs_locations attribute bit, which indicates the client
   is interested in file system state.

7.4.3.  Referrals

   Referrals provide a way of placing a result regarding an absent file system system.  If it is
   not requested, GETATTR will result in an NFS4ERR_MOVED error.

   When a location within
   the namespace essentially without respect to its physical location GETATTR is done on
   a given server.  This allows a single server or a an absent file system, the set of servers to
   present a multi-server namespace supported
   attributes is very limited.  Many attributes, including those that encompasses
   are normally REQUIRED, will not be available on an absent file systems
   located
   system.  In addition to the fs_locations attribute, the following
   attributes SHOULD be available on multiple servers.  Some likely uses of this include
   establishment absent file systems.  In the case
   of site-wide or organization-wide namespaces, or even
   knitting such together into a truly global namespace.

   Referrals occur when a client determines, upon first referencing a
   position in RECOMMENDED attributes, they should be available at least to the current namespace,
   same degree that it is part of a new they are available on present file
   system and systems.

   fsid:  This attribute should be provided so that the client can
      determine file system is absent.  When this occurs,
   typically by receiving the error NFS4ERR_MOVED, the actual location
   or locations of boundaries, including, in particular, the
      boundary between present and absent file system can systems.  This value must
      be determined by fetching different from any other fsid on the
   fs_locations attribute.

   The locations-related attribute may designate a single file system
   location or multiple file system locations, current server and need
      have no particular relationship to be selected based fsids on any particular
      destination to which the needs of client might be directed.

   mounted_on_fileid:  For objects at the client.

   Use top of multi-server namespaces is enabled by NFSv4 but is not
   required.  The use of multi-server namespaces and their scope will
   depend on the applications used and system administration
   preferences.

   Multi-server namespaces can be established by a single server
   providing a large set of referrals an absent file system,
      this attribute needs to all of be available.  Since the included fileid is within
      the present parent file
   systems.  Alternatively, a single multi-server namespace may system, there should be
   administratively segmented with separate referral file systems (on
   separate servers) for each separately administered portion of no need to
      reference the
   namespace.  The top-level referral absent file system or any segment may use
   replicated referral file systems for higher availability.

   Generally, multi-server namespaces are for the most part uniform, in
   that the same data made available to one client at a given location
   in the namespace is provide this information.

   Other attributes SHOULD NOT be made available for absent file
   systems, even when it is possible to all clients at provide them.  The server should
   not assume that location.

7.5.  Location Entries more information is always better and Server Identity

   As mentioned above, should avoid
   gratuitously providing additional information.

   When a single location entry may have GETATTR operation includes a server address
   target in bit mask for the form of a DNS name that may represent multiple IP
   addresses, while multiple location entries may have their own server
   address targets attribute
   fs_locations, but where the bit mask includes attributes that reference are not
   supported, GETATTR will not return an error, but will return the same server.

   When multiple addresses for mask
   of the same server exist, actual attributes supported with the client may
   assume results.

   Handling of VERIFY/NVERIFY is similar to GETATTR in that if the
   attribute mask does not include fs_locations the error NFS4ERR_MOVED
   will result.  It differs in that any appearance in the attribute mask
   of an attribute not supported for each an absent file system (and note
   that this will include some normally REQUIRED attributes) will also
   cause an NFS4ERR_MOVED result.

8.3.2.  READDIR and Absent File Systems

   A READDIR performed when the current filehandle is within an absent
   file system will result in an NFS4ERR_MOVED error, since, unlike the namespace
   case of a given server
   network address, there exist GETATTR, no such exception is made for READDIR.

   Attributes for an absent file systems at corresponding namespace
   locations system may be fetched via a READDIR for each of
   a directory in a present file system, when that directory contains
   the other server network addresses.  It may do root directories of one or more absent file systems.  In this
   case, the handling is as follows:

   o  If the attribute set requested includes fs_locations, then
      fetching of attributes proceeds normally and no NFS4ERR_MOVED
      indication is returned, even in when the absence rdattr_error attribute is
      requested.

   o  If the attribute set requested does not include fs_locations, then
      if the rdattr_error attribute is requested, each directory entry
      for the root of explicit listing in fs_locations.  Such
   corresponding an absent file system locations can be used as alternate
   locations, just will report NFS4ERR_MOVED as those explicitly specified via
      the fs_locations value of the rdattr_error attribute.

   o  If a single location entry designates multiple server IP addresses, the client cannot assume that these addresses are multiple paths to attribute set requested does not include either of the same server.  In most cases, they will be, but
      attributes fs_locations or rdattr_error then the client MUST
   verify that before acting on that assumption.  When two server
   addresses are designated by a single location entry and they
   correspond to different servers, this normally indicates some sort occurrence of
   misconfiguration, and so the client should avoid using such location
   entries when alternatives are available.  When they are not, clients
   should pick one
      root of IP addresses and use it, without using others an absent file system within the directory will result in
      the READDIR failing with an NFS4ERR_MOVED error.

   o  The unavailability of an attribute because of a file system's
      absence, even one that
   are is ordinarily REQUIRED, does not directed to result in
      any error indication.  The set of attributes returned for the same server.

7.6.  Additional Client-Side Considerations

   When clients make use root
      directory of servers that implement referrals,
   replication, and migration, care should be taken that a user who
   mounts a given the absent file system in that includes case is simply
      restricted to those actually available.

8.4.  Uses of Location Information

   The location-bearing attribute of fs_locations provides, together
   with the possibility of absent file systems, a referral or number of important
   facilities in providing reliable, manageable, and scalable data
   access.

   When a relocated file system continues is present, these attributes can provide
   alternative locations, to see a coherent picture be used to access the same data, in the
   event of server failures, communications problems, or other
   difficulties that user-side
   file system despite make continued access to the fact that it contains a number of server-side current file systems that system
   impossible or otherwise impractical.  Under some circumstances,
   multiple alternative locations may be on different servers.

   One important issue is upward navigation from used simultaneously to provide
   higher-performance access to the root of a server-
   side file system in question.  Provision
   of such alternate locations is referred to its parent (specified as ".." "replication" although
   there are cases in UNIX), which replicated sets of data are not in fact
   present, and the
   case in which it transitions replicas are instead different paths to that the same
   data.

   When a file system as a result of
   referral, migration, or a transition as a result of replication.
   When the client is at such a point, present and it needs to ascend to becomes absent, clients can be
   given the
   parent, it must go back opportunity to the parent as seen within the multi-server
   namespace rather than sending have continued access to their data, at an
   alternate location.  In this case, a LOOKUPP operation continued attempt to use the server,
   which would result
   data in the parent within now-absent file system will result in an NFS4ERR_MOVED
   error and, at that server's single-server
   namespace.  In order to do this, point, the client needs successor locations (typically only one
   although multiple choices are possible) can be fetched and used to remember
   continue access.  Transfer of the
   filehandles that represent such file system roots and use these
   instead of issuing a LOOKUPP operation contents to the current server.  This
   will allow the client to present new
   location is referred to applications a consistent
   namespace, where upward navigation and downward navigation are
   consistent.

   Another issue concerns refresh of referral locations.  When referrals
   are used extensively, they may change as server configurations
   change.  It is expected "migration", but it should be kept in mind
   that clients will cache information related
   to traversing referrals so that future client-side requests there are
   resolved locally without server communication.  This is usually
   rooted cases in client-side name look up caching.  Clients should
   periodically purge which this term can be used, like
   "replication", when there is no actual data for referral points in order to detect
   changes in migration per se.

   Where a file system was not previously present, specification of file
   system location information.

   A potential problem exists if provides a client were to allow an open owner to
   have state on multiple filesystems means by which file systems located on server, in that it is unclear
   how the sequence numbers one
   server can be associated with open owners are to be dealt
   with, in the event of transparent state migration. a namespace defined by another server,
   thus allowing a general multi-server namespace facility.  A client can
   avoid
   designation of such a situation, if it ensures that any use location, in place of an open owner absent file system, is confined to
   called a "referral".

   Because client support for location-related attributes is OPTIONAL, a single filesystem.

   A server MAY decline to migrate state associated with open owners
   that span multiple filesystems.  In cases in which the
   server chooses may (but is not required to) take action to migrate hide migration and
   referral events from such state, clients, by acting as a proxy, for example.

8.4.1.  File System Replication

   The fs_locations attribute provides alternative locations, to be used
   to access data in place of or in addition to the server MUST return NFS4ERR_BAD_STATEID
   when current file system
   instance.  On first access to a file system, the client uses those stateids on the new server.

   The server MUST return NFS4ERR_STALE_STATEID when should obtain
   the client uses
   those stateids on value of the old server, regardless set of whether migration has
   occurred alternate locations by interrogating the
   fs_locations attribute.

   In the event that server failures, communications problems, or not.

7.7.  Effecting File System Transitions

   Transitions between other
   difficulties make continued access to the current file system instances, whether due
   impossible or otherwise impractical, the client can use the alternate
   locations as a way to get continued access to its data.  Multiple
   locations may be used simultaneously, to switching provide higher performance
   through the exploitation of multiple paths between client and target
   file system.

   The alternate locations may be physical replicas upon server unavailability of the (typically
   read-only) file system data, or they may reflect alternate paths to server-initiated
   migration events, are best dealt with together.  This is so even
   though, for
   the server, pragmatic considerations will normally force
   different implementation strategies same server or provide for planned and unplanned
   transitions.  Even though the prototypical use cases of replication
   and migration contain distinctive sets various forms of features, when all
   possibilities for these operations are considered, there is an
   underlying unity server
   clustering in which multiple servers provide alternate ways of these operations, from
   accessing the client's point of
   view, that makes treating them together desirable.

   A number same physical file system.  How these different modes
   of methods file system transition are possible for servers to replicate data represented within the fs_locations
   attribute and to
   track how the client state in order to allow clients to transition between deals with file system instances transition issues
   will be discussed in detail below.

   Multiple server addresses, whether they are derived from a single
   entry with a minimum DNS name representing a set of disruption.  Such methods
   vary between those that use inter-server clustering techniques IP addresses or from
   multiple entries each with its own server address, may correspond to
   limit
   the changes seen by same actual server.

8.4.2.  File System Migration

   When a file system is present and becomes absent, clients can be
   given the client, opportunity to those that are less
   aggressive, use more standard methods of replicating have continued access to their data, and impose
   a greater burden on at an
   alternate location, as specified by the fs_locations attribute.
   Typically, a client to adapt to will be accessing the transition.

   The NFSv4 protocol does not impose choices on clients file system in question,
   get an NFS4ERR_MOVED error, and servers
   with regard then use the fs_locations attribute
   to that spectrum determine the new location of transition methods.  In fact, there
   are many valid choices, depending on client and application
   requirements and their interaction with server implementation
   choices. the data.

   Such migration can be helpful in providing load balancing or general
   resource reallocation.  The NFSv4.0 protocol does not provide specify how the servers file
   system will be moved between servers.  It is anticipated that a means
   number of communicating different server-to-server transfer mechanisms might be
   used with the transition methods.  In choice left to the NFSv4.1 server implementor.  The NFSv4
   protocol
   [RFC5661], an additional attribute "fs_locations_info" is presented,
   which will define specifies the specific choices that can be made, how these
   choices are communicated method used to communicate the client, and how the migration event
   between client is to deal
   with any discontinuities.

   In the sections below, references will and server.

   The new location may be made an alternate communication path to various possible
   server implementation choices as a way of illustrating the transition
   scenarios that clients may deal with.  The intent here is not to
   define or limit same
   server implementations but rather to illustrate or, in the
   range case of issues that clients may face.  Again, as the NFSv4.0
   protocol does not have an explicit means various forms of communicating these
   issues to the client, the intent is server clustering, another
   server providing access to document the problems that can
   be faced same physical file system.  The
   client's responsibilities in a multi-server name space and allow dealing with this transition depend on
   the client to use specific nature of the
   inferred transitions available via fs_locations new access path as well as how and other attributes
   (see Section 7.9.1).

   In the discussion below, references whether
   data was in fact migrated.  These issues will be made to discussed in detail
   below.

   When an alternate location is designated as the target for migration,
   it must designate the same data.  Where file systems are writable, a
   change made on the original file system
   having must be visible on all
   migration targets.  Where a particular property or to two file systems (typically the
   source and destination) belonging to system is not writable but
   represents a common class of any read-only copy (possibly periodically updated) of several
   types.  Two a
   writable file systems that belong system, similar requirements apply to such a class share some
   important aspects the propagation
   of updates.  Any change visible in the original file system behavior that clients may depend
   upon when present, must
   already be effected on all migration targets, to easily effect avoid any
   possibility that a client, in effecting a seamless transition between to the migration
   target, will see any reversion in file system instances.  Conversely, where the state.

8.4.3.  Referrals

   Referrals provide a way of placing a file systems do not
   belong to such system in a common class, location within
   the client has to deal with various
   sorts of implementation discontinuities that may cause performance or
   other issues in effecting a transition.

   While fs_locations is available, default assumptions with regard to
   such classifications have namespace essentially without respect to be inferred (see Section 7.9.1 for
   details).

   In cases in which one its physical location on
   a given server.  This allows a single server is expected or a set of servers to accept opaque values from
   the client
   present a multi-server namespace that originated from another server, the servers SHOULD
   encode the "opaque" values in big-endian byte order.  If this is
   done, servers acting as replicas or immigrating encompasses file systems will be
   able to parse values like stateids, directory cookies, filehandles,
   etc., even if their native byte order is different from that
   located on multiple servers.  Some likely uses of other
   servers cooperating this include
   establishment of site-wide or organization-wide namespaces, or even
   knitting such together into a truly global namespace.

   Referrals occur when a client determines, upon first referencing a
   position in the replication and migration current namespace, that it is part of the a new file
   system.

7.7.1.  File System Transitions
   system and Simultaneous Access

   When a single that the file system may be accessed at multiple locations,
   either because of an indication is absent.  When this occurs,
   typically by receiving the error NFS4ERR_MOVED, the actual location
   or locations of the file system identity as reported can be determined by fetching the
   fs_locations attribute, the client will, depending on specific
   circumstances as discussed below, either:

   o  Access attribute.

   The locations-related attribute may designate a single file system
   location or multiple instances simultaneously, each of which represents
      an alternate path file system locations, to be selected based on
   the same data and metadata.

   o  Accesses one instance (or set needs of instances) and then transition to
      an alternative instance (or set the client.

   Use of instances) as a result multi-server namespaces is enabled by NFSv4 but is not
   required.  The use of
      network issues, server unresponsiveness, or server-directed
      migration.

7.7.2.  Filehandles multi-server namespaces and their scope will
   depend on the applications used and File System Transitions

   There are a number of ways in which filehandles can be handled across
   a file system transition.  These administration
   preferences.

   Multi-server namespaces can be divided into two broad
   classes depending upon whether the two file systems across which the
   transition happens share sufficient state to effect some sort established by a single server
   providing a large set of
   continuity referrals to all of file system handling.

   When there is no such cooperation in filehandle assignment, the two included file
   systems.  Alternatively, a single multi-server namespace may be
   administratively segmented with separate referral file systems are reported as being in different handle classes.  In
   this case, all filehandles are assumed to expire as part (on
   separate servers) for each separately administered portion of the
   namespace.  The top-level referral file system transition.  Note that this behavior does not depend on
   fh_expire_type attribute and depends on the specification of the
   FH4_VOL_MIGRATION bit.

   When there is co-operation in filehandle assignment, the two or any segment may use
   replicated referral file systems for higher availability.

   Generally, multi-server namespaces are reported as being for the most part uniform, in
   that the same handle classes.  In this
   case, persistent filehandles remain valid after the file system
   transition, while volatile filehandles (excluding those that are only
   volatile due data made available to one client at a given location
   in the FH4_VOL_MIGRATION bit) are subject namespace is made available to expiration
   on the target server.

7.7.3.  Fileids all clients at that location.

8.5.  Location Entries and File System Transitions

   The issue of continuity of fileids Server Identity

   As mentioned above, a single location entry may have a server address
   target in the event form of a file system
   transition needs to be addressed.  The general expectation is DNS name that in
   situations in which may represent multiple IP
   addresses, while multiple location entries may have their own server
   address targets that reference the two same server.

   When multiple addresses for the same server exist, the client may
   assume that for each file system instances are created by in the namespace of a
   single vendor using some sort given server
   network address, there exist file systems at corresponding namespace
   locations for each of the other server network addresses.  It may do
   this even in the absence of explicit listing in fs_locations.  Such
   corresponding file system image copy, fileids will locations can be consistent across used as alternate
   locations, just as those explicitly specified via the transition, while in fs_locations
   attribute.

   If a single location entry designates multiple server IP addresses,
   the analogous multi-
   vendor transitions client cannot assume that these addresses are multiple paths to
   the same server.  In most cases, they will not.  This poses difficulties,
   especially for be, but the client without special knowledge of the transition
   mechanisms adopted by the server.  Note MUST
   verify that although fileid is not before acting on that assumption.  When two server
   addresses are designated by a
   REQUIRED attribute, many servers support fileids single location entry and many they
   correspond to different servers, this normally indicates some sort of
   misconfiguration, and so the client should avoid using such location
   entries when alternatives are available.  When they are not, clients
   provide APIs
   should pick one of IP addresses and use it, without using others that depend on fileids.

   It is important
   are not directed to note that while the same server.

8.6.  Additional Client-Side Considerations

   When clients themselves may have no
   trouble with make use of servers that implement referrals,
   replication, and migration, care should be taken that a fileid changing as user who
   mounts a result of given file system that includes a referral or a relocated
   file system
   transition event, applications do typically have access continues to see a coherent picture of that user-side
   file system despite the fileid
   (e.g., via stat).  The result is fact that it contains a number of server-side
   file systems that an application may work
   perfectly well if there be on different servers.

   One important issue is no upward navigation from the root of a server-
   side file system instance transition or if
   any such transition is among instances created by a single vendor,
   yet be unable to deal with its parent (specified as ".." in UNIX), in the situation
   case in which it transitions to that file system as a result of
   referral, migration, or a multi-vendor transition occurs at the wrong time.

   Providing as a result of replication.
   When the same fileids in client is at such a multi-vendor (multiple server
   vendors) environment has generally been held point, and it needs to be quite difficult.
   While there is work ascend to be done, the
   parent, it needs must go back to be pointed out that this
   difficulty is partly self-imposed.  Servers have typically identified
   fileid with inode number, i.e., with the parent as seen within the multi-server
   namespace rather than sending a quantity used LOOKUPP operation to find the file server,
   which would result in question.  This identification poses special difficulties for
   migration of a the parent within that server's single-server
   namespace.  In order to do this, the client needs to remember the
   filehandles that represent such file system between vendors where assigning roots and use these
   instead of issuing a LOOKUPP operation to the same
   index current server.  This
   will allow the client to present to applications a given file consistent
   namespace, where upward navigation and downward navigation are
   consistent.

   Another issue concerns refresh of referral locations.  When referrals
   are used extensively, they may not be possible.  Note here change as server configurations
   change.  It is expected that a fileid clients will cache information related
   to traversing referrals so that future client-side requests are
   resolved locally without server communication.  This is not required usually
   rooted in client-side name look up caching.  Clients should
   periodically purge this data for referral points in order to be useful detect
   changes in location information.

   A potential problem exists if a client were to find the allow an open owner to
   have state on multiple file systems on server, in question, only that it is unique within unclear
   how the given file system.  Servers prepared sequence numbers associated with open owners are to
   accept be dealt
   with, in the event of transparent state migration.  A client can
   avoid such a fileid as situation, if it ensures that any use of an open owner
   is confined to a single piece of metadata and store it apart from
   the value used file system.

   A server MAY decline to index the migrate state associated with open owners
   that span multiple file information can relatively easily
   maintain a fileid value across a migration event, allowing a truly
   transparent migration event. systems.  In any case, where servers can provide continuity of fileids, they
   should, and cases in which the client should be able server
   chooses not to find out that migrate such
   continuity is available and take appropriate action.  Information
   about state, the continuity (or lack thereof) of fileids across a file
   system transition is represented by specifying whether server MUST return
   NFS4ERR_BAD_STATEID when the file
   systems in question are of client uses those stateids on the same fileid class.

   Note that new
   server.

   The server MUST return NFS4ERR_STALE_STATEID when consistent fileids do not exist across a transition
   (either because there is no continuity of fileids or because fileid
   is not a supported attribute the client uses
   those stateids on one the old server, regardless of instances involved), and there whether migration has
   occurred or not.

8.7.  Effecting File System Referrals

   Referrals are no reliable filehandles across a transition event (either because
   there effected when an absent file system is no filehandle continuity encountered, and
   one or because the filehandles more alternate locations are
   volatile), made available by the
   fs_locations attribute.  The client is in will typically get an
   NFS4ERR_MOVED error, fetch the appropriate location information, and
   proceed to access the file system on a position where it cannot verify that
   files different server, even though
   it was accessing before the transition are retains its logical position within the same objects.
   It is forced to assume original namespace.
   Referrals differ from migration events in that no object they happen only when
   the client has been renamed, and, unless
   there are guarantees that provide this (e.g., not previously referenced the file system in question
   (so there is
   read-only), problems for applications may occur.  Therefore, use of
   such configurations should be limited nothing to situations where the
   problems that this may cause transition).  Referrals can be tolerated.

7.7.4.  Fsids and File System Transitions

   Since fsids are generally only unique within a per-server basis, it come into
   effect when an absent file system is likely encountered at its root.

   The examples given in the sections below are somewhat artificial in
   that they an actual client will change during a file system transition.
   Clients should not typically do a multi-component look
   up, but will have cached information regarding the upper levels of
   the name hierarchy.  However, these example are chosen to make the fsids received from
   required behavior clear and easy to put within the server visible scope of a small
   number of requests, without getting unduly into details of how
   specific clients might choose to
   applications since they cache things.

8.7.1.  Referral Example (LOOKUP)

   Let us suppose that the following COMPOUND is sent in an environment
   in which /this/is/the/path is absent from the target server.  This
   may not be globally unique, and because they
   may change during for a number of reasons.  It may be the case that the file
   system transition event.  Applications are
   best served if they are isolated from such transitions to has moved, or it may be the extent
   possible.

7.7.5.  The Change Attribute and File System Transitions

   Since case that the change attribute is defined as a server-specific one,
   change attributes fetched from one target server are normally presumed is
   functioning mainly, or solely, to be
   invalid refer clients to the servers on another server.  Such a presumption
   which various file systems are located.

   o  PUTROOTFH

   o  LOOKUP "this"

   o  LOOKUP "is"

   o  LOOKUP "the"

   o  LOOKUP "path"

   o  GETFH

   o  GETATTR(fsid,fileid,size,time_modify)

   Under the given circumstances, the following will be the result.

   o  PUTROOTFH --> NFS_OK.  The current fh is troublesome since
   it would invalidate all cached change attributes, requiring
   refetching.  Even more disruptive, now the absence root of any assured
   continuity the
      pseudo-fs.

   o  LOOKUP "this" --> NFS_OK.  The current fh is for /this and is
      within the change attribute means that even if pseudo-fs.

   o  LOOKUP "is" --> NFS_OK.  The current fh is for /this/is and is
      within the same value pseudo-fs.

   o  LOOKUP "the" --> NFS_OK.  The current fh is retrieved on refetch, no conclusions can be drawn as to whether for /this/is/the and
      is within the object in question has changed. pseudo-fs.

   o  LOOKUP "path" --> NFS_OK.  The identical change attribute
   could be merely an artifact of current fh is for /this/is/the/path
      and is within a modified new, absent file with a different
   change attribute construction algorithm, with system, but ... the client will
      never see the value of that new algorithm just
   happening to result fh.

   o  GETFH --> NFS4ERR_MOVED.  Fails because current fh is in an identical change value.

   When the two absent
      file systems have consistent change attribute formats, system at the start of the operation, and we say that they are in the same change class, specification
      makes no exception for GETFH.

   o  GETATTR(fsid,fileid,size,time_modify) Not executed because the
      failure of the GETFH stops processing of the COMPOUND.

   Given the failure of the GETFH, the client may
   assume a continuity has the job of change attribute construction and handle this
   situation just as it would be handled without any determining
   the root of the absent file system
   transition.

7.7.6.  Lock State and File System Transitions

   In a where to find that file system transition,
   system, i.e., the client needs server and path relative to handle cases in
   which the two servers have cooperated in state management and in
   which they have not.  Cooperation by two servers that server's root fh.
   Note here that in state management
   requires coordination of client IDs.  Before this example, the client attempts to
   use a client ID associated with one server in a request to did not obtain filehandles
   and attribute information (e.g., fsid) for the server
   of intermediate
   directories, so that it would not be sure where the other absent file system, it must eliminate
   system starts.  It could be the possibility case, for example, that two
   non-cooperating servers have assigned the same client ID by accident.

   In /this/is/the
   is the case root of migration, the servers involved in the migration of a moved file system SHOULD transfer all server state from the original to and that the
   new server.  When this is done, it must be done in a way reason that is
   transparent to the client.  With replication, such a degree
   look up of common
   state "path" succeeded is typically not the case.

   This state transfer will reduce disruption to that the client when a file system transition occurs.  If was not absent on
   that operation but was moved between the servers are successful in
   transferring all state, then last LOOKUP and the client may use GETFH
   (since COMPOUND is not atomic).  Even if we had the existing stateids
   associated with that client ID fsids for all of
   the old file system instance in
   connection with intermediate directories, we could have no way of knowing that same client ID in connection with
   /this/is/the/path was the
   transitioned root of a new file system instance.

   File systems cooperating in state management may actually share state
   or simply divide the identifier space so as system, since we don't
   yet have its fsid.

   In order to recognize (and reject
   as stale) each other's stateids get the necessary information, let us re-send the chain
   of LOOKUPs with GETFHs and client IDs.  Servers that do
   share state may not do so under all conditions or GETATTRs to at all times.  If least get the server cannot fsids so we
   can be sure when accepting a client ID that it reflects
   the locks the client was given, the server must treat all associated
   state as stale and report it as such to where the client. appropriate file system boundaries are.  The
   client must establish a new client ID on could choose to get fs_locations at the destination, if it
   does not have one already, and reclaim locks if allowed by same time but in most
   cases the
   server.  In this case, old stateids and client IDs should not be
   presented to the new server since there is no assurance that they will not conflict with IDs valid on that server.

   When actual locks are not known to be maintained, the destination
   server may establish have a grace period specific good guess as to the given file
   system, with non-reclaim locks being rejected for that where file system,
   even though normal locks system
   boundaries are being granted (because of where NFS4ERR_MOVED was, and was not,
   received) making fetching of fs_locations unnecessary.

   OP01:  PUTROOTFH --> NFS_OK

   -  Current fh is root of pseudo-fs.

   OP02:  GETATTR(fsid) --> NFS_OK
   -  Just for other file systems.
   Clients should not infer completeness.  Normally, clients will know the absence fsid of
      the pseudo-fs as soon as they establish communication with a grace period for file
   systems being transitioned to a server from responses
      server.

   OP03:  LOOKUP "this" --> NFS_OK

   OP04:  GETATTR(fsid) --> NFS_OK

   -  Get current fsid to requests for
   other see where file systems.

   In system boundaries are.  The
      fsid will be that for the case of lock reclamation pseudo-fs in this example, so no
      boundary.

   OP05:  GETFH --> NFS_OK

   -  Current fh is for a given /this and is within pseudo-fs.

   OP06:  LOOKUP "is" --> NFS_OK

   -  Current fh is for /this/is and is within pseudo-fs.

   OP07:  GETATTR(fsid) --> NFS_OK

   -  Get current fsid to see where file system after a boundaries are.  The
      fsid will be that for the pseudo-fs in this example, so no
      boundary.

   OP08:  GETFH --> NFS_OK

   -  Current fh is for /this/is and is within pseudo-fs.

   OP09:  LOOKUP "the" --> NFS_OK

   -  Current fh is for /this/is/the and is within pseudo-fs.

   OP10:  GETATTR(fsid) --> NFS_OK

   -  Get current fsid to see where file system transition, edge conditions can arise similar to those boundaries are.  The
      fsid will be that for
   reclaim after server restart (although the pseudo-fs in this example, so no
      boundary.

   OP11:  GETFH --> NFS_OK

   -  Current fh is for /this/is/the and is within pseudo-fs.

   OP12:  LOOKUP "path" --> NFS_OK
   -  Current fh is for /this/is/the/path and is within a new, absent
      file system, but ...

   -  The client will never see the case value of that fh.

   OP13:  GETATTR(fsid, fs_locations) --> NFS_OK

   -  We are getting the planned
   state transfer associated with migration, these can fsid to know where the file system boundaries
      are.  In this operation, the fsid will be avoided by
   securely recording lock state as part different than that of state migration).  Unless
      the destination server can guarantee parent directory (which in turn was retrieved in OP10).  Note
      that locks the fsid we are given will not necessarily be
   incorrectly granted, preserved at
      the destination server should not allow lock
   reclaims new location.  That fsid might be different, and should avoid establishing a grace period.  (See
   Section 9.14 in fact the
      fsid we have for further details.)

   Servers are encouraged to provide facilities to allow locks to this file system might be
   reclaimed on the new server after a valid fsid of a
      different file system transition.  Often
   such facilities may not be available on that new server.

   -  In this particular case, we are pretty sure anyway that what has
      moved is /this/is/the/path rather than /this/is/the since we have
      the fsid of the latter and client should be prepared to
   re-obtain locks, even though it is possible that of the client may pseudo-fs, which
      presumably cannot move.  However, in other examples, we might not
      have
   its LOCK or OPEN request denied due to a conflicting lock.

   The consequences this kind of having no facilities available information to reclaim locks
   on the new server will depend rely on the type of environment.  In some
   environments, such as the transition between read-only (e.g., /this/is/the might
      be a non-pseudo file systems,
   such denial of locks should not pose large difficulties in practice.
   When an attempt system separate from /this/is/the/path), so
      we need to re-establish a lock have other reliable source information on a new server is denied, the
   client should treat boundary
      of the situation as if its original lock had been
   revoked.  Note file system that when the lock is granted, moved.  If, for example, the client cannot
   assume that no conflicting lock could file
      system /this/is had moved, we would have been granted in the
   interim.  Where change attribute continuity is present, a case of migration
      rather than referral, and once the client
   may check boundaries of the change attribute to check for unwanted migrated file
   modifications.  Where even
      system was clear we could fetch fs_locations.

   -  We are fetching fs_locations because the fact that we got an
      NFS4ERR_MOVED at this point means that it is not available, and the file system most likely that this
      is not read-only, a client may reasonably treat all pending locks as
   having been revoked.

7.7.6.1.  Transitions referral and we need the Lease_time Attribute

   In order that the client may appropriately manage its lease in destination.  Even if it is the case of
      that /this/is/the is a file system transition, that has migrated, we will
      still need the destination server must
   establish proper values location information for the lease_time attribute.

   When state is transferred transparently, that state should include file system.

   OP14:  GETFH --> NFS4ERR_MOVED

   -  Fails because current fh is in an absent file system at the correct value start
      of the lease_time attribute.  The lease_time
   attribute on operation, and the specification makes no exception for
      GETFH.  Note that this means the destination server must will never be less than that on send the source, since this would result in premature expiration of client
      a
   lease granted by filehandle from within an absent file system.

   Given the source server.  Upon transitions in which state
   is transferred transparently, above, the client knows where the root of the absent file
   system is under no obligation to
   refetch (/this/is/the/path) by noting where the lease_time attribute change of fsid
   occurred (between "the" and may continue to use "path").  The fs_locations attribute also
   gives the value
   previously fetched (on client the source server).

   If state has not been transferred transparently because actual location of the client ID
   is rejected when presented to absent file system, so
   that the new server, referral can proceed.  The server gives the client should fetch the value bare
   minimum of lease_time on information about the new (i.e., destination) server, and
   use it absent file system so that there
   will be very little scope for subsequent locking requests.  However, problems of conflict between
   information sent by the referring server must
   respect a grace period and information of at least as long as the lease_time file
   system's home.  No filehandles and very few attributes are present on
   the
   source referring server, in order to ensure that clients have ample time to
   reclaim their lock before potentially conflicting non-reclaimed locks
   are granted.

7.7.7.  Write Verifiers and File System Transitions

   In a file system transition, the two file systems client can treat those it receives as
   transient information with the function of enabling the referral.

8.7.2.  Referral Example (READDIR)

   Another context in which a client may be clustered encounter referrals is when it
   does a READDIR on a directory in which some of the handling sub-directories
   are the roots of unstably written data.  When this absent file systems.

   Suppose such a directory is the read as follows:

   o  PUTROOTFH

   o  LOOKUP "this"

   o  LOOKUP "is"

   o  LOOKUP "the"

   o  READDIR (fsid, size, time_modify, mounted_on_fileid)

   In this case, because rdattr_error is not requested, fs_locations is
   not requested, and some of the two file systems belong to attributes cannot be provided, the same write-verifier class, write
   verifiers returned from one system may
   result will be compared to those returned
   by an NFS4ERR_MOVED error on the other and superfluous writes avoided.

   When two file systems belong to different write-verifier classes, any
   verifier generated by one must not be compared to one provided by READDIR, with the
   other.  Instead, it should be treated
   detailed results as not equal even when the
   values are identical.

7.7.8.  Readdir Cookies and Verifiers and File System Transitions

   In a file system transition, follows:

   o  PUTROOTFH --> NFS_OK.  The current fh is at the two file systems may be consistent
   in their handling root of READDIR cookies the
      pseudo-fs.

   o  LOOKUP "this" --> NFS_OK.  The current fh is for /this and verifiers.  When this is
      within the
   case, pseudo-fs.

   o  LOOKUP "is" --> NFS_OK.  The current fh is for /this/is and is
      within the two file systems belong to the same readdir class,
   READDIR cookies pseudo-fs.

   o  LOOKUP "the" --> NFS_OK.  The current fh is for /this/is/the and verifiers from one system may be recognized by
      is within the other and pseudo-fs.

   o  READDIR operations started on one server may be validly
   continued on the other, simply by presenting (fsid, size, time_modify, mounted_on_fileid) -->
      NFS4ERR_MOVED.  Note that the cookie and verifier same error would have been returned by a READDIR operation done on
      if /this/is/the had migrated, but it is returned because the first file system to
      directory contains the
   second.

   When two root of an absent file systems belong to different readdir classes, any system.

   So now suppose that we re-send with rdattr_error:

   o  PUTROOTFH

   o  LOOKUP "this"

   o  LOOKUP "is"

   o  LOOKUP "the"

   o  READDIR cookie and verifier generated by one (rdattr_error, fsid, size, time_modify, mounted_on_fileid)

   The results will be:

   o  PUTROOTFH --> NFS_OK.  The current fh is not valid on at the
   second, and must not be presented to that server by root of the client.
      pseudo-fs.

   o  LOOKUP "this" --> NFS_OK.  The
   client should act as if the verifier was rejected.

7.7.9.  File System Data and File System Transitions

   When multiple replicas exist current fh is for /this and are used simultaneously or in
   succession by a client, applications using them will normally expect
   that they contain either is
      within the same data or data that pseudo-fs.

   o  LOOKUP "is" --> NFS_OK.  The current fh is consistent
   with for /this/is and is
      within the normal sorts of changes that are made by other clients
   updating pseudo-fs.

   o  LOOKUP "the" --> NFS_OK.  The current fh is for /this/is/the and
      is within the data of pseudo-fs.

   o  READDIR (rdattr_error, fsid, size, time_modify, mounted_on_fileid)
      --> NFS_OK.  The attributes for directory entry with the file system (with metadata being component
      named "path" will only contain rdattr_error with the same value
      NFS4ERR_MOVED, together with an fsid value and a value for
      mounted_on_fileid.

   So suppose we do another READDIR to
   the degree inferred by the get fs_locations attribute).  However, when
   multiple file systems are presented (although we
   could have used a GETATTR directly, as replicas of one another, the
   precise relationship between in Section 8.7.1).

   o  PUTROOTFH

   o  LOOKUP "this"

   o  LOOKUP "is"

   o  LOOKUP "the"

   o  READDIR (rdattr_error, fs_locations, mounted_on_fileid, fsid,
      size, time_modify)

   The results would be:

   o  PUTROOTFH --> NFS_OK.  The current fh is at the data root of one and the data of another
      pseudo-fs.

   o  LOOKUP "this" --> NFS_OK.  The current fh is not, as a general matter, specified by the NFSv4 protocol.  It for /this and is
   quite possible to present as replicas file systems where
      within the data of
   those file systems is sufficiently different that some applications
   have problems dealing with pseudo-fs.

   o  LOOKUP "is" --> NFS_OK.  The current fh is for /this/is and is
      within the transition between replicas. pseudo-fs.

   o  LOOKUP "the" --> NFS_OK.  The
   namespace current fh is for /this/is/the and
      is within the pseudo-fs.

   o  READDIR (rdattr_error, fs_locations, mounted_on_fileid, fsid,
      size, time_modify) --> NFS_OK.  The attributes will typically be constructed so that applications can
   choose an appropriate level of support, so that in one position in as shown
      below.

   The attributes for the namespace a varied set of replicas directory entry with the component named
   "path" will be listed, while in
   another only those that contain:

   o  rdattr_error (value: NFS_OK)

   o  fs_locations

   o  mounted_on_fileid (value: unique fileid within referring file
      system)

   o  fsid (value: unique value within referring server)

   The attributes for entry "path" will not contain size or time_modify
   because these attributes are up-to-date may be considered replicas. not available within an absent file
   system.

8.8.  The protocol does define four special cases of Attribute fs_locations

   The fs_locations attribute is structured in the relationship among
   replicas following way:

   struct fs_location4 {
           utf8str_cis             server<>;
           pathname4               rootpath;
   };

   struct fs_locations4 {
           pathname4       fs_root;
           fs_location4    locations<>;
   };
   The fs_location4 data type is used to be specified by represent the server and relied upon location of a
   file system by clients:

   o  When multiple providing a server addresses correspond to name and the same actual
      server, path to the client may depend on root of
   the fact that changes to data,
      metadata, or locks made on one file system are immediately
      reflected on others.

   o within that server's namespace.  When multiple replicas exist and are used simultaneously by a
      client, they must designate the same data.  Where set of
   servers have corresponding file systems are
      writable, a change made on one instance must be visible on all
      instances, immediately upon at the earlier same path within their
   namespaces, an array of server names may be provided.  An entry in
   the return server array is a UTF-8 string and represents one of the
      modifying requester a
   traditional DNS host name, IPv4 address, IPv6 address, or an zero-
   length string.  A zero-length string SHOULD be used to indicate the visibility of that change on any of
   current address being used for the
      associated replicas.  This allows RPC call.  It is not a client to use these replicas
      simultaneously without any special adaptation to the fact requirement
   that
      there are multiple replicas.  In this case, locks (whether share
      reservations or byte-range locks), and delegations obtained on one
      replica are immediately reflected on all replicas, even though
      these locks will servers that share the same rootpath be managed under a set of client IDs.

   o  When listed in one replica
   fs_location4 instance.  The array of server names is designated as provided for
   convenience.  Servers that share the successor instance to
      another existing instance after return NFS4ERR_MOVED (i.e., same rootpath may also be listed
   in separate fs_location4 entries in the
      case fs_locations attribute.

   The fs_locations4 data type and fs_locations attribute contain an
   array of migration), such locations.  Since the client namespace of each server may depend on be
   constructed differently, the fact that all
      changes written to stable storage on "fs_root" field is provided.  The path
   represented by fs_root represents the original instance are
      written to stable storage location of the successor (uncommitted writes are
      dealt with in Section 7.7.7).

   o  Where a file system is not writable but represents a read-only
      copy (possibly periodically updated) of a writable file system,
      clients have similar requirements with regard to in
   the propagation
      of updates.  They may need a guarantee current server's namespace, i.e., that any change visible on of the original file system instance must be immediately visible on
      any replica before server from which
   the client transitions access to that replica,
      in order to avoid any possibility that a client, in effecting a
      transition fs_locations attribute was obtained.  The fs_root path is meant
   to a replica, will see any reversion in aid the client by clearly referencing the root of the file system
      state.  Since these file systems
   whose locations are presumed to be unsuitable for
      simultaneous use, there is being reported, no specification of how locking is
      handled; in general, locks obtained on one matter what object within the
   current file system will be
      separate from those the current filehandle designates.  The fs_root
   is simply the pathname the client used to reach the object on others.  Since these are going the
   current server (i.e., the object to be read-
      only file systems, this which the fs_locations attribute
   applies).

   When the fs_locations attribute is not expected to pose an issue for
      clients or applications.

7.8.  Effecting File System Referrals

   Referrals interrogated and there are effected when an absent no
   alternate file system is encountered, and
   one or more alternate locations are made available by locations, the
   fs_locations attribute.  The client will typically get server SHOULD return a zero-
   length array of fs_location4 structures, together with a valid
   fs_root.

   As an
   NFS4ERR_MOVED error, fetch the appropriate location information, and
   proceed to access the example, suppose there is a replicated file system on a different server, even though
   it retains its logical position within the original namespace.
   Referrals differ from migration events in that they happen only when
   the client has not previously referenced located at
   two servers (servA and servB).  At servA, the file system in question
   (so there is nothing to transition).  Referrals can only come into
   effect when an absent located
   at path /a/b/c.  At, servB the file system is encountered located at its root.

   The examples given in path /x/y/z.
   If the sections below are somewhat artificial in
   that an actual client will not typically do a multi-component look
   up, but will have cached information regarding the upper levels of
   the name hierarchy.  However, these example are chosen were to make obtain the
   required behavior clear and easy to put within fs_locations value for the scope of a small
   number of requests, without getting unduly into details of how
   specific clients directory
   at /a/b/c/d, it might choose to cache things.

7.8.1.  Referral Example (LOOKUP)

   Let us suppose not necessarily know that the following COMPOUND file system's
   root is sent located in servA's namespace at /a/b/c.  When the client
   switches to servB, it will need to determine that the directory it
   first referenced at servA is now represented by the path /x/y/z/d on
   servB.  To facilitate this, the fs_locations attribute provided by
   servA would have an environment fs_root value of /a/b/c and two entries in which /this/is/the/path is absent from
   fs_locations.  One entry in fs_locations will be for itself (servA)
   and the target server.  This
   may other will be for servB with a number path of reasons.  It may be /x/y/z.  With this
   information, the case that client is able to substitute /x/y/z for the file
   system has moved, or it may be the case that /a/b/c
   at the target server is
   functioning mainly, or solely, to refer clients beginning of its access path and construct /x/y/z/d to use for
   the servers on
   which various file systems are located.

   o  PUTROOTFH

   o  LOOKUP "this"

   o  LOOKUP "is"

   o  LOOKUP "the"

   o  LOOKUP "path"

   o  GETFH

   o  GETATTR(fsid,fileid,size,time_modify)

   Under the given circumstances, new server.

   Note that: there is no requirement that the following will number of components in
   each rootpath be the result.

   o  PUTROOTFH --> NFS_OK.  The current fh same; there is now no relation between the root number of
   components in rootpath or fs_root, and none of the
      pseudo-fs.

   o  LOOKUP "this" --> NFS_OK.  The current fh is for /this components in each
   rootpath and is
      within fs_root have to be the pseudo-fs.

   o  LOOKUP "is" --> NFS_OK. same.  In the above example, we
   could have had a third element in the locations array, with server
   equal to "servC", and rootpath equal to "/I/II", and a fourth element
   in locations with server equal to "servD" and rootpath equal to
   "/aleph/beth/gimel/daleth/he".

   The current fh relationship between fs_root to a rootpath is that the client
   replaces the pathname indicated in fs_root for /this/is and is
      within the pseudo-fs.

   o  LOOKUP "the" --> NFS_OK.  The current fh is server for /this/is/the and
      is within
   the pseudo-fs.

   o  LOOKUP "path" --> NFS_OK.  The current fh is substitute indicated in rootpath for /this/is/the/path
      and is within the new server.

   For an example of a new, absent referred or migrated file system, but ... the client will
      never see the value of that fh.

   o  GETFH --> NFS4ERR_MOVED.  Fails because current fh suppose there
   is in an absent a file system located at serv1.  At serv1, the start file system is
   located at /az/buky/vedi/glagoli.  The client finds that object at
   glagoli has migrated (or is a referral).  The client gets the
   fs_locations attribute, which contains an fs_root of /az/buky/vedi/
   glagoli, and one element in the operation, locations array, with server equal to
   serv2, and rootpath equal to /izhitsa/fita.  The client replaces /az/
   buky/vedi/glagoli with /izhitsa/fita, and uses the specification
      makes no exception for GETFH.

   o  GETATTR(fsid,fileid,size,time_modify) Not executed because latter pathname on
   serv2.

   Thus, the
      failure of server MUST return an fs_root that is equal to the GETFH stops processing of path the COMPOUND.

   Given
   client used to reach the failure of object to which the GETFH, fs_locations attribute
   applies.  Otherwise, the client has cannot determine the job of determining new path to use
   on the root new server.

8.8.1.  Inferring Transition Modes

   When fs_locations is used, information about the specific locations
   should be assumed based on the following rules.

   The following rules are general and apply irrespective of the absent
   context.

   o  All listed file system and where to find that file
   system, i.e., instances should be considered as of the server
      same handle class if and path relative to that server's root fh.
   Note here that in this example, only if the client did not obtain filehandles
   and current fh_expire_type
      attribute information (e.g., fsid) for does not include the intermediate
   directories, so FH4_VOL_MIGRATION bit.  Note that it would
      in the case of referral, filehandle issues do not apply since
      there can be sure where no filehandles known within the absent current file system starts.  It could be the case, for example, that /this/is/the
      nor is there any access to the root of the moved file system and that the reason that the
   look up of "path" succeeded is that fh_expire_type attribute on the
      referring (absent) file system.

   o  All listed file system was not absent on
   that operation but was moved between instances should be considered as of the last LOOKUP
      same fileid class if and the GETFH
   (since COMPOUND is not atomic).  Even only if we had the fsids for all of fh_expire_type attribute
      indicates persistent filehandles and does not include the intermediate directories, we could have no way of knowing
      FH4_VOL_MIGRATION bit.  Note that
   /this/is/the/path was in the root case of a new file system, referral, fileid
      issues do not apply since we don't
   yet have its fsid.

   In order to get the necessary information, let us re-send the chain
   of LOOKUPs with GETFHs and GETATTRs to at least get the fsids so we there can be sure where no fileids known within the appropriate
      referring (absent) file system boundaries are.  The
   client could choose nor is there any access to get fs_locations at the same time but in most
   cases the client will have a good guess as to where
      fh_expire_type attribute.

   o  All file system
   boundaries are (because instances servers should be considered as of where NFS4ERR_MOVED was, and was not,
   received) making fetching
      different change classes.

   o  All file system instances servers should be considered as of fs_locations unnecessary.

   OP01:  PUTROOTFH --> NFS_OK

   -  Current fh is root
      different readdir classes.

   For other class assignments, handling of pseudo-fs.

   OP02:  GETATTR(fsid) --> NFS_OK

   -  Just file system transitions
   depends on the reasons for completeness.  Normally, clients will know the fsid of transition:

   o  When the pseudo-fs as soon as they establish communication with a
      server.

   OP03:  LOOKUP "this" --> NFS_OK

   OP04:  GETATTR(fsid) --> NFS_OK

   -  Get current fsid transition is due to see where file system boundaries are.  The
      fsid will be migration, that for is, the pseudo-fs in this example, so no
      boundary.

   OP05:  GETFH --> NFS_OK

   -  Current fh is for /this and is within pseudo-fs.

   OP06:  LOOKUP "is" --> NFS_OK

   -  Current fh is for /this/is and is within pseudo-fs.

   OP07:  GETATTR(fsid) --> NFS_OK

   -  Get current fsid client was
      directed to see where a new file system boundaries are.  The
      fsid will after receiving an NFS4ERR_MOVED
      error, the target should be that for treated as being of the pseudo-fs in this example, so no
      boundary.

   OP08:  GETFH --> NFS_OK

   -  Current fh same write-
      verifier class as the source.

   o  When the transition is for /this/is due to failover to another replica, that
      is, the client selected another replica without receiving and is within pseudo-fs.

   OP09:  LOOKUP "the" --> NFS_OK

   -  Current fh is
      NFS4ERR_MOVED error, the target should be treated as being of a
      different write-verifier class from the source.

   The specific choices reflect typical implementation patterns for /this/is/the
   failover and is within pseudo-fs.

   OP10:  GETATTR(fsid) --> NFS_OK

   -  Get current fsid controlled migration, respectively.

   See Section 17 for a discussion on the recommendations for the
   security flavor to see where file system boundaries are.  The
      fsid will be used by any GETATTR operation that for requests the pseudo-fs in this example, so no
      boundary.

   OP11:  GETFH --> NFS_OK

   -  Current fh is for /this/is/the
   "fs_locations" attribute.

9.  File Locking and is within pseudo-fs.

   OP12:  LOOKUP "path" --> NFS_OK

   -  Current fh is for /this/is/the/path and is within a new, absent
      file system, but ...

   -  The client will never see Share Reservations

   Integrating locking into the value NFS protocol necessarily causes it to be
   stateful.  With the inclusion of that fh.

   OP13:  GETATTR(fsid, fs_locations) --> NFS_OK

   -  We are getting share reservations the fsid to know where protocol
   becomes substantially more dependent on state than the file system boundaries
      are. traditional
   combination of NFS and NLM (Network Lock Manager) [xnfs].  There are
   three components to making this state manageable:

   o  clear division between client and server

   o  ability to reliably detect inconsistency in state between client
      and server

   o  simple and robust recovery mechanisms

   In this operation, model, the fsid will be different than that of server owns the parent directory (which in turn was retrieved state information.  The client
   requests changes in OP10).  Note
      that locks and the fsid we server responds with the changes
   made.  Non-client-initiated changes in locking state are given will not necessarily be preserved at infrequent.
   The client receives prompt notification of such changes and can
   adjust its view of the new location.  That fsid might be different, locking state to reflect the server's changes.

   Individual pieces of state created by the server and in fact passed to the
      fsid we have for this file system might be
   client at its request are represented by 128-bit stateids.  These
   stateids may represent a valid fsid particular open file, a set of byte-range
   locks held by a particular owner, or a recallable delegation of
   privileges to access a
      different file system on that new server.

   -  In this in particular case, we are pretty sure anyway that what has
      moved ways or at a particular
   location.

   In all cases, there is /this/is/the/path rather than /this/is/the since we have a transition from the fsid of most general information
   that represents a client as a whole to the latter eventual lightweight
   stateid used for most client and server locking interactions.  The
   details of this transition will vary with the type of object but it
   always starts with a client ID.

   To support Win32 share reservations it is that of necessary to atomically
   OPEN or CREATE files and apply the pseudo-fs, which
      presumably cannot move.  However, appropriate locks in other examples, we might the same
   operation.  Having a separate share/unshare operation would not
      have this kind allow
   correct implementation of information the Win32 OpenFile API.  In order to rely on (e.g., /this/is/the might
      be
   correctly implement share semantics, the previous NFS protocol
   mechanisms used when a non-pseudo file system separate from /this/is/the/path), so
      we is opened or created (LOOKUP, CREATE,
   ACCESS) need to have other reliable source information on be replaced.  The NFSv4 protocol has an OPEN
   operation that subsumes the boundary NFSv3 methodology of LOOKUP, CREATE, and
   ACCESS.  However, because many operations require a filehandle, the file system that
   traditional LOOKUP is moved.  If, for example, the file
      system /this/is had moved, we would have preserved to map a case of migration
      rather than referral, and once file name to filehandle
   without establishing state on the boundaries server.  The policy of granting
   access or modifying files is managed by the migrated file
      system was clear we could fetch fs_locations.

   -  We are fetching fs_locations because server based on the fact that we got an
      NFS4ERR_MOVED at this point means that it
   client's state.  These mechanisms can implement policy ranging from
   advisory only locking to full mandatory locking.

9.1.  Opens and Byte-Range Locks

   It is most likely assumed that this
      is manipulating a referral byte-range lock is rare when
   compared to READ and we need the destination.  Even if it WRITE operations.  It is the case also assumed that /this/is/the
   server restarts and network partitions are relatively rare.
   Therefore it is a file system important that has migrated, we will
      still need the location READ and WRITE operations have a
   lightweight mechanism to indicate if they possess a held lock.  A
   byte-range lock request contains the heavyweight information for that file system.

   OP14:  GETFH --> NFS4ERR_MOVED

   -  Fails because current fh is in an absent file system at required
   to establish a lock and uniquely define the start owner of the operation, and lock.

   The following sections describe the specification makes no exception for
      GETFH.  Note that this means transition from the heavy weight
   information to the eventual stateid used for most client and server will never send
   locking and lease interactions.

9.1.1.  Client ID

   For each LOCK request, the client must identify itself to the server.
   This is done in such a filehandle from within an absent file system.

   Given way as to allow for correct lock
   identification and crash recovery.  A sequence of a SETCLIENTID
   operation followed by a SETCLIENTID_CONFIRM operation is required to
   establish the above, identification onto the server.  Establishment of
   identification by a new incarnation of the client knows where also has the root effect
   of immediately breaking any leased state that a previous incarnation
   of the absent file
   system is (/this/is/the/path) by noting where client might have had on the change of fsid
   occurred (between "the" and "path").  The fs_locations attribute also
   gives server, as opposed to forcing the
   new client incarnation to wait for the actual location of leases to expire.  Breaking
   the absent file system, so
   that lease state amounts to the referral can proceed.  The server gives removing all lock, share
   reservation, and, where the server is not supporting the
   CLAIM_DELEGATE_PREV claim type, all delegation state associated with
   same client with the bare
   minimum same identity.  For discussion of information about delegation
   state recovery, see Section 10.2.1.

   Owners of opens and owners of byte-range locks are separate entities
   and remain separate even if the absent file system so that there
   will be very little scope for problems same opaque arrays are used to
   designate owners of conflict each.  The protocol distinguishes between
   information sent open-
   owners (represented by the referring server open_owner4 structures) and information lock-owners
   (represented by lock_owner4 structures).

   Both sorts of the file
   system's home.  No filehandles and very few attributes are present on
   the referring server, owners consist of a clientid and an opaque owner
   string.  For each client, the client can treat those it receives as
   transient information set of distinct owner values used with
   that client constitutes the function set of enabling owners of that type, for the referral.

7.8.2.  Referral Example (READDIR)

   Another context in which a client may encounter referrals given
   client.

   Each open is when it
   does associated with a READDIR on specific open-owner while each byte-
   range lock is associated with a directory in which some of lock-owner and an open-owner, the sub-directories
   are
   latter being the roots of absent file systems.

   Suppose such a directory open-owner associated with the open file under which
   the LOCK operation was done.

   Client identification is read as follows:

   o  PUTROOTFH

   o  LOOKUP "this"

   o  LOOKUP "is"

   o  LOOKUP "the"

   o  READDIR (fsid, size, time_modify, mounted_on_fileid)

   In this case, because rdattr_error encapsulated in the following structure:

   struct nfs_client_id4 {
           verifier4       verifier;
           opaque          id<NFS4_OPAQUE_LIMIT>;
   };

   The first field, verifier is not requested, fs_locations a client incarnation verifier that is
   not requested, and some of
   used to detect client reboots.  Only if the attributes cannot be provided, verifier is different
   from that which the
   result will be an NFS4ERR_MOVED error on server has previously recorded for the READDIR, with client (as
   identified by the
   detailed results as follows:

   o  PUTROOTFH --> NFS_OK.  The current fh is at second field of the root structure, id) does the server
   start the process of canceling the
      pseudo-fs.

   o  LOOKUP "this" --> NFS_OK. client's leased state.

   The current fh is for /this and second field, id is
      within a variable length string that uniquely
   defines the pseudo-fs.

   o  LOOKUP "is" --> NFS_OK.  The current fh is client.

   There are several considerations for /this/is and is
      within how the pseudo-fs.

   o  LOOKUP "the" --> NFS_OK.  The current fh is for /this/is/the and
      is within client generates the pseudo-fs. id
   string:

   o  READDIR (fsid, size, time_modify, mounted_on_fileid) -->
      NFS4ERR_MOVED.  Note  The string should be unique so that multiple clients do not
      present the same error would have been returned
      if /this/is/the had migrated, but it is returned because the
      directory contains the root string.  The consequences of two clients
      presenting the same string range from one client getting an absent file system.

   So now suppose that we re-send with rdattr_error:

   o  PUTROOTFH

   o  LOOKUP "this"

   o  LOOKUP "is"

   o  LOOKUP "the" error
      to one client having its leased state abruptly and unexpectedly
      canceled.

   o  READDIR (rdattr_error, fsid, size, time_modify, mounted_on_fileid)  The results will be:

   o  PUTROOTFH --> NFS_OK. string should be selected so the subsequent incarnations
      (e.g., reboots) of the same client cause the client to present the
      same string.  The current fh implementor is at cautioned against an approach
      that requires the root string to be recorded in a local file because
      this precludes the use of the
      pseudo-fs.

   o  LOOKUP "this" --> NFS_OK.  The current fh implementation in an environment
      where there is for /this no local disk and all file access is
      within the pseudo-fs. from an NFSv4
      server.

   o  LOOKUP "is" --> NFS_OK.  The current fh is string should be different for /this/is and is
      within each server network address
      that the pseudo-fs.

   o  LOOKUP "the" --> NFS_OK. client accesses, rather than common to all server network
      addresses.  The current fh reason is that it may not be possible for /this/is/the and the
      client to tell if the same server is within listening on multiple network
      addresses.  If the pseudo-fs.

   o  READDIR (rdattr_error, fsid, size, time_modify, mounted_on_fileid)
      --> NFS_OK.  The attributes for directory entry client issues SETCLIENTID with the component
      named "path" same id
      string to each network address of such a server, the server will only contain rdattr_error with
      think it is the value
      NFS4ERR_MOVED, together with an fsid value same client, and a value for
      mounted_on_fileid.

   So suppose we do another READDIR each successive SETCLIENTID will
      cause the server to get fs_locations (although we
   could have used a GETATTR directly, as in Section 7.8.1).

   o  PUTROOTFH

   o  LOOKUP "this"

   o  LOOKUP "is"

   o  LOOKUP "the"

   o  READDIR (rdattr_error, fs_locations, mounted_on_fileid, fsid,
      size, time_modify)

   The results would be:

   o  PUTROOTFH --> NFS_OK.  The current fh is at begin the root process of removing the
      pseudo-fs. client's
      previous leased state.

   o  LOOKUP "this" --> NFS_OK.  The current fh is algorithm for /this and is
      within generating the pseudo-fs.

   o  LOOKUP "is" --> NFS_OK.  The current fh is for /this/is string should not assume that the
      client's network address won't change.  This includes changes
      between client incarnations and is
      within even changes while the pseudo-fs.

   o  LOOKUP "the" --> NFS_OK.  The current fh client is for /this/is/the
      stilling running in its current incarnation.  This means that if
      the client includes just the client's and server's network address
      in the id string, there is within a real risk, after the pseudo-fs.

   o  READDIR (rdattr_error, fs_locations, mounted_on_fileid, fsid,
      size, time_modify) --> NFS_OK.  The attributes will be as shown
      below.

   The attributes for client gives up
      the directory entry with network address, that another client, using a similar
      algorithm for generating the component named
   "path" id string, will only contain: generate a
      conflicting id string.

   Given the above considerations, an example of a well generated id
   string is one that includes:

   o  rdattr_error (value: NFS_OK)  The server's network address.

   o  fs_locations  The client's network address.

   o  mounted_on_fileid (value:  For a user level NFSv4 client, it should contain additional
      information to distinguish the client from other user level
      clients running on the same host, such as an universally unique fileid within referring file
      system)
      identifier (UUID).

   o  fsid (value: unique value within referring server)

   The attributes for entry "path" will not contain size  Additional information that tends to be unique, such as one or time_modify
   because these attributes are not available within an absent file
   system.

7.9.  The Attribute fs_locations
      more of:

      *  The fs_locations attribute client machine's serial number (for privacy reasons, it is structured in
         best to perform some one way function on the following way:

   struct fs_location4 {
           utf8val_REQUIRED4       server<>;
           pathname4               rootpath;
   };

   struct fs_locations4 {
           pathname4       fs_root;
           fs_location4    locations<>;
   }; serial number).

      *  A MAC address.

      *  The fs_location4 data type timestamp of when the NFSv4 software was first installed on
         the client (though this is used subject to represent the location of a
   file system by providing previously mentioned
         caution about using information that is stored in a server name and file,
         because the path file might only be accessible over NFSv4).

      *  A true random number.  However since this number ought to be
         the root of same between client incarnations, this shares the file system within same
         problem as that server's namespace.  When a set of
   servers have corresponding file systems at the same path within their
   namespaces, an array using the timestamp of server names may be provided.  An entry in the server array is software
         installation.

   As a UTF-8 string and represents one of security measure, the server MUST NOT cancel a
   traditional DNS host name, IPv4 address, IPv6 address, or an zero-
   length string.  A zero-length string SHOULD be used to indicate client's leased
   state if the
   current address being used for principal that established the RPC call.  It state for a given id
   string is not a requirement
   that all servers that share the same rootpath be listed in one
   fs_location4 instance.  The array of server names is provided for
   convenience.  Servers that share as the same rootpath may also be listed
   in separate fs_location4 entries in principal issuing the fs_locations attribute.

   The fs_locations4 data type SETCLIENTID.

   Note that SETCLIENTID and fs_locations attribute contain an
   array SETCLIENTID_CONFIRM has a secondary purpose
   of such locations.  Since establishing the namespace of each server may be
   constructed differently, information the "fs_root" field is provided.  The path
   represented by fs_root represents server needs to make callbacks to
   the location client for purpose of supporting delegations.  It is permitted to
   change this information via SETCLIENTID and SETCLIENTID_CONFIRM
   within the file system in
   the current server's namespace, i.e., that same incarnation of the server from which client without removing the fs_locations attribute was obtained.  The fs_root path is meant
   to aid
   client's leased state.

   Once a SETCLIENTID and SETCLIENTID_CONFIRM sequence has successfully
   completed, the client by clearly referencing uses the root shorthand client identifier, of type
   clientid4, instead of the file system
   whose locations are being reported, no matter what object within the
   current file system the current filehandle designates.  The fs_root longer and less compact nfs_client_id4
   structure.  This shorthand client identifier (a client ID) is simply
   assigned by the pathname server and should be chosen so that it will not
   conflict with a client ID previously assigned by the server.  This
   applies across server restarts or reboots.  When a client used ID is
   presented to reach the object on a server and that client ID is not recognized, as would
   happen after a server reboot, the
   current server (i.e., will reject the object to which request with
   the fs_locations attribute
   applies). error NFS4ERR_STALE_CLIENTID.  When this happens, the fs_locations attribute is interrogated client must
   obtain a new client ID by use of the SETCLIENTID operation and there are no
   alternate file system locations, then
   proceed to any other necessary recovery for the server SHOULD return reboot case
   (See Section 9.6.2).

   The client must also employ the SETCLIENTID operation when it
   receives a zero-
   length array of fs_location4 structures, together with NFS4ERR_STALE_STATEID error using a valid
   fs_root.

   As an example, suppose there is stateid derived from
   its current client ID, since this also indicates a replicated file system located at
   two servers (servA and servB).  At servA, server reboot
   which has invalidated the file system is located
   at path /a/b/c.  At, servB existing client ID (see Section 9.6.2 for
   details).

   See the file system is located at path /x/y/z. detailed descriptions of SETCLIENTID and SETCLIENTID_CONFIRM
   for a complete specification of the operations.

9.1.2.  Server Release of Client ID

   If the server determines that the client were holds no associated state
   for its client ID, the server may choose to obtain release the fs_locations value client ID.
   The server may make this choice for the directory
   at /a/b/c/d, it might not necessarily know an inactive client so that
   resources are not consumed by those intermittently active clients.
   If the file system's
   root is located in servA's namespace at /a/b/c.  When client contacts the server after this release, the server must
   ensure the client
   switches to servB, receives the appropriate error so that it will need use
   the SETCLIENTID/SETCLIENTID_CONFIRM sequence to determine establish a new
   identity.  It should be clear that the directory it
   first referenced at servA is now represented by the path /x/y/z/d on
   servB.  To facilitate this, server must be very hesitant
   to release a client ID since the fs_locations attribute provided by
   servA would have resulting work on the client to
   recover from such an fs_root value of /a/b/c and two entries in
   fs_locations.  One entry in fs_locations event will be for itself (servA)
   and the other will be same burden as if the server
   had failed and restarted.  Typically a server would not release a
   client ID unless there had been no activity from that client for servB with many
   minutes.

   Note that if the id string in a path of /x/y/z.  With this
   information, SETCLIENTID request is properly
   constructed, and if the client is able takes care to substitute /x/y/z use the same principal
   for each successive use of SETCLIENTID, then, barring an active
   denial of service attack, NFS4ERR_CLID_INUSE should never be
   returned.

   However, client bugs, server bugs, or perhaps a deliberate change of
   the /a/b/c
   at principal owner of the beginning id string (such as the case of its access path a client
   that changes security flavors, and construct /x/y/z/d to use for under the new server.

   Note that: flavor, there is no requirement that
   mapping to the number of components previous owner) will in
   each rootpath be rare cases result in
   NFS4ERR_CLID_INUSE.

   In that event, when the same; there is server gets a SETCLIENTID for a client ID
   that currently has no relation between the number of
   components in rootpath state, or fs_root, and none of it has state, but the components in each
   rootpath lease has
   expired, rather than returning NFS4ERR_CLID_INUSE, the server MUST
   allow the SETCLIENTID, and fs_root have to be confirm the same.  In new client ID if followed by
   the above example, we
   could have had a third element in appropriate SETCLIENTID_CONFIRM.

9.1.3.  Stateid Definition

   When the locations array, with server
   equal to "servC", and rootpath equal to "/I/II", and grants a fourth element
   in locations with server equal to "servD" lock of any type (including opens, byte-
   range locks, and rootpath equal to
   "/aleph/beth/gimel/daleth/he".

   The relationship between fs_root to delegations), it responds with a rootpath is unique stateid that the client
   replaces the pathname indicated in fs_root
   represents a set of locks (often a single lock) for the current server for same file, of
   the substitute indicated in rootpath for same type, and sharing the new server.

   For same ownership characteristics.  Thus,
   opens of the same file by different open-owners each have an example
   identifying stateid.  Similarly, each set of byte-range locks on a referred or migrated
   file system, suppose there
   is owned by a file system located at serv1.  At serv1, the file system is
   located at /az/buky/vedi/glagoli.  The client finds that object at
   glagoli specific lock-owner has migrated (or its own identifying stateid.
   Delegations also have associated stateids by which they may be
   referenced.  The stateid is used as a referral).  The client gets the
   fs_locations attribute, which contains an fs_root shorthand reference to a lock
   or set of /az/buky/vedi/
   glagoli, locks, and one element in given a stateid, the locations array, with server equal to
   serv2, and rootpath equal to /izhitsa/fita.  The client replaces /az/
   buky/vedi/glagoli with /izhitsa/fita, and uses can determine the latter pathname on
   serv2.

   Thus,
   associated state-owner or state-owners (in the server MUST return case of an fs_root that is equal to open-owner/
   lock-owner pair) and the path associated filehandle.  When stateids are
   used, the
   client used to reach the object to which the fs_locations attribute
   applies.  Otherwise, current filehandle must be the one associated with that
   stateid.

   All stateids associated with a given client cannot determine the new path to use
   on ID are associated with a
   common lease that represents the new server.

7.9.1.  Inferring Transition Modes

   When fs_locations is used, information about claim of those stateids and the specific locations
   should
   objects they represent to be assumed based on maintained by the following rules.

   The following rules are general and apply irrespective server.  See
   Section 9.5 for a discussion of the
   context.

   o  All listed file system instances should lease.

   Each stateid must be considered as of the
      same handle class if and only if unique to the current fh_expire_type
      attribute does server.  Many operations take a
   stateid as an argument but not include the FH4_VOL_MIGRATION bit.  Note that
      in a clientid, so the case of referral, filehandle issues do not apply since
      there can server must be no filehandles known within the current file system
      nor is there any access able
   to infer the fh_expire_type attribute on the
      referring (absent) file system.

   o  All listed file system instances should be considered as of client from the
      same fileid class if and only if stateid.

9.1.3.1.  Stateid Types

   With the fh_expire_type attribute
      indicates persistent filehandles and does not include exception of special stateids (see Section 9.1.3.3), each
   stateid represents locking objects of one of a set of types defined
   by the
      FH4_VOL_MIGRATION bit. NFSv4 protocol.  Note that in the case all these cases, where we speak
   of referral, fileid
      issues do not apply since there can be no fileids known within the
      referring (absent) file system nor guarantee, it is understood there any access to are situations such as a client
   restart, or lock revocation, that allow the
      fh_expire_type attribute.

   o  All file system instances servers should guarantee to be considered as of
      different change classes. voided.

   o  All file system instances servers should be considered as of
      different readdir classes.

   For other class assignments, handling  Stateids may represent opens of file system transitions
   depends on files.

      Each stateid in this case represents the reasons OPEN state for a given
      client ID/open-owner/filehandle triple.  Such stateids are subject
      to change (with consequent incrementing of the transition:

   o  When the transition is due stateid's seqid) in
      response to migration, OPENs that is, result in upgrade and OPEN_DOWNGRADE
      operations.

   o  Stateids may represent sets of byte-range locks.

      All locks held on a particular file by a particular owner and all
      gotten under the client was
      directed to aegis of a new particular open file system after receiving an NFS4ERR_MOVED
      error, are associated
      with a single stateid with the target should be treated as seqid being incremented whenever
      LOCK and LOCKU operations affect that set of the same write-
      verifier class as the source. locks.

   o  When  Stateids may represent file delegations, which are recallable
      guarantees by the transition is due to failover server to another replica, the client, that
      is, other clients will
      not reference, or will not modify a particular file, until the
      delegation is returned.

      A stateid represents a single delegation held by a client selected another replica without receiving and
      NFS4ERR_MOVED error, the target should be treated as being of for a
      different write-verifier class from the source.

   The specific choices reflect typical implementation patterns for
   failover
      particular filehandle.

9.1.3.2.  Stateid Structure

   Stateids are divided into two fields, a 96-bit "other" field
   identifying the specific set of locks and controlled migration, respectively.

   See a 32-bit "seqid" sequence
   value.  Except in the case of special stateids (see Section 17 for 9.1.3.3),
   a discussion on particular value of the recommendations "other" field denotes a set of locks of the
   same type (for example, byte-range locks, opens, or delegations), for
   a specific file or directory, and sharing the
   security flavor same ownership
   characteristics.  The seqid designates a specific instance of such a
   set of locks, and is incremented to be used indicate changes in such a set of
   locks, either by any GETATTR operation that requests the
   "fs_locations" attribute.

8.  NFS Server Name Space

8.1.  Server Exports

   On a UNIX server addition or deletion of locks from the name space describes all set, a
   change in the files reachable by
   pathnames under byte-range they apply to, or an upgrade or downgrade in
   the root directory type of one or "/".  On more locks.

   When such a Windows NT server
   the name space constitutes all set of locks is first created, the files on disks named by mapped
   disk letters.  NFS server administrators rarely make the entire
   server's filesystem name space available to NFS clients.  More often
   portions SHOULD return a
   stateid with seqid value of one.  On subsequent operations that
   modify the name space are made available via an "export"
   feature.  In previous versions set of locks, the NFS protocol, the root
   filehandle for each export server is obtained through the MOUNT protocol; required to increment the client sends
   "seqid" field by one whenever it returns a string that identifies stateid for the export of name space same
   state-owner/file/type combination and there is some change in the set
   of locks actually designated.  In this case, the server returns will return a
   stateid with an "other" field the root filehandle same as previously used for it.  The MOUNT
   protocol supports an EXPORTS procedure that will enumerate
   state-owner/file/type combination, with an incremented "seqid" field.
   This pattern continues until the
   server's exports.

8.2.  Browsing Exports seqid is incremented past
   NFS4_UINT32_MAX, and one (not zero) SHOULD be the next seqid value.
   The NFSv4 protocol provides a root filehandle that clients can use to
   obtain filehandles for these exports via a multi-component LOOKUP.  A
   common user experience purpose of the incrementing of the seqid is to use a graphical user interface (perhaps
   a file "Open" dialog window) to find a file via progressive browsing
   through a directory tree.  The client must be able allow the server
   to move from one
   export communicate to another export via single-component, progressive LOOKUP
   operations.

   This style of browsing is not well supported by the NFSv2 and NFSv3
   protocols.  The client expects all LOOKUP the order in which operations to remain within that
   modified locking state associated with a single server filesystem.  For example, stateid have been processed.

   In making comparisons between seqids, both by the device attribute will
   not change.  This prevents a client from taking name space paths that
   span exports.

   An automounter on in
   determining the client can obtain a snapshot order of operations and by the server's
   name space using server in determining
   whether the EXPORTS procedure of NFS4ERR_OLD_STATEID is to be returned, the MOUNT protocol.  If it
   understands possibility of
   the server's pathname syntax, it can create an image of
   the server's name space on the client.  The parts of seqid being swapped around past the name space
   that NFS4_UINT32_MAX value needs
   to be taken into account.

9.1.3.3.  Special Stateids

   Stateid values whose "other" field is either all zeros or all ones
   are reserved.  They may not exported be assigned by the server are filled in with a "pseudo
   filesystem" that allows the user to browse from one mounted
   filesystem to another.  There is a drawback to this representation of but have
   special meanings defined by the server's name space protocol.  The particular meaning
   depends on whether the client: it "other" field is static.  If the server
   administrator adds a new export the client will be unaware of it.

8.3.  Server Pseudo Filesystem

   NFSv4 servers avoid this name space inconsistency by presenting all zeros or all ones and the exports within
   specific value of the framework "seqid" field.

   The following combinations of a single server name space.  An
   NFSv4 client uses LOOKUP "other" and READDIR operations to browse seamlessly
   from one export to another.  Portions of the server name space that "seqid" are not exported defined in
   NFSv4:

   o  When "other" and "seqid" are bridged via a "pseudo filesystem" that provides
   a view of exported directories only.  A pseudo filesystem has both zero, the stateid is treated as
      a
   unique fsid special anonymous stateid, which can be used in READ, WRITE, and behaves like a normal, read only filesystem.

   Based on
      SETATTR requests to indicate the construction absence of any open state
      associated with the server's name space, it request.  When an anonymous stateid value is possible
   that multiple pseudo filesystems may exist.  For example,

     /a         pseudo filesystem
     /a/b       real filesystem
     /a/b/c     pseudo filesystem
     /a/b/c/d   real filesystem

   Each of the pseudo filesystems are considered separate entities
      used, and
   therefore an existing open denies the form of access requested,
      then access will have a unique fsid.

8.4.  Multiple Roots

   The DOS be denied to the request.

   o  When "other" and Windows operating environments are sometimes described as
   having "multiple roots".  Filesystems are commonly represented as
   disk letters.  MacOS represents filesystems as top level names.
   NFSv4 servers for these platforms can construct a pseudo file system
   above these root names so that disk letters or volume names "seqid" are
   simply directory names in the pseudo root.

8.5.  Filehandle Volatility

   The nature of both all ones, the server's pseudo filesystem stateid is that a
      special READ bypass stateid.  When this value is used in WRITE or
      SETATTR, it is a logical
   representation of filesystem(s) available from the server.
   Therefore, treated like the pseudo filesystem is most likely constructed
   dynamically when anonymous value.  When used in
      READ, the server MAY grant access, even if access would normally
      be denied to READ requests.

   If a stateid value is first instantiated.  It is expected
   that used which has all zero or all ones in the pseudo filesystem may
   "other" field, but does not have an on disk counterpart from
   which persistent filehandles could be constructed.  Even though it is
   preferable that match one of the server provide persistent filehandles for cases above, the
   pseudo filesystem, server
   MUST return the NFS error NFS4ERR_BAD_STATEID.

   Special stateids, unlike other stateids, are not associated with
   individual client should expect that pseudo file
   system IDs or filehandles are volatile.  This and can be confirmed by checking
   the associated "fh_expire_type" attribute for those filehandles in
   question.  If the filehandles are volatile, the NFS used with all valid
   client IDs and filehandles.

9.1.3.4.  Stateid Lifetime and Validation

   Stateids must be
   prepared to recover remain valid until either a filehandle value (e.g., with client restart or a multi-component
   LOOKUP) when receiving an error of NFS4ERR_FHEXPIRED.

8.6.  Exported Root

   If server
   restart or until the server's root filesystem is exported, one might conclude that
   a pseudo-filesystem is not needed.  This would be wrong.  Assume client returns all of the
   following filesystems on a server:

     /       disk1  (exported)
     /a      disk2  (not exported)
     /a/b    disk3  (exported)

   Because disk2 is not exported, disk3 cannot be reached locks associated with simple
   LOOKUPs.  The server must bridge
   the gap with a pseudo-filesystem.

8.7.  Mount Point Crossing

   The server filesystem environment may be constructed in stateid by means of an operation such a way
   that one filesystem contains a directory which is 'covered' as CLOSE or
   mounted upon by a second filesystem.  For example:

     /a/b            (filesystem 1)
     /a/b/c/d        (filesystem 2)

   The pseudo filesystem for this server may be constructed to look
   like:

     /               (place holder/not exported)
     /a/b            (filesystem 1)
     /a/b/c/d        (filesystem 2)

   It is the server's responsibility to present DELEGRETURN.
   If the pseudo filesystem
   that is complete locks are lost due to the client.  If revocation as long as the client sends a lookup request
   for the path "/a/b/c/d", the server's response ID is
   valid, the filehandle of
   the filesystem "/a/b/c/d".  In previous versions stateid remains a valid designation of that revoked state.
   Stateids associated with byte-range locks are an exception.  They
   remain valid even if a LOCKU frees all remaining locks, so long as
   the NFS protocol,
   the server would respond open file with which they are associated remains open.

   It should be noted that there are situations in which the filehandle of directory "/a/b/c/d"
   within client's
   locks become invalid, without the filesystem "/a/b".

   The NFS client will requesting they be able to determine if it crosses a server mount
   point by returned.
   These include lease expiration and a change in the value number of the "fsid" attribute.

8.8.  Security Policy and Name Space Presentation

   The application forms of lock
   revocation within the server's security policy needs to be carefully
   considered by the implementor.  One may choose lease period.  It is important to limit the
   viewability of portions of the pseudo filesystem based on note that in
   these situations, the
   server's perception of stateid remains valid and the client's ability client can use it
   to authenticate itself
   properly.  However, with determine the support disposition of multiple security mechanisms
   and the ability to negotiate the appropriate use associated lost locks.

   An "other" value must never be reused for a different purpose (i.e.
   different filehandle, owner, or type of these mechanisms, locks) within the server is unable to properly determine if context of
   a single client will be able
   to authenticate itself.  If, based on its policies, the ID.  A server
   chooses to limit may retain the contents of "other" value for the pseudo filesystem,
   same purpose beyond the server point where it may effectively hide filesystems from a client otherwise be freed but if
   it does so, it must maintain "seqid" continuity with previous values.

   One mechanism that may otherwise
   have legitimate access.

   As suggested practice, be used to satisfy the server should apply requirement that the security policy of
   a shared resource in
   server recognize invalid and out-of-date stateids is for the server's namespace server
   to divide the components "other" field of the
   resource's ancestors.  For example:

     /
     /a/b
     /a/b/c

   The /a/b/c directory is stateid into two fields.

   o  An index into a real filesystem and is the shared resource.
   The security policy for /a/b/c is Kerberos with integrity.  The
   server should apply the same security policy to /, /a, and /a/b.
   This allows for the extension of the protection of the server's
   namespace to the ancestors of the real shared resource.

   For the case table of the use locking-state structures.

   o  A generation number which is incremented on each allocation of multiple, disjoint security mechanisms in
   the server's resources, the security a
      table entry for a particular object use.

   And then store in each table entry,

   o  The client ID with which the
   server's namespace should be the union of all security mechanisms of
   all direct descendants.

9.  File Locking and Share Reservations

   Integrating locking into the NFS protocol necessarily causes it to be
   stateful.  With stateid is associated.

   o  The current generation number for the inclusion (at most one) valid stateid
      sharing this index value.

   o  The filehandle of share reservations the protocol
   becomes substantially more dependent file on state than which the traditional
   combination of NFS and NLM (Network Lock Manager) [xnfs].  There locks are
   three components to making this state manageable: taken.

   o  clear division between client and server  An indication of the type of stateid (open, byte-range lock, file
      delegation).

   o  ability  The last "seqid" value returned corresponding to reliably detect inconsistency in state between client
      and server the current
      "other" value.

   o  simple and robust recovery mechanisms

   In this model,  An indication of the server owns current status of the state information.  The client
   requests changes in locks and the server responds associated with the changes
   made.  Non-client-initiated changes in locking state are infrequent.
   The client receives prompt notification of such changes
      this stateid.  In particular, whether these have been revoked and
      if so, for what reason.

   With this information, an incoming stateid can
   adjust its view of the locking state to reflect the server's changes.

   Individual pieces of state created by the server be validated and passed to the
   client at its request are represented by 128-bit stateids.  These
   appropriate error returned when necessary.  Special and non-special
   stateids may represent a particular open file, are handled separately.  (See Section 9.1.3.3 for a set
   discussion of byte-range
   locks held by special stateids.)

   When a particular owner, stateid is being tested, and the "other" field is all zeros or
   all ones, a recallable delegation of
   privileges to access check that the "other" and "seqid" fields match a file in particular ways or at defined
   combination for a particular
   location.

   In all cases, there special stateid is a transition from done and the most general information
   that represents a client results determined
   as a whole to follows:

   o  If the eventual lightweight
   stateid used for most client "other" and server locking interactions.  The
   details of this transition will vary "seqid" fields do not match a defined
      combination associated with a special stateid, the type of object error
      NFS4ERR_BAD_STATEID is returned.

   o  If the combination is valid in general but it
   always starts with a client ID.

   To support Win32 share reservations it is necessary to atomically
   OPEN or CREATE files.  Having a separate share/unshare operation
   would not allow correct implementation of the Win32 OpenFile API.  In
   order appropriate to correctly implement share semantics,
      the previous NFS
   protocol mechanisms context in which the stateid is used (e.g., an all-zero
      stateid is used when an open stateid is required in a file LOCK
      operation), the error NFS4ERR_BAD_STATEID is opened or created (LOOKUP,
   CREATE, ACCESS) need to be replaced.  The NFSv4 protocol has an OPEN
   operation that subsumes also returned.

   o  Otherwise, the NFSv3 methodology of LOOKUP, CREATE, check is completed and
   ACCESS.  However, because many operations require a filehandle, the
   traditional LOOKUP special stateid is preserved to map
      accepted as valid.

   When a file name to filehandle
   without establishing state on the server.  The policy of granting
   access or modifying files stateid is managed by being tested, and the server based on "other" field is neither all
   zeros or all ones, the
   client's state.  These mechanisms can implement policy ranging from
   advisory only locking following procedure could be used to full mandatory locking.

9.1.  Opens validate
   an incoming stateid and Byte-Range Locks

   It is assumed that manipulating a byte-range lock is rare return an appropriate error, when
   compared to READ and WRITE operations.  It is also assumed necessary,
   assuming that
   server restarts the "other" field would be divided into a table index
   and network partitions are relatively rare.
   Therefore it an entry generation.

   o  If the table index field is important that outside the READ and WRITE operations have a
   lightweight mechanism to indicate if they possess a held lock.  A
   byte-range lock request contains range of the heavyweight information required
   to establish a lock and uniquely define associated
      table, return NFS4ERR_BAD_STATEID.

   o  If the owner selected table entry is of a different generation than that
      specified in the lock.

   The following sections describe incoming stateid, return NFS4ERR_BAD_STATEID.

   o  If the transition from selected table entry does not match the heavy weight
   information to current filehandle,
      return NFS4ERR_BAD_STATEID.

   o  If the eventual stateid used for most client and server
   locking and represents revoked state or state lost as a result
      of lease interactions.

9.1.1.  Client ID

   For each LOCK request, the client must identify itself to expiration, then return NFS4ERR_EXPIRED,
      NFS4ERR_BAD_STATEID, or NFS4ERR_ADMIN_REVOKED, as appropriate.

   o  If the server.
   This stateid type is done not valid for the context in such which the
      stateid appears, return NFS4ERR_BAD_STATEID.  Note that a way as to allow stateid
      may be valid in general, but be invalid for correct lock
   identification and crash recovery.  A sequence of a SETCLIENTID
   operation followed by particular
      operation, as, for example, when a SETCLIENTID_CONFIRM operation stateid which doesn't represent
      byte-range locks is required passed to
   establish the identification onto the server.  Establishment of
   identification by a new incarnation of the client also has the effect
   of immediately breaking any leased state that a previous incarnation non-from_open case of the client might have had on the server, as opposed to forcing the
   new client incarnation to wait for the leases LOCK or to expire.  Breaking
   the lease state amounts
      LOCKU, or when a stateid which does not represent an open is
      passed to CLOSE or OPEN_DOWNGRADE.  In such cases, the server removing all lock, share
   reservation, and, where MUST
      return NFS4ERR_BAD_STATEID.

   o  If the server "seqid" field is not supporting zero, and it is greater than the
   CLAIM_DELEGATE_PREV claim type, all delegation state associated with
   same client with
      current sequence value corresponding the same identity.  For discussion of delegation
   state recovery, see Section 10.2.1.

   Owners of opens and owners of byte-range locks are separate entities
   and remain separate even if current "other" field,
      return NFS4ERR_BAD_STATEID.

   o  If the same opaque arrays are used to
   designate owners of each.  The protocol distinguishes between open-
   owners (represented by open_owner4 structures) and lock-owners
   (represented by lock_owner4 structures).

   Both sorts of owners consist of a clientid "seqid" field is less than the current sequence value
      corresponding the current "other" field, return
      NFS4ERR_OLD_STATEID.

   o  Otherwise, the stateid is valid and an opaque owner
   string.  For each client, the set table entry should contain
      any additional information about the type of distinct owner values used stateid and
      information associated with that client constitutes the set of owners particular type of that type, for stateid, such
      as the given
   client.

   Each open is associated with a specific set of locks, such as open-owner while each byte-
   range lock is associated with a lock-owner and an open-owner, the
   latter being the open-owner associated with lock-owner
      information, as well as information on the specific locks, such as
      open file under which
   the LOCK operation was done.

   Client identification is encapsulated in the following structure:

   struct nfs_client_id4 {
           verifier4       verifier;
           opaque          id<NFS4_OPAQUE_LIMIT>;
   };

   The first field, verifier is a client incarnation verifier that is
   used modes and byte ranges.

9.1.3.5.  Stateid Use for I/O Operations

   Clients performing Input/Output (I/O) operations need to detect client reboots.  Only if the verifier is different
   from that which the server has previously recorded select an
   appropriate stateid based on the client (as
   identified locks (including opens and
   delegations) held by the second field of the structure, id) does the server
   start client and the process various types of canceling state-owners
   sending the client's leased state.

   The second field, id is a variable length string I/O requests.  SETATTR operations that uniquely
   defines change the client.

   There file
   size are several considerations for how the client generates the id
   string:

   o treated like I/O operations in this regard.

   The string should be unique so that multiple clients do not
      present following rules, applied in order of decreasing priority, govern
   the same string.  The consequences selection of two clients
      presenting the same string range from one appropriate stateid.  In following these rules,
   the client getting will only consider locks of which it has actually received
   notification by an error
      to one client having its leased state abruptly and unexpectedly
      canceled. appropriate operation response or callback.

   o  The string should be selected so the subsequent incarnations
      (e.g., reboots) of  If the same client cause holds a delegation for the client to present file in question, the
      same string.  The implementor is cautioned against an approach
      that requires
      delegation stateid SHOULD be used.

   o  Otherwise, if the string entity corresponding to be recorded in the lock-owner (e.g., a local file because
      this precludes
      process) sending the use of I/O has a byte-range lock stateid for the implementation in an environment
      where
      associated open file, then the byte-range lock stateid for that
      lock-owner and open file SHOULD be used.

   o  If there is no local disk byte-range lock stateid, then the OPEN stateid for
      the current open-owner, and all that OPEN stateid for the open file access is from an NFSv4
      server.

   o  The string should in
      question SHOULD be different for each server network address
      that used.

   o  Finally, if none of the client accesses, rather than common to all server network
      addresses.  The reason is that it may not above apply, then a special stateid SHOULD
      be possible for used.

   Ignoring these rules may result in situations in which the
      client server
   does not have information necessary to tell if properly process the same server is listening on multiple network
      addresses.  If request.
   For example, when mandatory byte-range locks are in effect, if the client issues SETCLIENTID with
   stateid does not indicate the same id
      string to each network address of such proper lock-owner, via a server, the lock stateid,
   a request might be avoidably rejected.

   The server will
      think it is the same client, however should not try to enforce these ordering rules and each successive SETCLIENTID will
      cause the server
   should use whatever information is available to begin the properly process I/O
   requests.  In particular, when a client has a delegation for a given
   file, it SHOULD take note of removing the client's
      previous leased state.

   o  The algorithm this fact in processing a request, even
   if it is sent with a special stateid.

9.1.3.6.  Stateid Use for generating SETATTR Operations

   In the string should not assume case of SETATTR operations, a stateid is present.  In cases
   other than those that set the
      client's network address won't change.  This includes changes
      between client incarnations and even changes while file size, the client may send either a
   special stateid or, when a delegation is
      stilling running in its current incarnation.  This means that if
      the client includes just held for the client's and server's network address file in the id string, there is
   question, a real risk, after delegation stateid.  While the client gives up server SHOULD validate the network address, that another client, using a similar
      algorithm for generating
   stateid and may use the id string, will generate stateid to optimize the determination as to
   whether a
      conflicting id string.

   Given delegation is held, it SHOULD note the above considerations, an example presence of a well generated id
   string
   delegation even when a special stateid is one that includes:

   o  The server's network address.

   o  The client's network address.

   o  For sent, and MUST accept a user level NFSv4 client, it should contain additional
      information to distinguish
   valid delegation stateid when sent.

9.1.4.  lock-owner

   When requesting a lock, the client from other user level
      clients running on must present to the same host, such as server the
   client ID and an universally unique identifier (UUID).

   o  Additional information that tends to be unique, such as one or
      more of:

      *  The client machine's serial number (for privacy reasons, it is
         best to perform some one way function on for the serial number).

      *  A MAC address.

      *  The timestamp owner of when the NFSv4 software was first installed on
         the client (though this is subject requested lock.
   These two fields are referred to as the previously mentioned
         caution about using information that is stored in a file,
         because lock-owner and the file might only be accessible over NFSv4).

      * definition
   of those fields are:

   o  A true random number.  However since this number ought to be
         the same between client incarnations, this shares ID returned by the same
         problem server as that part of the using the timestamp client's use of
      the software
         installation.

   As a security measure, the server MUST NOT cancel SETCLIENTID operation.

   o  A variable length opaque array used to uniquely define the owner
      of a client's leased
   state if lock managed by the principal that established client.

      This may be a thread id, process id, or other unique value.

   When the state for server grants the lock, it responds with a given id
   string unique stateid.
   The stateid is not the same used as the principal issuing the SETCLIENTID.

   Note that SETCLIENTID and SETCLIENTID_CONFIRM has a secondary purpose
   of establishing shorthand reference to the information lock-owner, since
   the server needs to make callbacks to will be maintaining the client for purpose correspondence between them.

9.1.5.  Use of supporting delegations.  It is permitted to
   change this information via SETCLIENTID the Stateid and SETCLIENTID_CONFIRM
   within Locking

   All READ, WRITE and SETATTR operations contain a stateid.  For the same incarnation
   purposes of this section, SETATTR operations which change the client without removing the
   client's leased state.

   Once size
   attribute of a SETCLIENTID file are treated as if they are writing the area
   between the old and SETCLIENTID_CONFIRM sequence has successfully
   completed, new size (i.e., the client uses range truncated or added to
   the shorthand client identifier, of type
   clientid4, instead file by means of the longer and less compact nfs_client_id4
   structure.  This shorthand client identifier (a client ID) SETATTR), even where SETATTR is
   assigned by not
   explicitly mentioned in the server and should text.  The stateid passed to one of these
   operations must be chosen so one that represents an OPEN (e.g., via the open-
   owner), a set of byte-range locks, or a delegation, or it will not
   conflict with may be a client ID previously assigned by the server.  This
   applies across server restarts
   special stateid representing anonymous access or reboots.  When the special bypass
   stateid.

   If the state-owner performs a client ID is
   presented to READ or WRITE in a server and that client ID is not recognized, as would
   happen after situation in which
   it has established a server reboot, lock or share reservation on the server will reject the request with
   the error NFS4ERR_STALE_CLIENTID.  When this happens, the client must
   obtain (any
   OPEN constitutes a new client ID share reservation) the stateid (previously
   returned by use of the SETCLIENTID operation and then
   proceed server) must be used to any other necessary recovery for indicate what locks,
   including both byte-range locks and share reservations, are held by
   the server reboot case
   (See Section 9.6.2).

   The client must also employ state-owner.  If no state is established by the SETCLIENTID operation when it
   receives a NFS4ERR_STALE_STATEID error using client, either
   byte-range lock or share reservation, a stateid derived from
   its current client ID, since this also indicates a server reboot
   which has invalidated the existing client ID (see Section 9.6.2 for
   details).

   See the detailed descriptions of SETCLIENTID and SETCLIENTID_CONFIRM
   for all bits 0 is
   used.  Regardless whether a complete specification of the operations.

9.1.2.  Server Release stateid of Client ID

   If all bits 0, or a stateid
   returned by the server determines that is used, if there is a conflicting share
   reservation or mandatory byte-range lock held on the client holds no associated state
   for its client ID, file, the server may choose
   MUST refuse to release service the client ID.
   The server may make this choice for an inactive client so that
   resources READ or WRITE operation.

   Share reservations are not consumed established by those intermittently active clients.
   If the client contacts the server after this release, the server must
   ensure the client receives the appropriate error so OPEN operations and by their
   nature are mandatory in that it will use when the SETCLIENTID/SETCLIENTID_CONFIRM sequence to establish a new
   identity.  It should be clear OPEN denies READ or WRITE
   operations, that denial results in such operations being rejected
   with error NFS4ERR_LOCKED.  Byte-range locks may be implemented by
   the server must as either mandatory or advisory, or the choice of
   mandatory or advisory behavior may be very hesitant
   to release a client ID since determined by the resulting work server on the client to
   recover from such an event will be the same burden as if
   basis of the server
   had failed and restarted.  Typically a server would not release file being accessed (for example, some UNIX-based
   servers support a
   client ID unless there had been no activity from that client for many
   minutes.

   Note "mandatory lock bit" on the mode attribute such
   that if set, byte-range locks are required on the id string in a SETCLIENTID request file before I/O is properly
   constructed,
   possible).  When byte-range locks are advisory, they only prevent the
   granting of conflicting lock requests and if have no effect on READs or
   WRITEs.  Mandatory byte-range locks, however, prevent conflicting I/O
   operations.  When they are attempted, they are rejected with
   NFS4ERR_LOCKED.  When the client takes care gets NFS4ERR_LOCKED on a file it
   knows it has the proper share reservation for, it will need to use issue
   a LOCK request on the same principal
   for each successive use of SETCLIENTID, then, barring an active
   denial region of service attack, NFS4ERR_CLID_INUSE should never the file that includes the region the
   I/O was to be
   returned.

   However, client bugs, server bugs, or perhaps performed on, with an appropriate locktype (i.e.,
   READ*_LT for a deliberate change READ operation, WRITE*_LT for a WRITE operation).

   With NFSv3, there was no notion of a stateid so there was no way to
   tell if the principal owner application process of the id string (such as the case of a client
   that changes security flavors, and under sending the new flavor, READ or
   WRITE operation had also acquired the appropriate byte-range lock on
   the file.  Thus there is was no
   mapping way to implement mandatory locking.
   With the previous owner) will in rare cases result in
   NFS4ERR_CLID_INUSE.

   In stateid construct, this barrier has been removed.

   Note that event, when the server gets a SETCLIENTID for a client ID UNIX environments that currently has no state, or it has state, but support mandatory file locking,
   the lease has
   expired, rather than returning NFS4ERR_CLID_INUSE, distinction between advisory and mandatory locking is subtle.  In
   fact, advisory and mandatory byte-range locks are exactly the server MUST
   allow same in
   so far as the SETCLIENTID, APIs and confirm requirements on implementation.  If the new client ID
   mandatory lock attribute is set on the file, the server checks to see
   if followed by the lock-owner has an appropriate SETCLIENTID_CONFIRM.

9.1.3.  Stateid Definition

   When shared (read) or exclusive
   (write) byte-range lock on the region it wishes to read or write to.
   If there is no appropriate lock, the server grants checks if there is a
   conflicting lock (which can be done by attempting to acquire the
   conflicting lock on the behalf of any type (including opens, byte-
   range locks, the lock-owner, and delegations), it responds with a unique stateid that
   represents a set of locks (often a single lock) for if successful,
   release the same file, of lock after the same type, READ or WRITE is done), and sharing if there is,
   the same ownership characteristics. server returns NFS4ERR_LOCKED.

   For Windows environments, there are no advisory byte-range locks, so
   the server always checks for byte-range locks during I/O requests.

   Thus,
   opens the NFSv4 LOCK operation does not need to distinguish between
   advisory and mandatory byte-range locks.  It is the NFS version 4
   server's processing of the same file READ and WRITE operations that introduces
   the distinction.

   Every stateid other than the special stateid values noted in this
   section, whether returned by different open-owners each have an
   identifying stateid.  Similarly, each set of byte-range locks on a
   file owned OPEN-type operation (i.e., OPEN,
   OPEN_DOWNGRADE), or by a specific lock-owner has its own identifying stateid.
   Delegations also have associated stateids LOCK-type operation (i.e., LOCK or LOCKU),
   defines an access mode for the file (i.e., READ, WRITE, or READ-
   WRITE) as established by the original OPEN which they may be
   referenced.  The began the stateid is used
   sequence, and as modified by subsequent OPENs and OPEN_DOWNGRADEs
   within that stateid sequence.  When a shorthand reference to a lock READ, WRITE, or set of locks, and SETATTR which
   specifies the size attribute, is done, the operation is subject to
   checking against the access mode to verify that the operation is
   appropriate given a stateid, the server can determine OPEN with which the
   associated state-owner or state-owners (in operation is associated.

   In the case of an open-owner/
   lock-owner pair) WRITE-type operations (i.e., WRITEs and SETATTRs which
   set size), the associated filehandle.  When stateids are
   used, the current filehandle server must be the one associated with that
   stateid.

   All stateids associated with a given client ID are associated with a
   common lease verify that represents the claim of those stateids access mode allows writing
   and return an NFS4ERR_OPENMODE error if it does not.  In the
   objects they represent case, of
   READ, the server may perform the corresponding check on the access
   mode, or it may choose to be maintained by allow READ on opens for WRITE only, to
   accommodate clients whose write implementation may unavoidably do
   reads (e.g., due to buffer cache constraints).  However, even if
   READs are allowed in these circumstances, the server.  See
   Section 9.5 server MUST still check
   for a discussion locks that conflict with the READ (e.g., another open specifying
   denial of READs).  Note that a server which does enforce the lease.

   Each access
   mode check on READs need not explicitly check for conflicting share
   reservations since the existence of OPEN for read access guarantees
   that no conflicting share reservation can exist.

   A stateid must be unique of all bits 1 (one) MAY allow READ operations to bypass
   locking checks at the server.  Many  However, WRITE operations take with a
   stateid with bits all 1 (one) MUST NOT bypass locking checks and are
   treated exactly the same as an argument but not if a clientid, so the server must stateid of all bits 0 were used.

   A lock may not be able
   to infer the client from the stateid.

9.1.3.1.  Stateid Types

   With the exception granted while a READ or WRITE operation using one
   of the special stateids (see Section 9.1.3.3), each
   stateid represents locking objects of one of a set of types defined
   by is being performed and the NFSv4 protocol.  Note that in all these cases, where we speak range of guarantee, it is understood there are situations such as a client
   restart, or the lock revocation, that allow
   request conflicts with the guarantee to be voided.

   o  Stateids may represent opens range of files.

      Each stateid in this case represents the OPEN state for a given
      client ID/open-owner/filehandle triple.  Such stateids are subject
      to change (with consequent incrementing of READ or WRITE operation.  For
   the stateid's seqid) in
      response to OPENs that result in upgrade and OPEN_DOWNGRADE
      operations.

   o  Stateids may represent sets purposes of byte-range locks.

      All locks held on this paragraph, a particular file by conflict occurs when a particular owner shared lock
   is requested and all
      gotten under the aegis of a particular open file are associated
      with a single stateid with the seqid WRITE operation is being incremented whenever
      LOCK performed, or an
   exclusive lock is requested and LOCKU operations affect that set of locks.

   o  Stateids may represent file delegations, which are recallable
      guarantees by the server to the client, that other clients will
      not reference, either a READ or will not modify a particular file, until the
      delegation WRITE operation is returned.
   being performed.  A stateid represents SETATTR that sets size is treated similarly to a single delegation held
   WRITE as discussed above.

9.1.6.  Sequencing of Lock Requests

   Locking is different than most NFS operations as it requires "at-
   most-one" semantics that are not provided by ONC RPC.  ONC RPC over a client for a
      particular filehandle.

9.1.3.2.  Stateid Structure

   Stateids are divided into two fields,
   reliable transport is not sufficient because a 96-bit "other" field
   identifying sequence of locking
   requests may span multiple TCP connections.  In the specific set face of locks
   retransmission or reordering, lock or unlock requests must have a
   well defined and consistent behavior.  To accomplish this, each lock
   request contains a 32-bit "seqid" sequence
   value.  Except in the case of special stateids (see Section 9.1.3.3), number that is a particular value of consecutively increasing
   integer.  Different state-owners have different sequences.  The
   server maintains the "other" field denotes last sequence number (L) received and the
   response that was returned.  The server SHOULD assign a set of locks seqid value
   of one for the
   same type (for example, byte-range locks, opens, or delegations), first request issued for any given state-owner.

   Note that for requests that contain a specific file or directory, and sharing the same ownership
   characteristics.  The seqid designates sequence number, for each
   state-owner, there should be no more than one outstanding request.

   If a specific instance of such request (r) with a
   set of locks, and previous sequence number (r < L) is incremented to indicate changes in such a set of
   locks, either by received,
   it is rejected with the addition or deletion return of locks from the set, error NFS4ERR_BAD_SEQID.  Given a
   change in
   properly-functioning client, the byte-range they apply to, or an upgrade or downgrade in response to (r) must have been
   received before the type of one or more locks.

   When such last request (L) was sent.  If a set duplicate of locks
   last request (r == L) is first created, received, the server SHOULD return stored response is returned.

   If a
   stateid request beyond the next sequence (r == L + 2) is received, it is
   rejected with seqid value of one.  On subsequent operations that
   modify the set return of locks, the server error NFS4ERR_BAD_SEQID.  Sequence
   history is required to increment the
   "seqid" field by one reinitialized whenever it returns a stateid for the same
   state-owner/file/type combination and there is some change in SETCLIENTID/SETCLIENTID_CONFIRM
   sequence changes the set
   of locks actually designated.  In this case, client verifier.

   Since the server will return a
   stateid sequence number is represented with an "other" field unsigned 32-bit
   integer, the same as previously used for that
   state-owner/file/type combination, arithmetic involved with an incremented "seqid" field.
   This pattern continues until the seqid sequence number is incremented past
   NFS4_UINT32_MAX, mod
   2^32.  Note that when the seqid wraps, it SHOULD bypass zero and use
   one (not zero) SHOULD be as the next seqid value.
   The purpose of the incrementing  For an example of the seqid modulo arithmetic
   involving sequence numbers see [RFC0793].

   It is to allow critical the server
   to communicate maintain the last response sent to the
   client to provide a more reliable cache of duplicate non-idempotent
   requests than that of the order traditional cache described in which operations that
   modified locking state associated with [Chet].  The
   traditional duplicate request cache uses a stateid have been processed.

   In making comparisons between seqids, both by least recently used
   algorithm for removing unneeded requests.  However, the last lock
   request and response on a given state-owner must be cached as long as
   the lock state exists on the server.

   The client in
   determining MUST monotonically increment the order of operations and by sequence number for the server
   CLOSE, LOCK, LOCKU, OPEN, OPEN_CONFIRM, and OPEN_DOWNGRADE
   operations.  This is true even in determining
   whether the NFS4ERR_OLD_STATEID is event that the previous
   operation that used the sequence number received an error.  The only
   exception to be returned, this rule is if the possibility previous operation received one of
   the seqid being swapped around past following errors: NFS4ERR_STALE_CLIENTID, NFS4ERR_STALE_STATEID,
   NFS4ERR_BAD_STATEID, NFS4ERR_BAD_SEQID, NFS4ERR_BADXDR,
   NFS4ERR_RESOURCE, NFS4ERR_NOFILEHANDLE, or NFS4ERR_MOVED.

9.1.7.  Recovery from Replayed Requests

   As described above, the NFS4_UINT32_MAX value needs
   to be taken into account.

9.1.3.3.  Special Stateids

   Stateid values whose "other" field sequence number is either all zeros or all ones
   are reserved.  They may not be assigned by per state-owner.  As long
   as the server but have
   special meanings defined by the protocol.  The particular meaning
   depends on whether maintains the "other" field is all zeros or all ones last sequence number received and follows
   the
   specific value methods described above, there are no risks of the "seqid" field. a Byzantine router
   re-sending old requests.  The following combinations of "other" and "seqid" are defined in
   NFSv4:

   o  When "other" and "seqid" are both zero, server need only maintain the stateid is treated (state-
   owner, sequence number) state as long as there are open files or
   closed files with locks outstanding.

   LOCK, LOCKU, OPEN, OPEN_DOWNGRADE, and CLOSE each contain a special anonymous stateid, which can be used in READ, WRITE, sequence
   number and
      SETATTR requests to indicate therefore the absence risk of any open state
      associated with the request.  When an anonymous stateid value is
      used, and an existing open denies the form replay of access requested,
      then access will be denied to these operations
   resulting in undesired effects is non-existent while the request.

   o  When "other" and "seqid" are both all ones, server
   maintains the stateid is state-owner state.

9.1.8.  Interactions of multiple sequence values

   Some Operations may have multiple sources of data for request
   sequence checking and retransmission determination.  Some Operations
   have multiple sequence values associated with multiple types of
   state-owners.  In addition, such Operations may also have a
      special READ bypass stateid. stateid
   with its own seqid value, that will be checked for validity.

   As noted above, there may be multiple sequence values to check.  The
   following rules should be followed by the server in processing these
   multiple sequence values within a single operation.

   o  When this a sequence value associated with a state-owner is used in WRITE or
      SETATTR, it unavailable
      for checking because the state-owner is treated like unknown to the anonymous value.  When used server, it
      takes no part in
      READ, the server MAY grant access, even if access would normally
      be denied to READ requests.

   If comparison.

   o  When any of the state-owner sequence values are invalid,
      NFS4ERR_BAD_SEQID is returned.  When a stateid value sequence is used which has all zero
      checked, NFS4ERR_BAD_STATEID, or all ones in the
   "other" field, NFS4ERR_OLD_STATEID is returned
      as appropriate, but does not match NFS4ERR_BAD_SEQID has priority.

   o  When any one of the cases above, the server
   MUST return the error NFS4ERR_BAD_STATEID.

   Special stateids, unlike other stateids, are not associated with
   individual client IDs or filehandles and can be used with all valid
   client IDs and filehandles.

9.1.3.4.  Stateid Lifetime and Validation

   Stateids must remain valid until either sequence values matches a client restart or previous request,
      for a server
   restart or until state-owner, it is treated as a retransmission and not re-
      executed.  When the client returns all type of the locks associated with
   the stateid by means of an operation such as CLOSE or DELEGRETURN.
   If does not match that
      originally used, NFS4ERR_BAD_SEQID is returned.  When the locks are lost due to revocation as long as server
      can determine that the client ID is
   valid, request differs from the stateid remains a valid designation original it may
      return NFS4ERR_BAD_SEQID.

   o  When multiple of that revoked state.
   Stateids associated with byte-range locks are an exception.  They
   remain valid even if a LOCKU frees all remaining locks, so long as the open file with which they are associated remains open.

   It should be noted that there are situations in which sequence values match previous operations,
      but the client's
   locks become invalid, without operations are not the client requesting they be same, NFS4ERR_BAD_SEQID is
      returned.
   These include lease expiration

   o  When there are no available sequence values available for
      comparison and a number of forms of lock
   revocation within the lease period.  It operation is important to note that in
   these situations, an OPEN, the stateid remains valid and server indicates to
      the client can use that an OPEN_CONFIRM is required, unless it
   to can
      conclusively determine the disposition of the associated lost locks.

   An "other" value must never be reused that confirmation is not required (e.g., by
      knowing that no open-owner state has ever been released for the
      current clientid).

9.1.9.  Releasing state-owner State

   When a different purpose (i.e.
   different filehandle, owner, particular state-owner no longer holds open or type of locks) within file locking
   state at the server, the context of
   a single client ID.  A server may retain choose to release the "other" value for sequence
   number state associated with the
   same purpose beyond state-owner.  The server may make
   this choice based on lease expiration, for the point where it reclamation of server
   memory, or other implementation specific details.  Note that when
   this is done, a retransmitted request, normally identified by a
   matching state-owner sequence may otherwise not be freed but if
   it does so, correctly recognized, so
   that the client will not receive the original response that it must maintain "seqid" continuity with previous values.

   One mechanism would
   have if the state-owner state was not released.

   If the server were able to be sure that may a given state-owner would
   never again be used to satisfy by a client, such an issue could not arise.  Even
   when the requirement that state-owner state is released and the
   server recognize client subsequently
   uses that state-owner, retransmitted requests will be detected as
   invalid and out-of-date stateids is for the server
   to divide the "other" field of request not executed, although the stateid into two fields.

   o  An index into client may have a table of locking-state structures.

   o  A generation number which
   recovery path that is incremented on each allocation of a
      table entry for a particular use.

   And then store in each table entry,

   o  The client ID with which more complicated than simply getting the stateid
   original response back transparently.

   In any event, the server is associated.

   o  The current generation number for able to safely release state-owner state
   (in the (at most one) valid stateid
      sharing this index value.

   o  The filehandle of sense that retransmitted requests will not be erroneously
   acted upon) when the file on which state-owner no currently being utilized by the locks
   client (i.e., there are taken.

   o  An indication of the type of stateid (open, byte-range lock, file
      delegation).

   o no open files associated with an open-owner
   and no lock stateids associated with a lock-owner).  The last "seqid" value returned corresponding server may
   choose to hold the current
      "other" value.

   o  An indication of state-owner state in order to simplify the current status
   recovery path, in the case in which retransmissions of currently
   active requests are received.  However, the locks associated with period it chooses to hold
   this stateid. state is implementation specific.

   In particular, whether these have been revoked and
      if so, for what reason.

   With this information, an incoming stateid can be validated and the
   appropriate error returned when necessary.  Special and non-special
   stateids are handled separately.  (See Section 9.1.3.3 for a
   discussion of special stateids.)

   When case that a stateid LOCK, LOCKU, OPEN_DOWNGRADE, or CLOSE is being tested, and
   retransmitted after the "other" field is all zeros or
   all ones, a check server has previously released the state-
   owner state, the server will find that the "other" and "seqid" fields match a defined
   combination for a special stateid is done state-owner has no files
   open and an error will be returned to the results determined
   as follows:

   o client.  If the "other" and "seqid" fields do not match a defined
      combination associated with state-owner
   does have a special stateid, file open, the stateid will not match and again an error
      NFS4ERR_BAD_STATEID is returned.

   o  If the combination is valid in general but
   is not appropriate returned to the context in which client.

9.1.10.  Use of Open Confirmation

   In the stateid is used (e.g., case that an all-zero
      stateid OPEN is used when an open stateid is required in a LOCK
      operation), the error NFS4ERR_BAD_STATEID is also returned.

   o  Otherwise, the check is completed retransmitted and the special stateid is
      accepted as valid.

   When a stateid open-owner is being tested, and
   used for the "other" field is neither all
   zeros first time or all ones, the following procedure could be used to validate
   an incoming stateid and return an appropriate error, when necessary,
   assuming that open-owner state has been previously
   released by the "other" field would be divided into a table index
   and an entry generation.

   o  If server, the table index field is outside use of the range OPEN_CONFIRM operation will
   prevent incorrect behavior.  When the server observes the use of the associated
      table, return NFS4ERR_BAD_STATEID.

   o  If
   open-owner for the selected table entry is first time, it will direct the client to perform
   the OPEN_CONFIRM for the corresponding OPEN.  This sequence
   establishes the use of a different generation than that
      specified in open-owner and associated sequence number.
   Since the incoming stateid, return NFS4ERR_BAD_STATEID.

   o  If OPEN_CONFIRM sequence connects a new open-owner on the selected table entry does not match
   server with an existing open-owner on a client, the current filehandle,
      return NFS4ERR_BAD_STATEID.

   o  If sequence number
   may have any value.  The OPEN_CONFIRM step assures the stateid represents revoked state or state lost as a result
      of lease expiration, then return NFS4ERR_EXPIRED,
      NFS4ERR_BAD_STATEID, or NFS4ERR_ADMIN_REVOKED, as appropriate.

   o  If server that
   the stateid type value received is not valid for the context correct one. (see Section 15.20 for further
   details.)

   There are a number of situations in which the
      stateid appears, return NFS4ERR_BAD_STATEID.  Note requirement to confirm
   an OPEN would pose difficulties for the client and server, in that a stateid
      may
   they would be valid prevented from acting in general, but a timely fashion on
   information received, because that information would be invalid provisional,
   subject to deletion upon non-confirmation.  Fortunately, these are
   situations in which the server can avoid the need for confirmation
   when responding to open requests.  The two constraints are:

   o  The server must not bestow a particular
      operation, as, delegation for example, when any open which would
      require confirmation.

   o  The server MUST NOT require confirmation on a stateid reclaim-type open
      (i.e., one specifying claim type CLAIM_PREVIOUS or
      CLAIM_DELEGATE_PREV).

   These constraints are related in that reclaim-type opens are the only
   ones in which doesn't represent
      byte-range locks is passed to the non-from_open case of LOCK or server may be required to
      LOCKU, or when send a stateid which does not represent delegation.  For
   CLAIM_NULL, sending the delegation is optional while for
   CLAIM_DELEGATE_CUR, no delegation is sent.

   Delegations being sent with an open is
      passed requiring confirmation are
   troublesome because recovering from non-confirmation adds undue
   complexity to CLOSE or OPEN_DOWNGRADE.  In such cases, the server MUST
      return NFS4ERR_BAD_STATEID.

   o  If protocol while requiring confirmation on reclaim-
   type opens poses difficulties in that the "seqid" field is not zero, and it is greater than inability to resolve the
      current sequence value corresponding
   status of the current "other" field,
      return NFS4ERR_BAD_STATEID.

   o  If reclaim until lease expiration may make it difficult to
   have timely determination of the "seqid" field set of locks being reclaimed (since
   the grace period may expire).

   Requiring open confirmation on reclaim-type opens is less than avoidable
   because of the current sequence value
      corresponding nature of the current "other" field, return
      NFS4ERR_OLD_STATEID.

   o  Otherwise, the stateid environments in which such opens are
   done.  For CLAIM_PREVIOUS opens, this is valid and the table entry immediately after server
   reboot, so there should contain
      any additional information about the type of stateid be no time for open-owners to be created,
   found to be unused, and
      information associated recycled.  For CLAIM_DELEGATE_PREV opens, we
   are dealing with that particular type either a client reboot situation or a network
   partition resulting in deletion of stateid, such
      as the associated set lease state (and returning
   NFS4ERR_EXPIRED).  A server which supports delegations can be sure
   that no open-owners for that client have been recycled since client
   initialization or deletion of locks, such as open-owner and lock-owner
      information, as well as information on the specific locks, such as
      open modes lease state and byte ranges.

9.1.3.5.  Stateid Use for I/O Operations

   Clients performing I/O operations need thus can ensure that
   confirmation will not be required.

9.2.  Lock Ranges

   The protocol allows a lock owner to select an appropriate
   stateid based on the locks (including opens and delegations) held by
   the client request a lock with a byte range
   and the various types then either upgrade or unlock a sub-range of state-owners sending the I/O
   requests.  SETATTR operations initial lock.
   It is expected that change the file size are treated
   like I/O operations in this regard.

   The following rules, applied in order of decreasing priority, govern
   the selection of the appropriate stateid.  In following these rules,
   the client will only consider locks of which it has actually received
   notification by be an appropriate operation response uncommon type of request.  In any
   case, servers or callback.

   o  If the client holds a delegation for the server file in question, the
      delegation stateid SHOULD systems may not be used.

   o  Otherwise, if the entity corresponding able to support sub-
   range lock semantics.  In the lock-owner (e.g., event that a
      process) sending the I/O has server receives a byte-range lock stateid locking
   request that represents a sub-range of current locking state for the
      associated open file, then the byte-range lock stateid for that
      lock-owner and open file SHOULD be used.

   o  If there is no byte-range
   lock stateid, then owner, the OPEN stateid for server is allowed to return the current open-owner, and error
   NFS4ERR_LOCK_RANGE to signify that OPEN stateid for it does not support sub-range lock
   operations.  Therefore, the open file in
      question SHOULD client should be used.

   o  Finally, prepared to receive this
   error and, if none of appropriate, report the above apply, then a special stateid SHOULD error to the requesting
   application.

   The client is discouraged from combining multiple independent locking
   ranges that happen to be used.

   Ignoring these rules may result in situations in which adjacent into a single request since the
   server
   does may not have information necessary support sub-range requests and for reasons related to properly process
   the request.
   For example, when mandatory byte-range locks are recovery of file locking state in effect, if the
   stateid does not indicate event of server failure.
   As discussed in the Section 9.6.2 below, the proper lock-owner, via a lock stateid,
   a request might be avoidably rejected.

   The server however should not try may employ
   certain optimizations during recovery that work effectively only when
   the client's behavior during lock recovery is similar to enforce these ordering rules and
   should use whatever information is available the client's
   locking behavior prior to properly process I/O
   requests.  In particular, when server failure.

9.3.  Upgrading and Downgrading Locks

   If a client has a delegation for write lock on a given
   file, record, it SHOULD take note can request an atomic
   downgrade of this fact in processing the lock to a read lock via the LOCK request, even
   if it is sent with a special stateid.

9.1.3.6.  Stateid Use for SETATTR Operations

   In by setting
   the case of SETATTR operations, a stateid is present.  In cases
   other than those that set type to READ_LT.  If the file size, server supports atomic downgrade, the
   request will succeed.  If not, it will return NFS4ERR_LOCK_NOTSUPP.
   The client may send either should be prepared to receive this error, and if
   appropriate, report the error to the requesting application.

   If a
   special stateid or, when client has a delegation is held for read lock on a record, it can request an atomic
   upgrade of the file in
   question, lock to a delegation stateid.  While write lock via the LOCK request by setting
   the type to WRITE_LT or WRITEW_LT.  If the server SHOULD validate does not support
   atomic upgrade, it will return NFS4ERR_LOCK_NOTSUPP.  If the
   stateid upgrade
   can be achieved without an existing conflict, the request will
   succeed.  Otherwise, the server will return either NFS4ERR_DENIED or
   NFS4ERR_DEADLOCK.  The error NFS4ERR_DEADLOCK is returned if the
   client issued the LOCK request with the type set to WRITEW_LT and may use the stateid
   server has detected a deadlock.  The client should be prepared to optimize
   receive such errors and if appropriate, report the determination as error to
   whether a delegation is held, it SHOULD note the presence
   requesting application.

9.4.  Blocking Locks

   Some clients require the support of blocking locks.  The NFS version
   4 protocol must not rely on a
   delegation even when a special stateid is sent, callback mechanism and MUST accept therefore is
   unable to notify a
   valid delegation stateid client when sent.

9.1.4.  lock-owner

   When requesting a lock, the client must present previously denied lock has been
   granted.  Clients have no choice but to the server the
   client ID and an identifier continually poll for the owner of the requested
   lock.
   These two fields  This presents a fairness problem.  Two new lock types are referred
   added, READW and WRITEW, and are used to indicate to as the lock-owner and server that
   the definition
   of those fields are:

   o  A client ID returned by the is requesting a blocking lock.  The server as part should maintain
   an ordered list of pending blocking locks.  When the client's use of conflicting lock
   is released, the SETCLIENTID operation.

   o  A variable length opaque array used server may wait the lease period for the first
   waiting client to uniquely define re-request the owner
      of a lock managed by lock.  After the client.

      This may be a thread id, process id, or other unique value.

   When lease period
   expires the server grants next waiting client request is allowed the lock, lock.  Clients
   are required to poll at an interval sufficiently small that it responds with is
   likely to acquire the lock in a unique stateid. timely manner.  The stateid server is used as not
   required to maintain a shorthand reference list of pending blocked locks as it is not
   used to provide correct operation but only to increase fairness.
   Because of the lock-owner, since
   the server will unordered nature of crash recovery, storing of lock
   state to stable storage would be maintaining the correspondence between them.

9.1.5.  Use required to guarantee ordered
   granting of blocking locks.

   Servers may also note the Stateid and Locking

   All READ, WRITE lock types and SETATTR operations contain delay returning denial of
   the request to allow extra time for a stateid.  For conflicting lock to be
   released, allowing a successful return.  In this way, clients can
   avoid the
   purposes burden of this section, SETATTR operations which change needlessly frequent polling for blocking locks.
   The server should take care in the size
   attribute length of a file are treated as if they are writing delay in the area
   between event the old
   client retransmits the request.

   If a server receives a blocking lock request, denies it, and new size (i.e., then
   later receives a nonblocking request for the range truncated or added to same lock, which is also
   denied, then it should remove the file by means lock in question from its list of
   pending blocking locks.  Clients should use such a nonblocking
   request to indicate to the SETATTR), even where SETATTR server that this is not
   explicitly mentioned in the text.  The stateid passed last time they
   intend to one of these
   operations must be one that represents an OPEN (e.g., via poll for the open-
   owner), a set of byte-range locks, or a delegation, or it lock, as may be a
   special stateid representing anonymous access or happen when the special bypass
   stateid.

   If process
   requesting the state-owner performs a READ or WRITE in lock is interrupted.  This is a situation in which courtesy to the
   server, to prevent it has established from unnecessarily waiting a lease period
   before granting other lock or share reservation requests.  However, clients are not
   required to perform this courtesy, and servers must not depend on the server (any
   OPEN constitutes a share reservation) the stateid (previously
   returned by the server)
   them doing so.  Also, clients must be used prepared for the possibility
   that this final locking request will be accepted.

9.5.  Lease Renewal

   The purpose of a lease is to indicate what locks,
   including both byte-range allow a server to remove stale locks and share reservations,
   that are held by a client that has crashed or is otherwise
   unreachable.  It is not a mechanism for cache consistency and lease
   renewals may not be denied if the state-owner.  If no state lease interval has not expired.

   The client can implicitly provide a positive indication that it is established by
   still active and that the associated state held at the server, for
   the client, either
   byte-range lock is still valid.  Any operation made with a valid clientid
   (DELEGPURGE, LOCK, LOCKT, OPEN, RELEASE_LOCKOWNER, or RENEW) or share reservation, a
   valid stateid (CLOSE, DELEGRETURN, LOCK, LOCKU, OPEN, OPEN_CONFIRM,
   OPEN_DOWNGRADE, READ, SETATTR, or WRITE) informs the server to renew
   all of the leases for that client (i.e., all bits 0 is
   used.  Regardless whether those sharing a given
   client ID).  In the latter case, the stateid must not be one of the
   special stateids consisting of all bits 0, 0 or a stateid
   returned by the server is used, all bits 1.

   Note that if there is a conflicting share
   reservation or mandatory byte-range lock held on the file, client had restarted or rebooted, the server
   MUST refuse to service client would
   not be making these requests without issuing the READ SETCLIENTID/
   SETCLIENTID_CONFIRM sequence.  The use of the SETCLIENTID/
   SETCLIENTID_CONFIRM sequence (one that changes the client verifier)
   notifies the server to drop the locking state associated with the
   client.  SETCLIENTID/SETCLIENTID_CONFIRM never renews a lease.

   If the server has rebooted, the stateids (NFS4ERR_STALE_STATEID
   error) or WRITE operation.

   Share reservations the client ID (NFS4ERR_STALE_CLIENTID error) will not be
   valid hence preventing spurious renewals.

   This approach allows for low overhead lease renewal which scales
   well.  In the typical case no extra RPC calls are established by OPEN operations required for lease
   renewal and in the worst case one RPC is required every lease period
   (i.e., a RENEW operation).  The number of locks held by their
   nature are mandatory the client is
   not a factor since all state for the client is involved with the
   lease renewal action.

   Since all operations that create a new lease also renew existing
   leases, the server must maintain a common lease expiration time for
   all valid leases for a given client.  This lease time can then be
   easily updated upon implicit lease renewal actions.

9.6.  Crash Recovery

   The important requirement in crash recovery is that both the client
   and the server know when the OPEN denies READ other has failed.  Additionally, it is
   required that a client sees a consistent view of data across server
   restarts or reboots.  All READ and WRITE
   operations, that denial results in such operations being rejected
   with error NFS4ERR_LOCKED.  Byte-range locks that may be implemented by have
   been queued within the server as either mandatory or advisory, client or network buffers must wait until the choice of
   mandatory or advisory behavior may be determined by
   client has successfully recovered the server on locks protecting the
   basis of READ and
   WRITE operations.

9.6.1.  Client Failure and Recovery

   In the file being accessed (for example, some UNIX-based
   servers support event that a "mandatory lock bit" on client fails, the mode attribute such
   that if set, byte-range server may recover the client's
   locks are required on when the file before I/O is
   possible).  When byte-range associated leases have expired.  Conflicting locks are advisory, they
   from another client may only prevent be granted after this lease expiration.
   If the
   granting client is able to restart or reinitialize within the lease
   period the client may be forced to wait the remainder of conflicting the lease
   period before obtaining new locks.

   To minimize client delay upon restart, open and lock requests and have no effect on READs or
   WRITEs.  Mandatory byte-range locks, however, prevent conflicting I/O
   operations.  When they are attempted, they are rejected
   associated with
   NFS4ERR_LOCKED.  When an instance of the client gets NFS4ERR_LOCKED on a file it
   knows it has the proper share reservation for, it will need to issue by a LOCK request on the region client supplied
   verifier.  This verifier is part of the file that includes the region initial SETCLIENTID call made
   by the
   I/O was to be performed on, with an appropriate locktype (i.e.,
   READ*_LT for client.  The server returns a READ operation, WRITE*_LT for client ID as a WRITE operation).

   With NFSv3, there was no notion result of a stateid so there was no way to
   tell if the application process
   SETCLIENTID operation.  The client then confirms the use of the
   client sending the READ or
   WRITE operation had also acquired the appropriate byte-range lock on ID with SETCLIENTID_CONFIRM.  The client ID in combination
   with an opaque owner field is then used by the file.  Thus there was no way client to implement mandatory locking.

   With identify the stateid construct, this barrier has been removed.

   Note that
   open owner for UNIX environments OPEN.  This chain of associations is then used to
   identify all locks for a particular client.

   Since the verifier will be changed by the client upon each
   initialization, the server can compare a new verifier to the verifier
   associated with currently held locks and determine that support mandatory file locking, they do not
   match.  This signifies the distinction between advisory client's new instantiation and mandatory subsequent
   loss of locking state.  As a result, the server is subtle.  In
   fact, advisory and mandatory byte-range free to release
   all locks held which are exactly associated with the old client ID which was
   derived from the old verifier.

   Note that the verifier must have the same in
   so far as uniqueness properties of
   the APIs verifier for the COMMIT operation.

9.6.2.  Server Failure and requirements on implementation. Recovery

   If the
   mandatory lock attribute is set on server loses locking state (usually as a result of a restart
   or reboot), it must allow clients time to discover this fact and re-
   establish the file, lost locking state.  The client must be able to re-
   establish the locking state without having the server checks to see
   if deny valid
   requests because the lock-owner server has an appropriate shared (read) or exclusive
   (write) byte-range lock on the region it wishes granted conflicting access to read or write to.
   If another
   client.  Likewise, if there is no appropriate lock, the possibility that clients have not
   yet re-established their locking state for a file, the server checks if there must
   disallow READ and WRITE operations for that file.  The duration of
   this recovery period is a
   conflicting lock (which can be done by attempting equal to acquire the
   conflicting lock on duration of the behalf lease period.

   A client can determine that server failure (and thus loss of locking
   state) has occurred, when it receives one of two errors.  The
   NFS4ERR_STALE_STATEID error indicates a stateid invalidated by a
   reboot or restart.  The NFS4ERR_STALE_CLIENTID error indicates a
   client ID invalidated by reboot or restart.  When either of these are
   received, the lock-owner, client must establish a new client ID (see
   Section 9.1.1) and if successful,
   release re-establish the lock after locking state as discussed below.

   The period of special handling of locking and READs and WRITEs, equal
   in duration to the READ or WRITE lease period, is done), and if there is, referred to as the server returns NFS4ERR_LOCKED.

   For Windows environments, there are no advisory byte-range locks, so "grace
   period".  During the server always checks for byte-range grace period, clients recover locks during I/O requests.

   Thus, and the NFSv4
   associated state by reclaim-type locking requests (i.e., LOCK operation does not need
   requests with reclaim set to distinguish between
   advisory true and mandatory byte-range locks.  It is the NFS version 4
   server's processing OPEN operations with a claim
   type of either CLAIM_PREVIOUS or CLAIM_DELEGATE_PREV).  During the
   grace period, the server must reject READ and WRITE operations that introduces
   the distinction.

   Every stateid and
   non-reclaim locking requests (i.e., other than the special stateid values noted in this
   section, whether returned by an OPEN-type operation (i.e., OPEN,
   OPEN_DOWNGRADE), or by a LOCK-type operation (i.e., LOCK or LOCKU),
   defines an access mode for the file (i.e., READ, WRITE, or READ-
   WRITE) as established by the original OPEN which began the stateid
   sequence, and as modified by subsequent OPENs and OPEN_DOWNGRADEs
   within that stateid sequence.  When a READ, WRITE, or SETATTR which
   specifies the size attribute, is done, the operation is subject to
   checking against the access mode to verify that the operation is
   appropriate given the OPEN operations)
   with which the operation is associated.

   In the case an error of WRITE-type operations (i.e., WRITEs and SETATTRs which
   set size), NFS4ERR_GRACE.

   If the server must verify can reliably determine that granting a non-reclaim
   request will not conflict with reclamation of locks by other clients,
   the access mode allows writing
   and return an NFS4ERR_OPENMODE NFS4ERR_GRACE error if it does not.  In not have to be returned and the case, of
   READ, non-
   reclaim client request can be serviced.  For the server may perform the corresponding check on the access
   mode, or it may choose to allow be able to
   service READ on opens for and WRITE only, to
   accommodate clients whose write implementation may unavoidably do
   reads (e.g., due to buffer cache constraints).  However, even if
   READs are allowed in these circumstances, operations during the server MUST still check
   for locks grace period, it must
   again be able to guarantee that no possible conflict with could arise
   between an impending reclaim locking request and the READ (e.g., another open specify
   denial of READs).  Note or WRITE
   operation.  If the server is unable to offer that guarantee, the
   NFS4ERR_GRACE error must be returned to the client.

   For a server which does enforce to provide simple, valid handling during the access
   mode check on READs need not explicitly check for conflicting share
   reservations since grace
   period, the existence of OPEN for read access guarantees
   that no conflicting share reservation can exist.

   A stateid of easiest method is to simply reject all bits 1 (one) MAY allow non-reclaim
   locking requests and READ and WRITE operations to bypass
   locking checks at by returning the server.
   NFS4ERR_GRACE error.  However, WRITE operations with a
   stateid with bits all 1 (one) MUST NOT bypass locking checks and are
   treated exactly server may keep information about
   granted locks in stable storage.  With this information, the same as server
   could determine if a stateid of all bits 0 were used.

   A regular lock may not be granted while a or READ or WRITE operation using one can be
   safely processed.

   For example, if a count of the special stateids locks on a given file is being performed and available in
   stable storage, the range of server can track reclaimed locks for the lock
   request conflicts with file and
   when all reclaims have been processed, non-reclaim locking requests
   may be processed.  This way the range of server can ensure that non-reclaim
   locking requests will not conflict with potential reclaim requests.
   With respect to I/O requests, if the READ server is able to determine that
   there are no outstanding reclaim requests for a file by information
   from stable storage or WRITE operation.  For another similar mechanism, the purposes processing of this paragraph, a conflict occurs when
   I/O requests could proceed normally for the file.

   To reiterate, for a shared server that allows non-reclaim lock
   is requested and a WRITE I/O
   requests to be processed during the grace period, it MUST determine
   that no lock subsequently reclaimed will be rejected and that no lock
   subsequently reclaimed would have prevented any I/O operation is being performed, or an
   exclusive
   processed during the grace period.

   Clients should be prepared for the return of NFS4ERR_GRACE errors for
   non-reclaim lock is requested and either a READ or I/O requests.  In this case the client should
   employ a WRITE operation is
   being performed. retry mechanism for the request.  A SETATTR that sets size is treated similarly delay (on the order of
   several seconds) between retries should be used to a
   WRITE as discussed above.

9.1.6.  Sequencing avoid overwhelming
   the server.  Further discussion of Lock Requests

   Locking the general issue is different than most NFS operations as it requires "at-
   most-one" semantics included in
   [Floyd].  The client must account for the server that are not provided by ONCRPC.  ONCRPC over a
   reliable transport is not sufficient because a sequence of able to
   perform I/O and non-reclaim locking requests may span multiple TCP connections.  In within the face of
   retransmission or reordering, lock or unlock requests must have a grace period
   as well defined and consistent behavior.  To accomplish this, each lock
   request contains a sequence number as those that is a consecutively increasing
   integer.  Different state-owners have different sequences.  The
   server maintains cannot do so.

   A reclaim-type locking request outside the last sequence number (L) received and server's grace period can
   only succeed if the
   response server can guarantee that was returned.  The no conflicting lock or
   I/O request has been granted since reboot or restart.

   A server is free to assign any may, upon restart, establish a new value for the first request issued lease
   period.  Therefore, clients should, once a new client ID is
   established, refetch the lease_time attribute and use it as the basis
   for any given state-owner.

   Note that lease renewal for requests the lease associated with that contain server.
   However, the server must establish, for this restart event, a sequence number, grace
   period at least as long as the lease period for each
   state-owner, there should the previous server
   instantiation.  This allows the client state obtained during the
   previous server instance to be no more than one outstanding request. reliably re-established.

9.6.3.  Network Partitions and Recovery

   If a request (r) with a previous sequence number (r < L) is received,
   it is rejected with the return duration of error NFS4ERR_BAD_SEQID.  Given a
   properly-functioning client, network partition is greater than the response to (r) must lease
   period provided by the server, the server will have been not received before the last request (L) was sent.  If a duplicate of
   last request (r == L) is received,
   lease renewal from the stored response is returned. client.  If a request beyond this occurs, the next sequence (r == L + 2) is received, it is
   rejected with server may cancel
   the return of error NFS4ERR_BAD_SEQID.  Sequence
   history is reinitialized whenever lease and free all locks held for the SETCLIENTID/SETCLIENTID_CONFIRM
   sequence changes client.  As a result, all
   stateids held by the client verifier.

   Since will become invalid or stale.  Once the sequence number
   client is represented able to reach the server after such a network partition,
   all I/O submitted by the client with an unsigned 32-bit
   integer, the arithmetic involved now invalid stateids will
   fail with the sequence number is mod
   2^32.  For an example of modulo arithmetic involving sequence numbers
   see [RFC0793].

   It is critical the server maintain returning the last response sent to error NFS4ERR_EXPIRED.  Once this
   error is received, the client to provide a more reliable cache of duplicate non-idempotent
   requests than will suitably notify the application
   that of held the traditional cache described in [Chet].  The
   traditional duplicate request cache uses lock.

9.6.3.1.  Courtesy Locks

   As a least recently used
   algorithm for removing unneeded requests.  However, courtesy to the last lock
   request and response on a given state-owner must be cached as long client or as an optimization, the lock state exists server may
   continue to hold locks, including delegations, on the server.

   The behalf of a client MUST monotonically increment the sequence number
   for which recent communication has extended beyond the
   CLOSE, LOCK, LOCKU, OPEN, OPEN_CONFIRM, and OPEN_DOWNGRADE
   operations.  This is true even in lease period,
   delaying the event that cancellation of the previous
   operation that used lease.  If the sequence number received an error.  The only
   exception to this rule is server receives a
   lock or I/O request that conflicts with one of these courtesy locks
   or if it runs out of resources, the previous operation received one server MAY cause lease
   cancellation to occur at that time and henceforth return
   NFS4ERR_EXPIRED when any of the following errors: NFS4ERR_STALE_CLIENTID, NFS4ERR_STALE_STATEID,
   NFS4ERR_BAD_STATEID, NFS4ERR_BAD_SEQID, NFS4ERR_BADXDR,
   NFS4ERR_RESOURCE, NFS4ERR_NOFILEHANDLE, or NFS4ERR_MOVED.

9.1.7.  Recovery from Replayed Requests

   As described above, stateids associated with the sequence number freed
   locks is per state-owner.  As long
   as used.  If lease cancellation has not occurred and the server maintains
   receives a lock or I/O request that conflicts with one of the last sequence number received and follows
   courtesy locks, the methods described above, there requirements are no risks as follows:

   o  In the case of a Byzantine router
   re-sending old requests.  The server need only maintain the (state-
   owner, sequence number) state as long as there are open files or
   closed files with locks outstanding.

   LOCK, LOCKU, OPEN, OPEN_DOWNGRADE, and CLOSE each contain courtesy lock which is not a sequence
   number delegation, it MUST
      free the courtesy lock and therefore grant the risk of new request.

   o  In the replay case of these operations
   resulting in undesired effects lock or IO request which conflicts with a
      delegation which is non-existent while being held as courtesy lock, the server
   maintains the state-owner state.

9.1.8.  Interactions of multiple sequence values

   Some Operations may have multiple sources MAY
      delay resolution of data for request
   sequence checking but MUST NOT reject the request and retransmission determination.  Some Operations
   have multiple sequence values associated with multiple types of
   state-owners.
      MUST free the delegation and grant the new request eventually.

   o  In addition, such Operations may also have a stateid
   with its own seqid value, that will be checked for validity.

   As noted above, there may be multiple sequence values to check.  The
   following rules should be followed by the server in processing these
   multiple sequence values within case of a single operation.

   o  When requests for a sequence value associated delegation which conflicts with a state-owner
      delegation which is unavailable
      for checking because being held as courtesy lock, the state-owner is unknown to server MAY
      grant the server, new request or not as it
      takes no part in the comparison.

   o  When any of chooses, but if it grants the state-owner sequence values are invalid,
      NFS4ERR_BAD_SEQID is returned.  When a stateid sequence is
      checked, NFS4ERR_BAD_STATEID, or NFS4ERR_OLD_STATEID is returned
      as appropriate, but NFS4ERR_BAD_SEQID has priority.

   o  When any one of the sequence values matches a previous
      conflicting request,
      for a state-owner, it is treated as a retransmission and not re-
      executed.  When the type of delegation haled as courtesy lock MUST be
      freed.

   If the operation server does not match that
      originally used, NFS4ERR_BAD_SEQID is returned.  When the server
      can determine that the request differs from the original it may
      return NFS4ERR_BAD_SEQID.

   o  When multiple of the sequence values match previous operations,
      but reboot or cancel the operations are not lease before the same, NFS4ERR_BAD_SEQID network
   partition is
      returned.

   o  When there are no available sequence values available for
      comparison and healed, when the operation is an OPEN, original client tries to access a
   courtesy lock which was freed, the server indicates SHOULD send back a
   NFS4ERR_BAD_STATEID to the client that an OPEN_CONFIRM is required, unless it can
      conclusively determine that confirmation is not required (e.g., by
      knowing that no open-owner state has ever been released for client.  If the
      current clientid).

9.1.9.  Releasing state-owner State

   When client tries to access a particular state-owner no longer holds open or file locking
   state at the server,
   courtesy lock which was not freed, then the server may choose to release the sequence
   number state associated with SHOULD mark all of
   the state-owner.  The server may make
   this choice based on courtesy locks as implicitly being renewed.

9.6.3.2.  Lease Cancellation

   As a result of lease expiration, for the reclamation of server
   memory, or other implementation specific details.  Note that when
   this is done, a retransmitted request, normally identified by a
   matching state-owner sequence leases may not be correctly recognized, so
   that the client will not receive cancelled, either
   immediately upon expiration or subsequently, depending on the original response that it would
   have if
   occurrence of a conflicting lock or extension of the state-owner state was not released.

   If period of
   partition beyond what the server were able to be sure that a given state-owner would
   never again be used by will tolerate.

   When a client, such an issue could not arise.  Even
   when the state-owner lease is cancelled, all locking state associated with it is released
   freed and use of any the client subsequently
   uses that state-owner, retransmitted requests associated stateids will be detected as
   invalid and the request not executed, although result in
   NFS4ERR_EXPIRED being returned.  Similarly, use of the associated
   clientid will result in NFS4ERR_EXPIRED being returned.

   The client may have should recover from this situation by using SETCLIENTID
   followed by SETCLIENTID_CONFIRM, in order to establish a
   recovery path that is more complicated than simply getting the
   original response back transparently.

   In any event, the server new
   clientid.  Once a lock is able to safely release state-owner state
   (in the sense that retransmitted requests obtained using this clientid, a lease will not
   be erroneously
   acted upon) when the state-owner established.

9.6.3.3.  Client's Reaction to a Freed Lock

   There is no currently being utilized by the way for a client (i.e., there are no open files associated with an open-owner
   and no lock stateids associated with to predetermine how a lock-owner).  The given server may
   choose is
   going to hold behave during a network partition.  When the state-owner state in order to simplify the
   recovery path, in partition
   heals, either the case in which retransmissions client still has all of currently
   active requests are received.  However, the period its locks, it chooses to hold
   this state is implementation specific.

   In the case that a LOCK, LOCKU, OPEN_DOWNGRADE, or CLOSE is
   retransmitted after the server has previously released the state-
   owner state, the server will find that the state-owner some of
   its locks, or it has no files
   open and an error none of them.  The client will be returned able to
   examine the client.  If the state-owner
   does have a file open, the stateid will not match and again an various error
   is returned return values to the client.

9.1.10.  Use determine its response.

   NFS4ERR_EXPIRED:

      All locks have been freed as a result of Open Confirmation

   In the case that an OPEN is retransmitted and the open-owner is being
   used for a lease cancellation
      which occurred during the first time partition.  The client should use a
      SETCLIENTID to recover.

   NFS4ERR_ADMIN_REVOKED:

      The current lock has been revoked before, during, or after the open-owner state
      partition.  The client SHOULD handle this error as it normally
      would.

   NFS4ERR_BAD_STATEID:

      The current lock has been previously
   released by the server, the use of revoked/released during the OPEN_CONFIRM operation will
   prevent incorrect behavior.  When partition
      and the server observes the use of the
   open-owner for the first time, it will direct the did not reboot.  Other locks MAY still be renewed.
      The client to perform
   the OPEN_CONFIRM for the corresponding OPEN.  This sequence
   establishes the use of need not do a open-owner SETCLIENTID and associated sequence number.
   Since the OPEN_CONFIRM sequence connects instead SHOULD probe via
      a new open-owner on RENEW call.

   NFS4ERR_RECLAIM_BAD:

      The current lock has been revoked during the partition and the
      server with an existing open-owner rebooted.  The server might have no information on a client, the sequence number
      other locks.  They may have any value. still be renewable.

   NFS4ERR_NO_GRACE:

      The OPEN_CONFIRM step assures client's locks have been revoked during the partition and the
      server that rebooted.  None of the value received client's locks will be renewable.

   NFS4ERR_OLD_STATEID:

      The server has not rebooted.  The client SHOULD handle this error
      as it normally would.

9.6.3.4.  Edge Conditions

   When a network partition is the correct one. (see Section 15.20 for further
   details.)

   There are combined with a number of situations in which server reboot, then both
   the requirement server and client have responsibilities to confirm
   an OPEN would pose difficulties for ensure that the client and server, in that
   they would be prevented from acting in
   does not reclaim a timely fashion on
   information received, because that information would be provisional,
   subject to deletion upon non-confirmation.  Fortunately, these are
   situations in lock which the server can avoid the need for confirmation
   when responding it should no longer be able to open requests.  The two constraints access.
   Briefly those are:

   o  The server must not bestow a delegation for  Client's responsibility: A client MUST NOT attempt to reclaim any open
      locks which would
      require confirmation. it did not hold at the end of its most recent
      successfully established client lease.

   o  The  Server's responsibility: A server MUST NOT require confirmation on allow a reclaim-type open
      (i.e., one specifying claim type CLAIM_PREVIOUS or
      CLAIM_DELEGATE_PREV).

   These constraints are related in client to
      reclaim a lock unless it knows that reclaim-type opens are the only
   ones it could not have since
      granted a conflicting lock.  However, in which the deciding whether a
      conflicting lock could have been granted, it is permitted to
      assume its clients are responsible, as above.

   A server may be required to send consider a delegation.  For
   CLAIM_NULL, sending the delegation is optional while for
   CLAIM_DELEGATE_CUR, no delegation is sent.

   Delegations being sent with client's lease "successfully established"
   once it has received an open requiring confirmation are
   troublesome because recovering operation from non-confirmation adds undue
   complexity to the protocol while requiring confirmation on reclaim-
   type opens poses difficulties in that the inability client.

   The above are directed to resolve the
   status of the reclaim until lease expiration may make it difficult CLAIM_PREVIOUS reclaims and not to
   have timely determination of the set of locks being reclaimed (since
   the grace period may expire).

   Requiring open confirmation on reclaim-type opens is avoidable
   because of the nature of the environments in
   CLAIM_DELEGATE_PREV reclaims, which such opens are
   done.  For CLAIM_PREVIOUS opens, this is immediately after generally do not involve a server
   reboot, so there should be no time for open-owners to be created,
   found
   reboot.  However, when a server persistently stores delegation
   information to be unused, and recycled.  For support CLAIM_DELEGATE_PREV opens, we
   are dealing with either a client reboot situation or across a network
   partition resulting period in deletion of lease state (and returning
   NFS4ERR_EXPIRED).  A server which supports delegations can be sure
   that no open-owners for that client have been recycled since
   both client
   initialization or deletion of lease state and thus server are down at the same time, similar strictures
   apply.

   The next sections give examples showing what can ensure go wrong if these
   responsibilities are neglected, and provides examples of server
   implementation strategies that
   confirmation will not be required.

9.2.  Lock Ranges could meet a server's
   responsibilities.

9.6.3.4.1.  First Server Edge Condition

   The protocol allows first edge condition has the following scenario:

   1.  Client A acquires a lock owner lock.

   2.  Client A and server experience mutual network partition, such
       that client A is unable to request renew its lease.

   3.  Client A's lease expires, so server releases lock.

   4.  Client B acquires a lock that would have conflicted with a byte range
   and then either upgrade or unlock a sub-range of the initial lock.
   It is expected that this will be an uncommon type of request.  In any
   case, servers or server filesystems may not be able to support sub-
   range lock semantics.  In
       Client A.

   5.  Client B releases the event that a lock

   6.  Server reboots

   7.  Network partition between client A and server receives heals.

   8.  Client A issues a locking
   request that represents RENEW operation, and gets back a sub-range of current locking state for the
       NFS4ERR_STALE_CLIENTID.

   9.  Client A reclaims its lock owner, within the server is allowed to return server's grace period.

   Thus, at the error
   NFS4ERR_LOCK_RANGE to signify that it does not support sub-range lock
   operations.  Therefore, final step, the server has erroneously granted client should be prepared to receive this
   error and, if appropriate, report
   A's lock reclaim.  If client B modified the error to object the requesting
   application.

   The lock was
   protecting, client is discouraged from combining multiple independent locking
   ranges that happen to be adjacent into a single request since the
   server may not support sub-range requests A will experience object corruption.

9.6.3.4.2.  Second Server Edge Condition

   The second known edge condition follows:

   1.   Client A acquires a lock.

   2.   Server reboots.

   3.   Client A and for reasons related to
   the recovery of file locking state in the event of server failure.
   As discussed in the Section 9.6.2 below, the server may employ
   certain optimizations during recovery experience mutual network partition, such
        that work effectively only when
   the client's behavior during lock recovery client A is similar unable to reclaim its lock within the client's
   locking behavior prior to server failure.

9.3.  Upgrading and Downgrading Locks

   If a client grace
        period.

   4.   Server's reclaim grace period ends.  Client A has a write lock no locks
        recorded on server.

   5.   Client B acquires a record, it can request an atomic
   downgrade lock that would have conflicted with that of
        Client A.

   6.   Client B releases the lock to lock.

   7.   Server reboots a read second time.

   8.   Network partition between client A and server heals.

   9.   Client A issues a RENEW operation, and gets back a
        NFS4ERR_STALE_CLIENTID.

   10.  Client A reclaims its lock via within the LOCK request, by setting server's grace period.

   As with the type to READ_LT.  If first edge condition, the server supports atomic downgrade, final step of the
   request will succeed.  If not, it will return NFS4ERR_LOCK_NOTSUPP.
   The client should be prepared to receive this error, and if
   appropriate, report scenario of
   the error to second edge condition has the requesting application.

   If a server erroneously granting client has a read
   A's lock on a record, it can request an atomic
   upgrade reclaim.

9.6.3.4.3.  Handling Server Edge Conditions

   In both of the lock to above examples, the client attempts reclaim of a write lock via
   that it held at the LOCK request end of its most recent successfully established
   lease; thus, it has fulfilled its responsibility.

   The server, however, has failed, by setting granting a reclaim, despite
   having granted a conflicting lock since the type to WRITE_LT or WRITEW_LT.  If reclaimed lock was last
   held.

   Solving these edge conditions requires that the server does not support
   atomic upgrade, either assume
   after it will reboots that edge condition occurs, and thus return NFS4ERR_LOCK_NOTSUPP.  If
   NFS4ERR_NO_GRACE for all reclaim attempts, or that the upgrade
   can be achieved without an existing conflict, server record
   some information in stable storage.  The amount of information the request will
   succeed.  Otherwise,
   server records in stable storage is in inverse proportion to how
   harsh the server will return either NFS4ERR_DENIED or
   NFS4ERR_DEADLOCK. wants to be whenever the edge conditions occur.  The error NFS4ERR_DEADLOCK
   server that is returned if the
   client issued completely tolerant of all edge conditions will record
   in stable storage every lock that is acquired, removing the LOCK request with lock
   record from stable storage only when the type set to WRITEW_LT and lock is unlocked by the
   server has detected a deadlock.  The
   client should be prepared to
   receive such errors and if appropriate, report the error to lock's owner advances the
   requesting application.

9.4.  Blocking Locks

   Some clients require sequence number such that
   the support of blocking locks.  The NFS version
   4 protocol must lock release is not rely on the last stateful event for the owner's
   sequence.  For the two aforementioned edge conditions, the harshest a callback mechanism and therefore is
   unable to notify
   server can be, and still support a client when grace period for reclaims,
   requires that the server record in stable storage information some
   minimal information.  For example, a previously denied lock has been
   granted.  Clients have no choice but to continually poll server implementation could, for
   each client, save in stable storage a record containing:

   o  the
   lock.  This presents client's id string

   o  a fairness problem.  Two new lock types are
   added, READW and WRITEW, and are used to indicate boolean that indicates if the client's lease expired or if there
      was administrative intervention (see Section 9.8) to revoke a
      byte-range lock, share reservation, or delegation

   o  a timestamp that is updated the first time after a server that boot or
      reboot the client is requesting a blocking lock. acquires byte-range locking, share reservation,
      or delegation state on the server.  The timestamp need not be
      updated on subsequent lock requests until the server reboots.

   The server should maintain
   an ordered list of pending blocking locks.  When implementation would also record in the conflicting lock
   is released, stable storage the
   timestamps from the two most recent server may wait reboots.

   Assuming the lease period above record keeping, for the first
   waiting client to re-request edge condition,
   after the lock.  After server reboots, the record that client A's lease period
   expires expired
   means that another client could have acquired a conflicting record
   lock, share reservation, or delegation.  Hence the next waiting server must reject
   a reclaim from client request is allowed A with the lock.  Clients
   are required to poll at error NFS4ERR_NO_GRACE or
   NFS4ERR_RECLAIM_BAD.

   For the second edge condition, after the server reboots for a second
   time, the record that the client had an interval sufficiently small unexpired record lock, share
   reservation, or delegation established before the server's previous
   incarnation means that it is
   likely to acquire the lock in a timely manner.  The server is not
   required to maintain must reject a list reclaim from client A
   with the error NFS4ERR_NO_GRACE or NFS4ERR_RECLAIM_BAD.

   Regardless of pending blocked locks as it is used to
   increase fairness the level and not correct operation.  Because of approach to record keeping, the
   unordered nature of crash recovery, storing server
   MUST implement one of lock state to stable
   storage would be required the following strategies (which apply to guarantee ordered granting
   reclaims of blocking
   locks.

   Servers may also note the lock types share reservations, byte-range locks, and delay returning denial of delegations):

   1.  Reject all reclaims with NFS4ERR_NO_GRACE.  This is super harsh,
       but necessary if the request server does not want to allow extra time for a conflicting record lock state in
       stable storage.

   2.  Record sufficient state in stable storage to be
   released, allowing a successful return. meet its
       responsibilities.  In this way, clients can
   avoid doubt, the burden of needlessly frequent polling for blocking locks.
   The server should take care in err on the length side of delay in
       being harsh.

       In the event the
   client retransmits the request.

   If that, after a server receives a blocking lock request, denies it, and then
   later receives a nonblocking request for the same lock, which is also
   denied, then it should remove the lock in question from its list of
   pending blocking locks.  Clients should use such a nonblocking
   request to indicate to reboot, the server determines
       that this there is unrecoverable damage or corruption to the last time they
   intend to poll for the lock, as may happen when the process
   requesting
       stable storage, then for all clients and/or locks affected, the lock is interrupted.  This is a courtesy to
       server MUST return NFS4ERR_NO_GRACE.

9.6.3.4.4.  Client Edge Condition

   A third edge condition effects the
   server, to prevent it from unnecessarily waiting a lease period
   before granting other lock requests.  However, clients are not
   required to perform this courtesy, client and servers must not depend on
   them doing so.  Also, clients must be prepared for the possibility
   that this final locking request will be accepted.

9.5.  Lease Renewal

   The purpose of a lease is to allow a server.  If the
   server to remove stale reboots in the middle of the client reclaiming some locks
   that are held by and
   then a client that has crashed or is otherwise
   unreachable.  It network partition is not a mechanism for cache consistency and lease
   renewals may not established, the client might be denied if in the lease interval has
   situation of having reclaimed some, but not expired.

   The client can implicitly provide a positive indication all locks.  In that it is
   still active and case,
   a conservative client would assume that the associated state held at the server, for
   the client, is still valid.  Any operation made with non-reclaimed locks were
   revoked.

   The third known edge condition follows:

   1.   Client A acquires a valid clientid
   (DELEGPURGE, LOCK, LOCKT, OPEN, RELEASE_LOCKOWNER, or RENEW) or lock 1.

   2.   Client A acquires a
   valid stateid (CLOSE, DELEGRETURN, LOCK, LOCKU, OPEN, OPEN_CONFIRM,
   OPEN_DOWNGRADE, READ, SETATTR, or WRITE) informs lock 2.

   3.   Server reboots.

   4.   Client A issues a RENEW operation, and gets back a
        NFS4ERR_STALE_CLIENTID.

   5.   Client A reclaims its lock 1 within the server's grace period.

   6.   Client A and server experience mutual network partition, such
        that client A is unable to renew
   all of reclaim its remaining locks within
        the leases for grace period.

   7.   Server's reclaim grace period ends.

   8.   Client B acquires a lock that client (i.e., all those sharing would have conflicted with Client
        A's lock 2.

   9.   Client B releases the lock.

   10.  Server reboots a given second time.

   11.  Network partition between client ID).  In the latter case, A and server heals.

   12.  Client A issues a RENEW operation, and gets back a
        NFS4ERR_STALE_CLIENTID.

   13.  Client A reclaims both lock 1 and lock 2 within the stateid must not be one of server's
        grace period.

   At the
   special stateids consisting of all bits 0 or all bits 1.

   Note that if last step, the client reclaims lock 2 as if it had restarted or rebooted, held that
   lock continuously, when in fact a conflicting lock was granted to
   client B.

   This occurs because the client would failed its responsibility, by
   attempting to reclaim lock 2 even though it had not be making these requests without issuing held that lock at
   the SETCLIENTID/
   SETCLIENTID_CONFIRM sequence.  The use end of the SETCLIENTID/
   SETCLIENTID_CONFIRM sequence (one lease that changes was established by the client verifier)
   notifies SETCLIENTID after
   the first server to drop the locking state associated with the
   client.  SETCLIENTID/SETCLIENTID_CONFIRM never renews reboot.  (The client did hold lock 2 on a previous
   lease.

   If  But it is only the most recent lease that matters.)

   A server has rebooted, could avoid this situation by rejecting the stateids (NFS4ERR_STALE_STATEID
   error) or the client ID (NFS4ERR_STALE_CLIENTID error) will not be
   valid hence preventing spurious renewals.

   This approach allows for low overhead lease renewal which scales
   well.  In the typical case no extra RPC calls are required for lease
   renewal and in the worst case one RPC is required every lease period
   (i.e., a RENEW operation).  The number reclaim of lock
   2.  However, to do so accurately it would have to ensure that
   additional information about individual locks held by the client is survives reboot.
   Server implementations are not a factor since all state for required to do that, so the client is involved with the
   lease renewal action.

   Since all operations
   must not assume that create a new lease also renew existing
   leases, the server must maintain a common lease expiration time for
   all valid leases for will.

   Instead, a given client.  This lease time can then be
   easily updated upon implicit lease renewal actions.

9.6.  Crash Recovery

   The important requirement in crash recovery is that both the client
   and MUST reclaim only those locks which it successfully
   acquired from the previous server know when the other has failed.  Additionally, it is
   required instance, omitting any that it
   failed to reclaim before a client sees a consistent view of data across server
   restarts or reboots.  All READ and WRITE operations that may have
   been queued within the client or network buffers must wait until new reboot.  Thus, in the last step above,
   client has successfully recovered A should reclaim only lock 1.

9.6.3.4.5.  Client's Handling of Reclaim Errors

   A mandate for the locks protecting client's handling of the READ and
   WRITE operations.

9.6.1.  Client Failure NFS4ERR_NO_GRACE and Recovery

   In
   NFS4ERR_RECLAIM_BAD errors is outside the event that a client fails, scope of this
   specification, since the server may recover strategies for such handling are very
   dependent on the client's
   locks when operating environment.  However, one
   potential approach is described below.

   When the associated leases have expired.  Conflicting locks
   from another client may only be granted after this lease expiration.
   If client's reclaim fails, it could examine the change
   attribute of the objects the client is able trying to restart reclaim state for,
   and use that to determine whether to re-establish the state via
   normal OPEN or reinitialize within LOCK requests.  This is acceptable provided the lease
   period
   client's operating environment allows it.  In other words, the client may be forced
   implementor is advised to wait the remainder of document for his users the lease
   period before obtaining new locks.

   To minimize behavior.  The
   client delay upon restart, open and could also inform the application that its byte-range lock requests are
   associated with an instance or
   share reservations (whether they were delegated or not) have been
   lost, such as via a UNIX signal, a GUI pop-up window, etc.  See
   Section 10.5, for a discussion of what the client by a should do for
   dealing with unreclaimed delegations on client supplied
   verifier.  This verifier is part state.

   For further discussion of revocation of locks see Section 9.8.

9.7.  Recovery from a Lock Request Timeout or Abort

   In the initial SETCLIENTID call made
   by the client.  The server returns event a client ID as lock request times out, a result of client may decide to not
   retry the
   SETCLIENTID operation. request.  The client then confirms may also abort the use of request when the
   client ID with SETCLIENTID_CONFIRM.  The client ID in combination
   with an opaque owner field is then used by the client to identify the
   open owner
   process for OPEN.  This chain of associations which it was issued is then used terminated (e.g., in UNIX due to
   identify all locks for a particular client.

   Since
   signal).  It is possible though that the verifier will be changed by server received the client request
   and acted upon each
   initialization, it.  This would change the state on the server can compare a new verifier to without
   the verifier
   associated with currently held locks and determine client being aware of the change.  It is paramount that they do not
   match.  This signifies the client's new instantiation and subsequent
   loss of locking state.  As
   client re-synchronize state with server before it attempts any other
   operation that takes a result, seqid and/or a stateid with the server same state-
   owner.  This is free straightforward to release
   all locks held which are associated with do without a special re-
   synchronize operation.

   Since the server maintains the last lock request and response
   received on the state-owner, for each state-owner, the old client ID which was
   derived from should
   cache the old verifier.

   Note last lock request it sent such that the verifier must have lock request did
   not receive a response.  From this, the same uniqueness properties of next time the verifier client does a
   lock operation for the COMMIT operation.

9.6.2.  Server Failure state-owner, it can send the cached request,
   if there is one, and Recovery

   If if the server loses locking request was one that established state (usually as
   (e.g., a LOCK or OPEN operation), the server will return the cached
   result of a restart or reboot), it must allow clients time to discover this fact and re-
   establish if never saw the lost locking state. request, perform it.  The client must be able can
   follow up with a request to re-
   establish remove the locking state without having (e.g., a LOCKU or CLOSE
   operation).  With this approach, the server deny valid
   requests because sequencing and stateid
   information on the client and server has granted conflicting access to another
   client.  Likewise, if there is for the possibility that clients have not
   yet re-established their locking given state-owner will
   re-synchronize and in turn the lock state for a file, will re-synchronize.

9.8.  Server Revocation of Locks

   At any point, the server must
   disallow READ can revoke locks held by a client and WRITE operations the
   client must be prepared for that file.  The duration of this recovery period event.  When the client detects that
   its locks have been or may have been revoked, the client is equal to
   responsible for validating the duration of state information between itself and
   the server.  Validating locking state for the lease period.

   A client can determine means that server failure (and thus loss of locking
   state) has occurred, when it receives one of two errors.  The
   NFS4ERR_STALE_STATEID error indicates a stateid invalidated by a
   reboot
   must verify or restart. reclaim state for each lock currently held.

   The NFS4ERR_STALE_CLIENTID error indicates a
   client ID invalidated by first instance of lock revocation is upon server reboot or restart.  When either of these are
   received, re-
   initialization.  In this instance the client must establish a new client ID (see
   Section 9.1.1) will receive an error
   (NFS4ERR_STALE_STATEID or NFS4ERR_STALE_CLIENTID) and re-establish the locking state client will
   proceed with normal crash recovery as discussed below.

   The period of special handling of locking and READs and WRITEs, equal described in duration the previous
   section.

   The second lock revocation event is the inability to renew the lease period,
   before expiration.  While this is referred to as considered a rare or unusual event,
   the "grace
   period".  During client must be prepared to recover.  Both the grace period, clients recover locks server and client
   will be able to detect the
   associated state by reclaim-type locking requests (i.e., LOCK
   requests with reclaim set failure to true renew the lease and OPEN operations with a claim
   type are capable
   of either CLAIM_PREVIOUS or CLAIM_DELEGATE_PREV).  During recovering without data corruption.  For the
   grace period, server, it tracks the server must reject READ and WRITE operations and
   non-reclaim locking requests (i.e., other LOCK
   last renewal event serviced for the client and OPEN operations)
   with an error of NFS4ERR_GRACE.

   If knows when the server can reliably determine that granting a non-reclaim
   request lease
   will not conflict with reclamation of locks by other clients,
   the NFS4ERR_GRACE error does not have to be returned and expire.  Similarly, the non-
   reclaim client request can be serviced.  For the server to be able to
   service READ and WRITE must track operations during which will
   renew the grace period, it must
   again be able to guarantee lease period.  Using the time that no possible conflict could arise
   between an impending reclaim locking each such request was
   sent and the READ or WRITE
   operation.  If the server is unable to offer time that guarantee, the
   NFS4ERR_GRACE error must be returned to the client.

   For a server to provide simple, valid handling during corresponding reply was received, the grace
   period,
   client should bound the easiest method is to simply reject all non-reclaim
   locking requests and READ and WRITE operations by returning time that the
   NFS4ERR_GRACE error.  However, a server may keep information about
   granted locks in stable storage.  With this information, corresponding renewal could
   have occurred on the server
   could and thus determine if it is possible that
   a regular lease period expiration could have occurred.

   The third lock or READ or WRITE operation revocation event can be
   safely processed.

   For example, if occur as a count result of locks on
   administrative intervention within the lease period.  While this is
   considered a given file rare event, it is available in
   stable storage, the server can track reclaimed locks for the file and
   when all reclaims have been processed, non-reclaim locking requests
   may be processed.  This way the server can ensure possible that non-reclaim
   locking requests will not conflict with potential reclaim requests.
   With respect to I/O requests, if the server is able server's
   administrator has decided to determine that
   there are no outstanding reclaim requests for release or revoke a file particular lock held
   by information
   from stable storage or another similar mechanism, the processing of
   I/O requests could proceed normally for the file.

   To reiterate, for client.  As a server that allows non-reclaim lock and I/O
   requests to be processed during result of revocation, the grace period, it MUST determine
   that no lock subsequently reclaimed client will be rejected and that no lock
   subsequently reclaimed would have prevented any I/O operation
   processed during the grace period.

   Clients should be prepared for the return receive an
   error of NFS4ERR_GRACE errors for
   non-reclaim lock and I/O requests. NFS4ERR_ADMIN_REVOKED.  In this case instance the client should
   employ a retry mechanism for the request.  A delay (on the order of
   several seconds) between retries should be used to avoid overwhelming
   the server.  Further discussion of may
   assume that only the general issue is included in
   [Floyd]. state-owner's locks have been lost.  The client must account for the server that is able to
   perform I/O and non-reclaim locking requests within
   notifies the grace period
   as well as those that cannot do so.

   A reclaim-type locking request outside lock holder appropriately.  The client MUST NOT assume
   the server's grace lease period can
   only succeed if the server can guarantee that no conflicting lock or
   I/O request has been granted since reboot or restart.

   A server may, upon restart, establish renewed as a new value for the lease
   period.  Therefore, clients should, once result of a new client ID is
   established, refetch the lease_time attribute and use it as failed operation.

   When the basis
   for lease renewal for client determines the lease associated with that server.
   However, period may have expired, the server
   client must establish, mark all locks held for this restart event, a grace
   period at least as long as the associated lease period for the previous server
   instantiation. as
   "unvalidated".  This allows means the client has been unable to re-establish
   or confirm the appropriate lock state obtained during with the server.  As described
   in Section 9.6, there are scenarios in which the
   previous server instance to be reliably re-established.

9.6.3.  Network Partitions and Recovery

   If may grant
   conflicting locks after the duration of lease period has expired for a network partition client.
   When it is greater than possible that the lease period provided by the server, has expired, the server will have client
   must validate each lock currently held to ensure that a conflicting
   lock has not received been granted.  The client may accomplish this task by
   issuing an I/O request, either a
   lease renewal from pending I/O or a zero-length read,
   specifying the client. stateid associated with the lock in question.  If this occurs, the server may cancel
   response to the lease and free all locks held for request is success, the client.  As a result, client has validated all
   stateids held of
   the locks governed by that stateid and re-established the client will become invalid or stale.  Once appropriate
   state between itself and the
   client is able to reach server.

   If the server after such a network partition,
   all I/O submitted by request is not successful, then one or more of the client locks
   associated with the now invalid stateids will
   fail with stateid was revoked by the server returning the error NFS4ERR_EXPIRED.  Once this
   error is received, and the client will suitably
   must notify the application
   that held the lock.

9.6.3.1.  Courtesy Locks

   As owner.

9.9.  Share Reservations

   A share reservation is a courtesy mechanism to the client or as an optimization, the server may
   continue control access to hold locks, including delegations, on behalf of a file.  It
   is a separate and independent mechanism from byte-range locking.
   When a client
   for which recent communication has extended beyond opens a file, it issues an OPEN operation to the lease period,
   delaying server
   specifying the cancellation type of the lease.  If the server receives a
   lock access required (READ, WRITE, or I/O request that conflicts with one BOTH) and the
   type of these courtesy locks access to deny others (OPEN4_SHARE_DENY_NONE,
   OPEN4_SHARE_DENY_READ, OPEN4_SHARE_DENY_WRITE, or if it runs out of resources,
   OPEN4_SHARE_DENY_BOTH).  If the server MAY cause lease
   cancellation to occur at that time and henceforth return
   NFS4ERR_EXPIRED when any of OPEN fails the stateids associated with client will fail the freed
   locks is used.  If lease cancellation has not occurred and
   application's open request.

   Pseudo-code definition of the server
   receives a lock or I/O request that conflicts with one semantics:

     if (request.access == 0)
             return (NFS4ERR_INVAL)
     else if ((request.access & file_state.deny)) ||
         (request.deny & file_state.access))
             return (NFS4ERR_DENIED)

   This checking of share reservations on OPEN is done with no exception
   for an existing OPEN for the
   courtesy locks, same open-owner.

   The constants used for the requirements OPEN and OPEN_DOWNGRADE operations for the
   access and deny fields are as follows:

   o  In the case of a courtesy lock which is not

   const OPEN4_SHARE_ACCESS_READ   = 0x00000001;
   const OPEN4_SHARE_ACCESS_WRITE  = 0x00000002;
   const OPEN4_SHARE_ACCESS_BOTH   = 0x00000003;

   const OPEN4_SHARE_DENY_NONE     = 0x00000000;
   const OPEN4_SHARE_DENY_READ     = 0x00000001;
   const OPEN4_SHARE_DENY_WRITE    = 0x00000002;
   const OPEN4_SHARE_DENY_BOTH     = 0x00000003;

9.10.  OPEN/CLOSE Operations

   To provide correct share semantics, a delegation, it client MUST
      free use the courtesy lock OPEN
   operation to obtain the initial filehandle and grant indicate the new request.

   o  In desired
   access and what access, if any, to deny.  Even if the case of lock or IO request which conflicts with client intends
   to use a
      delegation which is being held as courtesy lock, the server MAY
      delay resolution stateid of request but MUST NOT reject all 0's or all 1's, it must still obtain the request and
      MUST free
   filehandle for the delegation and grant regular file with the new request eventually.

   o  In OPEN operation so the case of
   appropriate share semantics can be applied.  Clients that do not have
   a requests deny mode built into their programming interfaces for opening a delegation which conflicts with a
      delegation which is being held as courtesy lock, the server MAY
      grant the new
   file should request or not as it chooses, but if it grants a deny mode of OPEN4_SHARE_DENY_NONE.

   The OPEN operation with the
      conflicting request, CREATE flag, also subsumes the delegation haled CREATE
   operation for regular files as courtesy lock MUST be
      freed.

   If the server does not reboot or cancel the lease before the network
   partition is healed, when used in previous versions of the original client tries to access NFS
   protocol.  This allows a
   courtesy lock which was freed, the server SHOULD send back create with a
   NFS4ERR_BAD_STATEID share to be done atomically.

   The CLOSE operation removes all share reservations held by the client. open-
   owner on that file.  If byte-range locks are held, the client tries to access a
   courtesy lock which was not freed, then the server SHOULD mark
   release all of
   the courtesy locks as implicitly being renewed.

9.6.3.2.  Lease Cancellation

   As before issuing a result of lease expiration, leases may be cancelled, either
   immediately upon expiration or subsequently, depending CLOSE.  The server MAY free all
   outstanding locks on CLOSE but some servers may not support the
   occurrence CLOSE
   of a conflicting file that still has byte-range locks held.  The server MUST
   return failure, NFS4ERR_LOCKS_HELD, if any locks would exist after
   the CLOSE.

   The LOOKUP operation will return a filehandle without establishing
   any lock or extension of state on the period of
   partition beyond what server.  Without a valid stateid, the server
   will tolerate.

   When assume the client has the least access.  For example, if one
   client opened a lease is cancelled, all locking state associated file with OPEN4_SHARE_DENY_BOTH and another client
   accesses the file via a filehandle obtained through LOOKUP, the
   second client could only read the file using the special read bypass
   stateid.  The second client could not WRITE the file at all because
   it is
   freed would not have a valid stateid from OPEN and use the special anonymous
   stateid would not be allowed access.

9.10.1.  Close and Retention of State Information

   Since a CLOSE operation requests deallocation of a stateid, dealing
   with retransmission of any the associated stateids will result in
   NFS4ERR_EXPIRED being returned.  Similarly, use CLOSE, may pose special difficulties,
   since the state information, which normally would be used to
   determine the state of the associated
   clientid will result in NFS4ERR_EXPIRED open file being returned.

   The client should recover from designated, might be
   deallocated, resulting in an NFS4ERR_BAD_STATEID error.

   Servers may deal with this situation by using SETCLIENTID
   followed by SETCLIENTID_CONFIRM, problem in order to establish a new
   clientid.  Once a lock number of ways.  To provide
   the greatest degree assurance that the protocol is obtained using being used
   properly, a server should, rather than deallocate the stateid, mark
   it as close-pending, and retain the stateid with this clientid, status, until
   later deallocation.  In this way, a lease will retransmitted CLOSE can be established.

9.6.3.3.  Client's Reaction to a Freed Lock

   There is no way for a client
   recognized since the stateid points to predetermine how state information with this
   distinctive status, so that it can be handled without error.

   When adopting this strategy, a given server is
   going to behave during a network partition.  When should retain the partition
   heals, either state
   information until the client still has all of its locks, it has some of
   its locks, or it has none of them. earliest of:

   o  Another validly sequenced request for the same open-owner, that is
      not a retransmission.

   o  The client will be able to
   examine time that an open-owner is freed by the various error return values server due to determine its response.

   NFS4ERR_EXPIRED: period
      with no activity.

   o  All locks have been for the client are freed as a result of a lease cancellation
      which occurred during SETCLIENTID.

   Servers may avoid this complexity, at the partition.  The client should use cost of less complete
   protocol error checking, by simply responding NFS4_OK in the event of
   a
      SETCLIENTID to recover.

   NFS4ERR_ADMIN_REVOKED:

      The current lock has been revoked before, during, or after CLOSE for a deallocated stateid, on the
      partition.  The client SHOULD handle assumption that this error as case
   must be caused by a retransmitted close.  When adopting this
   approach, it normally
      would.

   NFS4ERR_BAD_STATEID:

      The current lock has been revoked/released during the partition
      and is desirable to at least log an error when returning a
   no-error indication in this situation.  If the server did not reboot.  Other locks MAY still be renewed.
      The client need not do maintains a SETCLIENTID
   reply-cache mechanism, it can verify the CLOSE is indeed a
   retransmission and instead SHOULD probe via avoid error logging in most cases.

9.11.  Open Upgrade and Downgrade

   When an OPEN is done for a RENEW call.

   NFS4ERR_RECLAIM_BAD:

      The current lock has been revoked during the partition file and the
      server rebooted.  The server might have no information on open-owner for which the
      other locks.  They may still be renewable.

   NFS4ERR_NO_GRACE:

      The client's locks have been revoked during open
   is being done already has the partition and file open, the
      server rebooted.  None of result is to upgrade the
   open file status maintained on the client's locks will be renewable.

   NFS4ERR_OLD_STATEID:

      The server has not rebooted. to include the access and
   deny bits specified by the new OPEN as well as those for the existing
   OPEN.  The client SHOULD handle this error result is that there is one open file, as it normally would.

9.6.3.4.  Edge Conditions

   When a network partition far as the
   protocol is combined with a server reboot, then both concerned, and it includes the server union of the access and client have responsibilities to ensure that
   deny bits for all of the client
   does not reclaim OPEN requests completed.  Only a lock which it should no longer single
   CLOSE will be able to access.

   Briefly those are:

   o  Client's responsibility: A client MUST NOT attempt done to reclaim any
      locks which it did not hold at reset the end effects of its most recent
      successfully established client lease.

   o  Server's responsibility: A server MUST NOT allow a client to
      reclaim a lock unless it knows both OPENs.  Note that it could not have since
      granted a conflicting lock.  However, in deciding whether a
      conflicting lock could have been granted, it is permitted to
      assume its clients are responsible, as above.

   A server the
   client, when issuing the OPEN, may consider a client's lease "successfully established"
   once it has received an open operation from not know that client. the same file is in
   fact being opened.  The above are directed to CLAIM_PREVIOUS reclaims and not to
   CLAIM_DELEGATE_PREV reclaims, which generally do not involve a server
   reboot.  However, when a server persistently stores delegation
   information to support CLAIM_DELEGATE_PREV across a period in which only applies if both client and OPENs result in
   the OPENed object being designated by the same filehandle.

   When the server are down at chooses to export multiple filehandles corresponding
   to the same time, similar strictures
   apply.

   The next sections give examples showing what can go wrong if these
   responsibilities are neglected, file object and provides examples returns different filehandles on two
   different OPENs of the same file object, the server
   implementation strategies that could meet a server's
   responsibilities.

9.6.3.4.1.  First Server Edge Condition

   The first edge condition has MUST NOT "OR"
   together the following scenario:

   1.  Client A acquires a lock.

   2.  Client A access and deny bits and coalesce the two open files.
   Instead the server experience mutual network partition, such
       that client A is unable to renew its lease.

   3.  Client A's lease expires, so server releases lock.

   4.  Client B acquires a lock that would have conflicted must maintain separate OPENs with that of
       Client A.

   5.  Client B releases separate
   stateids and will require separate CLOSEs to free them.

   When multiple open files on the lock

   6.  Server reboots

   7.  Network partition between client A and server heals.

   8.  Client A issues a RENEW operation, and gets back are merged into a
       NFS4ERR_STALE_CLIENTID.

   9.  Client A reclaims its lock within single open
   file object on the server's grace period.

   Thus, at server, the final step, close of one of the server has erroneously granted client
   A's lock reclaim.  If client B modified the object the lock was
   protecting, client A will experience object corruption.

9.6.3.4.2.  Second Server Edge Condition

   The second known edge condition follows:

   1.   Client A acquires a lock.

   2.   Server reboots.

   3.   Client A and server experience mutual network partition, such
        that client A is unable to reclaim its lock within open files (on the grace
        period.

   4.   Server's reclaim grace period ends.  Client A has no locks
        recorded on server.

   5.   Client B acquires a lock that would have conflicted with that
   client) may necessitate change of
        Client A.

   6.   Client B releases the lock.

   7.   Server reboots a second time.

   8.   Network partition between client A and server heals.

   9.   Client A issues a RENEW operation, access and gets back a
        NFS4ERR_STALE_CLIENTID.

   10.  Client A reclaims its lock within the server's grace period.

   As with the first edge condition, the final step deny status of the scenario of
   open file on the second edge condition has server.  This is because the server erroneously granting client
   A's lock reclaim.

9.6.3.4.3.  Handling Server Edge Conditions

   In both union of the above examples, access and
   deny bits for the client attempts reclaim of remaining opens may be smaller (i.e., a lock
   that it held at the end of its most recent successfully established
   lease; thus, it has fulfilled its responsibility. proper
   subset) than previously.  The server, however, has failed, by granting a reclaim, despite
   having granted a conflicting lock since OPEN_DOWNGRADE operation is used to
   make the reclaimed lock was last
   held.

   Solving these edge conditions requires that necessary change and the server either assume
   after client should use it reboots that edge condition occurs, and thus return
   NFS4ERR_NO_GRACE for all reclaim attempts, or that to update the
   server record
   some information in stable storage. so that share reservation requests by other clients are
   handled properly.  The amount of information the
   server records in stable storage is in inverse proportion to how
   harsh stateid returned has the server wants same "other" field as
   that passed to be whenever the edge conditions occur. server.  The
   server that is completely tolerant of all edge conditions will record "seqid" value in stable storage every lock that is acquired, removing the lock
   record from stable storage only when the lock returned stateid
   MUST be incremented, even in situations in which there is unlocked by no change
   to the
   client access and deny bits for the lock's owner advances the sequence number such that
   the lock release is not file.

9.12.  Short and Long Leases

   When determining the last stateful event time period for the owner's
   sequence.  For the two aforementioned edge conditions, server lease, the harshest a usual
   lease tradeoffs apply.  Short leases are good for fast server can be, and still support
   recovery at a grace period for reclaims,
   requires that cost of increased RENEW or READ (with zero length)
   requests.  Longer leases are certainly kinder and gentler to servers
   trying to handle very large numbers of clients.  The number of RENEW
   requests drop in proportion to the lease time.  The disadvantages of
   long leases are slower recovery after server record in stable storage information some
   minimal information.  For example, a failure (the server implementation could, must
   wait for
   each client, save in stable storage a record containing:

   o the client's id string

   o  a boolean that indicates if leases to expire and the client's grace period to elapse before
   granting new lock requests) and increased file contention (if client
   fails to transmit an unlock request then server must wait for lease expired or
   expiration before granting new locks).

   Long leases are usable if there
      was administrative intervention (see Section 9.8) to revoke a
      byte-range lock, share reservation, or delegation

   o  a timestamp that the server is updated able to store lease state in
   non-volatile memory.  Upon recovery, the first time after a server boot or
      reboot can reconstruct the client acquires byte-range locking, share reservation,
      or delegation
   lease state on the server.  The timestamp need from its non-volatile memory and continue operation with
   its clients and therefore long leases would not be
      updated on subsequent lock requests until the server reboots.

   The server implementation would also record in an issue.

9.13.  Clocks, Propagation Delay, and Calculating Lease Expiration

   To avoid the stable storage need for synchronized clocks, lease times are granted by
   the
   timestamps from server as a time delta.  However, there is a requirement that the two most recent
   client and server reboots.

   Assuming clocks do not drift excessively over the above record keeping, for duration
   of the first edge condition,
   after lock.  There is also the server reboots, issue of propagation delay across the record
   network which could easily be several hundred milliseconds as well as
   the possibility that requests will be lost and need to be
   retransmitted.

   To take propagation delay into account, the client A's should subtract it
   from lease expired
   means times (e.g., if the client estimates the one-way
   propagation delay as 200 msec, then it can assume that the lease is
   already 200 msec old when it gets it).  In addition, it will take
   another 200 msec to get a response back to the server.  So the client could have acquired
   must send a conflicting record
   lock, share reservation, lock renewal or delegation.  Hence write data back to the server must reject
   a reclaim from client A with 400 msec
   before the error NFS4ERR_NO_GRACE or
   NFS4ERR_RECLAIM_BAD.

   For lease would expire.

   The server's lease period configuration should take into account the second edge condition, after
   network distance of the server reboots for a second
   time, clients that will be accessing the record server's
   resources.  It is expected that the lease period will take into
   account the network propagation delays and other network delay
   factors for the client had population.  Since the protocol does not allow
   for an unexpired record lock, share
   reservation, or delegation established before automatic method to determine an appropriate lease period, the
   server's previous
   incarnation means that administrator may have to tune the server must reject lease period.

9.14.  Migration, Replication and State

   When responsibility for handling a reclaim from client A
   with the error NFS4ERR_NO_GRACE given file system is transferred
   to a new server (migration) or NFS4ERR_RECLAIM_BAD.

   Regardless of the level and approach client chooses to record keeping, the use an alternate
   server
   MUST implement one (e.g., in response to server unresponsiveness) in the context
   of file system replication, the following strategies (which apply to
   reclaims appropriate handling of share reservations, byte-range state shared
   between the client and server (i.e., locks, leases, stateids, and delegations):

   1.  Reject all reclaims with NFS4ERR_NO_GRACE.  This
   client IDs) is super harsh,
       but necessary if the server does not want to record lock state in
       stable storage.

   2.  Record sufficient state in stable storage to meet its
       responsibilities.  In doubt, the as described below.  The handling differs between
   migration and replication.  For related discussion of file server should err on the side
   state and recover of
       being harsh.

       In such see the event that, after sections under Section 9.6.

   If a server reboot, the server determines
       that there is unrecoverable damage replica or corruption to the the
       stable storage, then for all clients and/or locks affected, the a server MUST return NFS4ERR_NO_GRACE.

9.6.3.4.4.  Client Edge Condition

   A third edge condition effects immigrating a file system agrees to,
   or is expected to, accept opaque values from the client and not the server.  If that
   originated from another server, then servers SHOULD encode the
   server reboots
   "opaque" values in network byte order.  This way, servers acting as
   replicas or immigrating file systems will be able to parse values
   like stateids, directory cookies, filehandles, etc. even if their
   native byte order is different from other servers cooperating in the middle
   replication and migration of the client reclaiming some locks file system.

9.14.1.  Migration and
   then a network partition is established, State

   In the client might be case of migration, the servers involved in the
   situation migration of having reclaimed some, but not all locks.  In that case, a conservative client would assume that
   file system SHOULD transfer all server state from the non-reclaimed locks were
   revoked.

   The third known edge condition follows:

   1.   Client A acquires a lock 1.

   2.   Client A acquires a lock 2.

   3.   Server reboots.

   4.   Client A issues a RENEW operation, and gets back a
        NFS4ERR_STALE_CLIENTID.

   5.   Client A reclaims its lock 1 within original to the server's grace period.

   6.   Client A and server experience mutual network partition, such
   new server.  This must be done in a way that client A is unable transparent to reclaim its remaining locks within the grace period.

   7.   Server's reclaim grace period ends.

   8.   Client B acquires a lock that would have conflicted with Client
        A's lock 2.

   9.   Client B releases
   client.  This state transfer will ease the lock.

   10.  Server reboots a second time.

   11.  Network partition between client A and server heals.

   12.  Client A issues a RENEW operation, and gets back client's transition when a
        NFS4ERR_STALE_CLIENTID.

   13.  Client A reclaims both lock 1 and lock 2 within the server's
        grace period.

   At the last step,
   file system migration occurs.  If the client reclaims lock 2 as if it had held that
   lock continuously, when servers are successful in fact a conflicting lock was granted to
   client B.

   This occurs because
   transferring all state, the client failed its responsibility, by
   attempting will continue to reclaim lock 2 even though it had not held that lock at
   the end of the lease that was established use stateids
   assigned by the SETCLIENTID after original server.  Therefore the first new server reboot.  (The must
   recognize these stateids as valid.  This holds true for the client did hold lock 2 on ID
   as well.  Since responsibility for an entire file system is
   transferred with a previous
   lease.  But it migration event, there is only the most recent lease no possibility that matters.)

   A
   conflicts will arise on the new server could avoid this situation by rejecting as a result of the reclaim transfer of
   locks.

   As part of the transfer of lock
   2.  However, to do so accurately it would have to ensure that
   additional information about individual locks held survives reboot.
   Server implementations are not required between servers, leases would
   be transferred as well.  The leases being transferred to do that, so the client
   must not assume that the new
   server will.

   Instead, will typically have a client MUST reclaim only those locks which it successfully
   acquired different expiration time from the previous server instance, omitting any that it
   failed to reclaim before a new reboot.  Thus, in the last step above,
   client A should reclaim only lock 1.

9.6.3.4.5.  Client's Handling of Reclaim Errors

   A mandate those for
   the client's handling of the NFS4ERR_NO_GRACE and
   NFS4ERR_RECLAIM_BAD errors is outside same client, previously on the scope of this
   specification, since old server.  To maintain the strategies for such handling are very
   dependent
   property that all leases on a given server for a given client expire
   at the client's operating environment.  However, one
   potential approach is described below.

   When same time, the client's reclaim fails, it could examine server should advance the change
   attribute expiration time to
   the later of the objects leases being transferred or the leases already
   present.  This allows the client is trying to reclaim state for,
   and use that to determine whether maintain lease renewal of both
   classes without special effort.

   The servers may choose not to re-establish transfer the state via
   normal OPEN or LOCK requests.  This information upon
   migration.  However, this choice is acceptable provided the
   client's operating environment allows it. discouraged.  In other words, the client
   implementor is advised to document for his users this case, when
   the behavior.  The client could also inform presents state information from the application that its byte-range lock or
   share reservations (whether they were delegated or not) have been
   lost, such as via a UNIX signal, original server (e.g.,
   in a GUI pop-up window, etc.  See
   Section 10.5, for RENEW op or a discussion of what the client should do for
   dealing with unreclaimed delegations on client state.

   For further discussion of revocation READ op of locks see Section 9.8.

9.7.  Recovery from a Lock Request Timeout or Abort

   In zero length), the event a lock request times out, a client may decide must be
   prepared to not
   retry receive either NFS4ERR_STALE_CLIENTID or
   NFS4ERR_STALE_STATEID from the request. new server.  The client may also abort the request when the
   process for which should then
   recover its state information as it was issued is terminated (e.g., normally would in UNIX due response to a
   signal).  It is possible though that the
   server received failure.  The new server must take care to allow for the request
   and acted upon it.  This
   recovery of state information as it would change in the state on event of server
   restart.

   A client SHOULD re-establish new callback information with the new
   server without as soon as possible, according to sequences described in
   Section 15.35 and Section 15.36.  This ensures that server operations
   are not blocked by the inability to recall delegations.

9.14.2.  Replication and State

   Since client being aware of switch-over in the change.  It case of replication is paramount that not under
   server control, the
   client re-synchronize handling of state with server before it attempts any other
   operation that takes a seqid and/or a stateid with the same state-
   owner.  This is straightforward to different.  In this case,
   leases, stateids and client IDs do without not have validity across a special re-
   synchronize operation.

   Since the
   transition from one server maintains the last lock request and response
   received on the state-owner, for each state-owner, the to another.  The client should
   cache must re-establish
   its locks on the last lock request it sent such that new server.  This can be compared to the lock request did
   not receive re-
   establishment of locks by means of reclaim-type requests after a response.  From this,
   server reboot.  The difference is that the next time server has no provision to
   distinguish requests reclaiming locks from those obtaining new locks
   or to defer the latter.  Thus, a client does re-establishing a lock operation for the state-owner, it can send the cached request,
   if there is one, and if on the request was one that established state
   (e.g.,
   new server (by means of a LOCK or OPEN operation), the server will return the cached
   result or if never saw request), may have the request, perform it.  The client can
   follow up with
   requests denied due to a request conflicting lock.  Since replication is
   intended for read-only use of file systems, such denial of locks
   should not pose large difficulties in practice.  When an attempt to remove the state (e.g.,
   re-establish a LOCKU or CLOSE
   operation).  With this approach, the sequencing and stateid
   information lock on the client and a new server for is denied, the given state-owner will
   re-synchronize and in turn client should
   treat the situation as if his original lock state will re-synchronize.

9.8.  Server Revocation had been revoked.

9.14.3.  Notification of Locks

   At any point, Migrated Lease

   In the server can revoke locks held by a client and case of lease renewal, the client must may not be prepared submitting
   requests for this event.  When the client detects a file system that
   its locks have been or may have has been revoked, migrated to another server.
   This can occur because of the implicit lease renewal mechanism.  The
   client is
   responsible renews leases for validating the state information between itself and all file systems when submitting a request
   to any one file system at the server.  Validating locking state

   In order for the client means to schedule renewal of leases that it may have
   been relocated to the new server, the client must verify or reclaim state find out about
   lease relocation before those leases expire.  To accomplish this, all
   operations which implicitly renew leases for each lock currently held.

   The first instance of lock revocation is upon server reboot or re-
   initialization.  In this instance the a client will receive an error
   (NFS4ERR_STALE_STATEID or NFS4ERR_STALE_CLIENTID) (such as OPEN,
   CLOSE, READ, WRITE, RENEW, LOCK, and the client others), will
   proceed with normal crash recovery as described in return the previous
   section.

   The second lock revocation event is error
   NFS4ERR_LEASE_MOVED if responsibility for any of the inability leases to be
   renewed has been transferred to renew the lease
   before expiration.  While this is considered a rare or unusual event, new server.  This condition will
   continue until the client must be prepared to recover.  Both receives an NFS4ERR_MOVED error and the
   server and client
   will be able to detect receives the failure subsequent GETATTR(fs_locations) for an access to renew the
   each file system for which a lease and are capable
   of recovering without data corruption.  For the server, it tracks has been moved to a new server.
   By convention, the
   last renewal event serviced for compound including the client and knows when GETATTR(fs_locations)
   SHOULD append a RENEW operation to permit the lease
   will expire.  Similarly, server to identify the
   client must track operations which will
   renew doing the lease period.  Using access.

   Upon receiving the time NFS4ERR_LEASE_MOVED error, a client that each such request was
   sent and the time supports
   file system migration MUST probe all file systems from that the corresponding reply was received, server on
   which it holds open state.  Once the client should bound has successfully probed
   all those file systems which are migrated, the time server MUST resume
   normal handling of stateful requests from that client.

   In order to support legacy clients that do not handle the corresponding renewal could
   have occurred on
   NFS4ERR_LEASE_MOVED error correctly, the server and thus determine if it is possible that
   a lease period expiration could have occurred.

   The third lock revocation event can occur as SHOULD time out after
   a result wait of
   administrative intervention within the at least two lease period.  While this is
   considered a rare event, periods, at which time it is possible that the server's
   administrator has decided to release or revoke will resume
   normal handling of stateful requests from all clients.  If a particular lock held
   by client
   attempts to access the client.  As a result of revocation, migrated files, the server MUST reply
   NFS4ERR_MOVED.

   When the client will receive receives an
   error of NFS4ERR_ADMIN_REVOKED.  In this instance NFS4ERR_MOVED error, the client may
   assume that only can
   follow the state-owner's locks have been lost.  The client
   notifies normal process to obtain the lock holder appropriately.  The client may not assume new server information
   (through the lease period has been renewed as a result fs_locations attribute) and perform renewal of a failed operation.

   When those
   leases on the client determines new server.  If the lease period may have expired, server has not had state
   transferred to it transparently, the client must mark all locks held for will receive either
   NFS4ERR_STALE_CLIENTID or NFS4ERR_STALE_STATEID from the associated lease new server,
   as
   "unvalidated".  This means the described above.  The client has been unable to re-establish
   or confirm the appropriate lock can then recover state with the server.  As described
   in Section 9.6, there are scenarios information as
   it does in which the event of server failure.

9.14.4.  Migration and the Lease_time Attribute

   In order that the client may grant
   conflicting locks after appropriately manage its leases in the lease period has expired
   case of migration, the destination server must establish proper
   values for a client. the lease_time attribute.

   When it state is possible transferred transparently, that state should include
   the lease period has expired, correct value of the client lease_time attribute.  The lease_time
   attribute on the destination server must validate each lock currently held to ensure never be less than that a conflicting
   lock has not been granted.  The client may accomplish on
   the source since this task would result in premature expiration of leases
   granted by
   issuing an I/O request, either a pending I/O or a zero-length read,
   specifying the stateid associated with the lock source server.  Upon migration in question.  If the
   response to the request which state is success,
   transferred transparently, the client has validated all of is under no obligation to re-
   fetch the locks governed by that stateid lease_time attribute and re-established may continue to use the appropriate
   state between itself and value
   previously fetched (on the server. source server).

   If the I/O request is state has not successful, then one been transferred transparently (i.e., the client
   sees a real or more of simulated server reboot), the locks
   associated with client should fetch the stateid was revoked by
   value of lease_time on the server new (i.e., destination) server, and use it
   for subsequent locking requests.  However the client server must notify the owner.

9.9.  Share Reservations

   A share reservation is respect a mechanism to control access
   grace period at least as long as the lease_time on the source server,
   in order to a file.  It
   is a separate and independent mechanism from byte-range locking.
   When a client opens a file, it issues an OPEN operation ensure that clients have ample time to reclaim their
   locks before potentially conflicting non-reclaimed locks are granted.
   The means by which the new server
   specifying obtains the type value of access required (READ, WRITE, or BOTH) and lease_time on
   the
   type of access old server is left to deny others (OPEN4_SHARE_DENY_NONE,
   OPEN4_SHARE_DENY_READ, OPEN4_SHARE_DENY_WRITE, or
   OPEN4_SHARE_DENY_BOTH).  If the OPEN fails the client will fail server implementations.  It is not
   specified by the
   application's open request.

   Pseudo-code definition NFS version 4 protocol.

10.  Client-Side Caching

   Client-side caching of the semantics:

     if (request.access == 0)
             return (NFS4ERR_INVAL)
     else if ((request.access & file_state.deny)) ||
         (request.deny & file_state.access))
             return (NFS4ERR_DENIED)

   This checking data, of share reservations on OPEN file attributes, and of file names is done
   essential to providing good performance with no exception
   for an existing OPEN for the same open-owner.

   The constants used for the OPEN NFS protocol.
   Providing distributed cache coherence is a difficult problem and OPEN_DOWNGRADE operations for
   previous versions of the
   access and deny fields are as follows:

   const OPEN4_SHARE_ACCESS_READ   = 0x00000001;
   const OPEN4_SHARE_ACCESS_WRITE  = 0x00000002;
   const OPEN4_SHARE_ACCESS_BOTH   = 0x00000003;

   const OPEN4_SHARE_DENY_NONE     = 0x00000000;
   const OPEN4_SHARE_DENY_READ     = 0x00000001;
   const OPEN4_SHARE_DENY_WRITE    = 0x00000002;
   const OPEN4_SHARE_DENY_BOTH     = 0x00000003;

9.10.  OPEN/CLOSE Operations

   To provide correct share semantics, a NFS protocol have not attempted it.
   Instead, several NFS client MUST use the OPEN
   operation implementation techniques have been used
   to obtain the initial filehandle and indicate reduce the desired
   access problems that a lack of coherence poses for users.
   These techniques have not been clearly defined by earlier protocol
   specifications and it is often unclear what access, if any, to deny.  Even if the is valid or invalid
   client intends behavior.

   The NFSv4 protocol uses many techniques similar to use a stateid of all 0's or all 1's, it must still obtain the
   filehandle for the regular file with the OPEN operation so the
   appropriate share semantics can be applied.  Clients those that do not have
   a deny mode built into their programming interfaces for opening a
   file should request a deny mode of OPEN4_SHARE_DENY_NONE.

   The OPEN operation with the CREATE flag, also subsumes the CREATE
   operation for regular files as
   been used in previous versions of the NFS
   protocol.  This allows a create with protocol versions.  The NFSv4 protocol does not
   provide distributed cache coherence.  However, it defines a share more
   limited set of caching guarantees to be done atomically.

   The CLOSE operation removes all allow locks and share
   reservations held by the open-
   owner on that file.  If byte-range locks are held, the to be used without destructive interference from client SHOULD
   release all locks before issuing
   side caching.

   In addition, the NFSv4 protocol introduces a CLOSE.  The delegation mechanism
   which allows many decisions normally made by the server MAY free all
   outstanding locks on CLOSE but some servers may not to be made
   locally by clients.  This mechanism provides efficient support of the CLOSE
   common cases where sharing is infrequent or where sharing is read-
   only.

10.1.  Performance Challenges for Client-Side Caching

   Caching techniques used in previous versions of a file that still has byte-range locks held.  The server MUST
   return failure, NFS4ERR_LOCKS_HELD, if any locks would exist after the CLOSE.

   The LOOKUP operation will return a filehandle without establishing
   any lock state on NFS protocol have
   been successful in providing good performance.  However, several
   scalability challenges can arise when those techniques are used with
   very large numbers of clients.  This is particularly true when
   clients are geographically distributed which classically increases
   the server.  Without a valid stateid, latency for cache re-validation requests.

   The previous versions of the server
   will assume NFS protocol repeat their file data
   cache validation requests at the client has time the least access.  For example, if file is opened.  This
   behavior can have serious performance drawbacks.  A common case is
   one
   client opened in which a file with OPEN4_SHARE_DENY_BOTH and another client
   accesses the file via is only accessed by a filehandle obtained through LOOKUP, single client.  Therefore,
   sharing is infrequent.

   In this case, repeated reference to the
   second server to find that no
   conflicts exist is expensive.  A better option with regards to
   performance is to allow a client could only read the that repeatedly opens a file using to do
   so without reference to the special read bypass
   stateid.  The second server.  This is done until potentially
   conflicting operations from another client could not WRITE the actually occur.

   A similar situation arises in connection with file at all because
   it would not have a valid stateid from OPEN locking.  Sending
   file lock and unlock requests to the special anonymous
   stateid would not be allowed access.

9.10.1.  Close server as well as the read and Retention of State Information

   Since a CLOSE operation
   write requests deallocation of a stateid, dealing necessary to make data caching consistent with retransmission of the CLOSE, may pose special difficulties,
   since the state information, which normally would be
   locking semantics (see Section 10.3.2) can severely limit
   performance.  When locking is used to
   determine provide protection against
   infrequent conflicts, a large penalty is incurred.  This penalty may
   discourage the state use of the open file being designated, might be
   deallocated, resulting in an NFS4ERR_BAD_STATEID error.

   Servers may deal locking by applications.

   The NFSv4 protocol provides more aggressive caching strategies with this problem in a number of ways.  To provide
   the greatest degree assurance that
   the protocol is being used
   properly, following design goals:

   o  Compatibility with a large range of server should, rather than deallocate semantics.

   o  Provide the stateid, mark
   it same caching benefits as close-pending, and retain previous versions of the stateid with this status, until
   later deallocation.  In this way, NFS
      protocol when unable to provide the more aggressive model.

   o  Requirements for aggressive caching are organized so that a retransmitted CLOSE large
      portion of the benefit can be
   recognized since the stateid points to state information with this
   distinctive status, so that it obtained even when not all of the
      requirements can be handled without error.

   When adopting this strategy, a server should retain the state
   information until the earliest of:

   o  Another validly sequenced request for the same open-owner, that is
      not a retransmission.

   o met.

   The time that an open-owner is freed by the server due to period
      with no activity.

   o  All locks appropriate requirements for the client server are freed as a result discussed in later
   sections in which specific forms of a SETCLIENTID.

   Servers may avoid this complexity, at the cost caching are covered (see
   Section 10.4).

10.2.  Delegation and Callbacks

   Recallable delegation of less complete
   protocol error checking, server responsibilities for a file to a
   client improves performance by simply responding NFS4_OK avoiding repeated requests to the
   server in the event absence of
   a CLOSE for a deallocated stateid, on inter-client conflict.  With the assumption that this case
   must be caused by use of a retransmitted close.  When adopting this
   approach, it is desirable
   "callback" RPC from server to at least log an error when returning client, a
   no-error indication in this situation.  If the server maintains a
   reply-cache mechanism, it can verify the CLOSE is indeed a
   retransmission and avoid error logging recalls delegated
   responsibilities when another client engages in most cases.

9.11.  Open Upgrade and Downgrade

   When an OPEN is done for sharing of a file and the open-owner for which the open
   is being done already has the file open, the result
   delegated file.

   A delegation is to upgrade the
   open file status maintained on passed from the server to include the access and
   deny bits specified by the new OPEN as well as those for client, specifying the existing
   OPEN.  The result is that there is one open file, as far as
   object of the
   protocol is concerned, delegation and it includes the union type of the access and
   deny bits for all delegation.  There are
   different types of the OPEN requests completed.  Only delegations but each type contains a single
   CLOSE will stateid to be done
   used to reset the effects of both OPENs.  Note that represent the
   client, delegation when issuing the OPEN, may not know performing operations that
   depend on the same file delegation.  This stateid is in
   fact being opened.  The above only applies if both OPENs result in
   the OPENed object being designated by the same filehandle.

   When the server chooses to export multiple filehandles corresponding similar to those
   associated with locks and share reservations but differs in that the same file object
   stateid for a delegation is associated with a client ID and returns different filehandles may be
   used on two
   different OPENs behalf of all the same file object, the server MUST NOT "OR"
   together the access and deny bits and coalesce the two open files.
   Instead open-owners for the server must maintain separate OPENs with separate
   stateids and will require separate CLOSEs given client.  A
   delegation is made to free them.

   When multiple open files on the client are merged into as a single open
   file object on the server, the close of one of the open files (on the
   client) may necessitate change of the access whole and deny status not to any specific
   process or thread of control within it.

   Because callback RPCs may not work in all environments (due to
   firewalls, for example), correct protocol operation does not depend
   on them.  Preliminary testing of callback functionality by means of a
   CB_NULL procedure determines whether callbacks can be supported.  The
   CB_NULL procedure checks the
   open file on the server.  This is because the union continuity of the access callback path.  A
   server makes a preliminary assessment of callback availability to a
   given client and
   deny bits for avoids delegating responsibilities until it has
   determined that callbacks are supported.  Because the remaining opens may be smaller (i.e., granting of a proper
   subset) than previously.  The OPEN_DOWNGRADE operation
   delegation is used to
   make always conditional upon the necessary change absence of conflicting
   access, clients must not assume that a delegation will be granted and
   they must always be prepared for OPENs to be processed without any
   delegations being granted.

   Once granted, a delegation behaves in most ways like a lock.  There
   is an associated lease that is subject to renewal together with all
   of the other leases held by that client.

   Unlike locks, an operation by a second client should use it to update a delegated file
   will cause the server so that share reservation requests by other clients are
   handled properly.  The stateid returned has to recall a delegation through a callback.

   On recall, the same "other" field client holding the delegation must flush modified
   state (such as
   that passed modified data) to the server.  The "seqid" value in server and return the returned stateid
   MUST
   delegation.  The conflicting request will not be incremented, even in situations in which there acted on until the
   recall is no change
   to complete.  The recall is considered complete when the access and deny bits for
   client returns the file.

9.12.  Short and Long Leases

   When determining delegation or the time period server times out its wait for
   the server lease, delegation to be returned and revokes the usual
   lease tradeoffs apply.  Short leases are good for fast server
   recovery at delegation as a cost result
   of increased RENEW or READ (with zero length)
   requests.  Longer leases are certainly kinder and gentler to servers
   trying the timeout.  In the interim, the server will either delay
   responding to handle very large numbers of clients.  The number of RENEW conflicting requests drop in proportion or respond to them with
   NFS4ERR_DELAY.  Following the lease time.  The disadvantages resolution of
   long leases are slower recovery after server failure (the the recall, the server must
   wait for
   has the leases information necessary to expire and grant or deny the second client's
   request.

   At the time the grace period to elapse before
   granting new lock requests) and increased file contention (if client
   fails receives a delegation recall, it may have
   substantial state that needs to transmit an unlock request then be flushed to the server.  Therefore,
   the server must wait should allow sufficient time for lease
   expiration before granting new locks).

   Long leases are usable if the delegation to be
   returned since it may involve numerous RPCs to the server.  If the
   server is able to store lease determine that the client is diligently flushing
   state in
   non-volatile memory.  Upon recovery, to the server can reconstruct as a result of the
   lease state from its non-volatile memory and continue operation with
   its clients and therefore long leases would not be an issue.

9.13.  Clocks, Propagation Delay, and Calculating Lease Expiration

   To avoid the need for synchronized clocks, lease times are granted by recall, the server as a MAY extend
   the usual time delta.  However, there is allowed for a requirement that the
   client and server clocks do not drift excessively over the duration
   of the lock.  There is also the issue of propagation delay across the
   network which could easily be several hundred milliseconds as well as recall.  However, the possibility that requests will be lost and need to time allowed for
   recall completion SHOULD NOT be
   retransmitted.

   To take propagation delay into account, the client should subtract it
   from lease times (e.g., if the client estimates the one-way
   propagation delay as 200 msec, then it can assume that the lease unbounded.

   An example of this is
   already 200 msec old when it gets it).  In addition, it will take
   another 200 msec responsibility to get mediate opens on a response back given
   file is delegated to the server.  So the client
   must send a lock renewal or write data back to the server 400 msec
   before the lease would expire. client (see Section 10.4).  The server's lease period configuration should take into account the
   network distance of the clients that server will be accessing
   not know what opens are in effect on the server's
   resources.  It is expected that client.  Without this
   knowledge the lease period server will take into
   account be unable to determine if the network propagation delays access and other network delay
   factors
   deny state for the client population.  Since file allows any particular open until the protocol does not allow
   delegation for an automatic method to determine an appropriate lease period, the
   server's administrator may have to tune the lease period.

9.14.  Migration, Replication and State

   When responsibility for handling a given file system is transferred has been returned.

   A client failure or a network partition can result in failure to
   respond to a new server (migration) or recall callback.  In this case, the client chooses to use an alternate server (e.g., will revoke
   the delegation which in response turn will render useless any modified state
   still on the client.

   Clients need to be aware that server unresponsiveness) in the context
   of file system replication, implementors may enforce
   practical limitations on the appropriate handling number of state shared
   between delegations issued.  Further,
   as there is no way to determine which delegations to revoke, the client and
   server (i.e., locks, leases, stateids, and
   client IDs) is as described below.  The handling differs between
   migration and replication.  For related discussion of file server
   state and recover of such see the sections under Section 9.6. allowed to revoke any.  If a server replica or a the server immigrating a filesystem agrees to,
   or is expected to, accept opaque values from the client that
   originated from implemented to
   revoke another server, delegation held by that client, then it is a wise implementation
   practice for the servers to encode the "opaque" values in network
   byte order.  This way, servers acting as replicas or immigrating
   filesystems will client may be
   able to parse values like stateids, directory
   cookies, filehandles, etc. even if their native byte order is
   different from other servers cooperating determine that a limit has been reached because each new
   delegation request results in the replication a revoke.  The client could then
   determine which delegations it may not need and
   migration of the filesystem.

9.14.1.  Migration and State preemptively release
   them.

10.2.1.  Delegation Recovery

   There are three situations that delegation recovery must deal with:

   o  Client reboot or restart

   o  Server reboot or restart (see Section 9.6.3.1)

   o  Network partition (full or callback-only)

   In the case of migration, event the servers involved in client reboots or restarts, the migration confirmation of a
   filesystem SHOULD transfer all server state from the original to the
   new server.  This must be
   SETCLIENTID done in with an nfs_client_id4 with a way that is transparent to the
   client.  This state transfer new verifier4 value
   will ease result in the client's transition when release of byte-range locks and share
   reservations.  Delegations, however, may be treated a
   filesystem migration occurs.  If the servers are successful bit
   differently.

   There will be situations in
   transferring all state, which delegations will need to be
   reestablished after a client reboots or restarts.  The reason for
   this is the client may have file data stored locally and this data
   was associated with the previously held delegations.  The client will continue
   need to use stateids
   assigned by reestablish the appropriate file state on the original server.  Therefore

   To allow for this type of client recovery, the new server must
   recognize these stateids as valid. MAY allow
   delegations to be retained after other sort of locks are released.
   This holds true implies that requests from other clients that conflict with
   these delegations will need to wait.  Because the normal recall
   process may require significant time for the client ID
   as well.  Since responsibility to flush changed
   state to the server, other clients need to be prepared for an entire filesystem is
   transferred with a migration event, there is no possibility delays
   that
   conflicts will arise on the new server as a result of the transfer of
   locks.

   As part occur because of a conflicting delegation.  In order to give
   clients a chance to get through the transfer of information between servers, reboot process during which
   leases would will not be transferred as well.  The leases being transferred to renewed, the new server will typically have a different expiration time from those for
   the same client, previously on MAY extend the old server.  To maintain period for
   delegation recovery beyond the
   property typical lease expiration period.  For
   open delegations, such delegations that all leases on are not released are
   reclaimed using OPEN with a given server claim type of CLAIM_DELEGATE_PREV.  (See
   Section 10.5 and Section 15.18 for discussion of open delegation and
   the details of OPEN respectively).

   A server MAY support a given claim type of CLAIM_DELEGATE_PREV, but if it
   does, it MUST NOT remove delegations upon SETCLIENTID_CONFIRM and
   instead MUST make them available for client expire
   at reclaim using
   CLAIM_DELEGATE_PREV.  The server MUST NOT remove the same time, delegations
   until either the server should advance client does a DELEGPURGE, or one lease period has
   elapsed from the expiration time to the later of the leases being transferred SETCLIENTID_CONFIRM or the leases already
   present.  This allows
   last successful CLAIM_DELEGATE_PREV reclaim.

   Note that the client to maintain lease renewal of both
   classes without special effort.

   The servers may choose requirement stated above is not meant to transfer the state information upon
   migration.  However, this choice is discouraged.  In this case, imply that
   when the client presents state information from the original server (e.g.,
   in a RENEW op or a READ op of zero length), the client must be
   prepared to receive either NFS4ERR_STALE_CLIENTID or
   NFS4ERR_STALE_STATEID from the new server.  The client should then
   recover its state information is no longer obliged, as it normally would in response to a
   server failure.  The new server must take care required above, to allow for the
   recovery of state information as retain
   delegation information, that it would in the event should necessarily dispose of server
   restart.

   A client SHOULD re-establish new callback information with it.
   Some specific cases are:

   o  When the new
   server as soon as possible, according to sequences described in
   Section 15.35 and Section 15.36.  This ensures that server operations
   are not blocked period is terminated by the inability to recall delegations.

9.14.2.  Replication and State

   Since client switch-over in the case occurrence of replication is not under
   server control, the handling DELEGPURGE,
      deletion of state unreclaimed delegations is different.  In this case,
   leases, stateids appropriate and client IDs do not have validity across desirable.

   o  When the period is terminated by a
   transition from one server lease period elapsing without a
      successful CLAIM_DELEGATE_PREV reclaim, and that situation appears
      to another.  The client must re-establish
   its locks on the new server.  This can be compared to the re-
   establishment of locks by means result of reclaim-type requests after a
   server reboot.  The difference is that the server network partition (i.e., lease expiration
      has no provision to
   distinguish requests reclaiming locks from those obtaining new occurred), a server's lease expiration approach, possibly
      including the use of courtesy locks
   or to defer would normally provide for the latter.  Thus, a client re-establishing a lock on
      retention of unreclaimed delegations.  Even in the
   new server (by means event that
      lease cancellation occurs, such delegation should be reclaimed
      using CLAIM_DELEGATE_PREV as part of a LOCK or OPEN request), may have network partition recovery.

   o  When the
   requests denied due to a conflicting lock.  Since replication period of non-communicating is
   intended for read-only followed by a client
      reboot, unreclaimed delegations, should also be reclaimable by use
      of filesystems, such denial CLAIM_DELEGATE_PREV as part of locks
   should not pose large difficulties in practice. client reboot recovery.

   o  When an attempt to
   re-establish the period is terminated by a lock on lease period elapsing without a new server
      successful CLAIM_DELEGATE_PREV reclaim, and lease renewal is denied,
      occurring, the client should
   treat server may well conclude that unreclaimed
      delegations have been abandoned, and consider the situation as if his original lock had been revoked.

9.14.3.  Notification of Migrated Lease

   In the case one
      in which an implied DELEGPURGE should be assumed.

   A server that supports a claim type of lease renewal, CLAIM_DELEGATE_PREV MUST
   support the client may not be submitting
   requests for DELEGPURGE operation, and similarly a filesystem server that has been migrated to another server.
   This can occur because of
   supports DELEGPURGE MUST support CLAIM_DELEGATE_PREV.  A server which
   does not support CLAIM_DELEGATE_PREV MUST return NFS4ERR_NOTSUPP if
   the implicit lease renewal mechanism.  The client renews leases attempts to use that feature or performs a DELEGPURGE
   operation.

   Support for all filesystems when submitting a request claim type of CLAIM_DELEGATE_PREV, is often referred to
   any one filesystem at the server.

   In order
   as providing for "client-persistent delegations" in that they allow
   use of client persistent storage on the client to schedule renewal of leases that may have
   been relocated to the new server, store data written
   by the client, even across a client must find out about
   lease relocation before those leases expire.  To accomplish this, all
   operations which implicitly renew leases for a restart.  It should be noted
   that, with the optional exception noted below, this feature requires
   persistent storage to be used on the client (such as OPEN,
   CLOSE, READ, WRITE, RENEW, LOCK, and others), will return does not add to
   persistent storage requirements on the error
   NFS4ERR_LEASE_MOVED if responsibility server.

   One good way to think about client-persistent delegations is that for any of
   the leases most part, they function like "courtesy locks", with special
   semantic adjustments to allow them to be
   renewed has been transferred retained across a client
   restart, which cause all other sorts of locks to be freed.  Such
   locks are generally not retained across a new server.  This condition will
   continue until server restart.  The one
   exception is the case of simultaneous failure of the client receives an NFS4ERR_MOVED error and the
   server receives and is discussed below.

   When the subsequent GETATTR(fs_locations) for an access to
   each filesystem for which a lease has been moved server indicates support of CLAIM_DELEGATE_PREV (implicitly)
   by returning NFS_OK to DELEGPURGE, a new server.  By
   convention, the compound including the GETATTR(fs_locations) SHOULD
   append client with a RENEW operation write delegation,
   can use write-back caching for data to permit the server be written to identify the client
   doing the access.

   Upon receiving server,
   deferring the NFS4ERR_LEASE_MOVED error, a client that supports
   filesystem migration MUST probe all filesystems from that server on
   which it holds open state.  Once write-back, until such time as the delegation is
   recalled, possibly after intervening client has successfully probed
   all those filesystems which are migrated, restarts.  Similarly,
   when the server MUST resume
   normal handling indicates support of stateful requests from that client.

   In order CLAIM_DELEGATE_PREV, a client
   with a read delegation and an open-for-write subordinate to support legacy clients that do not handle the
   NFS4ERR_LEASE_MOVED error correctly,
   delegation, may be sure of the server SHOULD time out after
   a wait integrity of at least two lease periods, at which time it will resume
   normal handling its persistently cached
   copy of stateful requests from all clients.  If the file after a client
   attempts to access the migrated files, restart without specific verification
   of the server MUST reply
   NFS4ERR_MOVED. change attribute.

   When the client receives an NFS4ERR_MOVED error, the client can
   follow the normal process to obtain the new server information
   (through reboots or restarts, delegations are reclaimed (using
   the fs_locations attribute) OPEN operation with CLAIM_PREVIOUS) in a similar fashion to byte-
   range locks and perform renewal of those
   leases on share reservations.  However, there is a slight
   semantic difference.  In the new server.  If normal case, if the server has decides that
   a delegation should not had state
   transferred to be granted, it transparently, the client will receive either
   NFS4ERR_STALE_CLIENTID or NFS4ERR_STALE_STATEID from performs the new server,
   as described above.  The client can then recover state information as
   it does in requested action
   (e.g., OPEN) without granting any delegation.  For reclaim, the event of
   server failure.

9.14.4.  Migration and grants the Lease_time Attribute

   In order delegation but a special designation is applied so
   that the client may appropriately manage its leases in treats the
   case delegation as having been granted but
   recalled by the server.  Because of migration, this, the destination server must establish proper
   values for client has the lease_time attribute.

   When duty to
   write all modified state is transferred transparently, that state should include to the correct value of server and then return the lease_time attribute.  The lease_time
   attribute on
   delegation.  This process of handling delegation reclaim reconciles
   three principles of the destination NFSv4 protocol:

   o  Upon reclaim, a client claiming resources assigned to it by an
      earlier server instance must never be less than that on granted those resources.

   o  The server has unquestionable authority to determine whether
      delegations are to be granted and, once granted, whether they are
      to be continued.

   o  The use of callbacks is not to be depended upon until the source since this would result in premature expiration client
      has proven its ability to receive them.

   When a client has more than a single open associated with a
   delegation, state for those additional opens can be established using
   OPEN operations of leases
   granted by type CLAIM_DELEGATE_CUR.  When these are used to
   establish opens associated with reclaimed delegations, the source server.  Upon migration server
   MUST allow them when made within the grace period.

   Situations in which state there is
   transferred transparently, the a series of client and server restarts
   where there is under no obligation to re-
   fetch restart of both at the lease_time attribute same time, are dealt with
   via a combination of CLAIM_DELEGATE_PREV and may continue to use the value
   previously fetched (on the source server).

   If state has not been transferred transparently (i.e., CLAIM_PREVIOUS reclaim
   cycles.  Persistent storage is needed only on the client.  For each
   server failure, a CLAIM_PREVIOUS reclaim cycle is done, while for
   each client
   sees restart, a real or simulated server reboot), CLAIM_DELEGATE_PREV reclaim cycle is done.

   To deal with the possibility of simultaneous failure of client should fetch and
   server (e.g., a data center power outage), the
   value of lease_time on server MAY
   persistently store delegation information so that it can respond to a
   CLAIM_DELEGATE_PREV reclaim request which it receives from a
   restarting client.  This is the new (i.e., destination) server, and use one case in which persistent
   delegation state can be retained across a server restart.  A server
   is not required to store this information, but if it does do so, it
   should do so for subsequent locking requests.  However write delegations and for read delegations, during
   the pendency of which (across multiple client and/or server must respect a
   grace period at least as long
   instances), some open-for-write was done as part of delegation.  When
   the lease_time on space to persistently record such information is limited, the source server,
   server should recall delegations in order to ensure that clients have ample time this class in preference to reclaim their
   locks before potentially conflicting non-reclaimed locks
   keeping them active without persistent storage recording.

   When a network partition occurs, delegations are granted.
   The means subject to freeing
   by which the new server obtains the value of lease_time on when the old server lease renewal period expires.  This is left similar
   to the server implementations.  It is not
   specified behavior for locks and share reservations, and, as for locks
   and share reservations it may be modified by support for "courtesy
   locks" in which locks are not freed in the NFS version 4 protocol.

10.  Client-Side Caching

   Client-side caching of data, absence of file attributes, a conflicting
   lock request.  Whereas, for locks and share reservations, freeing of file names is
   essential to providing good performance with the NFS protocol.
   Providing distributed cache coherence is a difficult problem and
   previous versions
   locks will occur immediately upon the appearance of a conflicting
   request, for delegations, the NFS protocol have not attempted it.
   Instead, several NFS client implementation techniques have been used
   to reduce server may institute period during
   which conflicting requests are held off.  Eventually the problems that occurrence
   of a lack conflicting request from another client will cause revocation of coherence poses for users.
   These techniques have not been clearly defined
   the delegation.

   A loss of the callback path (e.g., by earlier protocol
   specifications and it is often unclear what is valid or invalid
   client behavior.

   The NFSv4 protocol uses many techniques similar to those that later network configuration
   change) will have
   been used a similar effect in previous protocol versions.  The NFSv4 protocol does not
   provide distributed cache coherence.  However, that it defines a more
   limited set can also result in
   revocation of caching guarantees to allow locks a delegation A recall request will fail and share
   reservations to be used without destructive interference from client
   side caching.

   In addition, revocation
   of the NFSv4 protocol introduces a delegation mechanism
   which allows many decisions will result.

   A client normally made by the server to be made
   locally by clients.  This mechanism provides efficient support finds out about revocation of a delegation when it
   uses a stateid associated with a delegation and receives one of the
   common cases where sharing is infrequent
   errors NFS4ERR_EXPIRED, NFS4ERR_BAD_STATEID, or where sharing is read-
   only.

10.1.  Performance Challenges for Client-Side Caching

   Caching techniques used in previous versions of NFS4ERR_ADMIN_REVOKED
   (NFS4ERR_EXPIRED indicates that all lock state associated with the NFS protocol have
   client has been successful in providing good performance.  However, several
   scalability challenges can arise lost).  It also may find out about delegation
   revocation after a client reboot when those techniques are used with
   very large numbers it attempts to reclaim a
   delegation and receives NFS4ERR_EXPIRED.  Note that in the case of clients.  This is particularly true when
   clients a
   revoked OPEN_DELEGATE_WRITE delegation, there are geographically distributed which classically increases issues because data
   may have been modified by the latency client whose delegation is revoked and
   separately by other clients.  See Section 10.5.1 for cache re-validation requests.

   The previous versions a discussion of
   such issues.  Note also that when delegations are revoked,
   information about the NFS protocol repeat their file data
   cache validation requests at the time revoked delegation will be written by the file is opened.
   server to stable storage (as described in Section 9.6).  This
   behavior can have serious performance drawbacks.  A common case is
   one done
   to deal with the case in which a file is only accessed by server reboots after revoking a single client.  Therefore,
   sharing
   delegation but before the client holding the revoked delegation is infrequent.

   In this case, repeated reference to
   notified about the server to find revocation.

   Note that no
   conflicts exist is expensive.  A better option with regards to
   performance when there is to allow a client that repeatedly opens loss of a file to do
   so without reference delegation, due to the server.  This is done until potentially
   conflicting operations from another client actually occur.

   A similar situation arises a network
   partition in connection which all locks associated with file locking.  Sending
   file lock and unlock requests to the server as well as lease are lost, the read and
   write requests necessary to make data caching consistent with
   client will also receive the
   locking semantics (see Section 10.3.2) error NFS4ERR_EXPIRED.  This case can severely limit
   performance.  When locking be
   distinguished from other situations in which delegations are revoked
   by seeing that the associated clientid becomes invalid so that
   NFS4ERR_STALE_CLIENTID is used to provide protection against
   infrequent conflicts, a large penalty returned when it is incurred.  This penalty may
   discourage the use of file locking by applications.

   The NFSv4 protocol provides more aggressive caching strategies with used.

   When NFS4ERR_EXPIRED is returned, the following design goals:

   o  Compatibility with a large range of server semantics.

   o  Provide the same caching benefits as previous versions of MAY retain information
   about the NFS
      protocol when unable to provide delegations held by the more aggressive model.

   o  Requirements for aggressive caching are organized so client, deleting those that are
   invalidated by a large
      portion of conflicting request.  Retaining such information
   will allow the benefit can be obtained even when not client to recover all of non-invalidated delegations
   using the
      requirements can be met.

   The appropriate requirements for claim type CLAIM_DELEGATE_PREV, once the server are discussed in later
   sections in which specific forms of caching are covered (see
   Section 10.4).

10.2.  Delegation and Callbacks

   Recallable delegation of server responsibilities for a file
   SETCLIENTID_CONFIRM is done to recover.  Attempted recovery of a
   delegation that the client improves performance has no record of, typically because they
   were invalidated by avoiding repeated requests to the
   server in conflicting requests, will get the absence of inter-client conflict.  With error
   NFS4ERR_BAD_RECLAIM.  Once a reclaim is attempted for all delegations
   that the use of client held, it SHOULD do a
   "callback" RPC from server DELEGPURGE to client, a allow any
   remaining server recalls delegated
   responsibilities when another client engages in sharing of a
   delegated file.

   A delegation is passed from the server information to be freed.

10.3.  Data Caching

   When applications share access to the client, specifying the
   object of the delegation and the type of delegation.  There are
   different types of delegations but each type contains a stateid set of files, they need to be
   used
   implemented so as to represent the delegation when performing operations that
   depend on take account of the delegation. possibility of conflicting
   access by another application.  This stateid is similar to those
   associated with locks and share reservations but differs true whether the applications
   in that question execute on different clients or reside on the
   stateid for a delegation is associated with a client ID same
   client.

   Share reservations and may be
   used on behalf of all byte-range locks are the open-owners for facilities the given client.  A
   delegation is made NFS
   version 4 protocol provides to allow applications to coordinate
   access by providing mutual exclusion facilities.  The NFSv4
   protocol's data caching must be implemented such that it does not
   invalidate the client as a whole assumptions that those using these facilities depend
   upon.

10.3.1.  Data Caching and OPENs

   In order to avoid invalidating the sharing assumptions that
   applications rely on, NFSv4 clients should not provide cached data to any specific
   process
   applications or thread modify it on behalf of control within it.

   Because callback RPCs may an application when it would
   not work in all environments (due be valid to
   firewalls, for example), correct protocol operation does not depend
   on them.  Preliminary testing of callback functionality by means of a
   CB_NULL procedure determines whether callbacks can be supported.  The
   CB_NULL procedure checks the continuity of the callback path.  A
   server makes a preliminary assessment of callback availability to a
   given client and avoids delegating responsibilities until it has
   determined obtain or modify that callbacks are supported.  Because the granting of same data via a
   delegation is always conditional upon READ or WRITE
   operation.

   Furthermore, in the absence of conflicting
   access, clients must not assume that a open delegation will be granted (see Section 10.4) two
   additional rules apply.  Note that these rules are obeyed in practice
   by many NFSv2 and
   they NFSv3 clients.

   o  First, cached data present on a client must always be prepared for OPENs to be processed without any
   delegations being granted.

   Once granted, a delegation behaves in most ways like a lock.  There
   is revalidated after
      doing an associated lease OPEN.  Revalidating means that is subject to renewal together with all
   of the other leases held by that client.

   Unlike locks, an operation by a second client to a delegated file
   will cause the server to recall a delegation through a callback.

   On recall, fetches the client holding
      change attribute from the delegation must flush modified
   state (such as modified data) to server, compares it with the server cached
      change attribute, and return if different, declares the
   delegation.  The conflicting request will not be acted on until cached data (as
      well as the
   recall cached attributes) as invalid.  This is complete.  The recall to ensure that
      the data for the OPENed file is considered complete when still correctly reflected in the
   client returns
      client's cache.  This validation must be done at least when the delegation
      client's OPEN operation includes DENY=WRITE or BOTH thus
      terminating a period in which other clients may have had the server times its wait for
      opportunity to open the
   delegation file with WRITE access.  Clients may
      choose to be returned and revokes do the delegation as a result revalidation more often (i.e., at OPENs
      specifying DENY=NONE) to parallel the NFSv3 protocol's practice
      for the benefit of users assuming this degree of cache
      revalidation.  Since the timeout.  In change attribute is updated for data and
      metadata modifications, some client implementors may be tempted to
      use the interim, time_modify attribute and not the server will either delay responding change attribute to conflicting requests or respond
      validate cached data, so that metadata changes do not spuriously
      invalidate clean data.  The implementor is cautioned in this
      approach.  The change attribute is guaranteed to change for each
      update to them with NFS4ERR_DELAY.
   Following the resolution of the recall, file, whereas time_modify is guaranteed to change
      only at the server has granularity of the
   information necessary to grant or deny time_delta attribute.  Use by the second
      client's request.

   At data cache validation logic of time_modify and not the time
      change attribute runs the risk of the client receives a delegation recall, it may have
   substantial state that needs to incorrectly marking
      stale data as valid.

   o  Second, modified data must be flushed to the server.  Therefore,
   the server should allow sufficient time before closing
      a file OPENed for the delegation to be
   returned since it may involve numerous RPCs write.  This is complementary to the server. first rule.
      If the
   server data is able to determine that not flushed at CLOSE, the revalidation done after
      the client OPENs a file is diligently unable to achieve its purpose.  The
      other aspect to flushing
   state the data before close is that the data
      must be committed to stable storage, at the server as a result of server, before the recall,
      CLOSE operation is requested by the server may extend client.  In the usual time allowed for case of a recall.  However, the time allowed for
   recall completion should
      server reboot or restart and a CLOSEd file, it may not be unbounded.

   An example of possible
      to retransmit the data to be written to the file.  Hence, this is when responsibility
      requirement.

10.3.2.  Data Caching and File Locking

   For those applications that choose to mediate opens on a given use file locking instead of
   share reservations to exclude inconsistent file access, there is delegated an
   analogous set of constraints that apply to a client (see Section 10.4).  The server will
   not know what opens side data caching.
   These rules are in effect on effective only if the client.  Without this
   knowledge file locking is used in a way
   that matches in an equivalent way the server will be unable actual READ and WRITE
   operations executed.  This is as opposed to determine if file locking that is
   based on pure convention.  For example, it is possible to manipulate
   a two-megabyte file by dividing the file into two one-megabyte
   regions and protecting access to the two regions by file locks on
   bytes zero and
   deny state one.  A lock for write on byte zero of the file allows any particular open until would
   represent the
   delegation right to do READ and WRITE operations on the first
   region.  A lock for write on byte one of the file has been returned.

   A client failure or a network partition can result in failure to
   respond would represent the
   right to a recall callback.  In this case, do READ and WRITE operations on the server will revoke second region.  As long
   as all applications manipulating the delegation which in turn file obey this convention, they
   will render useless any modified state
   still work on the client.

   Clients need to be aware that server implementors a local file system.  However, they may enforce
   practical limitations on not work with
   the number of delegations issued.  Further,
   as there is no way to determine which delegations to revoke, NFSv4 protocol unless clients refrain from data caching.

   The rules for data caching in the
   server is allowed file locking environment are:

   o  First, when a client obtains a file lock for a particular region,
      the data cache corresponding to revoke any. that region (if any cached data
      exists) must be revalidated.  If the server is implemented to
   revoke another delegation held by change attribute indicates
      that client, then the client file may be
   able to determine that a limit has have been reached because each new
   delegation request results in a revoke.  The updated since the cached data was
      obtained, the client could then
   determine which delegations it may not need and preemptively release
   them.

10.2.1.  Delegation Recovery

   There are three situations that delegation recovery must deal with:

   o  Client reboot or restart

   o  Server reboot or restart

   o  Network partition (full flush or callback-only)

   In invalidate the event cached data for
      the newly locked region.  A client reboots or restarts, the confirmation of a
   SETCLIENTID done with an nfs_client_id4 with a new verifier4 value
   will result in might choose to invalidate all
      of non-modified cached data that it has for the release file but the only
      requirement for correct operation is to invalidate all of byte-range locks and share
   reservations.  Delegations, however, may be treated a bit
   differently.

   There will be situations the data
      in which delegations will need to be
   reestablished after the newly locked region.

   o  Second, before releasing a client reboots or restarts.  The reason write lock for
   this is a region, all modified
      data for that region must be flushed to the client may have file server.  The modified
      data stored locally must also be written to stable storage.

   Note that flushing data to the server and this the invalidation of cached
   data
   was associated with must reflect the previously held delegations.  The actual byte ranges locked or unlocked.
   Rounding these up or down to reflect client cache block boundaries
   will
   need cause problems if not carefully done.  For example, writing a
   modified block when only half of that block is within an area being
   unlocked may cause invalid modification to reestablish the appropriate file state on region outside the server.

   To allow for
   unlocked area.  This, in turn, may be part of a region locked by
   another client.  Clients can avoid this type situation by synchronously
   performing portions of client recovery, write operations that overlap that portion
   (initial or final) that is not a full block.  Similarly, invalidating
   a locked area which is not an integral number of full buffer blocks
   would require the server MAY allow
   delegations client to be retained after other sort of locks are released.
   This implies that requests read one or two partial blocks from other clients the
   server if the revalidation procedure shows that conflict with
   these delegations will need to wait.  Because the normal recall
   process may require significant time for data which the
   client possesses may not be valid.

   The data that is written to flush changed
   state the server as a prerequisite to the
   unlocking of a region must be written, at the server, other clients need to be prepared for delays
   that occur stable
   storage.  The client may accomplish this either with synchronous
   writes or by following asynchronous writes with a COMMIT operation.
   This is required because retransmission of the modified data after a conflicting delegation.  In order to give
   clients
   server reboot might conflict with a chance lock held by another client.

   A client implementation may choose to get through the reboot process during accommodate applications which
   leases will not be renewed,
   use byte-range locking in non-standard ways (e.g., using a byte-range
   lock as a global semaphore) by flushing to the server MAY extend more data upon
   a LOCKU than is covered by the period locked range.  This may include
   modified data within files other than the one for
   delegation recovery beyond which the typical lease expiration period.  For
   open delegations, such delegations that unlocks
   are being done.  In such cases, the client must not released are
   reclaimed using OPEN interfere with a claim type of CLAIM_DELEGATE_PREV.  (See
   Section 10.5
   applications whose READs and Section 15.18 for discussion WRITEs are being done only within the
   bounds of open delegation and record locks which the details application holds.  For example, an
   application locks a single byte of OPEN respectively).

   A server MAY support a claim type of CLAIM_DELEGATE_PREV, but if it
   does, it MUST NOT remove delegations upon SETCLIENTID_CONFIRM file and
   instead MUST make them available for client reclaim using
   CLAIM_DELEGATE_PREV.  The server MUST NOT remove the delegations
   until either the proceeds to write that
   single byte.  A client does that chose to handle a DELEGPURGE, or one lease period has
   elapsed from the time the later of the SETCLIENTID_CONFIRM or LOCKU by flushing all
   modified data to the
   last successful CLAIM_DELEGATE_PREV reclaim.

   Note server could validly write that the requirement stated above is single byte in
   response to an unrelated unlock.  However, it would not meant be valid to imply that
   when
   write the client is no longer obliged, as required above, to retain
   delegation information, entire block in which that single written byte was located
   since it should necessarily dispose of it.

   Some specific cases are:

   o  When the period includes an area that is terminated not locked and might be locked by the occurrence of DELEGPURGE,
      deletion of unreclaimed delegations is
   another client.  Client implementations can avoid this problem by
   dividing files with modified data into those for which all
   modifications are done to areas covered by an appropriate byte-range
   lock and desirable.

   o  When the period is terminated those for which there are modifications not covered by a lease period elapsing without a
      successful CLAIM_DELEGATE_PREV reclaim,
   byte-range lock.  Any writes done for the former class of files must
   not include areas not locked and that situation appears
      to be thus not modified on the result client.

10.3.3.  Data Caching and Mandatory File Locking

   Client side data caching needs to respect mandatory file locking when
   it is in effect.  The presence of mandatory file locking for a network partition (i.e., lease expiration given
   file is indicated when the client gets back NFS4ERR_LOCKED from a
   READ or WRITE on a file it has occurred), an appropriate share reservation for.
   When mandatory locking is in effect for a server's lease expiration approach, possibly
      including file, the use of courtesy locks would normally provide client must check
   for an appropriate file lock for data being read or written.  If a
   lock exists for the
      retention of unreclaimed delegations.  Even in range being read or written, the event that
      lease cancellation occurs, such delegation should be reclaimed client may
   satisfy the request using CLAIM_DELEGATE_PREV as part of network partition recovery.

   o  When the period of non-communicating client's validated cache.  If an
   appropriate file lock is followed by a client
      reboot, unreclaimed delegations, should also be reclaimable by use
      of CLAIM_DELEGATE_PREV as part not held for the range of client reboot recovery.

   o  When the period is terminated read or write,
   the read or write request must not be satisfied by a lease period elapsing without a
      successful CLAIM_DELEGATE_PREV reclaim, and lease renewal is
      occurring, the server may well conclude that unreclaimed
      delegations have been abandoned, client's cache
   and consider the situation as one
      in which an implied DELEGPURGE should request must be assumed.

   A sent to the server that supports for processing.  When a claim type of CLAIM_DELEGATE_PREV MUST
   support
   read or write request partially overlaps a locked region, the DELEGPURGE operation, request
   should be subdivided into multiple pieces with each region (locked or
   not) treated appropriately.

10.3.4.  Data Caching and similarly a server that
   supports DELEGPURGE MUST support CLAIM_DELEGATE_PREV.  A server File Identity

   When clients cache data, the file data needs to be organized
   according to the file system object to which
   does not support CLAIM_DELEGATE_PREV MUST return NFS4ERR_NOTSUPP if the client attempts data belongs.  For
   NFSv3 clients, the typical practice has been to use that feature or performs a DELEGPURGE
   operation.

   Support assume for a claim type the
   purpose of CLAIM_DELEGATE_PREV, is often referred to
   as providing for "client-persistent delegations" in caching that they allow
   use of distinct filehandles represent distinct file
   system objects.  The client persistent storage on then has the client choice to store organize and
   maintain the data written
   by cache on this basis.

   In the client, even across a client restart.  It should be noted
   that, with NFSv4 protocol, there is now the optional exception noted below, this feature requires
   persistent storage possibility to have
   significant deviations from a "one filehandle per object" model
   because a filehandle may be used constructed on the client and does not add to
   persistent storage requirements on basis of the server.

   One good way object's
   pathname.  Therefore, clients need a reliable method to think about client-persistent delegations is that for determine if
   two filehandles designate the most part, they function like "courtesy locks", with a special
   semantic adjustments same file system object.  If clients
   were simply to allow them assume that all distinct filehandles denote distinct
   objects and proceed to be retained across a do data caching on this basis, caching
   inconsistencies would arise between the distinct client
   restart, side objects
   which cause all other sorts of locks mapped to be freed.  Such
   locks are generally not retained across a the same server restart.  The one
   exception is side object.

   By providing a method to differentiate filehandles, the case of simultaneous failure of NFSv4
   protocol alleviates a potential functional regression in comparison
   with the NFSv3 protocol.  Without this method, caching
   inconsistencies within the same client could occur and
   server and is discussed below.

   When the server indicates support this has not
   been present in previous versions of CLAIM_DELEGATE_PREV (implicitly)
   by returning NFS_OK the NFS protocol.  Note that it
   is possible to DELEGPURGE, a client have such inconsistencies with a write delegation,
   can use write-back caching for data to be written to applications executing
   on multiple clients but that is not the server,
   deferring issue being addressed here.

   For the write-back, until such time as purposes of data caching, the delegation is
   recalled, possibly after intervening following steps allow an NFSv4
   client restarts.  Similarly,
   when to determine whether two distinct filehandles denote the same
   server indicates support side object:

   o  If GETATTR directed to two filehandles returns different values of CLAIM_DELEGATE_PREV, a client
      the fsid attribute, then the filehandles represent distinct
      objects.

   o  If GETATTR for any file with a read delegation and an open-for-write subordinate to fsid that
   delegation, may be sure of matches the integrity of its persistently cached
   copy fsid of the file after
      two filehandles in question returns a client restart without specific verification unique_handles attribute
      with a value of TRUE, then the change attribute.

   When the server reboots or restarts, delegations two objects are reclaimed (using
   the OPEN operation with CLAIM_PREVIOUS) in a similar fashion distinct.

   o  If GETATTR directed to byte-
   range locks and share reservations.  However, there is a slight
   semantic difference.  In the normal case, if two filehandles does not return the server decides
      fileid attribute for both of the handles, then it cannot be
      determined whether the two objects are the same.  Therefore,
      operations which depend on that
   a delegation should not knowledge (e.g., client side data
      caching) cannot be granted, done reliably.  Note that if GETATTR does not
      return the fileid attribute for both filehandles, it performs will return
      it for neither of the requested action
   (e.g., OPEN) without granting any delegation.  For reclaim, filehandles, since the fsid for both
      filehandles is the same.

   o  If GETATTR directed to the two filehandles returns different
      values for the fileid attribute, then they are distinct objects.

   o  Otherwise they are the same object.

10.4.  Open Delegation

   When a file is being OPENed, the server grants may delegate further handling
   of opens and closes for that file to the opening client.  Any such
   delegation but a special designation is applied so recallable, since the circumstances that allowed for
   the client treats delegation are subject to change.  In particular, the server may
   receive a conflicting OPEN from another client, the server must
   recall the delegation as having been granted but
   recalled by before deciding whether the server.  Because of this, OPEN from the other
   client has the duty to
   write all modified state may be granted.  Making a delegation is up to the server and then return the
   clients should not assume that any particular OPEN either will or
   will not result in an open delegation.  This process of handling delegation reclaim reconciles
   three principles  The following is a typical
   set of the NFSv4 protocol: conditions that servers might use in deciding whether OPEN
   should be delegated:

   o  Upon reclaim, a  The client reporting resources assigned to it by an
      earlier server instance must be granted those resources.

   o  The server has unquestionable authority to determine whether
      delegations are able to be granted and, once granted, whether they are respond to be continued.

   o the server's callback
      requests.  The server will use of callbacks is not to be depended upon until the client
      has proven its ability to receive them.

   When CB_NULL procedure for a test of
      callback ability.

   o  The client has more than a single must have responded properly to previous recalls.

   o  There must be no current open associated conflicting with a
   delegation, state for those additional opens can the requested
      delegation.

   o  There should be established using
   OPEN operations of type CLAIM_DELEGATE_CUR.  When these are used to
   establish opens associated no current delegation that conflicts with reclaimed delegations, the server
   MUST allow them when made within
      delegation being requested.

   o  The probability of future conflicting open requests should be low
      based on the grace period.

   Situations in which there us a series recent history of client and server restarts
   where there is no restart the file.

   o  The existence of both at any server-specific semantics of OPEN/CLOSE that
      would make the same time, are dealt required handling incompatible with
   via a combination the prescribed
      handling that the delegated client would apply (see below).

   There are two types of CLAIM_DELEGATE_PREV open delegations, OPEN_DELEGATE_READ and CLAIM_PREVIOUS reclaim
   cycles.  Persistent storage is needed only on the client.  For each
   server failure,
   OPEN_DELEGATE_WRITE.  A OPEN_DELEGATE_READ delegation allows a CLAIM_PREVIOUS reclaim cycle is done, while for
   each client restart,
   to handle, on its own, requests to open a CLAIM_DELEGATE_PREV reclaim cycle file for reading that do
   not deny read access to others.  It MUST, however, continue to send
   all requests to open a file for writing to the server.  Multiple
   OPEN_DELEGATE_READ delegations may be outstanding simultaneously and
   do not conflict.  A OPEN_DELEGATE_WRITE delegation allows the client
   to handle, on its own, all opens.  Only one OPEN_DELEGATE_WRITE
   delegation may exist for a given file at a given time and it is done.

   To deal
   inconsistent with any OPEN_DELEGATE_READ delegations.

   When a single client holds a OPEN_DELEGATE_READ delegation, it is
   assured that no other client may modify the possibility of simultaneous failure contents or attributes of
   the file.  If more than one client and
   server (e.g., a data center power outage), holds an OPEN_DELEGATE_READ
   delegation, then the server MAY
   persistently store delegation information so contents and attributes of that it can respond file are not
   allowed to change.  When a
   CLAIM_DELEGATE_PREV reclaim request which client has an OPEN_DELEGATE_WRITE
   delegation, it receives from a
   restarting client.  This is may modify the one case in which persistent
   delegation state can file data since no other client will be retained across
   accessing the file's data.  The client holding a server restart.  A server
   is OPEN_DELEGATE_WRITE
   delegation may only affect file attributes which are intimately
   connected with the file data: size, time_modify, change.

   When a client has an open delegation, it does not required send OPENs or
   CLOSEs to store this information, the server but if it does do so, it
   should do so updates the appropriate status internally.
   For a OPEN_DELEGATE_READ delegation, opens that cannot be handled
   locally (opens for write delegations and for or that deny read delegations, during access) must be sent to
   the pendency of which (across multiple client and/or server
   instances), some open-for-write was done as part of delegation. server.

   When
   the space to persistently record such information an open delegation is limited, made, the
   server should recall delegations in this class in preference to
   keeping them active without persistent storage recording.

   When a network partition occurs, delegations are subject response to freeing
   by the server when OPEN contains an
   open delegation structure which specifies the lease renewal period expires.  This is similar following:

   o  the type of delegation (read or write)

   o  space limitation information to control flushing of data on close
      (OPEN_DELEGATE_WRITE delegation only, see Section 10.4.1)

   o  an nfsace4 specifying read and write permissions

   o  a stateid to represent the behavior delegation for locks READ and share reservations, and, as for locks WRITE

   The delegation stateid is separate and share reservations it may be modified by support distinct from the stateid for "courtesy
   locks" in which locks are not freed in
   the absence of OPEN proper.  The standard stateid, unlike the delegation
   stateid, is associated with a conflicting
   lock request.  Whereas, for locks particular open-owner and share reservations, freeing of
   locks will occur immediately upon the appearance of a conflicting
   request, for delegations, continue
   to be valid after the server may institute period during
   which conflicting requests are held off.  Eventually delegation is recalled and the occurrence
   of file remains
   open.

   When a conflicting request from another internal to the client is made to open a file and open
   delegation is in effect, it will cause revocation of be accepted or rejected solely on
   the delegation.

   A loss basis of the callback path (e.g., by later network configuration
   change) will have a similar effect in that it can also following conditions.  Any requirement for other
   checks to be made by the delegate should result in
   revocation of a open delegation A recall
   being denied so that the checks can be made by the server itself.

   o  The access and deny bits for the request will fail and revocation
   of the delegation will result.

   A client normally finds out about revocation of a delegation when it
   uses a stateid associated file as described
      in Section 9.9.

   o  The read and write permissions as determined below.

   The nfsace4 passed with a delegation and receives one of can be used to avoid frequent
   ACCESS calls.  The permission check should be as follows:

   o  If the
   errors NFS4ERR_EXPIRED, NFS4ERR_BAD_STATEID, or NFS4ERR_ADMIN_REVOKED
   (NFS4ERR_EXPIRED nfsace4 indicates that all lock state associated with the
   client has been lost).  It also open may find out about delegation
   revocation after a client reboot when be done, then it attempts should
      be granted without reference to reclaim a
   delegation and receives NFS4ERR_EXPIRED.  Note the server.

   o  If the nfsace4 indicates that in the case of a
   revoked OPEN_DELEGATE_WRITE delegation, there are issues because data open may have been modified by not be done, then an
      ACCESS request must be sent to the client whose delegation server to obtain the definitive
      answer.

   The server may return an nfsace4 that is revoked and
   separately by other clients.  See Section 10.5.1 for a discussion more restrictive than the
   actual ACL of
   such issues. the file.  This includes an nfsace4 that specifies
   denial of all access.  Note also that when delegations are revoked,
   information about the revoked delegation will be written by some common practices such as
   mapping the
   server traditional user "root" to stable storage (as described in Section 9.6).  This is done the user "nobody" may make it
   incorrect to deal with return the case in which a server reboots after revoking a
   delegation but before actual ACL of the client holding file in the revoked delegation is
   notified about
   response.

   The use of delegation together with various other forms of caching
   creates the revocation.

   Note possibility that when there is a loss of a delegation, due to no server authentication will ever be
   performed for a network
   partition in which given user since all locks associated with of the lease are lost, user's requests might be
   satisfied locally.  Where the client will also receive is depending on the error NFS4ERR_EXPIRED. server for
   authentication, the client should be sure authentication occurs for
   each user by use of the ACCESS operation.  This should be the case can
   even if an ACCESS operation would not be
   distinguished from other situations in which delegations are revoked required otherwise.  As
   mentioned before, the server may enforce frequent authentication by seeing that
   returning an nfsace4 denying all access with every open delegation.

10.4.1.  Open Delegation and Data Caching

   OPEN delegation allows much of the message overhead associated clientid becomes invalid so that
   NFS4ERR_STALE_CLIENTID is returned with
   the opening and closing files to be eliminated.  An open when it an open
   delegation is used.

   When NFS4ERR_EXPIRED Is returned, in effect does not require that a validation message be
   sent to the server MAY retain information
   about unless there exists a potential for conflict with
   the delegations held by requested share mode.  The continued endurance of the client, deleting those
   "OPEN_DELEGATE_READ delegation" provides a guarantee that are
   invalidated by no OPEN for
   write and thus no write has occurred that did not originate from this
   client.  Similarly, when closing a conflicting request.  Retaining such information
   will allow file opened for write and if
   OPEN_DELEGATE_WRITE delegation is in effect, the client data written does
   not have to be flushed to recover all non-invalidated delegations
   using the claim type CLAIM_DELEGATE_PREV, once server until the
   SETCLIENTID_CONFIRM open delegation is done to recover.  Attempted recovery
   recalled.  The continued endurance of a the open delegation provides a
   guarantee that the client has no record of, typically because they
   were invalidated open and thus no read or write has been done by conflicting requests, will get
   another client.

   For the error
   NFS4ERR_BAD_RECLAIM.  Once purposes of open delegation, READs and WRITEs done without an
   OPEN are treated as the functional equivalents of a reclaim is attempted for all delegations corresponding
   type of OPEN.  This refers to the READs and WRITEs that use the client held, it SHOULD do
   special stateids consisting of all zero bits or all one bits.
   Therefore, READs or WRITEs with a DELEGPURGE to allow any
   remaining special stateid done by another
   client will force the server delegation information to be freed.

10.3.  Data Caching

   When applications share access to recall a set OPEN_DELEGATE_WRITE
   delegation.  A WRITE with a special stateid done by another client
   will force a recall of files, they need OPEN_DELEGATE_READ delegations.

   With delegations, a client is able to be
   implemented so as avoid writing data to take account of the possibility
   server when the CLOSE of conflicting
   access by another application.  This a file is serviced.  The file close system
   call is true whether the applications
   in question execute on different clients or reside on the same
   client.

   Share reservations and byte-range locks are usual point at which the facilities client is notified of a lack of
   stable storage for the NFS
   version 4 protocol provides to allow applications to coordinate
   access modified file data generated by providing mutual exclusion facilities.  The NFSv4
   protocol's the
   application.  At the close, file data caching must be implemented such that it does not
   invalidate is written to the assumptions that those using these facilities depend
   upon.

10.3.1.  Data Caching server and OPENs

   In order
   through normal accounting the server is able to avoid invalidating determine if the
   available file system space for the sharing assumptions that
   applications rely on, NFSv4 clients should not provide cached data to
   applications has been exceeded (i.e.,
   server returns NFS4ERR_NOSPC or modify it on behalf NFS4ERR_DQUOT).  This accounting
   includes quotas.  The introduction of an application when it would
   not be valid to obtain or modify delegations requires that same data via a READ or WRITE
   operation.

   Furthermore,
   alternative method be in place for the absence same type of open delegation (see Section 10.4) two
   additional rules apply.  Note that these rules are obeyed in practice
   by many NFSv2 and NFSv3 clients.

   o  First, cached data present on a communication to
   occur between client must be revalidated after
      doing an OPEN.  Revalidating means that and server.

   In the client fetches delegation response, the
      change attribute from server provides either the server, compares it with limit of
   the cached
      change attribute, and if different, declares size of the cached data (as
      well as file or the cached attributes) as invalid.  This is to number of modified blocks and associated
   block size.  The server must ensure that the client will be able to
   flush data for to the OPENed file is still correctly reflected server of a size equal to that provided in the
      client's cache.  This validation
   original delegation.  The server must be done at least when make this assurance for all
   outstanding delegations.  Therefore, the
      client's OPEN operation includes DENY=WRITE server must be careful in
   its management of available space for new or BOTH thus
      terminating modified data taking
   into account available file system space and any applicable quotas.
   The server can recall delegations as a period in which other clients may have had the
      opportunity to open result of managing the
   available file with WRITE access.  Clients may
      choose to do system space.  The client should abide by the revalidation more often (i.e., at OPENs
      specifying DENY=NONE) to parallel server's
   state space limits for delegations.  If the NFSv3 protocol's practice client exceeds the stated
   limits for the benefit of users assuming this degree of cache
      revalidation.  Since delegation, the change attribute server's behavior is updated for data and
      metadata modifications, some client implementors may be tempted to
      use undefined.

   Based on server conditions, quotas or available file system space,
   the time_modify attribute and not change to validate cached
      data, so that metadata changes do not spuriously invalidate clean
      data. server may grant OPEN_DELEGATE_WRITE delegations with very
   restrictive space limitations.  The implementor is cautioned limitations may be defined in this approach.  The change
      attribute is guaranteed a
   way that will always force modified data to change for each update be flushed to the file,
      whereas time_modify is guaranteed server
   on close.

   With respect to change only at the
      granularity of the time_delta attribute.  Use by the client's data
      cache validation logic of time_modify and not change runs the risk
      of the client incorrectly marking stale data as valid.

   o  Second, authentication, flushing modified data must be flushed to the server before closing
   after a file OPENed for write.  This is complementary to CLOSE has occurred may be problematic.  For example, the first rule.
      If user
   of the data is application may have logged off the client and unexpired
   authentication credentials may not flushed at CLOSE, be present.  In this case, the revalidation done after
   client OPENs as file is unable may need to achieve its purpose.  The other
      aspect take special care to flushing the data before close is ensure that the data must local unexpired
   credentials will in fact be
      committed to stable storage, at the server, before the CLOSE
      operation is requested available.  This may be accomplished by
   tracking the client.  In the case expiration time of a server
      reboot credentials and flushing data well in
   advance of their expiration or restart by making private copies of
   credentials to assure their availability when needed.

10.4.2.  Open Delegation and File Locks

   When a CLOSEd file, it client holds a OPEN_DELEGATE_WRITE delegation, lock operations
   may not be possible to
      retransmit the data to performed locally.  This includes those required for mandatory
   file locking.  This can be written to done since the file.  Hence, this
      requirement.

10.3.2.  Data Caching and File Locking

   For those applications delegation implies that choose to use file locking instead of
   share reservations to exclude inconsistent file access,
   there is an
   analogous set can be no conflicting locks.  Similarly, all of constraints the
   revalidations that apply to client side would normally be associated with obtaining locks
   and the flushing of data caching.
   These rules associated with the releasing of locks need
   not be done.

   When a client holds a OPEN_DELEGATE_READ delegation, lock operations
   are effective only if not performed locally.  All lock operations, including those
   requesting non-exclusive locks, are sent to the file locking server for
   resolution.

10.4.3.  Handling of CB_GETATTR

   The server needs to employ special handling for a GETATTR where the
   target is used in a way file that matches has a OPEN_DELEGATE_WRITE delegation in an equivalent way effect.
   The reason for this is that the actual READ client holding the
   OPEN_DELEGATE_WRITE delegation may have modified the data and WRITE
   operations executed.  This is as opposed the
   server needs to file locking reflect this change to the second client that is
   based on pure convention.  For example, it is possible
   submitted the GETATTR.  Therefore, the client holding the
   OPEN_DELEGATE_WRITE delegation needs to manipulate
   a two-megabyte file by dividing be interrogated.  The server
   will use the file into two one-megabyte
   regions CB_GETATTR operation.  The only attributes that the
   server can reliably query via CB_GETATTR are size and protecting access change.

   Since CB_GETATTR is being used to satisfy another client's GETATTR
   request, the server only needs to know if the two regions by file locks on
   bytes zero and one.  A lock for write on byte zero of client holding the file would
   represent
   delegation has a modified version of the right to do READ and WRITE operations on file.  If the first
   region.  A lock for write on byte one client's copy
   of the delegated file would represent is not modified (data or size), the
   right to do READ and WRITE operations on server can
   satisfy the second region.  As long
   as all applications manipulating client's GETATTR request from the file obey this convention, they
   will work on a local filesystem.  However, they may not work with attributes
   stored locally at the
   NFSv4 protocol unless clients refrain from data caching.

   The rules for data caching in server.  If the file locking environment are:

   o  First, when a client obtains a file lock for a particular region, is modified, the data cache corresponding server
   only needs to that region (if any cached data
      exists) must be revalidated. know about this modified state.  If the change attribute indicates server
   determines that the file may have been updated since is currently modified, it will respond to
   the cached data was
      obtained, second client's GETATTR as if the client must flush or invalidate file had been modified locally
   at the cached data for server.

   Since the newly locked region.  A client might choose to invalidate all form of non-modified cached data that it has for the file but change attribute is determined by the only
      requirement for correct operation server
   and is opaque to invalidate all of the data
      in client, the newly locked region.

   o  Second, before releasing a write lock for a region, all modified
      data for that region must be flushed client and server need to agree on a
   method of communicating the server.  The modified
      data must also be written to stable storage.

   Note that flushing data to state of the server and file.  For the invalidation size
   attribute, the client will report its current view of cached
   data must reflect the actual byte ranges locked or unlocked.
   Rounding these up or down to reflect client cache block boundaries
   will cause problems if not carefully done. file size.
   For example, writing a
   modified block when only half of that block the change attribute, the handling is within an area being
   unlocked may cause invalid modification to more involved.

   For the region outside client, the
   unlocked area.  This, in turn, may following steps will be part of taken when receiving a region locked by
   another client.  Clients can avoid
   OPEN_DELEGATE_WRITE delegation:

   o  The value of the change attribute will be obtained from the server
      and cached.  Let this situation value be represented by synchronously
   performing portions of write operations that overlap that portion
   (initial or final) that is not a full block.  Similarly, invalidating c.

   o  The client will create a locked area which value greater than c that will be used
      for communicating modified data is not an integral number of full buffer blocks
   would require held at the client.  Let this
      value be represented by d.

   o  When the client to read one or two partial blocks from is queried via CB_GETATTR for the
   server change
      attribute, it checks to see if it holds modified data.  If the revalidation procedure shows that
      file is modified, the data which value d is returned for the
   client possesses may not be valid.

   The data that change attribute
      value.  If this file is written to not currently modified, the server as a prerequisite to client returns
      the
   unlocking value c for the change attribute.

   For simplicity of a region must be written, at implementation, the server, to stable
   storage.  The client may accomplish this either with synchronous
   writes or by following asynchronous writes with a COMMIT operation. MAY for each CB_GETATTR
   return the same value d.  This is required because retransmission of true even if, between successive
   CB_GETATTR operations, the modified client again modifies in the file's data after a
   server reboot might conflict with a lock held by another client.

   A
   or metadata in its cache.  The client implementation may choose can return the same value
   because the only requirement is that the client be able to accommodate applications which
   use byte-range locking in non-standard ways (e.g., using a byte-range
   lock as a global semaphore) by flushing indicate
   to the server more data upon
   a LOCKU than is covered by that the locked range.  This may include client holds modified data within files other than data.  Therefore, the one for which
   value of d may always be c + 1.

   While the unlocks
   are being done.  In such cases, change attribute is opaque to the client must not interfere with
   applications whose READs and WRITEs are being done only within in the
   bounds sense that
   it has no idea what units of record locks which time, if any, the application holds.  For example, an
   application locks a single byte of a file and proceeds to write server is counting
   change with, it is not opaque in that
   single byte.  A the client that chose to handle a LOCKU by flushing all
   modified data has to treat it as
   an unsigned integer, and the server could validly write that single byte in
   response has to an unrelated unlock.  However, it would not be valid able to
   write see the entire block in which results
   of the client's changes to that single written byte was located
   since integer.  Therefore, the server MUST
   encode the change attribute in network order when sending it includes an area that is not locked and might be locked by
   another client.  Client implementations can avoid this problem by
   dividing files with modified data into those for which all
   modifications are done to areas covered by an appropriate byte-range
   lock and those for which there are modifications not covered by a
   byte-range lock.  Any writes done for the former class of files must
   not include areas not locked and thus not modified on the
   client.

10.3.3.  Data Caching and Mandatory File Locking

   Client side data caching needs  The client MUST decode it from network order to respect mandatory file locking its native
   order when receiving it and the client MUST encode it network order
   when sending it to the server.  For this reason, the change attribute
   is in effect.  The presence defined as an unsigned integer rather than an opaque array of mandatory file locking for a given
   file is indicated when
   bytes.

   For the client gets back NFS4ERR_LOCKED from server, the following steps will be taken when providing a
   READ or WRITE on
   OPEN_DELEGATE_WRITE delegation:

   o  Upon providing a file OPEN_DELEGATE_WRITE delegation, the server will
      cache a copy of the change attribute in the data structure it has an appropriate share reservation for. uses
      to record the delegation.  Let this value be represented by sc.

   o  When mandatory locking is in effect for a file, the second client must check
   for an appropriate file lock for data being read or written.  If sends a
   lock exists for GETATTR operation on the range being read or written, same file to
      the client may
   satisfy server, the request using server obtains the client's validated cache.  If an
   appropriate file lock is not held for change attribute from the range of first
      client.  Let this value be cc.

   o  If the read or write, value cc is equal to sc, the read or write request must file is not be satisfied by the client's cache modified and the request must be sent to the
      server for processing.  When a
   read or write request partially overlaps a locked region, returns the request
   should be subdivided into multiple pieces with each region (locked or
   not) treated appropriately.

10.3.4.  Data Caching current values for change, time_metadata, and File Identity

   When clients cache data, the file data needs to be organized
   according
      time_modify (for example) to the filesystem object second client.

   o  If the value cc is NOT equal to which sc, the data belongs.  For
   NFSv3 clients, file is currently modified
      at the typical practice has been first client and most likely will be modified at the server
      at a future time.  The server then uses its current time to assume
      construct attribute values for the
   purpose time_metadata and time_modify.  A
      new value of caching sc, which we will call nsc, is computed by the
      server, such that distinct filehandles represent distinct
   filesystem objects. nsc >= sc + 1.  The client server then has returns the choice to organize
      constructed time_metadata, time_modify, and
   maintain nsc values to the data cache on this basis.

   In
      requester.  The server replaces sc in the NFSv4 protocol, there is now delegation record with
      nsc.  To prevent the possibility to have
   significant deviations of time_modify, time_metadata,
      and change from a "one filehandle per object" model
   because a filehandle may be constructed on the basis of the object's
   pathname.  Therefore, clients need a reliable method appearing to determine go backward (which would happen if
   two filehandles designate
      the same filesystem object.  If clients
   were simply to assume that all distinct filehandles denote distinct
   objects and proceed client holding the delegation fails to do write its modified data caching on this basis, caching
   inconsistencies would arise between the distinct client side objects
   which mapped
      to the same server side object.

   By providing a method to differentiate filehandles, the NFSv4
   protocol alleviates a potential functional regression in comparison
   with the NFSv3 protocol.  Without this method, caching
   inconsistencies within the same client could occur and this has not
   been present in previous versions of before the NFS protocol.  Note that it
   is possible to have such inconsistencies with applications executing
   on multiple clients but that delegation is not the issue being addressed here.

   For the purposes of data caching, the following steps allow an NFSv4
   client to determine whether two distinct filehandles denote revoked or returned), the same
      server side object:

   o  If GETATTR directed to two filehandles returns different values of
      the fsid attribute, then SHOULD update the filehandles represent distinct
      objects.

   o  If GETATTR for any file file's metadata record with an fsid that matches the fsid of the
      two filehandles in question returns a unique_handles
      constructed attribute
      with a value values.  For reasons of TRUE, then reasonable
      performance, committing the two objects are distinct.

   o  If GETATTR directed constructed attribute values to stable
      storage is OPTIONAL.

   As discussed earlier in this section, the two filehandles does not client MAY return the
      fileid attribute for both of the handles, then it cannot be
      determined whether the two objects are the same.  Therefore,
      operations which depend same
   cc value on that knowledge (e.g., client side data
      caching) cannot be done reliably.  Note that subsequent CB_GETATTR calls, even if GETATTR does not
      return the fileid attribute for both filehandles, it will return
      it for neither of file was
   modified in the filehandles, since client's cache yet again between successive
   CB_GETATTR calls.  Therefore, the fsid for both
      filehandles is server must assume that the same.

   o  If GETATTR directed file
   has been modified yet again, and MUST take care to ensure that the two filehandles
   new nsc it constructs and returns different
      values for the fileid attribute, then they are distinct objects.

   o  Otherwise they are the same object.

10.4.  Open Delegation

   When a file is being OPENed, greater than the server may delegate further handling
   of opens and closes for previous nsc it
   returned.  An example implementation's delegation record would
   satisfy this mandate by including a boolean field (let us call it
   "modified") that file is set to FALSE when the opening client.  Any such delegation is recallable, since the circumstances that allowed for granted, and
   an sc value set at the delegation are subject time of grant to change.  In particular, the server may
   receive a conflicting OPEN from another client, the server must
   recall the delegation before deciding whether the OPEN from the other
   client may change attribute value.
   The modified field would be granted.  Making a delegation is up set to TRUE the server first time cc != sc, and
   clients should not assume that any particular OPEN either will or
   will not result in an open delegation.  The following
   would stay TRUE until the delegation is a typical
   set of conditions that servers might use in deciding whether OPEN
   should be delegated:

   o returned or revoked.  The client must be able to respond
   processing for constructing nsc, time_modify, and time_metadata would
   use this pseudo code:

       if (!modified) {
           do CB_GETATTR for change and size;

           if (cc != sc)
               modified = TRUE;
       } else {
           do CB_GETATTR for size;
       }

       if (modified) {
           sc = sc + 1;
           time_modify = time_metadata = current_time;
           update sc, time_modify, time_metadata into file's metadata;
       }

   This would return to the server's callback
      requests. client (that sent GETATTR) the attributes it
   requested, but make sure size comes from what CB_GETATTR returned.
   The server will use would not update the CB_NULL procedure for file's metadata with the client's
   modified size.

   In the case that the file attribute size is different than the
   server's current value, the server treats this as a test modification
   regardless of
      callback ability.

   o  The client must have responded properly the value of the change attribute retrieved via
   CB_GETATTR and responds to previous recalls.

   o  There must be no current open conflicting with the requested
      delegation.

   o  There second client as in the last step.

   This methodology resolves issues of clock differences between client
   and server and other scenarios where the use of CB_GETATTR break
   down.

   It should be no current delegation noted that conflicts with the server is under no obligation to use
   CB_GETATTR and therefore the server MAY simply recall the delegation being requested.

   o
   to avoid its use.

10.4.4.  Recall of Open Delegation

   The probability following events necessitate recall of future conflicting an open requests should be low
      based on the recent history of delegation:

   o  Potentially conflicting OPEN request (or READ/WRITE done with
      "special" stateid)

   o  SETATTR issued by another client

   o  REMOVE request for the file. file

   o  The existence of any server-specific semantics  RENAME request for the file as either source or target of OPEN/CLOSE that
      would make the required handling incompatible with
      RENAME

   Whether a RENAME of a directory in the prescribed
      handling that path leading to the delegated client would apply (see below).

   There are two types file
   results in recall of an open delegations, OPEN_DELEGATE_READ and
   OPEN_DELEGATE_WRITE.  A OPEN_DELEGATE_READ delegation allows a client
   to handle, depends on its own, requests to open a the semantics of
   the server file for reading system.  If that do
   not deny read access to others.  It MUST, however, continue to send
   all requests to open file system denies such RENAMEs when
   a file for writing to is open, the server.  Multiple
   OPEN_DELEGATE_READ delegations may recall must be outstanding simultaneously and
   do not conflict.  A OPEN_DELEGATE_WRITE delegation allows performed to determine whether the client
   file in question is, in fact, open.

   In addition to handle, on its own, all opens.  Only one OPEN_DELEGATE_WRITE
   delegation the situations above, the server may exist for a given file choose to recall
   open delegations at a given any time and if resource constraints make it is
   inconsistent with any OPEN_DELEGATE_READ delegations.
   advisable to do so.  Clients should always be prepared for the
   possibility of recall.

   When a single client holds receives a OPEN_DELEGATE_READ recall for an open delegation, it is
   assured that no other client may modify needs to
   update state on the contents or attributes of server before returning the file.  If more than one delegation.  These
   same updates must be done whenever a client holds an OPEN_DELEGATE_READ
   delegation, then the contents and attributes of that file are not
   allowed chooses to change.  When return a client has an OPEN_DELEGATE_WRITE
   delegation, it may modify the file data since no other client will
   delegation voluntarily.  The following items of state need to be
   accessing
   dealt with:

   o  If the file's data.  The client holding a OPEN_DELEGATE_WRITE
   delegation may only affect file attributes which are intimately
   connected associated with the file data: size, time_modify, change.

   When a client has an delegation is no longer open delegation, it does not send OPENs or
   CLOSEs and
      no previous CLOSE operation has been sent to the server but updates the appropriate status internally.
   For server, a OPEN_DELEGATE_READ delegation, opens that cannot be handled
   locally (opens for write or that deny read access) CLOSE
      operation must be sent to the server.

   When an

   o  If a file has other open delegation is made, the response to references at the client, then OPEN contains an
   open delegation structure which specifies the following:

   o  the type of delegation (read or write)

   o  space limitation information to control flushing of data on close
      (OPEN_DELEGATE_WRITE delegation only, see Section 10.4.1)

   o  an nfsace4 specifying read and write permissions

   o  a stateid
      operations must be sent to represent the delegation for READ and WRITE

   The delegation stateid is separate and distinct from the stateid for the OPEN proper. server.  The standard stateid, unlike the delegation
   stateid, is associated with a particular lock-owner and appropriate stateids
      will continue
   to be valid after the delegation is recalled and provided by the file remains
   open.

   When a request internal to server for subsequent use by the client is made to open a file and open
      since the delegation is in effect, it stateid will not longer be accepted or rejected solely on valid.  These OPEN
      requests are done with the basis claim type of CLAIM_DELEGATE_CUR.  This
      will allow the following conditions.  Any requirement for other
   checks to be made by presentation of the delegate should result in open delegation
   being denied stateid so that the checks
      client can be made by the server itself.

   o  The access and deny bits for establish the request and appropriate rights to perform the file as described
      in OPEN.
      (see Section 9.9. 15.18 for details.)

   o  The read and write permissions as determined below.

   The nfsace4 passed with delegation can be used  If there are granted file locks, the corresponding LOCK operations
      need to avoid frequent
   ACCESS calls.  The permission check should be as follows: performed.  This applies to the OPEN_DELEGATE_WRITE
      delegation case only.

   o  If  For a OPEN_DELEGATE_WRITE delegation, if at the nfsace4 indicates that time of recall the
      file is not open may be done, then it should for write, all modified data for the file must be granted without reference
      flushed to the server.

   o  If the nfsace4 indicates that the open may delegation had not be done, then an
      ACCESS request must be sent to existed, the server to obtain
      client would have done this data flush before the definitive
      answer.

   The server may return an nfsace4 that CLOSE operation.

   o  For a OPEN_DELEGATE_WRITE delegation when a file is more restrictive than still open at
      the
   actual ACL time of recall, any modified data for the file.  This includes an nfsace4 that specifies
   denial of all access.  Note that some common practices such as
   mapping the traditional user "root" file needs to be
      flushed to the user "nobody" may make server.

   o  With the OPEN_DELEGATE_WRITE delegation in place, it
   incorrect to return is possible
      that the actual ACL file was truncated during the duration of the file in delegation.
      For example, the delegation
   response.

   The use truncation could have occurred as a result of delegation together an
      OPEN UNCHECKED4 with various other forms a size attribute value of caching
   creates the possibility that no server authentication will ever be
   performed for zero.  Therefore,
      if a given user since all truncation of the user's requests might be
   satisfied locally.  Where the client is depending on file has occurred and this operation has
      not been propagated to the server for
   authentication, server, the client should be sure authentication occurs for
   each user by use of truncation must occur
      before any modified data is written to the ACCESS operation.  This should be server.

   In the case
   even if an ACCESS operation would not be required otherwise.  As
   mentioned before, the server may enforce frequent authentication by
   returning an nfsace4 denying all access with every open delegation.

10.4.1.  Open Delegation and Data Caching

   OPEN delegation allows much of OPEN_DELEGATE_WRITE delegation, file locking imposes
   some additional requirements.  To precisely maintain the message overhead associated with
   the opening and closing files to be eliminated.  An open when an open
   delegation
   invariant, it is in effect does not require that a validation message be
   sent required to the server unless there exists a potential flush any modified data in any region
   for conflict with
   the requested share mode.  The continued endurance of the
   "OPEN_DELEGATE_READ delegation" provides which a guarantee that no OPEN for write and thus lock was released while the OPEN_DELEGATE_WRITE
   delegation was in effect.  However, because the OPEN_DELEGATE_WRITE
   delegation implies no write has occurred that did not originate from this
   client.  Similarly, when closing other locking by other clients, a file opened simpler
   implementation is to flush all modified data for write and the file (as
   described just above) if any write lock has been released while the
   OPEN_DELEGATE_WRITE delegation is was in effect, the data written does effect.

   An implementation need not have wait until delegation recall (or deciding
   to be flushed voluntarily return a delegation) to perform any of the server until above
   actions, if implementation considerations (e.g., resource
   availability constraints) make that desirable.  Generally, however,
   the fact that the actual open delegation is
   recalled.  The continued endurance state of the open delegation provides a
   guarantee that no open file may continue to
   change makes it not worthwhile to send information about opens and thus no read or write has been done by
   another client.

   For
   closes to the purposes of open delegation, READs and WRITEs done without an
   OPEN are treated server, except as the functional equivalents part of a corresponding
   type delegation return.  Only in
   the case of OPEN.  This refers to closing the READs and WRITEs open that use resulted in obtaining the
   special stateids consisting of all zero bits or all one bits.
   Therefore, READs or WRITEs with a special stateid
   delegation would clients be likely to do this early, since, in that
   case, the close once done by another
   client will force not be undone.  Regardless of the server to recall a OPEN_DELEGATE_WRITE
   delegation.  A WRITE with a special stateid done by another client
   will force a recall of OPEN_DELEGATE_READ delegations.

   With delegations, a client
   client's choices on scheduling these actions, all must be performed
   before the delegation is able to avoid writing data returned, including (when applicable) the
   close that corresponds to the open that resulted in the delegation.
   These actions can be performed either in previous requests or in
   previous operations in the same COMPOUND request.

10.4.5.  OPEN Delegation Race with CB_RECALL

   The server when informs the CLOSE client of recall via a file CB_RECALL.  A race case
   which may develop is serviced.  The file close system
   call when the delegation is immediately recalled
   before the usual point at COMPOUND which established the client delegation is notified of returned to
   the client.  As the CB_RECALL provides both a lack of
   stable storage stateid and a
   filehandle for which the modified file data generated by client has no mapping, it cannot honor the
   application.
   recall attempt.  At this point, the close, file data is written to client has two choices, either do
   not respond or respond with NFS4ERR_BADHANDLE.  If it does not
   respond, then it runs the server and
   through normal accounting risk of the server is able deciding to determine if not grant it
   further delegations.

   If instead it does reply with NFS4ERR_BADHANDLE, then both the
   available filesystem space for client
   and the data has been exceeded (i.e., server returns NFS4ERR_NOSPC or NFS4ERR_DQUOT).  This accounting
   includes quotas. might be able to detect that a race condition is
   occurring.  The introduction client can keep a list of delegations requires that pending delegations.  When
   it receives a
   alternative method be in place CB_RECALL for an unknown delegation, it can cache the same type of communication to
   occur between client
   stateid and server.

   In the delegation response, the server provides either the limit filehandle on a list of pending recalls.  When it is
   provided with a delegation, it would only use it if it was not on the size of
   pending recall list.  Upon the file or next CB_RECALL, it could immediately
   return the number delegation.

   In turn, the server can keep track of modified blocks when it issues a delegation and associated
   block size.  The server must ensure
   assume that the if a client will be able to
   flush data responds to the server of CB_RECALL with a size equal
   NFS4ERR_BADHANDLE, then the client has yet to that provided in receive the
   original delegation.
   The server must make this assurance for all
   outstanding delegations.  Therefore, SHOULD give the server must be careful in
   its management of available space for new or modified data taking
   into account available filesystem space client a reasonable time both to get this
   delegation and any applicable quotas.
   The server can recall delegations as to return it before revoking the delegation.  Unlike a result of managing
   failed callback path, the
   available filesystem space.  The client server should abide by the server's
   state space limits for delegations.  If periodically probe the client exceeds the stated
   limits for the delegation, the server's behavior is undefined.

   Based on server conditions, quotas or available filesystem space, the
   server may grant OPEN_DELEGATE_WRITE delegations
   with very
   restrictive space limitations.  The limitations may be defined in a
   way that will always force modified data to be flushed CB_RECALL to see if it has received the server
   on close.

   With respect to authentication, flushing modified data delegation and is ready
   to return it.

   When the server
   after a CLOSE finally determines that enough time has occurred may be problematic.  For example, lapsed, it
   SHOULD revoke the user
   of delegation and it SHOULD NOT revoke the application may have logged off lease.
   During this extended recall process, the client and unexpired
   authentication credentials may not server SHOULD be present.  In this case, renewing
   the client may need to take special care to ensure lease.  The intent here is that local unexpired
   credentials will in fact be available.  This may be accomplished by
   tracking the expiration time of credentials and flushing data well in
   advance of their expiration or client not pay too
   onerous a burden for a condition caused by making private copies of
   credentials the server.

10.4.6.  Clients that Fail to assure their availability when needed.

10.4.2.  Open Honor Delegation and File Locks

   When a Recalls

   A client holds a OPEN_DELEGATE_WRITE delegation, lock operations may be performed locally.  This includes those required fail to respond to a recall for mandatory
   file locking.  This can be done since various reasons, such as
   a failure of the delegation implies that
   there can callback path from server to the client.  The client
   may be no conflicting locks.  Similarly, all unaware of a failure in the
   revalidations callback path.  This lack of
   awareness could result in the client finding out long after the
   failure that would normally be associated with obtaining locks its delegation has been revoked, and another client has
   modified the flushing of data associated with for which the releasing of locks need
   not be done.

   When client had a delegation.  This is
   especially a problem for the client holds that held a OPEN_DELEGATE_READ delegation, lock operations
   are not performed locally.  All lock operations, OPEN_DELEGATE_WRITE
   delegation.

   The server also has a dilemma in that the client that fails to
   respond to the recall might also be sending other NFS requests,
   including those
   requesting non-exclusive locks, are sent to that renew the server lease before the lease expires.
   Without returning an error for
   resolution.

10.4.3.  Handling of CB_GETATTR

   The those lease renewing operations, the
   server needs to employ special handling for a GETATTR where leads the
   target is a file client to believe that has a OPEN_DELEGATE_WRITE the delegation it has is in effect.
   The reason for this
   force.

   This difficulty is that solved by the client holding following rules:

   o  When the callback path is down, the server MUST NOT revoke the
   OPEN_DELEGATE_WRITE
      delegation may have modified if one of the data following occurs:

      *  The client has issued a RENEW operation and the server needs to reflect this change to has
         returned an NFS4ERR_CB_PATH_DOWN error.  The server MUST renew
         the lease for any byte-range locks and share reservations the second
         client has that
   submitted the GETATTR.  Therefore, server has known about (as opposed to those
         locks and share reservations the client holding has established but not
         yet sent to the
   OPEN_DELEGATE_WRITE delegation needs server, due to be interrogated.  The server
   will use the CB_GETATTR operation. delegation).  The only attributes that the server can reliably query via CB_GETATTR are size and change.

   Since CB_GETATTR is being used
         SHOULD give the client a reasonable time to return its
         delegations to satisfy another client's GETATTR
   request, the server only needs to know if before revoking the client's
         delegations.

      *  The client holding the
   delegation has not issued a modified version RENEW operation for some period of
         time after the file.  If server attempted to recall the client's copy delegation.  This
         period of time MUST NOT be less than the delegated file is not modified (data or size), the server can
   satisfy the second client's GETATTR request from the attributes
   stored locally at the server.  If value of the file is modified,
         lease_time attribute.

   o  When the server
   only needs client holds a delegation, it cannot rely on operations,
      except for RENEW, that take a stateid, to know about this modified state.  If the server
   determines renew delegation leases
      across callback path failures.  The client that the file is currently modified, it will respond wants to keep
      delegations in force across callback path failures must use RENEW
      to do so.

10.4.7.  Delegation Revocation

   At the second client's GETATTR as if the file had been modified locally
   at the server.

   Since the form of the change attribute is determined by the server
   and point a delegation is opaque to revoked, if there are associated opens
   on the client, the client and server applications holding these opens need to agree on be
   notified.  This notification usually occurs by returning errors for
   READ/WRITE operations or when a
   method of communicating the modified state of close is attempted for the open file.  For the size
   attribute, the client will report its current view of

   If no opens exist for the file size.
   For at the change attribute, point the handling delegation is more involved.

   For the client, the following steps will be taken when receiving a
   OPEN_DELEGATE_WRITE delegation:

   o  The value
   revoked, then notification of the change attribute will be obtained from the server
      and cached.  Let this value be represented by c.

   o  The client will create a value greater than c that will be used
      for communicating revocation is unnecessary.
   However, if there is modified data is held present at the client.  Let this
      value be represented by d.

   o  When the client is queried via CB_GETATTR for the change
      attribute, it checks to see if it holds modified data.  If the
      file is modified,
   file, the value d is returned for user of the change attribute
      value.  If this file is application should be notified.  Unfortunately,
   it may not currently modified, be possible to notify the client returns user since active applications
   may not be present at the value c client.  See Section 10.5.1 for additional
   details.

10.5.  Data Caching and Revocation

   When locks and delegations are revoked, the change attribute. assumptions upon which
   successful caching depend are no longer guaranteed.  For simplicity of implementation, the client MAY for each CB_GETATTR
   return any locks or
   share reservations that have been revoked, the same value d. corresponding owner
   needs to be notified.  This is true even if, between successive
   CB_GETATTR operations, notification includes applications with a
   file open that has a corresponding delegation which has been revoked.
   Cached data associated with the client again modifies in revocation must be removed from the file's
   client.  In the case of modified data
   or metadata existing in its cache.  The client can return the same value
   because the only requirement is client's cache,
   that data must be removed from the client be able to indicate without it being written to
   the server that server.  As mentioned, the assumptions made by the client holds modified data.  Therefore, are no
   longer valid at the
   value of d point when a lock or delegation has been revoked.
   For example, another client may always be c + 1.

   While have been granted a conflicting lock
   after the change attribute is opaque to revocation of the client in lock at the sense that
   it has no idea what units of time, if any, first client.  Therefore, the server is counting
   change with, it is not opaque in that
   data within the lock range may have been modified by the other
   client.  Obviously, the first client has is unable to guarantee to treat it as
   an unsigned integer, and the server
   application what has occurred to be able to see the results file in the case of revocation.

   Notification to a lock owner will in many cases consist of simply
   returning an error on the client's changes next and all subsequent READs/WRITEs to that integer.  Therefore, the server MUST
   encode
   open file or on the change attribute in network order when sending it to close.  Where the
   client.  The client MUST decode it from network order methods available to its native
   order when receiving it and the a client MUST encode it network order
   when sending it to the server.  For
   make such notification impossible because errors for certain
   operations may not be returned, more drastic action such as signals
   or process termination may be appropriate.  The justification for
   this reason, change is defined as that an unsigned integer rather than invariant for which an opaque array of bytes.

   For the server, application depends on may be
   violated.  Depending on how errors are typically treated for the following steps will
   client operating environment, further levels of notification
   including logging, console messages, and GUI pop-ups may be taken when providing a
   OPEN_DELEGATE_WRITE delegation:

   o  Upon providing
   appropriate.

10.5.1.  Revocation Recovery for Write Open Delegation

   Revocation recovery for a OPEN_DELEGATE_WRITE delegation, delegation poses the server will
      cache a copy
   special issue of the change attribute in the modified data structure it uses
      to record in the delegation.  Let this value be represented by sc.

   o  When a second client sends a GETATTR operation on cache while the same file is
   not open.  In this situation, any client which does not flush
   modified data to the server, the server obtains on each close must ensure that the change attribute from user
   receives appropriate notification of the first
      client.  Let this value be cc.

   o  If failure as a result of the value cc is equal
   revocation.  Since such situations may require human action to sc,
   correct problems, notification schemes in which the file appropriate user
   or administrator is not modified notified may be necessary.  Logging and console
   messages are typical examples.

   If there is modified data on the
      server returns client, it must not be flushed
   normally to the current values for change, time_metadata, and
      time_modify (for example) server.  A client may attempt to provide a copy of
   the second client.

   o  If file data as modified during the value cc is NOT equal delegation under a different
   name in the file system name space to sc, ease recovery.  Note that when
   the client can determine that the file is currently has not been modified
      at by any
   other client, or when the first client and most likely will be modified at has a complete cached copy of the server
      at
   file in question, such a future time.  The server then uses its current time to
      construct attribute values for time_metadata and time_modify.  A
      new value saved copy of sc, which we will call nsc, is computed by the
      server, such that nsc >= sc + 1.  The server then returns the
      constructed time_metadata, time_modify, and nsc values to client's view of the
      requester.  The server replaces sc in file
   may be of particular value for recovery.  In other cases, recovery
   using a copy of the delegation record with
      nsc.  To prevent file based partially on the possibility of time_modify, time_metadata, client's cached data
   and change from appearing to go backward (which would happen if partially on the client holding server copy as modified by other clients, will
   be anything but straightforward, so clients may avoid saving file
   contents in these situations or mark the delegation fails results specially to write its warn
   users of possible problems.

   Saving of such modified data
      to the server before the in delegation is revoked revocation situations may
   be limited to files of a certain size or returned), the
      server SHOULD update the file's metadata record with might be used only when
   sufficient disk space is available within the
      constructed attribute values.  For reasons of reasonable
      performance, committing target file system.
   Such saving may also be restricted to situations when the constructed attribute values client has
   sufficient buffering resources to stable
      storage keep the cached copy available
   until it is OPTIONAL.

   As properly stored to the target file system.

10.6.  Attribute Caching

   The attributes discussed earlier in this section, the client MAY return section do not include named
   attributes.  Individual named attributes are analogous to files and
   caching of the same
   cc value data for these needs to be handled just as data
   caching is for regular files.  Similarly, LOOKUP results from an
   OPENATTR directory are to be cached on subsequent CB_GETATTR calls, even if the file was
   modified in the client's same basis as any other
   pathnames and similarly for directory contents.

   Clients may cache yet again between successive
   CB_GETATTR calls.  Therefore, file attributes obtained from the server must assume and use
   them to avoid subsequent GETATTR requests.  Such caching is write
   through in that the modification to file
   has been modified yet again, attributes is always done by
   means of requests to the server and MUST take care should not be done locally and
   cached.  The exception to ensure this are modifications to attributes that
   are intimately connected with data caching.  Therefore, extending a
   file by writing data to the
   new nsc it constructs and returns local data cache is greater than reflected immediately
   in the previous nsc it
   returned.  An example implementation's delegation record would
   satisfy size as seen on the client without this mandate by including a boolean field (let us call it
   "modified") that is set change being
   immediately reflected on the server.  Normally such changes are not
   propagated directly to FALSE the server but when the delegation modified data is granted, and
   an sc value set at the time of grant
   flushed to the change server, analogous attribute value.

   The changes are made on the
   server.  When open delegation is in effect, the modified field would attributes
   may be set returned to TRUE the first time cc != sc, and
   would stay TRUE until server in the delegation is returned or revoked.  The
   processing for constructing nsc, time_modify, and time_metadata would
   use this pseudo code:

       if (!modified) {
           do response to a CB_GETATTR for change and size;

           if (cc != sc)
               modified = TRUE;
       } else {
           do CB_GETATTR for size;
       }

       if (modified) {
           sc = sc + 1;
           time_modify = time_metadata = current_time;
           update sc, time_modify, time_metadata into file's metadata;
       }

   This would return to the client (that sent GETATTR) the attributes it
   requested, but make sure size comes from what CB_GETATTR returned. call.

   The server would not update the file's metadata with the client's
   modified size.

   In the case result of local caching of attributes is that the file attribute size is different than the
   server's current value,
   caches maintained on individual clients will not be coherent.
   Changes made in one order on the server treats this as may be seen in a modification
   regardless of the value of the change attribute retrieved via
   CB_GETATTR and responds to the second different
   order on one client as and in a third order on a different client.

   The typical file system application programming interfaces do not
   provide means to atomically modify or interrogate attributes for
   multiple files at the last step.

   This methodology resolves issues of clock differences between client
   and server and other scenarios same time.  The following rules provide an
   environment where the use of CB_GETATTR break
   down.

   It should potential incoherency mentioned above can be noted that
   reasonably managed.  These rules are derived from the server is under no obligation to use
   CB_GETATTR and therefore practice of
   previous NFS protocols.

   o  All attributes for a given file (per-fsid attributes excepted) are
      cached as a unit at the server MAY simply recall client so that no non-serializability can
      arise within the delegation
   to avoid its use.

10.4.4.  Recall of Open Delegation

   The following events necessitate recall context of an open delegation:

   o  Potentially conflicting OPEN request (or READ/WRITE done with
      "special" stateid) a single file.

   o  SETATTR issued by another  An upper time boundary is maintained on how long a client
   o  REMOVE request for cache
      entry can be kept without being refreshed from the file server.

   o  RENAME request for  When operations are performed that modify attributes at the file
      server, the updated attribute set is requested as either source or target part of the
      RENAME

   Whether a RENAME of a
      containing RPC.  This includes directory in operations that update
      attributes indirectly.  This is accomplished by following the path leading to
      modifying operation with a GETATTR operation and then using the file
      results in recall of an open delegation depends on the semantics of GETATTR to update the server filesystem.  If client's cached attributes.

   Note that filesystem denies such RENAMEs when a
   file is open, the recall must be performed to determine whether if the
   file in question is, in fact, open.

   In addition full set of attributes to be cached is requested by
   READDIR, the situations above, the server may choose to recall
   open delegations at any time if resource constraints make it
   advisable to do so.  Clients should always results can be prepared for cached by the
   possibility of recall.

   When a client receives a recall for an open delegation, it needs to
   update state on the server before returning the delegation.  These same updates must be done whenever a basis as
   attributes obtained via GETATTR.

   A client chooses to return a
   delegation voluntarily.  The following items may validate its cached version of state need to be
   dealt with:

   o  If the attributes for a file associated with by
   fetching just both the delegation is no longer open change and
      no previous CLOSE operation has been sent to the server, a CLOSE
      operation must be sent to time_access attributes and assuming
   that if the server.

   o  If a file change attribute has other open references at the client, then OPEN
      operations must be sent to same value as it did when the server.
   attributes were cached, then no attributes other than time_access
   have changed.  The appropriate stateids
      will be provided by reason why time_access is also fetched is because
   many servers operate in environments where the server for subsequent use operation that updates
   change does not update time_access.  For example, POSIX file
   semantics do not update access time when a file is modified by the client
      since
   write system call.  Therefore, the delegation stateid will not longer be valid.  These OPEN
      requests are done client that wants a current
   time_access value should fetch it with change during the claim type attribute
   cache validation processing and update its cached time_access.

   The client may maintain a cache of CLAIM_DELEGATE_CUR.  This
      will allow the presentation modified attributes for those
   attributes intimately connected with data of the delegation stateid so that modified regular files
   (size, time_modify, and change).  Other than those three attributes,
   the client can establish the appropriate rights MUST NOT maintain a cache of modified attributes.
   Instead, attribute changes are immediately sent to perform the OPEN.
      (see Section 15.18 for details.)

   o  If there are granted file locks, server.

   In some operating environments, the corresponding LOCK operations
      need equivalent to be performed.  This applies time_access is
   expected to be implicitly updated by each read of the OPEN_DELEGATE_WRITE
      delegation case only.

   o  For a OPEN_DELEGATE_WRITE delegation, if at the time content of recall the
   file object.  If an NFS client is not open for write, all modified data for caching the content of a file must be
      flushed to the server.  If the delegation had not existed,
   object, whether it is a regular file, directory, or symbolic link,
   the client would have done this data flush before SHOULD NOT update the CLOSE operation.

   o  For a OPEN_DELEGATE_WRITE delegation when time_access attribute (via SETATTR
   or a file small READ or READDIR request) on the server with each read that
   is still open at satisfied from cache.  The reason is that this can defeat the time
   performance benefits of recall, any modified data for caching content, especially since an explicit
   SETATTR of time_access may alter the file needs to be
      flushed to change attribute on the server.

   o  With
   If the OPEN_DELEGATE_WRITE delegation in place, it is possible change attribute changes, clients that are caching the file was truncated during content
   will think the duration of content has changed, and will re-read unmodified data
   from the delegation.
      For example, server.  Nor is the truncation could have occurred as a result of an
      OPEN UNCHECKED4 with a size attribute value of zero.  Therefore,
      if client encouraged to maintain a truncation modified
   version of the file has occurred and time_access in its cache, since this operation has
      not been propagated to would mean that the server,
   client will either eventually have to write the truncation must occur
      before any modified data is written access time to the server.

   In
   server with bad performance effects, or it would never update the case
   server's time_access, thereby resulting in a situation where an
   application that caches access time between a close and open of OPEN_DELEGATE_WRITE delegation, the
   same file locking imposes
   some additional requirements.  To precisely maintain observes the associated
   invariant, it is required access time oscillating between the past and
   present.  The time_access attribute always means the time of last
   access to flush any modified data in any region
   for which a write lock file by a read that was released while satisfied by the OPEN_DELEGATE_WRITE
   delegation was server.  This
   way clients will tend to see only time_access changes that go forward
   in effect.  However, because time.

10.7.  Data and Metadata Caching and Memory Mapped Files

   Some operating environments include the OPEN_DELEGATE_WRITE
   delegation implies no other locking by other clients, a simpler
   implementation is to flush all modified data capability for an application
   to map a file's content into the file (as
   described just above) if any write lock application's address space.  Each
   time the application accesses a memory location that corresponds to a
   block that has not been released while loaded into the
   OPEN_DELEGATE_WRITE delegation was in effect.

   An implementation need not wait until delegation recall (or deciding
   to voluntarily return address space, a delegation) to perform any of page fault
   occurs and the above
   actions, file is read (or if implementation considerations (e.g., resource
   availability constraints) make that desirable.  Generally, however, the fact that block does not exist in the actual open state of
   file, the file may continue to
   change makes it not worthwhile to send information about opens block is allocated and
   closes to then instantiated in the server, except
   application's address space).

   As long as part of delegation return.  Only in each memory mapped access to the case file requires a page
   fault, the relevant attributes of closing the open file that resulted in obtaining the
   delegation would clients be likely are used to do this early, since, detect
   access and modification (time_access, time_metadata, time_modify, and
   change) will be updated.  However, in that
   case, the close once done many operating environments,
   when page faults are not required these attributes will not be undone.  Regardless of the
   client's choices
   updated on scheduling these actions, all must be performed
   before the delegation is returned, including (when applicable) the
   close that corresponds reads or updates to the open that resulted in file via memory access (regardless
   of whether the delegation.
   These actions can be performed either in previous requests file is a local file or in
   previous operations in the same COMPOUND request.

10.4.5.  OPEN Delegation Race with CB_RECALL

   The server informs the is being access remotely).  A
   client or server MAY fail to update attributes of recall via a CB_RECALL.  A race case
   which may develop is when the delegation file that is immediately recalled
   before the COMPOUND which established the delegation
   being accessed via memory mapped I/O. This has several implications:

   o  If there is returned to
   the client.  As an application on the CB_RECALL provides both server that has memory mapped a stateid and
      file that a
   filehandle for which the client has no mapping, it cannot honor the
   recall attempt.  At this point, is also accessing, the client has two choices, either do
   not respond or respond with NFS4ERR_BADHANDLE.  If it does may not
   respond, then it runs the risk be able
      to get a consistent value of the server deciding change attribute to not grant it
   further delegations.

   If instead it does reply with NFS4ERR_BADHANDLE, then both the client
   and the determine
      whether its cache is stale or not.  A server might be able to detect that a race condition knows that the
      file is
   occurring.  The client can keep a list of pending delegations.  When
   it receives a CB_RECALL memory mapped could always pessimistically return updated
      values for an unknown delegation, it can cache change so as to force the
   stateid application to always get the
      most up to date data and filehandle on a list metadata for the file.  However, due to
      the negative performance implications of pending recalls.  When it this, such behavior is
      OPTIONAL.

   o  If the memory mapped file is
   provided with a delegation, it would only use it if it was not being modified on the
   pending recall list.  Upon server, and
      instead is just being read by an application via the next CB_RECALL, it could immediately
   return memory mapped
      interface, the delegation.

   In turn, client will not see an updated time_access
      attribute.  However, in many operating environments, neither will
      any process running on the server can keep track of when it issues a delegation and
   assume server.  Thus NFS clients are at no
      disadvantage with respect to local processes.

   o  If there is another client that is memory mapping the file, and if a
      that client responds to the CB_RECALL with is holding a
   NFS4ERR_BADHANDLE, then OPEN_DELEGATE_WRITE delegation, the client has yet to receive same
      set of issues as discussed in the delegation.
   The previous two bullet items apply.
      So, when a server SHOULD give the client does a reasonable time both to get this
   delegation and CB_GETATTR to return it before revoking the delegation.  Unlike a
   failed callback path, the server should periodically probe file that the client
   with CB_RECALL to see if it has received
      modified in its cache, the delegation and response from CB_GETATTR will not
      necessarily be accurate.  As discussed earlier, the client's
      obligation is ready to return it.

   When the server finally determines report that enough time the file has lapsed, it
   SHOULD revoke been modified since the
      delegation and was granted, not whether it SHOULD NOT revoke the lease.
   During this extended recall process, has been modified again
      between successive CB_GETATTR calls, and the server SHOULD be renewing
   the client lease.  The intent here is MUST assume
      that any file the client not pay too
   onerous a burden for a condition caused by has modified in cache has been modified
      again between successive CB_GETATTR calls.  Depending on the server.

10.4.6.  Clients that Fail to Honor Delegation Recalls

   A client may fail to respond to a recall for various reasons, such as
   a failure
      nature of the callback path from server to the client.  The client client's memory management system, this weak
      obligation may not be unaware of a failure possible.  A client MAY return stale
      information in CB_GETATTR whenever the callback path.  This lack file is memory mapped.

   o  The mixture of
   awareness could result in the client finding out long after the
   failure that its delegation has been revoked, memory mapping and another client has
   modified the data for which file locking on the client had a delegation.  This same file is
   especially a problem for
      problematic.  Consider the following scenario, where the page size
      on each client that held a OPEN_DELEGATE_WRITE
   delegation.

   The server also has is 8192 bytes.

      *  Client A memory maps first page (8192 bytes) of file X

      *  Client B memory maps first page (8192 bytes) of file X

      *  Client A write locks first 4096 bytes

      *  Client B write locks second 4096 bytes

      *  Client A, via a dilemma in that the client that fails to
   respond STORE instruction modifies part of its locked
         region.

      *  Simultaneous to the recall might also be sending other NFS requests,
   including those that renew the lease before the lease expires.
   Without returning an error for those lease renewing operations, the
   server leads the client to believe that the delegation it has is in
   force.

   This difficulty is solved by the following rules:

   o  When the callback path is down, the server MUST NOT revoke the
      delegation if one A, client B issues a STORE on part of
         its locked region.

   Here the following occurs:

      *  The challenge is for each client has issued to resynchronize to get a RENEW operation and the server has
         returned an NFS4ERR_CB_PATH_DOWN error.  The server MUST renew
   correct view of the lease for any byte-range locks and share reservations first page.  In many operating environments, the
   virtual memory management systems on each client has only know a page is
   modified, not that a subset of the server has known about (as opposed page corresponding to those
         locks and share reservations the client
   respective lock regions has established but been modified.  So it is not
         yet sent to the server, due possible for
   each client to do the delegation).  The server
         SHOULD give the client a reasonable time right thing, which is to return its
         delegations only write to the
   server before revoking the client's
         delegations.

      *  The client has not issued a RENEW operation for some period that portion of
         time after the server attempted to recall page that is locked.  For example, if
   client A simply writes out the delegation.  This
         period of time MUST NOT be less than page, and then client B writes out the value of
   page, client A's data is lost.

   Moreover, if mandatory locking is enabled on the
         lease_time attribute.

   o file, then we have a
   different problem.  When clients A and B issue the client holds STORE
   instructions, the resulting page faults require a delegation, it cannot rely byte-range lock on operations,
      except for RENEW, that take a stateid, to renew delegation leases
      across callback path failures.  The
   the entire page.  Each client that wants then tries to keep
      delegations in force across callback path failures must use RENEW extend their locked range
   to do so.

10.4.7.  Delegation Revocation

   At the point entire page, which results in a delegation is revoked, if there are associated opens
   on the client, deadlock.

   Communicating the applications holding these opens need NFS4ERR_DEADLOCK error to be
   notified.  This notification usually occurs by returning errors for
   READ/WRITE operations or when a close STORE instruction is attempted for the open file.

   If no opens exist for the file
   difficult at the point the delegation best.

   If a client is
   revoked, then notification of locking the revocation is unnecessary.
   However, if entire memory mapped file, there is modified data present no
   problem with advisory or mandatory byte-range locking, at least until
   the client for the
   file, unlocks a region in the user middle of the application should be notified.  Unfortunately,
   it may not be possible to notify file.

   Given the user since active applications
   may not be present at above issues the client.  See Section 10.5.1 for additional
   details.

10.5.  Data Caching and Revocation

   When locks and delegations following are revoked, the assumptions upon which
   successful caching depend permitted:

   o  Clients and servers MAY deny memory mapping a file they know there
      are no longer guaranteed.  For any byte-range locks or
   share reservations that have been revoked, the corresponding owner
   needs to be notified.  This notification includes applications with for.

   o  Clients and servers MAY deny a byte-range lock on a file open that has they know
      is memory mapped.

   o  A client MAY deny memory mapping a corresponding delegation which has been revoked.
   Cached data associated with file that it knows requires
      mandatory locking for I/O. If mandatory locking is enabled after
      the revocation must file is opened and mapped, the client MAY deny the application
      further access to its mapped file.

10.8.  Name Caching

   The results of LOOKUP and READDIR operations may be removed from cached to avoid
   the
   client.  In cost of subsequent LOOKUP operations.  Just as in the case of modified data existing in
   attribute caching, inconsistencies may arise among the client's cache,
   that data must be removed from various client
   caches.  To mitigate the effects of these inconsistencies and given
   the context of typical file system APIs, an upper time boundary is
   maintained on how long a client name cache entry can be kept without it being written to
   the server.  As mentioned,
   verifying that the assumptions entry has not been made invalid by a directory
   change operation performed by another client.

   When a client is not making changes to a directory for which there
   exist name cache entries, the client are needs to periodically fetch
   attributes for that directory to ensure that it is not being
   modified.  After determining that no
   longer valid at modification has occurred, the point when
   expiration time for the associated name cache entries may be updated
   to be the current time plus the name cache staleness bound.

   When a lock or delegation has been revoked.
   For example, another client may is making changes to a given directory, it needs to
   determine whether there have been granted a conflicting lock changes made to the directory by
   other clients.  It does this by using the change attribute as
   reported before and after the revocation of directory operation in the lock at associated
   change_info4 value returned for the first client.  Therefore, operation.  The server is able to
   communicate to the client whether the change_info4 data within is provided
   atomically with respect to the lock range may have been modified by directory operation.  If the other
   client.  Obviously, change
   values are provided atomically, the first client is unable to guarantee then able to compare
   the
   application what has occurred to pre-operation change value with the file change value in the case of revocation.

   Notification to a lock owner will in many cases consist of simply
   returning an error on client's
   name cache.  If the next and all subsequent READs/WRITEs to comparison indicates that the
   open file or on directory was
   updated by another client, the close.  Where name cache associated with the methods available to a client
   make such notification impossible because errors for certain
   operations may not be returned, more drastic action such as signals
   or process termination may be appropriate.  The justification for
   this
   modified directory is that an invariant for which an application depends on may purged from the client.  If the comparison
   indicates no modification, the name cache can be
   violated.  Depending updated on how errors are typically treated for the
   client operating environment, further levels of notification
   including logging, console messages, to reflect the directory operation and GUI pop-ups may the associated timeout
   extended.  The post-operation change value needs to be
   appropriate.

10.5.1.  Revocation Recovery for Write Open Delegation

   Revocation recovery saved as the
   basis for a OPEN_DELEGATE_WRITE delegation poses future change_info4 comparisons.

   As demonstrated by the
   special issue of modified data in scenario above, name caching requires that the
   client revalidate name cache while data by inspecting the file change attribute
   of a directory at the point when the name cache item was cached.
   This requires that the server update the change attribute for
   directories when the contents of the corresponding directory is
   not open.  In this situation, any
   modified.  For a client which does not flush
   modified data to use the change_info4 information
   appropriately and correctly, the server on each close must ensure that report the user
   receives appropriate notification of pre and post
   operation change attribute values atomically.  When the failure as a result of server is
   unable to report the
   revocation.  Since such situations may require human action before and after values atomically with respect
   to
   correct problems, notification schemes the directory operation, the server must indicate that fact in which the appropriate user
   or administrator is notified may be necessary.  Logging and console
   messages are typical examples.

   If there
   change_info4 return value.  When the information is modified data on not atomically
   reported, the client, it must client should not assume that other clients have not
   changed the directory.

10.9.  Directory Caching

   The results of READDIR operations may be flushed
   normally used to avoid subsequent
   READDIR operations.  Just as in the server.  A client cases of attribute and name
   caching, inconsistencies may attempt to provide a copy arise among the various client caches.
   To mitigate the effects of these inconsistencies, and given the
   context of typical file data as modified during system APIs, the delegation under following rules should be
   followed:

   o  Cached READDIR information for a different
   name directory which is not obtained
      in a single READDIR operation must always be a consistent snapshot
      of directory contents.  This is determined by using a GETATTR
      before the filesystem name space to ease recovery.  Note that when first READDIR and after the client can determine last of READDIR that
      contributes to the file has not been modified by any
   other client, or when cache.

   o  An upper time boundary is maintained to indicate the client has length of
      time a complete directory cache entry is considered valid before the client
      must revalidate the cached copy information.

   The revalidation technique parallels that discussed in the case of file
   name caching.  When the client is not changing the directory in
   question, such a saved copy checking the change attribute of the client's view directory with GETATTR
   is adequate.  The lifetime of the file may cache entry can be of particular value for recovery.  In other case, recovery using extended at
   these checkpoints.  When a
   copy of the file based partially on client is modifying the client's cached data and
   partially on the server copy as modified by other clients, will be
   anything but straightforward, so clients may avoid saving file
   contents in these situations or mark the results specially to warn
   users of possible problems.

   Saving of such modified data in delegation revocation situations may
   be limited to files of a certain size or might be used only when
   sufficient disk space is available within the target filesystem.
   Such saving may also be restricted to situations when directory, the
   client has
   sufficient buffering resources needs to keep use the cached copy available
   until it is properly stored change_info4 data to the target filesystem.

10.6.  Attribute Caching

   The attributes discussed in this section do not include named
   attributes.  Individual named attributes determine whether there
   are analogous to files and
   caching of other clients modifying the data for these needs to be handled just as data
   caching directory.  If it is for regular files.  Similarly, LOOKUP results from an
   OPENATTR directory determined that
   no other client modifications are to be cached on occurring, the same basis as any other
   pathnames and similarly for directory contents.

   Clients client may update
   its directory cache file attributes obtained from the server and use
   them to avoid subsequent GETATTR requests.  Such reflect its own changes.

   As demonstrated previously, directory caching is write
   through in requires that modification to file attributes is always done by
   means of requests to the server and should not be done locally and
   cached.  The exception to this are modifications to attributes that
   are intimately connected with
   client revalidate directory cache data caching.  Therefore, extending a
   file by writing data to the local data cache is reflected immediately
   in the size as seen on inspecting the client without this change being
   immediately reflected on
   attribute of a directory at the server.  Normally such changes are not
   propagated directly to point when the directory was cached.
   This requires that the server but update the change attribute for
   directories when the modified data contents of the corresponding directory is
   flushed
   modified.  For a client to use the server, analogous attribute changes are made on change_info4 information
   appropriately and correctly, the
   server. server must report the pre and post
   operation change attribute values atomically.  When open delegation the server is in effect,
   unable to report the modified attributes
   may be returned before and after values atomically with respect
   to the directory operation, the server must indicate that fact in the response to a CB_GETATTR call.

   The result of local caching of attributes is that
   change_info4 return value.  When the attribute
   caches maintained on individual clients will information is not be coherent.
   Changes made in one order on atomically
   reported, the server may be seen in a different
   order on one client and in a third order on a different client.

   The typical filesystem application programming interfaces do should not
   provide means to atomically modify or interrogate attributes for
   multiple files at the same time.  The following rules provide an
   environment where assume that other clients have not
   changed the potential incoherency mentioned above can be
   reasonably managed.  These rules are derived from directory.

11.  Minor Versioning

   To address the practice requirement of
   previous an NFS protocols.

   o  All attributes for a given file (per-fsid attributes excepted) are
      cached as a unit at the client so protocol that no non-serializability can
      arise within evolve as the context of a single file.

   o  An upper time boundary is maintained on how long a client cache
      entry can be kept without being refreshed from the server.

   o  When operations are performed that change attributes at
   need arises, the
      server, NFSv4 protocol contains the updated attribute set rules and framework to
   allow for future minor changes or versioning.

   The base assumption with respect to minor versioning is requested as part that any
   future accepted minor version must follow the IETF process and be
   documented in a standards track RFC.  Therefore, each minor version
   number will correspond to an RFC.  Minor version 0 of the
      containing RPC.  This includes directory operations that update
      attributes indirectly.  This NFS version
   4 protocol is accomplished represented by following the
      modifying operation with a GETATTR operation this RFC.  The COMPOUND and then using CB_COMPOUND
   procedures support the
      results encoding of the GETATTR to update the client's cached attributes.

   Note that if the full set of attributes to be cached is minor version being requested
   by
   READDIR, the results can be cached by client.

   The following items represent the client on basic rules for the same basis as
   attributes obtained via GETATTR.

   A client may validate its cached version development of attributes for
   minor versions.  Note that a file by
   fetching just both future minor version may decide to
   modify or add to the change and time_access attributes and assuming
   that if following rules as part of the change attribute has minor version
   definition.

   1.   Procedures are not added or deleted

        To maintain the same value as it did when general RPC model, NFSv4 minor versions will not
        add to or delete procedures from the
   attributes were cached, then no attributes other than time_access
   have changed. NFS program.

   2.   Minor versions may add operations to the COMPOUND and
        CB_COMPOUND procedures.

        The reason why time_access is also fetched is because
   many servers operate in environments where addition of operations to the operation that updates
   change COMPOUND and CB_COMPOUND
        procedures does not update time_access.  For example, POSIX file
   semantics do not update access time when a file is modified by affect the
   write system call.  Therefore, RPC model.

        1.  Minor versions may append attributes to the client bitmap4 that wants a current
   time_access value should fetch it with change during the attribute
   cache validation processing and update its cached time_access.

   The client may maintain a cache
            represents sets of modified attributes for those attributes intimately connected with data of modified regular files
   (size, time_modify, and change).  Other than those three attributes, to the client MUST NOT maintain a cache fattr4 that
            represents sets of modified attributes.
   Instead, attribute changes are immediately sent to values.

            This allows for the server.

   In some operating environments, expansion of the equivalent to time_access is
   expected attribute model to be implicitly updated by each read
            allow for future growth or adaptation.

        2.  Minor version X must append any new attributes after the
            last documented attribute.

            Since attribute results are specified as an opaque array of
            per-attribute XDR encoded results, the content complexity of adding
            new attributes in the
   file object.  If midst of the current definitions would
            be too burdensome.

   3.   Minor versions must not modify the structure of an NFS client is caching existing
        operation's arguments or results.

        Again, the content complexity of handling multiple structure definitions
        for a file
   object, whether it single operation is a regular file, directory, or symbolic link,
   the client SHOULD NOT update the time_access attribute (via SETATTR
   or a small READ or READDIR request) on the server with each read that
   is satisfied from cache.  The reason is that this can defeat the
   performance benefits of caching content, especially since an explicit
   SETATTR of time_access may alter the change attribute on the server.
   If the change attribute changes, clients that are caching the content
   will think the content has changed, and will re-read unmodified data
   from the server.  Nor is the client encouraged to maintain a modified
   version of time_access in its cache, since this would mean that the
   client will either eventually have to write the access time to the
   server with bad performance effects, or it would never update the
   server's time_access, thereby resulting in a situation where an
   application that caches access time between a close and open of the
   same file observes the access time oscillating between the past and
   present.  The time_access attribute always means the time of last
   access to a file by a read that was satisfied by the server.  This
   way clients will tend to see only time_access changes that go forward
   in time.

10.7.  Data and Metadata Caching and Memory Mapped Files

   Some operating environments include the capability for an application
   to map a file's content into the application's address space.  Each
   time the application accesses a memory location that corresponds to a
   block that has not been loaded into the address space, a page fault
   occurs and the file is read (or if the block does not exist in the
   file, the block is allocated and then instantiated in the
   application's address space).

   As long as each memory mapped access to the file requires a page
   fault, the relevant attributes of the file that are used to detect
   access and modification (time_access, time_metadata, time_modify, and
   change) will be updated.  However, in many operating environments,
   when page faults are not required these attributes will not be
   updated on reads or updates to the file via memory access (regardless
   whether the file is local file or is being access remotely).  A
   client or server MAY fail to update attributes of a file that is
   being accessed via memory mapped I/O. This has several implications:

   o  If there is an application on the server that has memory mapped a
      file that a client is also accessing, the client may not be able
      to get a consistent value of the change attribute to determine
      whether its cache is stale or not.  A server that knows that the
      file is memory mapped could always pessimistically return updated
      values for change so as to force the application to always get the
      most up to date data and metadata for the file.  However, due to
      the negative performance implications of this, such behavior is
      OPTIONAL.

   o  If the memory mapped file is not being modified on the server, and
      instead is just being read by an application via the memory mapped
      interface, the client will not see an updated time_access
      attribute.  However, in many operating environments, neither will
      any process running on the server.  Thus NFS clients are at no
      disadvantage with respect to local processes.

   o  If there is another client that is memory mapping the file, and if
      that client is holding a OPEN_DELEGATE_WRITE delegation, the same
      set of issues as discussed in the previous two bullet items apply.
      So, when a server does a CB_GETATTR to a file that the client has
      modified in its cache, the response from CB_GETATTR will not
      necessarily be accurate.  As discussed earlier, the client's
      obligation is to report that the file has been modified since the
      delegation was granted, not whether it has been modified again
      between successive CB_GETATTR calls, and the server MUST assume
      that any file the client has modified in cache has been modified
      again between successive CB_GETATTR calls.  Depending on the
      nature of the client's memory management system, this weak
      obligation may not be possible.  A client MAY return stale
      information in CB_GETATTR whenever the file is memory mapped.

   o  The mixture of memory mapping and file locking on the same file is
      problematic.  Consider the following scenario, where the page size
      on each client is 8192 bytes.

      *  Client A memory maps first page (8192 bytes) of file X

      *  Client B memory maps first page (8192 bytes) of file X

      *  Client A write locks first 4096 bytes

      *  Client B write locks second 4096 bytes

      *  Client A, via a STORE instruction modifies part of its locked
         region.

      *  Simultaneous to client A, client B issues a STORE on part of
         its locked region.

   Here the challenge is for each client to resynchronize to get a
   correct view of the first page.  In many operating environments, the
   virtual memory management systems on each client only know a page is
   modified, not that a subset of the page corresponding to the
   respective lock regions has been modified.  So it is not possible for
   each client to do the right thing, which is to only write to the
   server that portion of the page that is locked.  For example, if
   client A simply writes out the page, and then client B writes out the
   page, client A's data is lost.

   Moreover, if mandatory locking is enabled on the file, then we have a
   different problem.  When clients A and B issue the STORE
   instructions, the resulting page faults require a byte-range lock on
   the entire page.  Each client then tries to extend their locked range
   to the entire page, which results in a deadlock.

   Communicating the NFS4ERR_DEADLOCK error to a STORE instruction is
   difficult at best.

   If a client is locking the entire memory mapped file, there is no
   problem with advisory or mandatory byte-range locking, at least until
   the client unlocks a region in the middle of the file.

   Given the above issues the following are permitted:

   o  Clients and servers MAY deny memory mapping a file they know there
      are byte-range locks for.

   o  Clients and servers MAY deny a byte-range lock on a file they know
      is memory mapped.

   o  A client MAY deny memory mapping a file that it knows requires
      mandatory locking for I/O. If mandatory locking is enabled after
      the file is opened and mapped, the client MAY deny the application
      further access to its mapped file.

10.8.  Name Caching

   The results of LOOKUP and READDIR operations may be cached to avoid
   the cost of subsequent LOOKUP operations.  Just as in the case of
   attribute caching, inconsistencies may arise among the various client
   caches.  To mitigate the effects of these inconsistencies and given
   the context of typical filesystem APIs, an upper time boundary is
   maintained on how long a client name cache entry can be kept without
   verifying that the entry has not been made invalid by a directory
   change operation performed by another client.

   When a client is not making changes to a directory for which there
   exist name cache entries, the client needs to periodically fetch
   attributes for that directory to ensure that it is not being
   modified.  After determining that no modification has occurred, the
   expiration time for the associated name cache entries may be updated
   to be the current time plus the name cache staleness bound.

   When a client is making changes to a given directory, it needs to
   determine whether there have been changes made to the directory by
   other clients.  It does this by using the change attribute as
   reported before and after the directory operation in the associated
   change_info4 value returned for the operation.  The server is able to
   communicate to the client whether the change_info4 data is provided
   atomically with respect to the directory operation.  If the change
   values are provided atomically, the client is then able to compare
   the pre-operation change value with the change value in the client's
   name cache.  If the comparison indicates that the directory was
   updated by another client, the name cache associated with the
   modified directory is purged from the client.  If the comparison
   indicates no modification, the name cache can be updated on the
   client