--- 1/draft-ietf-nfsv4-rfc3530bis-09.txt 2011-03-14 20:14:24.000000000 +0100 +++ 2/draft-ietf-nfsv4-rfc3530bis-10.txt 2011-03-14 20:14:24.000000000 +0100 @@ -1,19 +1,19 @@ NFSv4 T. Haynes, Ed. Internet-Draft NetApp Intended status: Standards Track D. Noveck, Ed. Expires: September 14, 2011 EMC March 13, 2011 Network File System (NFS) Version 4 Protocol - draft-ietf-nfsv4-rfc3530bis-09.txt + draft-ietf-nfsv4-rfc3530bis-10.txt Abstract The Network File System (NFS) version 4 is a distributed filesystem protocol which owes heritage to NFS protocol version 2, RFC 1094, and version 3, RFC 1813. Unlike earlier versions, the NFS version 4 protocol supports traditional file access while integrating support for file locking and the mount protocol. In addition, support for strong security (and its negotiation), compound operations, client caching, and internationalization have been added. Of course, @@ -193,104 +193,104 @@ 9.1.8. Releasing lock_owner State . . . . . . . . . . . . . 121 9.1.9. Use of Open Confirmation . . . . . . . . . . . . . . 122 9.2. Lock Ranges . . . . . . . . . . . . . . . . . . . . . . 123 9.3. Upgrading and Downgrading Locks . . . . . . . . . . . . 123 9.4. Blocking Locks . . . . . . . . . . . . . . . . . . . . . 124 9.5. Lease Renewal . . . . . . . . . . . . . . . . . . . . . 125 9.6. Crash Recovery . . . . . . . . . . . . . . . . . . . . . 126 9.6.1. Client Failure and Recovery . . . . . . . . . . . . 126 9.6.2. Server Failure and Recovery . . . . . . . . . . . . 127 9.6.3. Network Partitions and Recovery . . . . . . . . . . 128 - 9.7. Recovery from a Lock Request Timeout or Abort . . . . . 134 + 9.7. Recovery from a Lock Request Timeout or Abort . . . . . 135 9.8. Server Revocation of Locks . . . . . . . . . . . . . . . 135 - 9.9. Share Reservations . . . . . . . . . . . . . . . . . . . 136 - 9.10. OPEN/CLOSE Operations . . . . . . . . . . . . . . . . . 136 - 9.10.1. Close and Retention of State Information . . . . . . 137 - 9.11. Open Upgrade and Downgrade . . . . . . . . . . . . . . . 138 + 9.9. Share Reservations . . . . . . . . . . . . . . . . . . . 137 + 9.10. OPEN/CLOSE Operations . . . . . . . . . . . . . . . . . 137 + 9.10.1. Close and Retention of State Information . . . . . . 138 + 9.11. Open Upgrade and Downgrade . . . . . . . . . . . . . . . 139 9.12. Short and Long Leases . . . . . . . . . . . . . . . . . 139 9.13. Clocks, Propagation Delay, and Calculating Lease - Expiration . . . . . . . . . . . . . . . . . . . . . . . 139 + Expiration . . . . . . . . . . . . . . . . . . . . . . . 140 9.14. Migration, Replication and State . . . . . . . . . . . . 140 - 9.14.1. Migration and State . . . . . . . . . . . . . . . . 140 - 9.14.2. Replication and State . . . . . . . . . . . . . . . 141 - 9.14.3. Notification of Migrated Lease . . . . . . . . . . . 141 - 9.14.4. Migration and the Lease_time Attribute . . . . . . . 142 + 9.14.1. Migration and State . . . . . . . . . . . . . . . . 141 + 9.14.2. Replication and State . . . . . . . . . . . . . . . 142 + 9.14.3. Notification of Migrated Lease . . . . . . . . . . . 142 + 9.14.4. Migration and the Lease_time Attribute . . . . . . . 143 10. Client-Side Caching . . . . . . . . . . . . . . . . . . . . . 143 - 10.1. Performance Challenges for Client-Side Caching . . . . . 143 - 10.2. Delegation and Callbacks . . . . . . . . . . . . . . . . 144 - 10.2.1. Delegation Recovery . . . . . . . . . . . . . . . . 146 - 10.3. Data Caching . . . . . . . . . . . . . . . . . . . . . . 148 - 10.3.1. Data Caching and OPENs . . . . . . . . . . . . . . . 148 - 10.3.2. Data Caching and File Locking . . . . . . . . . . . 149 + 10.1. Performance Challenges for Client-Side Caching . . . . . 144 + 10.2. Delegation and Callbacks . . . . . . . . . . . . . . . . 145 + 10.2.1. Delegation Recovery . . . . . . . . . . . . . . . . 147 + 10.3. Data Caching . . . . . . . . . . . . . . . . . . . . . . 149 + 10.3.1. Data Caching and OPENs . . . . . . . . . . . . . . . 149 + 10.3.2. Data Caching and File Locking . . . . . . . . . . . 150 10.3.3. Data Caching and Mandatory File Locking . . . . . . 151 - 10.3.4. Data Caching and File Identity . . . . . . . . . . . 151 + 10.3.4. Data Caching and File Identity . . . . . . . . . . . 152 - 10.4. Open Delegation . . . . . . . . . . . . . . . . . . . . 152 + 10.4. Open Delegation . . . . . . . . . . . . . . . . . . . . 153 10.4.1. Open Delegation and Data Caching . . . . . . . . . . 155 - 10.4.2. Open Delegation and File Locks . . . . . . . . . . . 156 - 10.4.3. Handling of CB_GETATTR . . . . . . . . . . . . . . . 156 - 10.4.4. Recall of Open Delegation . . . . . . . . . . . . . 159 - 10.4.5. OPEN Delegation Race with CB_RECALL . . . . . . . . 161 - 10.4.6. Clients that Fail to Honor Delegation Recalls . . . 162 - 10.4.7. Delegation Revocation . . . . . . . . . . . . . . . 163 - 10.5. Data Caching and Revocation . . . . . . . . . . . . . . 163 - 10.5.1. Revocation Recovery for Write Open Delegation . . . 164 - 10.6. Attribute Caching . . . . . . . . . . . . . . . . . . . 164 - 10.7. Data and Metadata Caching and Memory Mapped Files . . . 166 + 10.4.2. Open Delegation and File Locks . . . . . . . . . . . 157 + 10.4.3. Handling of CB_GETATTR . . . . . . . . . . . . . . . 157 + 10.4.4. Recall of Open Delegation . . . . . . . . . . . . . 160 + 10.4.5. OPEN Delegation Race with CB_RECALL . . . . . . . . 162 + 10.4.6. Clients that Fail to Honor Delegation Recalls . . . 163 + 10.4.7. Delegation Revocation . . . . . . . . . . . . . . . 164 + 10.5. Data Caching and Revocation . . . . . . . . . . . . . . 164 + 10.5.1. Revocation Recovery for Write Open Delegation . . . 165 + 10.6. Attribute Caching . . . . . . . . . . . . . . . . . . . 165 + 10.7. Data and Metadata Caching and Memory Mapped Files . . . 167 10.8. Name Caching . . . . . . . . . . . . . . . . . . . . . . 169 10.9. Directory Caching . . . . . . . . . . . . . . . . . . . 170 11. Minor Versioning . . . . . . . . . . . . . . . . . . . . . . 171 - 12. Internationalization . . . . . . . . . . . . . . . . . . . . 173 - 12.1. Use of UTF-8 . . . . . . . . . . . . . . . . . . . . . . 174 - 12.1.1. Relation to Stringprep . . . . . . . . . . . . . . . 174 - 12.1.2. Normalization, Equivalence, and Confusability . . . 175 - 12.2. String Type Overview . . . . . . . . . . . . . . . . . . 178 - 12.2.1. Overall String Class Divisions . . . . . . . . . . . 178 - 12.2.2. Divisions by Typedef Parent types . . . . . . . . . 179 - 12.2.3. Individual Types and Their Handling . . . . . . . . 180 - 12.3. Errors Related to Strings . . . . . . . . . . . . . . . 181 - 12.4. Types with Pre-processing to Resolve Mixture Issues . . 182 - 12.4.1. Processing of Principal Strings . . . . . . . . . . 182 + 12. Internationalization . . . . . . . . . . . . . . . . . . . . 174 + 12.1. Use of UTF-8 . . . . . . . . . . . . . . . . . . . . . . 175 + 12.1.1. Relation to Stringprep . . . . . . . . . . . . . . . 175 + 12.1.2. Normalization, Equivalence, and Confusability . . . 176 + 12.2. String Type Overview . . . . . . . . . . . . . . . . . . 179 + 12.2.1. Overall String Class Divisions . . . . . . . . . . . 179 + 12.2.2. Divisions by Typedef Parent types . . . . . . . . . 180 + 12.2.3. Individual Types and Their Handling . . . . . . . . 181 + 12.3. Errors Related to Strings . . . . . . . . . . . . . . . 182 + 12.4. Types with Pre-processing to Resolve Mixture Issues . . 183 + 12.4.1. Processing of Principal Strings . . . . . . . . . . 183 12.4.2. Processing of Server Id Strings . . . . . . . . . . 183 - 12.5. String Types without Internationalization Processing . . 183 + 12.5. String Types without Internationalization Processing . . 184 12.6. Types with Processing Defined by Other Internet Areas . 184 12.7. String Types with NFS-specific Processing . . . . . . . 185 - 12.7.1. Handling of File Name Components . . . . . . . . . . 185 - 12.7.2. Processing of Link Text . . . . . . . . . . . . . . 194 - 12.7.3. Processing of Principal Prefixes . . . . . . . . . . 195 - 13. Error Values . . . . . . . . . . . . . . . . . . . . . . . . 196 - 13.1. Error Definitions . . . . . . . . . . . . . . . . . . . 196 - 13.1.1. General Errors . . . . . . . . . . . . . . . . . . . 198 - 13.1.2. Filehandle Errors . . . . . . . . . . . . . . . . . 199 + 12.7.1. Handling of File Name Components . . . . . . . . . . 186 + 12.7.2. Processing of Link Text . . . . . . . . . . . . . . 195 + 12.7.3. Processing of Principal Prefixes . . . . . . . . . . 196 + 13. Error Values . . . . . . . . . . . . . . . . . . . . . . . . 197 + 13.1. Error Definitions . . . . . . . . . . . . . . . . . . . 197 + 13.1.1. General Errors . . . . . . . . . . . . . . . . . . . 199 + 13.1.2. Filehandle Errors . . . . . . . . . . . . . . . . . 200 13.1.3. Compound Structure Errors . . . . . . . . . . . . . 201 - 13.1.4. File System Errors . . . . . . . . . . . . . . . . . 201 - 13.1.5. State Management Errors . . . . . . . . . . . . . . 203 - 13.1.6. Security Errors . . . . . . . . . . . . . . . . . . 204 + 13.1.4. File System Errors . . . . . . . . . . . . . . . . . 202 + 13.1.5. State Management Errors . . . . . . . . . . . . . . 204 + 13.1.6. Security Errors . . . . . . . . . . . . . . . . . . 205 13.1.7. Name Errors . . . . . . . . . . . . . . . . . . . . 205 - 13.1.8. Locking Errors . . . . . . . . . . . . . . . . . . . 205 + 13.1.8. Locking Errors . . . . . . . . . . . . . . . . . . . 206 13.1.9. Reclaim Errors . . . . . . . . . . . . . . . . . . . 207 - 13.1.10. Client Management Errors . . . . . . . . . . . . . . 207 + 13.1.10. Client Management Errors . . . . . . . . . . . . . . 208 13.1.11. Attribute Handling Errors . . . . . . . . . . . . . 208 - 13.2. Operations and their valid errors . . . . . . . . . . . 208 + 13.2. Operations and their valid errors . . . . . . . . . . . 209 13.3. Callback operations and their valid errors . . . . . . . 216 13.4. Errors and the operations that use them . . . . . . . . 216 - 14. NFSv4 Requests . . . . . . . . . . . . . . . . . . . . . . . 220 + 14. NFSv4 Requests . . . . . . . . . . . . . . . . . . . . . . . 221 14.1. Compound Procedure . . . . . . . . . . . . . . . . . . . 221 - 14.2. Evaluation of a Compound Request . . . . . . . . . . . . 221 - 14.3. Synchronous Modifying Operations . . . . . . . . . . . . 222 + 14.2. Evaluation of a Compound Request . . . . . . . . . . . . 222 + 14.3. Synchronous Modifying Operations . . . . . . . . . . . . 223 14.4. Operation Values . . . . . . . . . . . . . . . . . . . . 223 15. NFSv4 Procedures . . . . . . . . . . . . . . . . . . . . . . 223 15.1. Procedure 0: NULL - No Operation . . . . . . . . . . . . 223 - 15.2. Procedure 1: COMPOUND - Compound Operations . . . . . . 223 + 15.2. Procedure 1: COMPOUND - Compound Operations . . . . . . 224 15.3. Operation 3: ACCESS - Check Access Rights . . . . . . . 229 - 15.4. Operation 4: CLOSE - Close File . . . . . . . . . . . . 231 - 15.5. Operation 5: COMMIT - Commit Cached Data . . . . . . . . 232 + 15.4. Operation 4: CLOSE - Close File . . . . . . . . . . . . 232 + 15.5. Operation 5: COMMIT - Commit Cached Data . . . . . . . . 233 15.6. Operation 6: CREATE - Create a Non-Regular File Object . 235 15.7. Operation 7: DELEGPURGE - Purge Delegations Awaiting Recovery . . . . . . . . . . . . . . . . . . . . . . . . 238 15.8. Operation 8: DELEGRETURN - Return Delegation . . . . . . 239 15.9. Operation 9: GETATTR - Get Attributes . . . . . . . . . 239 15.10. Operation 10: GETFH - Get Current Filehandle . . . . . . 241 15.11. Operation 11: LINK - Create Link to a File . . . . . . . 242 15.12. Operation 12: LOCK - Create Lock . . . . . . . . . . . . 243 15.13. Operation 13: LOCKT - Test For Lock . . . . . . . . . . 247 15.14. Operation 14: LOCKU - Unlock File . . . . . . . . . . . 249 @@ -6055,22 +6055,39 @@ If the server does not reboot before the network partition is healed, when the original client tries to access a courtesy lock which was freed, the server SHOULD send back a NFS4ERR_BAD_STATEID to the client. If the client tries to access a courtesy lock which was not freed, then the server SHOULD mark all of the courtesy locks as implicitly being renewed. When a network partition is combined with a server reboot, then both the server and client have responsibilities to ensure that the client does not reclaim a lock which it should no longer be able to access. - The next sections illustrate examples of these edge conditions and - the steps necessary to be undertaken to ensure proper lock semantics. + Briefly those are: + + o Client's responsibility: A client MUST NOT attempt to reclaim any + locks which it did not hold at the end of its most recent + successfully established client lease. + + o Server's responsibility: A server MUST NOT allow a client to + reclaim a lock unless it knows that it could not have since + granted a conflicting lock. However, in deciding whether a + conflicting lock could have been granted, it is permitted to + assume its clients are responsible, as above. + + A server may consider a client's lease "successfully established" + once it has received an open operation from that client. + + The next sections give examples showing what can go wrong if these + responsibilites are neglected, and provides examples of server + implementation strategies that could meet a server's + responsibilities. 9.6.3.1.1. First Server Edge Condition The first edge condition has the following scenario: 1. Client A acquires a lock. 2. Client A and server experience mutual network partition, such that client A is unable to renew its lease. @@ -6122,20 +6139,28 @@ NFS4ERR_STALE_CLIENTID. 10. Client A reclaims its lock within the server's grace period. As with the first edge condition, the final step of the scenario of the second edge condition has the server erroneously granting client A's lock reclaim. 9.6.3.1.3. Handling Server Edge Conditions + In both of the above examples, the client attempts reclaim of a lock + that it held at the end of its most recent successfully established + lease; thus, it has fulfilled its responsibility. + + The server, however, has failed, by granting a reclaim, despite + having granted a conflicting lock since the reclaimed lock was last + held. + Solving these edge conditions requires that the server either assume after it reboots that edge condition occurs, and thus return NFS4ERR_NO_GRACE for all reclaim attempts, or that the server record some information in stable storage. The amount of information the server records in stable storage is in inverse proportion to how harsh the server wants to be whenever the edge conditions occur. The server that is completely tolerant of all edge conditions will record in stable storage every lock that is acquired, removing the lock record from stable storage only when the lock is unlocked by the client and the lock's lockowner advances the sequence number such @@ -6174,29 +6199,28 @@ with the error NFS4ERR_NO_GRACE. Regardless of the level and approach to record keeping, the server MUST implement one of the following strategies (which apply to reclaims of share reservations, byte-range locks, and delegations): 1. Reject all reclaims with NFS4ERR_NO_GRACE. This is super harsh, but necessary if the server does not want to record lock state in stable storage. - 2. Record sufficient state in stable storage such that all known - edge conditions involving server reboot, including the two noted - in this section, are detected. False positives are acceptable. + 2. Record sufficient state in stable storage to meet its + responsibilities. In doubt, the server should err on the side of + being harsh. - Note that at this time, it is not known if there are other edge - conditions. In the event, after a server reboot, the server - determines that there is unrecoverable damage or corruption to - the the stable storage, then for all clients and/or locks - affected, the server MUST return NFS4ERR_NO_GRACE. + In the event that, after a server reboot, the server determines + that there is unrecoverable damage or corruption to the the + stable storage, then for all clients and/or locks affected, the + server MUST return NFS4ERR_NO_GRACE. 9.6.3.1.4. Client Edge Condition A third edge condition effects the client and not the server. If the server reboots in the middle of the client reclaiming some locks and then a network partition is established, the client might be in the situation of having reclaimed some, but not all locks. In that case, a conservative client would assume that the non-reclaimed locks were revoked. @@ -6231,27 +6255,33 @@ 12. Client A issues a RENEW operation, and gets back a NFS4ERR_STALE_CLIENTID. 13. Client A reclaims both lock 1 and lock 2 within the server's grace period. At the last step, the client reclaims lock 2 as if it had held that lock continuously, when in fact a conflicting lock was granted to client B. + This occurs because the client failed its responsibility, by + attempting to reclaim lock 2 even though it had not held that lock at + the end of the lease that was established by the SETCLIENTID after + the first server reboot. (The client did hold lock 2 on a previous + lease. But it is only the most recent lease that matters.) + A server could avoid this situation by rejecting the reclaim of lock 2. However, to do so accurately it would have to ensure that additional information about individual locks held survives reboot. Server implementations are not required to do that, so the client must not assume that the server will. - Instead, a client MUST reclaim only those locks which it succesfully + Instead, a client MUST reclaim only those locks which it successfully acquired from the previous server instance, omitting any that it failed to reclaim before a new reboot. Thus, in the last step above, client A should reclaim only lock 1. 9.6.3.1.5. Client's Handling of NFS4ERR_NO_GRACE A mandate for the client's handling of the NFS4ERR_NO_GRACE error is outside the scope of this specification, since the strategies for such handling are very dependent on the client's operating environment. However, one potential approach is described below.