| draft-ietf-nfsv4-rfc3010bis-02.txt | | draft-ietf-nfsv4-rfc3010bis-03.txt | |
| | | | |
| NFS version 4 Working Group S. Shepler | | NFS version 4 Working Group S. Shepler | |
| INTERNET-DRAFT Sun Microsystems, Inc. | | INTERNET-DRAFT Sun Microsystems, Inc. | |
|
| Document: draft-ietf-nfsv4-rfc3010bis-02.txt C. Beame | | Obsoletes: 3010 C. Beame | |
| Hummingbird Ltd. | | Document: draft-ietf-nfsv4-rfc3010bis-03.txt Hummingbird Ltd. | |
| B. Callaghan | | B. Callaghan | |
| Sun Microsystems, Inc. | | Sun Microsystems, Inc. | |
| M. Eisler | | M. Eisler | |
| Network Appliance, Inc. | | Network Appliance, Inc. | |
| D. Noveck | | D. Noveck | |
| Network Appliance, Inc. | | Network Appliance, Inc. | |
| D. Robinson | | D. Robinson | |
| Sun Microsystems, Inc. | | Sun Microsystems, Inc. | |
| R. Thurlow | | R. Thurlow | |
| Sun Microsystems, Inc. | | Sun Microsystems, Inc. | |
|
| August 2002 | | September 2002 | |
| | | | |
| NFS version 4 Protocol | | NFS version 4 Protocol | |
| | | | |
| Status of this Memo | | Status of this Memo | |
| | | | |
| This document is an Internet-Draft and is in full conformance with | | This document is an Internet-Draft and is in full conformance with | |
| all provisions of Section 10 of RFC2026. | | all provisions of Section 10 of RFC2026. | |
| | | | |
| Internet-Drafts are working documents of the Internet Engineering | | Internet-Drafts are working documents of the Internet Engineering | |
| Task Force (IETF), its areas, and its working groups. Note that | | Task Force (IETF), its areas, and its working groups. Note that | |
| | | | |
| skipping to change at page 2, line 5 | | skipping to change at page 2, line 5 | |
| http://www.ietf.org/ietf/1id-abstracts.txt | | http://www.ietf.org/ietf/1id-abstracts.txt | |
| | | | |
| The list of Internet-Draft Shadow Directories can be accessed at | | The list of Internet-Draft Shadow Directories can be accessed at | |
| http://www.ietf.org/shadow.html. | | http://www.ietf.org/shadow.html. | |
| | | | |
| Abstract | | Abstract | |
| | | | |
| NFS version 4 is a distributed filesystem protocol which owes | | NFS version 4 is a distributed filesystem protocol which owes | |
| heritage to NFS protocol versions 2 [RFC1094] and 3 [RFC1813]. | | heritage to NFS protocol versions 2 [RFC1094] and 3 [RFC1813]. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| Unlike earlier versions, the NFS version 4 protocol supports | | Unlike earlier versions, the NFS version 4 protocol supports | |
| traditional file access while integrating support for file locking | | traditional file access while integrating support for file locking | |
| and the mount protocol. In addition, support for strong security | | and the mount protocol. In addition, support for strong security | |
| (and its negotiation), compound operations, client caching, and | | (and its negotiation), compound operations, client caching, and | |
| internationalization have been added. Of course, attention has been | | internationalization have been added. Of course, attention has been | |
| applied to making NFS version 4 operate well in an Internet | | applied to making NFS version 4 operate well in an Internet | |
| environment. | | environment. | |
| | | | |
| Copyright | | Copyright | |
| | | | |
| Copyright (C) The Internet Society (2000-2002). All Rights Reserved. | | Copyright (C) The Internet Society (2000-2002). All Rights Reserved. | |
| | | | |
| Key Words | | Key Words | |
| | | | |
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |
| document are to be interpreted as described in [RFC2119]. | | document are to be interpreted as described in [RFC2119]. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| Table of Contents | | Table of Contents | |
| | | | |
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 7 | | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 7 | |
| 1.1. Inconsistencies of this Document with Section 18 . . . . . 7 | | 1.1. Inconsistencies of this Document with Section 18 . . . . . 7 | |
| 1.2. Overview of NFS version 4 Features . . . . . . . . . . . . 8 | | 1.2. Overview of NFS version 4 Features . . . . . . . . . . . . 8 | |
| 1.2.1. RPC and Security . . . . . . . . . . . . . . . . . . . . 8 | | 1.2.1. RPC and Security . . . . . . . . . . . . . . . . . . . . 8 | |
| 1.2.2. Procedure and Operation Structure . . . . . . . . . . . 8 | | 1.2.2. Procedure and Operation Structure . . . . . . . . . . . 8 | |
| 1.2.3. Filesystem Model . . . . . . . . . . . . . . . . . . . . 9 | | 1.2.3. Filesystem Model . . . . . . . . . . . . . . . . . . . . 9 | |
| 1.2.3.1. Filehandle Types . . . . . . . . . . . . . . . . . . . 9 | | 1.2.3.1. Filehandle Types . . . . . . . . . . . . . . . . . . . 9 | |
| | | | |
| skipping to change at page 3, line 34 | | skipping to change at page 3, line 34 | |
| 2.2. Structured Data Types . . . . . . . . . . . . . . . . . 15 | | 2.2. Structured Data Types . . . . . . . . . . . . . . . . . 15 | |
| 3. RPC and Security Flavor . . . . . . . . . . . . . . . . . 21 | | 3. RPC and Security Flavor . . . . . . . . . . . . . . . . . 21 | |
| 3.1. Ports and Transports . . . . . . . . . . . . . . . . . . 21 | | 3.1. Ports and Transports . . . . . . . . . . . . . . . . . . 21 | |
| 3.1.1. Client Retransmission Behavior . . . . . . . . . . . . 21 | | 3.1.1. Client Retransmission Behavior . . . . . . . . . . . . 21 | |
| 3.2. Security Flavors . . . . . . . . . . . . . . . . . . . . 22 | | 3.2. Security Flavors . . . . . . . . . . . . . . . . . . . . 22 | |
| 3.2.1. Security mechanisms for NFS version 4 . . . . . . . . 22 | | 3.2.1. Security mechanisms for NFS version 4 . . . . . . . . 22 | |
| 3.2.1.1. Kerberos V5 as a security triple . . . . . . . . . . 22 | | 3.2.1.1. Kerberos V5 as a security triple . . . . . . . . . . 22 | |
| 3.2.1.2. LIPKEY as a security triple . . . . . . . . . . . . 23 | | 3.2.1.2. LIPKEY as a security triple . . . . . . . . . . . . 23 | |
| 3.2.1.3. SPKM-3 as a security triple . . . . . . . . . . . . 24 | | 3.2.1.3. SPKM-3 as a security triple . . . . . . . . . . . . 24 | |
| 3.3. Security Negotiation . . . . . . . . . . . . . . . . . . 24 | | 3.3. Security Negotiation . . . . . . . . . . . . . . . . . . 24 | |
|
| 3.3.1. SECINFO . . . . . . . . . . . . . . . . . . . . . . . 25 | | 3.3.1. SECINFO . . . . . . . . . . . . . . . . . . . . . . . 24 | |
| 3.3.2. Security Error . . . . . . . . . . . . . . . . . . . . 25 | | 3.3.2. Security Error . . . . . . . . . . . . . . . . . . . . 25 | |
| 3.4. Callback RPC Authentication . . . . . . . . . . . . . . 25 | | 3.4. Callback RPC Authentication . . . . . . . . . . . . . . 25 | |
|
| 4. Filehandles . . . . . . . . . . . . . . . . . . . . . . . 28 | | 4. Filehandles . . . . . . . . . . . . . . . . . . . . . . . 27 | |
| 4.1. Obtaining the First Filehandle . . . . . . . . . . . . . 28 | | 4.1. Obtaining the First Filehandle . . . . . . . . . . . . . 27 | |
| 4.1.1. Root Filehandle . . . . . . . . . . . . . . . . . . . 28 | | 4.1.1. Root Filehandle . . . . . . . . . . . . . . . . . . . 27 | |
| 4.1.2. Public Filehandle . . . . . . . . . . . . . . . . . . 28 | | 4.1.2. Public Filehandle . . . . . . . . . . . . . . . . . . 27 | |
| 4.2. Filehandle Types . . . . . . . . . . . . . . . . . . . . 29 | | 4.2. Filehandle Types . . . . . . . . . . . . . . . . . . . . 28 | |
| 4.2.1. General Properties of a Filehandle . . . . . . . . . . 29 | | 4.2.1. General Properties of a Filehandle . . . . . . . . . . 28 | |
| 4.2.2. Persistent Filehandle . . . . . . . . . . . . . . . . 30 | | 4.2.2. Persistent Filehandle . . . . . . . . . . . . . . . . 29 | |
| 4.2.3. Volatile Filehandle . . . . . . . . . . . . . . . . . 30 | | 4.2.3. Volatile Filehandle . . . . . . . . . . . . . . . . . 29 | |
| 4.2.4. One Method of Constructing a Volatile Filehandle . . . 31 | | 4.2.4. One Method of Constructing a Volatile Filehandle . . . 30 | |
| 4.3. Client Recovery from Filehandle Expiration . . . . . . . 32 | | 4.3. Client Recovery from Filehandle Expiration . . . . . . . 31 | |
| 5. File Attributes . . . . . . . . . . . . . . . . . . . . . 34 | | 5. File Attributes . . . . . . . . . . . . . . . . . . . . . 33 | |
| 5.1. Mandatory Attributes . . . . . . . . . . . . . . . . . . 35 | | 5.1. Mandatory Attributes . . . . . . . . . . . . . . . . . . 34 | |
| 5.2. Recommended Attributes . . . . . . . . . . . . . . . . . 35 | | 5.2. Recommended Attributes . . . . . . . . . . . . . . . . . 34 | |
| 5.3. Named Attributes . . . . . . . . . . . . . . . . . . . . 35 | | 5.3. Named Attributes . . . . . . . . . . . . . . . . . . . . 34 | |
| 5.4. Classification of Attributes . . . . . . . . . . . . . . 36 | | 5.4. Classification of Attributes . . . . . . . . . . . . . . 35 | |
| 5.5. Mandatory Attributes - Definitions . . . . . . . . . . . 38 | | 5.5. Mandatory Attributes - Definitions . . . . . . . . . . . 37 | |
| 5.6. Recommended Attributes - Definitions . . . . . . . . . . 40 | | 5.6. Recommended Attributes - Definitions . . . . . . . . . . 39 | |
| 5.7. Time Access . . . . . . . . . . . . . . . . . . . . . . 45 | | 5.7. Time Access . . . . . . . . . . . . . . . . . . . . . . 44 | |
| 5.8. Interpreting owner and owner_group . . . . . . . . . . . 45 | | 5.8. Interpreting owner and owner_group . . . . . . . . . . . 44 | |
| 5.9. Character Case Attributes . . . . . . . . . . . . . . . 47 | | 5.9. Character Case Attributes . . . . . . . . . . . . . . . 46 | |
| 5.10. Quota Attributes . . . . . . . . . . . . . . . . . . . 47 | | 5.10. Quota Attributes . . . . . . . . . . . . . . . . . . . 46 | |
| 5.11. Access Control Lists . . . . . . . . . . . . . . . . . 48 | | 5.11. Access Control Lists . . . . . . . . . . . . . . . . . 47 | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
|
| 5.11.1. ACE type . . . . . . . . . . . . . . . . . . . . . . 49 | | 5.11.1. ACE type . . . . . . . . . . . . . . . . . . . . . . 48 | |
| 5.11.2. ACE Access Mask . . . . . . . . . . . . . . . . . . . 50 | | 5.11.2. ACE Access Mask . . . . . . . . . . . . . . . . . . . 49 | |
| 5.11.3. ACE flag . . . . . . . . . . . . . . . . . . . . . . 52 | | 5.11.3. ACE flag . . . . . . . . . . . . . . . . . . . . . . 51 | |
| 5.11.4. ACE who . . . . . . . . . . . . . . . . . . . . . . . 53 | | 5.11.4. ACE who . . . . . . . . . . . . . . . . . . . . . . . 52 | |
| 5.11.5. Mode Attribute . . . . . . . . . . . . . . . . . . . 54 | | 5.11.5. Mode Attribute . . . . . . . . . . . . . . . . . . . 53 | |
| 5.11.6. Mode and ACL Attribute . . . . . . . . . . . . . . . 55 | | 5.11.6. Mode and ACL Attribute . . . . . . . . . . . . . . . 54 | |
| 5.11.7. mounted_on_fileid . . . . . . . . . . . . . . . . . . 55 | | 5.11.7. mounted_on_fileid . . . . . . . . . . . . . . . . . . 54 | |
| 6. Filesystem Migration and Replication . . . . . . . . . . . 57 | | 6. Filesystem Migration and Replication . . . . . . . . . . . 56 | |
| 6.1. Replication . . . . . . . . . . . . . . . . . . . . . . 57 | | 6.1. Replication . . . . . . . . . . . . . . . . . . . . . . 56 | |
| 6.2. Migration . . . . . . . . . . . . . . . . . . . . . . . 57 | | 6.2. Migration . . . . . . . . . . . . . . . . . . . . . . . 56 | |
| 6.3. Interpretation of the fs_locations Attribute . . . . . . 58 | | 6.3. Interpretation of the fs_locations Attribute . . . . . . 57 | |
| 6.4. Filehandle Recovery for Migration or Replication . . . . 59 | | 6.4. Filehandle Recovery for Migration or Replication . . . . 58 | |
| 7. NFS Server Name Space . . . . . . . . . . . . . . . . . . 60 | | 7. NFS Server Name Space . . . . . . . . . . . . . . . . . . 59 | |
| 7.1. Server Exports . . . . . . . . . . . . . . . . . . . . . 60 | | 7.1. Server Exports . . . . . . . . . . . . . . . . . . . . . 59 | |
| 7.2. Browsing Exports . . . . . . . . . . . . . . . . . . . . 60 | | 7.2. Browsing Exports . . . . . . . . . . . . . . . . . . . . 59 | |
| 7.3. Server Pseudo Filesystem . . . . . . . . . . . . . . . . 60 | | 7.3. Server Pseudo Filesystem . . . . . . . . . . . . . . . . 59 | |
| 7.4. Multiple Roots . . . . . . . . . . . . . . . . . . . . . 61 | | 7.4. Multiple Roots . . . . . . . . . . . . . . . . . . . . . 60 | |
| 7.5. Filehandle Volatility . . . . . . . . . . . . . . . . . 61 | | 7.5. Filehandle Volatility . . . . . . . . . . . . . . . . . 60 | |
| 7.6. Exported Root . . . . . . . . . . . . . . . . . . . . . 61 | | 7.6. Exported Root . . . . . . . . . . . . . . . . . . . . . 60 | |
| 7.7. Mount Point Crossing . . . . . . . . . . . . . . . . . . 62 | | 7.7. Mount Point Crossing . . . . . . . . . . . . . . . . . . 61 | |
| 7.8. Security Policy and Name Space Presentation . . . . . . 62 | | 7.8. Security Policy and Name Space Presentation . . . . . . 61 | |
| 8. File Locking and Share Reservations . . . . . . . . . . . 64 | | 8. File Locking and Share Reservations . . . . . . . . . . . 63 | |
| 8.1. Locking . . . . . . . . . . . . . . . . . . . . . . . . 64 | | 8.1. Locking . . . . . . . . . . . . . . . . . . . . . . . . 63 | |
| 8.1.1. Client ID . . . . . . . . . . . . . . . . . . . . . . 64 | | 8.1.1. Client ID . . . . . . . . . . . . . . . . . . . . . . 63 | |
| 8.1.2. Server Release of Clientid . . . . . . . . . . . . . . 67 | | 8.1.2. Server Release of Clientid . . . . . . . . . . . . . . 66 | |
| 8.1.3. lock_owner and stateid Definition . . . . . . . . . . 68 | | 8.1.3. lock_owner and stateid Definition . . . . . . . . . . 67 | |
| 8.1.4. Use of the stateid and Locking . . . . . . . . . . . . 69 | | 8.1.4. Use of the stateid and Locking . . . . . . . . . . . . 68 | |
| 8.1.5. Sequencing of Lock Requests . . . . . . . . . . . . . 71 | | 8.1.5. Sequencing of Lock Requests . . . . . . . . . . . . . 70 | |
| 8.1.6. Recovery from Replayed Requests . . . . . . . . . . . 72 | | 8.1.6. Recovery from Replayed Requests . . . . . . . . . . . 71 | |
| 8.1.7. Releasing lock_owner State . . . . . . . . . . . . . . 72 | | 8.1.7. Releasing lock_owner State . . . . . . . . . . . . . . 72 | |
|
| 8.1.8. Use of Open Confirmation . . . . . . . . . . . . . . . 73 | | 8.1.8. Use of Open Confirmation . . . . . . . . . . . . . . . 72 | |
| 8.2. Lock Ranges . . . . . . . . . . . . . . . . . . . . . . 74 | | 8.2. Lock Ranges . . . . . . . . . . . . . . . . . . . . . . 73 | |
| 8.3. Upgrading and Downgrading Locks . . . . . . . . . . . . 74 | | 8.3. Upgrading and Downgrading Locks . . . . . . . . . . . . 73 | |
| 8.4. Blocking Locks . . . . . . . . . . . . . . . . . . . . . 75 | | 8.4. Blocking Locks . . . . . . . . . . . . . . . . . . . . . 74 | |
| 8.5. Lease Renewal . . . . . . . . . . . . . . . . . . . . . 75 | | 8.5. Lease Renewal . . . . . . . . . . . . . . . . . . . . . 74 | |
| 8.6. Crash Recovery . . . . . . . . . . . . . . . . . . . . . 76 | | 8.6. Crash Recovery . . . . . . . . . . . . . . . . . . . . . 75 | |
| 8.6.1. Client Failure and Recovery . . . . . . . . . . . . . 76 | | 8.6.1. Client Failure and Recovery . . . . . . . . . . . . . 76 | |
|
| 8.6.2. Server Failure and Recovery . . . . . . . . . . . . . 77 | | 8.6.2. Server Failure and Recovery . . . . . . . . . . . . . 76 | |
| 8.6.3. Network Partitions and Recovery . . . . . . . . . . . 79 | | 8.6.3. Network Partitions and Recovery . . . . . . . . . . . 78 | |
| 8.7. Recovery from a Lock Request Timeout or Abort . . . . . 80 | | 8.7. Recovery from a Lock Request Timeout or Abort . . . . . 81 | |
| 8.8. Server Revocation of Locks . . . . . . . . . . . . . . . 80 | | 8.8. Server Revocation of Locks . . . . . . . . . . . . . . . 82 | |
| 8.9. Share Reservations . . . . . . . . . . . . . . . . . . . 81 | | 8.9. Share Reservations . . . . . . . . . . . . . . . . . . . 83 | |
| 8.10. OPEN/CLOSE Operations . . . . . . . . . . . . . . . . . 82 | | 8.10. OPEN/CLOSE Operations . . . . . . . . . . . . . . . . . 83 | |
| 8.10.1. Close and Retention of State Information . . . . . . 83 | | 8.10.1. Close and Retention of State Information . . . . . . 84 | |
| 8.11. Open Upgrade and Downgrade . . . . . . . . . . . . . . 83 | | 8.11. Open Upgrade and Downgrade . . . . . . . . . . . . . . 85 | |
| 8.12. Short and Long Leases . . . . . . . . . . . . . . . . . 84 | | 8.12. Short and Long Leases . . . . . . . . . . . . . . . . . 85 | |
| 8.13. Clocks, Propagation Delay, and Calculating Lease | | 8.13. Clocks, Propagation Delay, and Calculating Lease | |
|
| Expiration . . . . . . . . . . . . . . . . . . . . . . 84 | | Expiration . . . . . . . . . . . . . . . . . . . . . . 86 | |
| 8.14. Migration, Replication and State . . . . . . . . . . . 85 | | 8.14. Migration, Replication and State . . . . . . . . . . . 86 | |
| 8.14.1. Migration and State . . . . . . . . . . . . . . . . . 85 | | 8.14.1. Migration and State . . . . . . . . . . . . . . . . . 87 | |
| 8.14.2. Replication and State . . . . . . . . . . . . . . . . 86 | | 8.14.2. Replication and State . . . . . . . . . . . . . . . . 87 | |
| 8.14.3. Notification of Migrated Lease . . . . . . . . . . . 86 | | 8.14.3. Notification of Migrated Lease . . . . . . . . . . . 88 | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
|
| 8.14.4. Migration and the Lease_time Attribute . . . . . . . 87 | | 8.14.4. Migration and the Lease_time Attribute . . . . . . . 88 | |
| 9. Client-Side Caching . . . . . . . . . . . . . . . . . . . 88 | | 9. Client-Side Caching . . . . . . . . . . . . . . . . . . . 90 | |
| 9.1. Performance Challenges for Client-Side Caching . . . . . 88 | | 9.1. Performance Challenges for Client-Side Caching . . . . . 90 | |
| 9.2. Delegation and Callbacks . . . . . . . . . . . . . . . . 89 | | 9.2. Delegation and Callbacks . . . . . . . . . . . . . . . . 91 | |
| 9.2.1. Delegation Recovery . . . . . . . . . . . . . . . . . 90 | | 9.2.1. Delegation Recovery . . . . . . . . . . . . . . . . . 92 | |
| 9.3. Data Caching . . . . . . . . . . . . . . . . . . . . . . 92 | | 9.3. Data Caching . . . . . . . . . . . . . . . . . . . . . . 94 | |
| 9.3.1. Data Caching and OPENs . . . . . . . . . . . . . . . . 92 | | 9.3.1. Data Caching and OPENs . . . . . . . . . . . . . . . . 94 | |
| 9.3.2. Data Caching and File Locking . . . . . . . . . . . . 93 | | 9.3.2. Data Caching and File Locking . . . . . . . . . . . . 95 | |
| 9.3.3. Data Caching and Mandatory File Locking . . . . . . . 95 | | 9.3.3. Data Caching and Mandatory File Locking . . . . . . . 97 | |
| 9.3.4. Data Caching and File Identity . . . . . . . . . . . . 95 | | 9.3.4. Data Caching and File Identity . . . . . . . . . . . . 97 | |
| 9.4. Open Delegation . . . . . . . . . . . . . . . . . . . . 96 | | 9.4. Open Delegation . . . . . . . . . . . . . . . . . . . . 98 | |
| 9.4.1. Open Delegation and Data Caching . . . . . . . . . . . 99 | | 9.4.1. Open Delegation and Data Caching . . . . . . . . . . . 101 | |
| 9.4.2. Open Delegation and File Locks . . . . . . . . . . . . 100 | | 9.4.2. Open Delegation and File Locks . . . . . . . . . . . . 102 | |
| 9.4.3. Handling of CB_GETATTR . . . . . . . . . . . . . . . . 100 | | 9.4.3. Handling of CB_GETATTR . . . . . . . . . . . . . . . . 102 | |
| 9.4.4. Recall of Open Delegation . . . . . . . . . . . . . . 102 | | 9.4.4. Recall of Open Delegation . . . . . . . . . . . . . . 105 | |
| 9.4.5. Delegation Revocation . . . . . . . . . . . . . . . . 104 | | 9.4.5. Clients that Fail to Honor Delegation Recalls . . . . 107 | |
| 9.5. Data Caching and Revocation . . . . . . . . . . . . . . 104 | | 9.4.6. Delegation Revocation . . . . . . . . . . . . . . . . 107 | |
| 9.5.1. Revocation Recovery for Write Open Delegation . . . . 104 | | 9.5. Data Caching and Revocation . . . . . . . . . . . . . . 108 | |
| 9.6. Attribute Caching . . . . . . . . . . . . . . . . . . . 105 | | 9.5.1. Revocation Recovery for Write Open Delegation . . . . 108 | |
| 9.7. Name Caching . . . . . . . . . . . . . . . . . . . . . . 107 | | 9.6. Attribute Caching . . . . . . . . . . . . . . . . . . . 109 | |
| 9.8. Directory Caching . . . . . . . . . . . . . . . . . . . 108 | | 9.7. Data and Metadata Caching and Memory Mapped Files . . . 111 | |
| 10. Minor Versioning . . . . . . . . . . . . . . . . . . . . 110 | | 9.8. Name Caching . . . . . . . . . . . . . . . . . . . . . . 113 | |
| 11. Internationalization . . . . . . . . . . . . . . . . . . 113 | | 9.9. Directory Caching . . . . . . . . . . . . . . . . . . . 114 | |
| 11.1. Universal Versus Local Character Sets . . . . . . . . . 113 | | 10. Minor Versioning . . . . . . . . . . . . . . . . . . . . 116 | |
| 11.2. Overview of Universal Character Set Standards . . . . . 114 | | 11. Internationalization . . . . . . . . . . . . . . . . . . 119 | |
| 11.3. Difficulties with UCS-4, UCS-2, Unicode . . . . . . . . 115 | | 11.1. Universal Versus Local Character Sets . . . . . . . . . 119 | |
| 11.4. UTF-8 and its solutions . . . . . . . . . . . . . . . . 115 | | 11.2. Overview of Universal Character Set Standards . . . . . 120 | |
| 11.5. Normalization . . . . . . . . . . . . . . . . . . . . . 116 | | 11.3. Difficulties with UCS-4, UCS-2, Unicode . . . . . . . . 121 | |
| 11.6. UTF-8 Related Errors . . . . . . . . . . . . . . . . . 116 | | 11.4. UTF-8 and its solutions . . . . . . . . . . . . . . . . 121 | |
| 12. Error Definitions . . . . . . . . . . . . . . . . . . . . 118 | | 11.5. Normalization . . . . . . . . . . . . . . . . . . . . . 122 | |
| 13. NFS version 4 Requests . . . . . . . . . . . . . . . . . 124 | | 11.6. UTF-8 Related Errors . . . . . . . . . . . . . . . . . 122 | |
| 13.1. Compound Procedure . . . . . . . . . . . . . . . . . . 124 | | 12. Error Definitions . . . . . . . . . . . . . . . . . . . . 124 | |
| 13.2. Evaluation of a Compound Request . . . . . . . . . . . 125 | | 13. NFS version 4 Requests . . . . . . . . . . . . . . . . . 130 | |
| 13.3. Synchronous Modifying Operations . . . . . . . . . . . 125 | | 13.1. Compound Procedure . . . . . . . . . . . . . . . . . . 130 | |
| 13.4. Operation Values . . . . . . . . . . . . . . . . . . . 126 | | 13.2. Evaluation of a Compound Request . . . . . . . . . . . 131 | |
| 14. NFS version 4 Procedures . . . . . . . . . . . . . . . . 127 | | 13.3. Synchronous Modifying Operations . . . . . . . . . . . 131 | |
| 14.1. Procedure 0: NULL - No Operation . . . . . . . . . . . 127 | | 13.4. Operation Values . . . . . . . . . . . . . . . . . . . 132 | |
| 14.2. Procedure 1: COMPOUND - Compound Operations . . . . . . 128 | | 14. NFS version 4 Procedures . . . . . . . . . . . . . . . . 133 | |
| 14.2.1. Operation 3: ACCESS - Check Access Rights . . . . . . 131 | | 14.1. Procedure 0: NULL - No Operation . . . . . . . . . . . 133 | |
| 14.2.2. Operation 4: CLOSE - Close File . . . . . . . . . . . 134 | | 14.2. Procedure 1: COMPOUND - Compound Operations . . . . . . 134 | |
| 14.2.3. Operation 5: COMMIT - Commit Cached Data . . . . . . 136 | | 14.2.1. Operation 3: ACCESS - Check Access Rights . . . . . . 137 | |
| 14.2.4. Operation 6: CREATE - Create a Non-Regular File Object 139 | | 14.2.2. Operation 4: CLOSE - Close File . . . . . . . . . . . 140 | |
| | | 14.2.3. Operation 5: COMMIT - Commit Cached Data . . . . . . 142 | |
| | | 14.2.4. Operation 6: CREATE - Create a Non-Regular File Object 145 | |
| 14.2.5. Operation 7: DELEGPURGE - Purge Delegations Awaiting | | 14.2.5. Operation 7: DELEGPURGE - Purge Delegations Awaiting | |
|
| Recovery . . . . . . . . . . . . . . . . . . . . . . 142 | | Recovery . . . . . . . . . . . . . . . . . . . . . . 148 | |
| 14.2.6. Operation 8: DELEGRETURN - Return Delegation . . . . 143 | | 14.2.6. Operation 8: DELEGRETURN - Return Delegation . . . . 150 | |
| 14.2.7. Operation 9: GETATTR - Get Attributes . . . . . . . . 144 | | 14.2.7. Operation 9: GETATTR - Get Attributes . . . . . . . . 151 | |
| 14.2.8. Operation 10: GETFH - Get Current Filehandle . . . . 146 | | 14.2.8. Operation 10: GETFH - Get Current Filehandle . . . . 153 | |
| 14.2.9. Operation 11: LINK - Create Link to a File . . . . . 148 | | 14.2.9. Operation 11: LINK - Create Link to a File . . . . . 155 | |
| 14.2.10. Operation 12: LOCK - Create Lock . . . . . . . . . . 150 | | 14.2.10. Operation 12: LOCK - Create Lock . . . . . . . . . . 157 | |
| 14.2.11. Operation 13: LOCKT - Test For Lock . . . . . . . . 154 | | 14.2.11. Operation 13: LOCKT - Test For Lock . . . . . . . . 161 | |
| 14.2.12. Operation 14: LOCKU - Unlock File . . . . . . . . . 156 | | | |
| 14.2.13. Operation 15: LOOKUP - Lookup Filename . . . . . . . 158 | | | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
|
| 14.2.14. Operation 16: LOOKUPP - Lookup Parent Directory . . 161 | | 14.2.12. Operation 14: LOCKU - Unlock File . . . . . . . . . 163 | |
| | | 14.2.13. Operation 15: LOOKUP - Lookup Filename . . . . . . . 165 | |
| | | 14.2.14. Operation 16: LOOKUPP - Lookup Parent Directory . . 168 | |
| 14.2.15. Operation 17: NVERIFY - Verify Difference in | | 14.2.15. Operation 17: NVERIFY - Verify Difference in | |
|
| Attributes . . . . . . . . . . . . . . . . . . . . . 162 | | Attributes . . . . . . . . . . . . . . . . . . . . . 169 | |
| 14.2.16. Operation 18: OPEN - Open a Regular File . . . . . . 164 | | 14.2.16. Operation 18: OPEN - Open a Regular File . . . . . . 171 | |
| 14.2.17. Operation 19: OPENATTR - Open Named Attribute | | 14.2.17. Operation 19: OPENATTR - Open Named Attribute | |
|
| Directory . . . . . . . . . . . . . . . . . . . . . 174 | | Directory . . . . . . . . . . . . . . . . . . . . . 181 | |
| 14.2.18. Operation 20: OPEN_CONFIRM - Confirm Open . . . . . 176 | | 14.2.18. Operation 20: OPEN_CONFIRM - Confirm Open . . . . . 183 | |
| 14.2.19. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access179 | | 14.2.19. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access186 | |
| 14.2.20. Operation 22: PUTFH - Set Current Filehandle . . . . 181 | | 14.2.20. Operation 22: PUTFH - Set Current Filehandle . . . . 188 | |
| 14.2.21. Operation 23: PUTPUBFH - Set Public Filehandle . . . 182 | | 14.2.21. Operation 23: PUTPUBFH - Set Public Filehandle . . . 189 | |
| 14.2.22. Operation 24: PUTROOTFH - Set Root Filehandle . . . 184 | | 14.2.22. Operation 24: PUTROOTFH - Set Root Filehandle . . . 191 | |
| 14.2.23. Operation 25: READ - Read from File . . . . . . . . 185 | | 14.2.23. Operation 25: READ - Read from File . . . . . . . . 192 | |
| 14.2.24. Operation 26: READDIR - Read Directory . . . . . . . 188 | | 14.2.24. Operation 26: READDIR - Read Directory . . . . . . . 195 | |
| 14.2.25. Operation 27: READLINK - Read Symbolic Link . . . . 192 | | 14.2.25. Operation 27: READLINK - Read Symbolic Link . . . . 199 | |
| 14.2.26. Operation 28: REMOVE - Remove Filesystem Object . . 194 | | 14.2.26. Operation 28: REMOVE - Remove Filesystem Object . . 201 | |
| 14.2.27. Operation 29: RENAME - Rename Directory Entry . . . 197 | | 14.2.27. Operation 29: RENAME - Rename Directory Entry . . . 204 | |
| 14.2.28. Operation 30: RENEW - Renew a Lease . . . . . . . . 200 | | 14.2.28. Operation 30: RENEW - Renew a Lease . . . . . . . . 207 | |
| 14.2.29. Operation 31: RESTOREFH - Restore Saved Filehandle . 201 | | 14.2.29. Operation 31: RESTOREFH - Restore Saved Filehandle . 209 | |
| 14.2.30. Operation 32: SAVEFH - Save Current Filehandle . . . 203 | | 14.2.30. Operation 32: SAVEFH - Save Current Filehandle . . . 211 | |
| 14.2.31. Operation 33: SECINFO - Obtain Available Security . 204 | | 14.2.31. Operation 33: SECINFO - Obtain Available Security . 212 | |
| 14.2.32. Operation 34: SETATTR - Set Attributes . . . . . . . 208 | | 14.2.32. Operation 34: SETATTR - Set Attributes . . . . . . . 216 | |
| 14.2.33. Operation 35: SETCLIENTID - Negotiate Clientid . . . 211 | | 14.2.33. Operation 35: SETCLIENTID - Negotiate Clientid . . . 219 | |
| 14.2.34. Operation 36: SETCLIENTID_CONFIRM - Confirm Clientid 215 | | 14.2.34. Operation 36: SETCLIENTID_CONFIRM - Confirm Clientid 223 | |
| 14.2.35. Operation 37: VERIFY - Verify Same Attributes . . . 219 | | 14.2.35. Operation 37: VERIFY - Verify Same Attributes . . . 227 | |
| 14.2.36. Operation 38: WRITE - Write to File . . . . . . . . 221 | | 14.2.36. Operation 38: WRITE - Write to File . . . . . . . . 229 | |
| 14.2.37. Operation 39: RELEASE_LOCKOWNER - Release Lockowner | | 14.2.37. Operation 39: RELEASE_LOCKOWNER - Release Lockowner | |
|
| State . . . . . . . . . . . . . . . . . . . . . . . 226 | | State . . . . . . . . . . . . . . . . . . . . . . . 234 | |
| 14.2.38. Operation 10044: ILLEGAL - Illegal operation . . . . 228 | | 14.2.38. Operation 10044: ILLEGAL - Illegal operation . . . . 236 | |
| 15. NFS version 4 Callback Procedures . . . . . . . . . . . . 229 | | 15. NFS version 4 Callback Procedures . . . . . . . . . . . . 237 | |
| 15.1. Procedure 0: CB_NULL - No Operation . . . . . . . . . . 229 | | 15.1. Procedure 0: CB_NULL - No Operation . . . . . . . . . . 237 | |
| 15.2. Procedure 1: CB_COMPOUND - Compound Operations . . . . 230 | | 15.2. Procedure 1: CB_COMPOUND - Compound Operations . . . . 238 | |
| 15.2.1. Operation 3: CB_GETATTR - Get Attributes . . . . . . 232 | | 15.2.1. Operation 3: CB_GETATTR - Get Attributes . . . . . . 240 | |
| 15.2.2. Operation 4: CB_RECALL - Recall an Open Delegation . 234 | | 15.2.2. Operation 4: CB_RECALL - Recall an Open Delegation . 242 | |
| 15.2.3. Operation 10044: CB_ILLEGAL - Illegal Callback | | 15.2.3. Operation 10044: CB_ILLEGAL - Illegal Callback | |
|
| Operation . . . . . . . . . . . . . . . . . . . . . . 236 | | Operation . . . . . . . . . . . . . . . . . . . . . . 244 | |
| 16. Security Considerations . . . . . . . . . . . . . . . . . 237 | | 16. Security Considerations . . . . . . . . . . . . . . . . . 245 | |
| 17. IANA Considerations . . . . . . . . . . . . . . . . . . . 238 | | 17. IANA Considerations . . . . . . . . . . . . . . . . . . . 246 | |
| 17.1. Named Attribute Definition . . . . . . . . . . . . . . 238 | | 17.1. Named Attribute Definition . . . . . . . . . . . . . . 246 | |
| 17.2. ONC RPC Network Identifiers (netids) . . . . . . . . . 238 | | 17.2. ONC RPC Network Identifiers (netids) . . . . . . . . . 246 | |
| 18. RPC definition file . . . . . . . . . . . . . . . . . . . 239 | | 18. RPC definition file . . . . . . . . . . . . . . . . . . . 247 | |
| 19. Bibliography . . . . . . . . . . . . . . . . . . . . . . 271 | | 19. Bibliography . . . . . . . . . . . . . . . . . . . . . . 279 | |
| 20. Authors . . . . . . . . . . . . . . . . . . . . . . . . . 277 | | 20. Authors . . . . . . . . . . . . . . . . . . . . . . . . . 285 | |
| 20.1. Editor's Address . . . . . . . . . . . . . . . . . . . 277 | | 20.1. Editor's Address . . . . . . . . . . . . . . . . . . . 285 | |
| 20.2. Authors' Addresses . . . . . . . . . . . . . . . . . . 277 | | 20.2. Authors' Addresses . . . . . . . . . . . . . . . . . . 285 | |
| 20.3. Acknowledgements . . . . . . . . . . . . . . . . . . . 278 | | 20.3. Acknowledgements . . . . . . . . . . . . . . . . . . . 286 | |
| 21. Full Copyright Statement . . . . . . . . . . . . . . . . 279 | | 21. Full Copyright Statement . . . . . . . . . . . . . . . . 287 | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| 1. Introduction | | 1. Introduction | |
| | | | |
| The NFS version 4 protocol is a further revision of the NFS protocol | | The NFS version 4 protocol is a further revision of the NFS protocol | |
| defined already by versions 2 [RFC1094] and 3 [RFC1813]. It retains | | defined already by versions 2 [RFC1094] and 3 [RFC1813]. It retains | |
| the essential characteristics of previous versions: design for easy | | the essential characteristics of previous versions: design for easy | |
| recovery, independent of transport protocols, operating systems and | | recovery, independent of transport protocols, operating systems and | |
| filesystems, simplicity, and good performance. The NFS version 4 | | filesystems, simplicity, and good performance. The NFS version 4 | |
| revision has the following goals: | | revision has the following goals: | |
| | | | |
| | | | |
| skipping to change at page 8, line 5 | | skipping to change at page 8, line 5 | |
| 1.1. Inconsistencies of this Document with Section 18 | | 1.1. Inconsistencies of this Document with Section 18 | |
| | | | |
| Section 18, RPC Definition File, contains the definitions in XDR | | Section 18, RPC Definition File, contains the definitions in XDR | |
| description language of the constructs used by the protocol. Prior | | description language of the constructs used by the protocol. Prior | |
| to Section 18, several of the constructs are reproduced for purposes | | to Section 18, several of the constructs are reproduced for purposes | |
| of explanation. The reader is warned of the possibility of errors in | | of explanation. The reader is warned of the possibility of errors in | |
| the reproduced constructs outside of Section 18. For any part of the | | the reproduced constructs outside of Section 18. For any part of the | |
| document that is inconsistent with Section 18, Section 18 is to be | | document that is inconsistent with Section 18, Section 18 is to be | |
| considered authoritative. | | considered authoritative. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| 1.2. Overview of NFS version 4 Features | | 1.2. Overview of NFS version 4 Features | |
| | | | |
| To provide a reasonable context for the reader, the major features of | | To provide a reasonable context for the reader, the major features of | |
| NFS version 4 protocol will be reviewed in brief. This will be done | | NFS version 4 protocol will be reviewed in brief. This will be done | |
| to provide an appropriate context for both the reader who is familiar | | to provide an appropriate context for both the reader who is familiar | |
| with the previous versions of the NFS protocol and the reader that is | | with the previous versions of the NFS protocol and the reader that is | |
| new to the NFS protocols. For the reader new to the NFS protocols, | | new to the NFS protocols. For the reader new to the NFS protocols, | |
| there is still a fundamental knowledge that is expected. The reader | | there is still a fundamental knowledge that is expected. The reader | |
| should be familiar with the XDR and RPC protocols as described in | | should be familiar with the XDR and RPC protocols as described in | |
| | | | |
| skipping to change at page 9, line 5 | | skipping to change at page 9, line 5 | |
| The COMPOUND procedure is defined in terms of operations and these | | The COMPOUND procedure is defined in terms of operations and these | |
| operations correspond more closely to the traditional NFS procedures. | | operations correspond more closely to the traditional NFS procedures. | |
| With the use of the COMPOUND procedure, the client is able to build | | With the use of the COMPOUND procedure, the client is able to build | |
| simple or complex requests. These COMPOUND requests allow for a | | simple or complex requests. These COMPOUND requests allow for a | |
| reduction in the number of RPCs needed for logical filesystem | | reduction in the number of RPCs needed for logical filesystem | |
| operations. For example, without previous contact with a server a | | operations. For example, without previous contact with a server a | |
| client will be able to read data from a file in one request by | | client will be able to read data from a file in one request by | |
| combining LOOKUP, OPEN, and READ operations in a single COMPOUND RPC. | | combining LOOKUP, OPEN, and READ operations in a single COMPOUND RPC. | |
| With previous versions of the NFS protocol, this type of single | | With previous versions of the NFS protocol, this type of single | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| request was not possible. | | request was not possible. | |
| | | | |
| The model used for COMPOUND is very simple. There is no logical OR | | The model used for COMPOUND is very simple. There is no logical OR | |
| or ANDing of operations. The operations combined within a COMPOUND | | or ANDing of operations. The operations combined within a COMPOUND | |
| request are evaluated in order by the server. Once an operation | | request are evaluated in order by the server. Once an operation | |
| returns a failing result, the evaluation ends and the results of all | | returns a failing result, the evaluation ends and the results of all | |
| evaluated operations are returned to the client. | | evaluated operations are returned to the client. | |
| | | | |
| The NFS version 4 protocol continues to have the client refer to a | | The NFS version 4 protocol continues to have the client refer to a | |
| | | | |
| skipping to change at page 9, line 38 | | skipping to change at page 9, line 38 | |
| the same as previous versions. The server filesystem is hierarchical | | the same as previous versions. The server filesystem is hierarchical | |
| with the regular files contained within being treated as opaque byte | | with the regular files contained within being treated as opaque byte | |
| streams. In a slight departure, file and directory names are encoded | | streams. In a slight departure, file and directory names are encoded | |
| with UTF-8 to deal with the basics of internationalization. | | with UTF-8 to deal with the basics of internationalization. | |
| | | | |
| The NFS version 4 protocol does not require a separate protocol to | | The NFS version 4 protocol does not require a separate protocol to | |
| provide for the initial mapping between path name and filehandle. | | provide for the initial mapping between path name and filehandle. | |
| Instead of using the older MOUNT protocol for this mapping, the | | Instead of using the older MOUNT protocol for this mapping, the | |
| server provides a ROOT filehandle that represents the logical root or | | server provides a ROOT filehandle that represents the logical root or | |
| top of the filesystem tree provided by the server. The server | | top of the filesystem tree provided by the server. The server | |
|
| provides multiple filesystems by glueing them together with pseudo | | provides multiple filesystems by gluing them together with pseudo | |
| filesystems. These pseudo filesystems provide for potential gaps in | | filesystems. These pseudo filesystems provide for potential gaps in | |
| the path names between real filesystems. | | the path names between real filesystems. | |
| | | | |
| 1.2.3.1. Filehandle Types | | 1.2.3.1. Filehandle Types | |
| | | | |
| In previous versions of the NFS protocol, the filehandle provided by | | In previous versions of the NFS protocol, the filehandle provided by | |
| the server was guaranteed to be valid or persistent for the lifetime | | the server was guaranteed to be valid or persistent for the lifetime | |
| of the filesystem object to which it referred. For some server | | of the filesystem object to which it referred. For some server | |
| implementations, this persistence requirement has been difficult to | | implementations, this persistence requirement has been difficult to | |
| meet. For the NFS version 4 protocol, this requirement has been | | meet. For the NFS version 4 protocol, this requirement has been | |
| relaxed by introducing another type of filehandle, volatile. With | | relaxed by introducing another type of filehandle, volatile. With | |
| persistent and volatile filehandle types, the server implementation | | persistent and volatile filehandle types, the server implementation | |
| can match the abilities of the filesystem at the server along with | | can match the abilities of the filesystem at the server along with | |
| the operating environment. The client will have knowledge of the | | the operating environment. The client will have knowledge of the | |
| type of filehandle being provided by the server and can be prepared | | type of filehandle being provided by the server and can be prepared | |
| to deal with the semantics of each. | | to deal with the semantics of each. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| 1.2.3.2. Attribute Types | | 1.2.3.2. Attribute Types | |
| | | | |
| The NFS version 4 protocol introduces three classes of filesystem or | | The NFS version 4 protocol introduces three classes of filesystem or | |
| file attributes. Like the additional filehandle type, the | | file attributes. Like the additional filehandle type, the | |
| classification of file attributes has been done to ease server | | classification of file attributes has been done to ease server | |
| implementations along with extending the overall functionality of the | | implementations along with extending the overall functionality of the | |
| NFS protocol. This attribute model is structured to be extensible | | NFS protocol. This attribute model is structured to be extensible | |
| such that new attributes can be introduced in minor revisions of the | | such that new attributes can be introduced in minor revisions of the | |
| protocol without requiring significant rework. | | protocol without requiring significant rework. | |
| | | | |
| skipping to change at page 11, line 5 | | skipping to change at page 11, line 5 | |
| replicate server filesystems is enabled within the protocol. The | | replicate server filesystems is enabled within the protocol. The | |
| filesystem locations attribute provides a method for the client to | | filesystem locations attribute provides a method for the client to | |
| probe the server about the location of a filesystem. In the event of | | probe the server about the location of a filesystem. In the event of | |
| a migration of a filesystem, the client will receive an error when | | a migration of a filesystem, the client will receive an error when | |
| operating on the filesystem and it can then query as to the new file | | operating on the filesystem and it can then query as to the new file | |
| system location. Similar steps are used for replication, the client | | system location. Similar steps are used for replication, the client | |
| is able to query the server for the multiple available locations of a | | is able to query the server for the multiple available locations of a | |
| particular filesystem. From this information, the client can use its | | particular filesystem. From this information, the client can use its | |
| own policies to access the appropriate filesystem location. | | own policies to access the appropriate filesystem location. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| 1.2.4. OPEN and CLOSE | | 1.2.4. OPEN and CLOSE | |
| | | | |
| The NFS version 4 protocol introduces OPEN and CLOSE operations. The | | The NFS version 4 protocol introduces OPEN and CLOSE operations. The | |
| OPEN operation provides a single point where file lookup, creation, | | OPEN operation provides a single point where file lookup, creation, | |
| and share semantics can be combined. The CLOSE operation also | | and share semantics can be combined. The CLOSE operation also | |
| provides for the release of state accumulated by OPEN. | | provides for the release of state accumulated by OPEN. | |
| | | | |
| 1.2.5. File locking | | 1.2.5. File locking | |
| | | | |
| | | | |
| skipping to change at page 12, line 5 | | skipping to change at page 12, line 5 | |
| client. When the server grants a delegation for a file to a client, | | client. When the server grants a delegation for a file to a client, | |
| the client is guaranteed certain semantics with respect to the | | the client is guaranteed certain semantics with respect to the | |
| sharing of that file with other clients. At OPEN, the server may | | sharing of that file with other clients. At OPEN, the server may | |
| provide the client either a read or write delegation for the file. | | provide the client either a read or write delegation for the file. | |
| If the client is granted a read delegation, it is assured that no | | If the client is granted a read delegation, it is assured that no | |
| other client has the ability to write to the file for the duration of | | other client has the ability to write to the file for the duration of | |
| the delegation. If the client is granted a write delegation, the | | the delegation. If the client is granted a write delegation, the | |
| client is assured that no other client has read or write access to | | client is assured that no other client has read or write access to | |
| the file. | | the file. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| Delegations can be recalled by the server. If another client | | Delegations can be recalled by the server. If another client | |
| requests access to the file in such a way that the access conflicts | | requests access to the file in such a way that the access conflicts | |
| with the granted delegation, the server is able to notify the initial | | with the granted delegation, the server is able to notify the initial | |
| client and recall the delegation. This requires that a callback path | | client and recall the delegation. This requires that a callback path | |
| exist between the server and client. If this callback path does not | | exist between the server and client. If this callback path does not | |
| exist, then delegations can not be granted. The essence of a | | exist, then delegations can not be granted. The essence of a | |
| delegation is that it allows the client to locally service operations | | delegation is that it allows the client to locally service operations | |
| such as OPEN, CLOSE, LOCK, LOCKU, READ, WRITE without immediate | | such as OPEN, CLOSE, LOCK, LOCKU, READ, WRITE without immediate | |
| interaction with the server. | | interaction with the server. | |
| | | | |
| skipping to change at page 13, line 5 | | skipping to change at page 13, line 5 | |
| alleviate the expense a server would have in maintaining | | alleviate the expense a server would have in maintaining | |
| state about variable length leases across server failures. | | state about variable length leases across server failures. | |
| | | | |
| Lock The term "lock" is used to refer to both record (byte- | | Lock The term "lock" is used to refer to both record (byte- | |
| range) locks as well as share reservations unless | | range) locks as well as share reservations unless | |
| specifically stated otherwise. | | specifically stated otherwise. | |
| | | | |
| Server The "Server" is the entity responsible for coordinating | | Server The "Server" is the entity responsible for coordinating | |
| client access to a set of filesystems. | | client access to a set of filesystems. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| Stable Storage | | Stable Storage | |
| NFS version 4 servers must be able to recover without data | | NFS version 4 servers must be able to recover without data | |
| loss from multiple power failures (including cascading | | loss from multiple power failures (including cascading | |
| power failures, that is, several power failures in quick | | power failures, that is, several power failures in quick | |
| succession), operating system failures, and hardware | | succession), operating system failures, and hardware | |
| failure of components other than the storage medium itself | | failure of components other than the storage medium itself | |
| (for example, disk, nonvolatile RAM). | | (for example, disk, nonvolatile RAM). | |
| | | | |
| Some examples of stable storage that are allowable for an | | Some examples of stable storage that are allowable for an | |
| | | | |
| skipping to change at page 14, line 5 | | skipping to change at page 14, line 5 | |
| defines the open and locking state provided by the server | | defines the open and locking state provided by the server | |
| for a specific open or lock owner for a specific file. | | for a specific open or lock owner for a specific file. | |
| | | | |
| Stateids composed of all bits 0 or all bits 1 have special | | Stateids composed of all bits 0 or all bits 1 have special | |
| meaning and are reserved values. | | meaning and are reserved values. | |
| | | | |
| Verifier A 64-bit quantity generated by the client that the server | | Verifier A 64-bit quantity generated by the client that the server | |
| can use to determine if the client has restarted and lost | | can use to determine if the client has restarted and lost | |
| all previous lock state. | | all previous lock state. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| 2. Protocol Data Types | | 2. Protocol Data Types | |
| | | | |
| The syntax and semantics to describe the data types of the NFS | | The syntax and semantics to describe the data types of the NFS | |
| version 4 protocol are defined in the XDR [RFC1832] and RPC [RFC1831] | | version 4 protocol are defined in the XDR [RFC1832] and RPC [RFC1831] | |
| documents. The next sections build upon the XDR data types to define | | documents. The next sections build upon the XDR data types to define | |
| types and structures specific to this protocol. | | types and structures specific to this protocol. | |
| | | | |
| 2.1. Basic Data Types | | 2.1. Basic Data Types | |
| | | | |
| | | | |
| skipping to change at page 15, line 5 | | skipping to change at page 15, line 5 | |
| | | | |
| mode4 typedef uint32_t mode4; | | mode4 typedef uint32_t mode4; | |
| Mode attribute data type | | Mode attribute data type | |
| | | | |
| nfs_cookie4 typedef uint64_t nfs_cookie4; | | nfs_cookie4 typedef uint64_t nfs_cookie4; | |
| Opaque cookie value for READDIR | | Opaque cookie value for READDIR | |
| | | | |
| nfs_fh4 typedef opaque nfs_fh4<NFS4_FHSIZE>; | | nfs_fh4 typedef opaque nfs_fh4<NFS4_FHSIZE>; | |
| Filehandle definition; NFS4_FHSIZE is defined as 128 | | Filehandle definition; NFS4_FHSIZE is defined as 128 | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| nfs_ftype4 enum nfs_ftype4; | | nfs_ftype4 enum nfs_ftype4; | |
| Various defined file types | | Various defined file types | |
| | | | |
| nfsstat4 enum nfsstat4; | | nfsstat4 enum nfsstat4; | |
| Return value for operations | | Return value for operations | |
| | | | |
| offset4 typedef uint64_t offset4; | | offset4 typedef uint64_t offset4; | |
| Various offset designations (READ, WRITE, LOCK, COMMIT) | | Various offset designations (READ, WRITE, LOCK, COMMIT) | |
| | | | |
| | | | |
| skipping to change at page 15, line 38 | | skipping to change at page 15, line 38 | |
| | | | |
| seqid4 typedef uint32_t seqid4; | | seqid4 typedef uint32_t seqid4; | |
| Sequence identifier used for file locking | | Sequence identifier used for file locking | |
| | | | |
| utf8string typedef opaque utf8string<>; | | utf8string typedef opaque utf8string<>; | |
| UTF-8 encoding for strings | | UTF-8 encoding for strings | |
| | | | |
| verifier4 typedef opaque verifier4[NFS4_VERIFIER_SIZE]; | | verifier4 typedef opaque verifier4[NFS4_VERIFIER_SIZE]; | |
| Verifier used for various operations (COMMIT, CREATE, | | Verifier used for various operations (COMMIT, CREATE, | |
| OPEN, READDIR, SETCLIENTID, SETCLIENTID_CONFIRM, WRITE) | | OPEN, READDIR, SETCLIENTID, SETCLIENTID_CONFIRM, WRITE) | |
|
| NFS4_VERIFIER_SIZE is defined as 8 | | NFS4_VERIFIER_SIZE is defined as 8. | |
| | | | |
| 2.2. Structured Data Types | | 2.2. Structured Data Types | |
| | | | |
| nfstime4 | | nfstime4 | |
| struct nfstime4 { | | struct nfstime4 { | |
| int64_t seconds; | | int64_t seconds; | |
| uint32_t nseconds; | | uint32_t nseconds; | |
| } | | } | |
| | | | |
| The nfstime4 structure gives the number of seconds and | | The nfstime4 structure gives the number of seconds and | |
| nanoseconds since midnight or 0 hour January 1, 1970 Coordinated | | nanoseconds since midnight or 0 hour January 1, 1970 Coordinated | |
| Universal Time (UTC). Values greater than zero for the seconds | | Universal Time (UTC). Values greater than zero for the seconds | |
| field denote dates after the 0 hour January 1, 1970. Values | | field denote dates after the 0 hour January 1, 1970. Values | |
| less than zero for the seconds field denote dates before the 0 | | less than zero for the seconds field denote dates before the 0 | |
| hour January 1, 1970. In both cases, the nseconds field is to | | hour January 1, 1970. In both cases, the nseconds field is to | |
| be added to the seconds field for the final time representation. | | be added to the seconds field for the final time representation. | |
| For example, if the time to be represented is one-half second | | For example, if the time to be represented is one-half second | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| before 0 hour January 1, 1970, the seconds field would have a | | before 0 hour January 1, 1970, the seconds field would have a | |
| value of negative one (-1) and the nseconds fields would have a | | value of negative one (-1) and the nseconds fields would have a | |
| value of one-half second (500000000). Values greater than | | value of one-half second (500000000). Values greater than | |
| 999,999,999 for nseconds are considered invalid. | | 999,999,999 for nseconds are considered invalid. | |
| | | | |
| This data type is used to pass time and date information. A | | This data type is used to pass time and date information. A | |
| server converts to and from its local representation of time | | server converts to and from its local representation of time | |
| when processing time values, preserving as much accuracy as | | when processing time values, preserving as much accuracy as | |
| possible. If the precision of timestamps stored for a filesystem | | possible. If the precision of timestamps stored for a filesystem | |
| | | | |
| skipping to change at page 17, line 5 | | skipping to change at page 17, line 5 | |
| | | | |
| This data type represents additional information for the device | | This data type represents additional information for the device | |
| file types NF4CHR and NF4BLK. | | file types NF4CHR and NF4BLK. | |
| | | | |
| fsid4 | | fsid4 | |
| | | | |
| struct fsid4 { | | struct fsid4 { | |
| uint64_t major; | | uint64_t major; | |
| uint64_t minor; | | uint64_t minor; | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| }; | | }; | |
| | | | |
| This type is the filesystem identifier that is used as a | | This type is the filesystem identifier that is used as a | |
| mandatory attribute. | | mandatory attribute. | |
| | | | |
| fs_location4 | | fs_location4 | |
| | | | |
| struct fs_location4 { | | struct fs_location4 { | |
| utf8string server<>; | | utf8string server<>; | |
| | | | |
| skipping to change at page 18, line 5 | | skipping to change at page 18, line 5 | |
| +-----------+-----------+-----------+-- | | +-----------+-----------+-----------+-- | |
| | count | 31 .. 0 | 63 .. 32 | | | | count | 31 .. 0 | 63 .. 32 | | |
| +-----------+-----------+-----------+-- | | +-----------+-----------+-----------+-- | |
| | | | |
| change_info4 | | change_info4 | |
| | | | |
| struct change_info4 { | | struct change_info4 { | |
| bool atomic; | | bool atomic; | |
| changeid4 before; | | changeid4 before; | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| changeid4 after; | | changeid4 after; | |
| }; | | }; | |
| | | | |
| This structure is used with the CREATE, LINK, REMOVE, RENAME | | This structure is used with the CREATE, LINK, REMOVE, RENAME | |
| operations to let the client know the value of the change | | operations to let the client know the value of the change | |
| attribute for the directory in which the target filesystem | | attribute for the directory in which the target filesystem | |
| object resides. | | object resides. | |
| | | | |
| clientaddr4 | | clientaddr4 | |
| | | | |
| skipping to change at page 18, line 55 | | skipping to change at page 18, line 55 | |
| | | | |
| For TCP over IPv4 the value of r_netid is the string "tcp". For | | For TCP over IPv4 the value of r_netid is the string "tcp". For | |
| UDP over IPv4 the value of r_netid is the string "udp". | | UDP over IPv4 the value of r_netid is the string "udp". | |
| | | | |
| For TCP over IPv4 and for UDP over IPv6, the format of r_addr is | | For TCP over IPv4 and for UDP over IPv6, the format of r_addr is | |
| the US-ASCII string: | | the US-ASCII string: | |
| | | | |
| x1:x2:x3:x4:x5:x6:x7:x8.p1.p2 | | x1:x2:x3:x4:x5:x6:x7:x8.p1.p2 | |
| | | | |
| The suffix "p1.p2" is the service port, and is computed the same | | The suffix "p1.p2" is the service port, and is computed the same | |
|
| way as with univeral addresses for TCP and UDP over IPv4. The | | way as with universal addresses for TCP and UDP over IPv4. The | |
| prefix, "x1:x2:x3:x4:x5:x6:x7:x8", is the standard textual form | | prefix, "x1:x2:x3:x4:x5:x6:x7:x8", is the standard textual form | |
| for representing an IPv6 address as defined in Section 2.2 of | | for representing an IPv6 address as defined in Section 2.2 of | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| [RFC1884]. Additionally, the two alternative forms specified in | | [RFC1884]. Additionally, the two alternative forms specified in | |
| Section 2.2 of [RFC1884] are also acceptable. | | Section 2.2 of [RFC1884] are also acceptable. | |
| | | | |
| For TCP over IPv6 the value of r_netid is the string "tcp6". | | For TCP over IPv6 the value of r_netid is the string "tcp6". | |
| For UDP over IPv6 the value of r_netid is the string "udp6". | | For UDP over IPv6 the value of r_netid is the string "udp6". | |
| | | | |
| cb_client4 | | cb_client4 | |
| | | | |
| struct cb_client4 { | | struct cb_client4 { | |
| | | | |
| skipping to change at page 20, line 5 | | skipping to change at page 20, line 5 | |
| lock_owner4 | | lock_owner4 | |
| | | | |
| struct lock_owner4 { | | struct lock_owner4 { | |
| clientid4 clientid; | | clientid4 clientid; | |
| opaque owner<NFS4_OPAQUE_LIMIT>; | | opaque owner<NFS4_OPAQUE_LIMIT>; | |
| }; | | }; | |
| | | | |
| This structure is used to identify the owner of file locking | | This structure is used to identify the owner of file locking | |
| state. NFS4_OPAQUE_LIMIT is defined as 1024. | | state. NFS4_OPAQUE_LIMIT is defined as 1024. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| open_to_lock_owner4 | | open_to_lock_owner4 | |
| | | | |
| struct open_to_lock_owner4 { | | struct open_to_lock_owner4 { | |
| seqid4 open_seqid; | | seqid4 open_seqid; | |
| stateid4 open_stateid; | | stateid4 open_stateid; | |
| seqid4 lock_seqid; | | seqid4 lock_seqid; | |
| lock_owner4 lock_owner; | | lock_owner4 lock_owner; | |
| }; | | }; | |
| | | | |
| | | | |
| skipping to change at page 21, line 5 | | skipping to change at page 21, line 5 | |
| | | | |
| This structure is used for the various state sharing mechanisms | | This structure is used for the various state sharing mechanisms | |
| between the client and server. For the client, this data | | between the client and server. For the client, this data | |
| structure is read-only. The starting value of the seqid field | | structure is read-only. The starting value of the seqid field | |
| is undefined. The server is required to increment the seqid | | is undefined. The server is required to increment the seqid | |
| field monotonically at each transition of the stateid. This is | | field monotonically at each transition of the stateid. This is | |
| important since the client will inspect the seqid in OPEN | | important since the client will inspect the seqid in OPEN | |
| stateids to determine the order of OPEN processing done by the | | stateids to determine the order of OPEN processing done by the | |
| server. | | server. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| 3. RPC and Security Flavor | | 3. RPC and Security Flavor | |
| | | | |
| The NFS version 4 protocol is a Remote Procedure Call (RPC) | | The NFS version 4 protocol is a Remote Procedure Call (RPC) | |
| application that uses RPC version 2 and the corresponding eXternal | | application that uses RPC version 2 and the corresponding eXternal | |
| Data Representation (XDR) as defined in [RFC1831] and [RFC1832]. The | | Data Representation (XDR) as defined in [RFC1831] and [RFC1832]. The | |
| RPCSEC_GSS security flavor as defined in [RFC2203] MUST be used as | | RPCSEC_GSS security flavor as defined in [RFC2203] MUST be used as | |
| the mechanism to deliver stronger security for the NFS version 4 | | the mechanism to deliver stronger security for the NFS version 4 | |
| protocol. | | protocol. | |
| | | | |
| 3.1. Ports and Transports | | 3.1. Ports and Transports | |
| | | | |
| Historically, NFS version 2 and version 3 servers have resided on | | Historically, NFS version 2 and version 3 servers have resided on | |
| port 2049. The registered port 2049 [RFC1700] for the NFS protocol | | port 2049. The registered port 2049 [RFC1700] for the NFS protocol | |
| should be the default configuration. Using the registered port for | | should be the default configuration. Using the registered port for | |
| NFS services means the NFS client will not need to use the RPC | | NFS services means the NFS client will not need to use the RPC | |
| binding protocols as described in [RFC1833]; this will allow NFS to | | binding protocols as described in [RFC1833]; this will allow NFS to | |
| transit firewalls. | | transit firewalls. | |
| | | | |
|
| The transport used by the RPC service for the NFS version 4 protocol | | Where an NFS version 4 implementation supports operation over the IP | |
| MUST provide congestion control comparable to that defined for TCP in | | network protocol, the supported transports between NFS and IP must be | |
| [RFC2581]. If the operating environment implements TCP, the NFS | | among the IETF-approved congestion control transport protocols, which | |
| version 4 protocol SHOULD be supported over TCP. The NFS client and | | include TCP and SCTP. To enhance the possibilities for | |
| server MAY use other transports if they support congestion control as | | interoperability, an NFS version 4 implementation SHOULD support | |
| defined above and in those cases a mechanism may be provided to | | operation over the TCP transport protocol. | |
| override TCP usage in favor of another transport. | | | |
| | | | |
| If TCP is used as the transport, the client and server SHOULD use | | If TCP is used as the transport, the client and server SHOULD use | |
| persistent connections. This will prevent the weakening of TCP's | | persistent connections. This will prevent the weakening of TCP's | |
| congestion control via short lived connections and will improve | | congestion control via short lived connections and will improve | |
| performance for the WAN environment by eliminating the need for SYN | | performance for the WAN environment by eliminating the need for SYN | |
| handshakes. | | handshakes. | |
| | | | |
| Note that for various timers, the client and server should avoid | | Note that for various timers, the client and server should avoid | |
| inadvertent synchronization of those timers. For further discussion | | inadvertent synchronization of those timers. For further discussion | |
| of the general issue refer to [Floyd]. | | of the general issue refer to [Floyd]. | |
| | | | |
| skipping to change at page 21, line 55 | | skipping to change at page 21, line 54 | |
| When processing a request received over a reliable transport such as | | When processing a request received over a reliable transport such as | |
| TCP, the NFS version 4 server MUST NOT silently drop the request, | | TCP, the NFS version 4 server MUST NOT silently drop the request, | |
| except if the transport connection has been broken. Given such a | | except if the transport connection has been broken. Given such a | |
| contract between NFS version 4 clients and servers, clients MUST NOT | | contract between NFS version 4 clients and servers, clients MUST NOT | |
| retry a request unless one or both of the following are true: | | retry a request unless one or both of the following are true: | |
| | | | |
| o The transport connection has been broken | | o The transport connection has been broken | |
| | | | |
| o The procedure being retried is the NULL procedure | | o The procedure being retried is the NULL procedure | |
| | | | |
|
| Since transports, including TCP, do not always synchronously inform a | | Since reliable transports, such as TCP, do not always synchronously | |
| peer when the other peer has broken the connection (for example, when | | inform a peer when the other peer has broken the connection (for | |
| | | example, when an NFS server reboots), so the NFS version 4 client may | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
|
| an NFS server reboots), so the NFS version 4 client may want to | | want to actively "probe" the connection to see if has been broken. | |
| actively "probe" the connection to see if has been broken. Use of | | Use of the NULL procedure is one recommended way to do so. So, when | |
| the NULL procedure is one recommended way to do so. So, when a | | a client experiences a remote procedure call timeout (of some | |
| client experiences a remote procedure call timeout (of some arbitrary | | arbitrary implementation specific amount), rather than retrying the | |
| implementation specific amount), rather than retrying the remote | | remote procedure call, it could instead issue a NULL procedure call | |
| procedure call, it could instead issue a NULL procedure call to the | | to the server. If the server has died, the transport connection break | |
| server. If the server has died, the transport connection break will | | will eventually be indicated to the NFS version 4 client. The client | |
| eventually be indicated to the NFS version 4 client. The client can | | can then reconnect, and then retry the original request. If the NULL | |
| then reconnect, and then retry the original request. If the NULL | | | |
| procedure call gets a response, the connection has not broken. The | | procedure call gets a response, the connection has not broken. The | |
| client can decide to wait longer for the original request's response, | | client can decide to wait longer for the original request's response, | |
| or it can break the transport connection and reconnect before re- | | or it can break the transport connection and reconnect before re- | |
| sending the original request. | | sending the original request. | |
| | | | |
| For callbacks from the server to the client, the same rules apply, | | For callbacks from the server to the client, the same rules apply, | |
| but the server doing the callback becomes the client, and the client | | but the server doing the callback becomes the client, and the client | |
| receiving the callback becomes the server. | | receiving the callback becomes the server. | |
| | | | |
| 3.2. Security Flavors | | 3.2. Security Flavors | |
| | | | |
| skipping to change at page 23, line 4 | | skipping to change at page 22, line 57 | |
| | | | |
| column descriptions: | | column descriptions: | |
| | | | |
| 1 == number of pseudo flavor | | 1 == number of pseudo flavor | |
| 2 == name of pseudo flavor | | 2 == name of pseudo flavor | |
| 3 == mechanism's OID | | 3 == mechanism's OID | |
| 4 == mechanism's algorithm(s) | | 4 == mechanism's algorithm(s) | |
| 5 == RPCSEC_GSS service | | 5 == RPCSEC_GSS service | |
| | | | |
| 1 2 3 4 5 | | 1 2 3 4 5 | |
|
| | | ----------------------------------------------------------------------- | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
|
| ----------------------------------------------------------------------- | | | |
| 390003 krb5 1.2.840.113554.1.2.2 DES MAC MD5 rpc_gss_svc_none | | 390003 krb5 1.2.840.113554.1.2.2 DES MAC MD5 rpc_gss_svc_none | |
| 390004 krb5i 1.2.840.113554.1.2.2 DES MAC MD5 rpc_gss_svc_integrity | | 390004 krb5i 1.2.840.113554.1.2.2 DES MAC MD5 rpc_gss_svc_integrity | |
| 390005 krb5p 1.2.840.113554.1.2.2 DES MAC MD5 rpc_gss_svc_privacy | | 390005 krb5p 1.2.840.113554.1.2.2 DES MAC MD5 rpc_gss_svc_privacy | |
| for integrity, | | for integrity, | |
| and 56 bit DES | | and 56 bit DES | |
| for privacy. | | for privacy. | |
| | | | |
| Note that the pseudo flavor is presented here as a mapping aid to the | | Note that the pseudo flavor is presented here as a mapping aid to the | |
| implementor. Because this NFS protocol includes a method to | | implementor. Because this NFS protocol includes a method to | |
| negotiate security and it understands the GSS-API mechanism, the | | negotiate security and it understands the GSS-API mechanism, the | |
| | | | |
| skipping to change at page 24, line 4 | | skipping to change at page 23, line 57 | |
| | | | |
| Because SPKM-3 negotiates the algorithms, subsequent calls to | | Because SPKM-3 negotiates the algorithms, subsequent calls to | |
| LIPKEY's GSS_Wrap() and GSS_GetMIC() by RPCSEC_GSS will use a quality | | LIPKEY's GSS_Wrap() and GSS_GetMIC() by RPCSEC_GSS will use a quality | |
| of protection value of 0 (zero). See section 5.2 of [RFC2025] for an | | of protection value of 0 (zero). See section 5.2 of [RFC2025] for an | |
| explanation. | | explanation. | |
| | | | |
| LIPKEY uses SPKM-3 to create a secure channel in which to pass a user | | LIPKEY uses SPKM-3 to create a secure channel in which to pass a user | |
| name and password from the client to the server. Once the user name | | name and password from the client to the server. Once the user name | |
| and password have been accepted by the server, calls to the LIPKEY | | and password have been accepted by the server, calls to the LIPKEY | |
| context are redirected to the SPKM-3 context. See [RFC2847] for more | | context are redirected to the SPKM-3 context. See [RFC2847] for more | |
|
| | | | |
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
| details. | | details. | |
| | | | |
|
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| 3.2.1.3. SPKM-3 as a security triple | | 3.2.1.3. SPKM-3 as a security triple | |
| | | | |
| The SPKM-3 GSS-API mechanism as described in [RFC2847] MUST be | | The SPKM-3 GSS-API mechanism as described in [RFC2847] MUST be | |
| implemented and provide the following security triples. The | | implemented and provide the following security triples. The | |
| definition of the columns matches the previous subsection "Kerberos | | definition of the columns matches the previous subsection "Kerberos | |
| V5 as security triple". | | V5 as security triple". | |
| | | | |
| 1 2 3 4 5 | | 1 2 3 4 5 | |
| ----------------------------------------------------------------------- | | ----------------------------------------------------------------------- | |
| 390009 spkm3 1.3.6.1.5.5.1.3 negotiated rpc_gss_svc_none | | 390009 spkm3 1.3.6.1.5.5.1.3 negotiated rpc_gss_svc_none | |
| | | | |
| skipping to change at page 25, line 5 | | skipping to change at page 24, line 52 | |
| that are available for use by NFS clients. In turn the NFS server | | that are available for use by NFS clients. In turn the NFS server | |
| may be configured such that each of these entry points may have | | may be configured such that each of these entry points may have | |
| different or multiple security mechanisms in use. | | different or multiple security mechanisms in use. | |
| | | | |
| The security negotiation between client and server must be done with | | The security negotiation between client and server must be done with | |
| a secure channel to eliminate the possibility of a third party | | a secure channel to eliminate the possibility of a third party | |
| intercepting the negotiation sequence and forcing the client and | | intercepting the negotiation sequence and forcing the client and | |
| server to choose a lower level of security than required or desired. | | server to choose a lower level of security than required or desired. | |
| See the section "Security Considerations" for further discussion. | | See the section "Security Considerations" for further discussion. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
| 3.3.1. SECINFO | | 3.3.1. SECINFO | |
| | | | |
| The new SECINFO operation will allow the client to determine, on a | | The new SECINFO operation will allow the client to determine, on a | |
| per filehandle basis, what security triple is to be used for server | | per filehandle basis, what security triple is to be used for server | |
| access. In general, the client will not have to use the SECINFO | | access. In general, the client will not have to use the SECINFO | |
|
| | | | |
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| operation except during initial communication with the server or when | | operation except during initial communication with the server or when | |
| the client crosses policy boundaries at the server. It is possible | | the client crosses policy boundaries at the server. It is possible | |
| that the server's policies change during the client's interaction | | that the server's policies change during the client's interaction | |
| therefore forcing the client to negotiate a new security triple. | | therefore forcing the client to negotiate a new security triple. | |
| | | | |
| 3.3.2. Security Error | | 3.3.2. Security Error | |
| | | | |
| Based on the assumption that each NFS version 4 client and server | | Based on the assumption that each NFS version 4 client and server | |
| must support a minimum set of security (i.e. LIPKEY, SPKM-3, and | | must support a minimum set of security (i.e. LIPKEY, SPKM-3, and | |
| Kerberos-V5 all under RPCSEC_GSS), the NFS client will start its | | Kerberos-V5 all under RPCSEC_GSS), the NFS client will start its | |
| | | | |
| skipping to change at page 25, line 42 | | skipping to change at page 25, line 37 | |
| | | | |
| 3.4. Callback RPC Authentication | | 3.4. Callback RPC Authentication | |
| | | | |
| Except as noted elsewhere in this section, the callback RPC | | Except as noted elsewhere in this section, the callback RPC | |
| (described later) MUST mutually authenticate the NFS server to the | | (described later) MUST mutually authenticate the NFS server to the | |
| principal that acquired the clientid (also described later), using | | principal that acquired the clientid (also described later), using | |
| the security flavor the original SETCLIENTID operation used. | | the security flavor the original SETCLIENTID operation used. | |
| | | | |
| For AUTH_NONE, there are no principals, so this is a non-issue. | | For AUTH_NONE, there are no principals, so this is a non-issue. | |
| | | | |
|
| AUTH_SYS has no notions of mutual authentation or a server principal, | | AUTH_SYS has no notions of mutual authentication or a server | |
| so the callback from the server simply uses the AUTH_SYS credential | | principal, so the callback from the server simply uses the AUTH_SYS | |
| that the user used when he set up the delegation. | | credential that the user used when he set up the delegation. | |
| | | | |
| For AUTH_DH, one commonly used convention is that the server uses the | | For AUTH_DH, one commonly used convention is that the server uses the | |
| credential corresponding to this AUTH_DH principal: | | credential corresponding to this AUTH_DH principal: | |
| | | | |
| unix.host@domain | | unix.host@domain | |
| | | | |
| where host and domain are variables corresponding to the name of | | where host and domain are variables corresponding to the name of | |
| server host and directory services domain in which it lives such as a | | server host and directory services domain in which it lives such as a | |
| Network Information System domain or a DNS domain. | | Network Information System domain or a DNS domain. | |
| | | | |
| Because LIPKEY is layered over SPKM-3, it is permissible for the | | Because LIPKEY is layered over SPKM-3, it is permissible for the | |
| server to use SPKM-3 and not LIPKEY for the callback even if the | | server to use SPKM-3 and not LIPKEY for the callback even if the | |
|
| | | | |
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
| client used LIPKEY for SETCLIENTID. | | client used LIPKEY for SETCLIENTID. | |
| | | | |
| Regardless of what security mechanism under RPCSEC_GSS is being used, | | Regardless of what security mechanism under RPCSEC_GSS is being used, | |
| the NFS server, MUST identify itself in GSS-API via a | | the NFS server, MUST identify itself in GSS-API via a | |
| GSS_C_NT_HOSTBASED_SERVICE name type. GSS_C_NT_HOSTBASED_SERVICE | | GSS_C_NT_HOSTBASED_SERVICE name type. GSS_C_NT_HOSTBASED_SERVICE | |
|
| | | | |
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| names are of the form: | | names are of the form: | |
| | | | |
| service@hostname | | service@hostname | |
| | | | |
| For NFS, the "service" element is | | For NFS, the "service" element is | |
| | | | |
| nfs | | nfs | |
| | | | |
| Implementations of security mechanisms will convert nfs@hostname to | | Implementations of security mechanisms will convert nfs@hostname to | |
| various different forms. For Kerberos V5 and LIPKEY, the following | | various different forms. For Kerberos V5 and LIPKEY, the following | |
| | | | |
| skipping to change at page 27, line 4 | | skipping to change at page 26, line 54 | |
| the SETCLIENTID operation. From an administrative perspective, | | the SETCLIENTID operation. From an administrative perspective, | |
| having a user name, password, and certificate for both the | | having a user name, password, and certificate for both the | |
| client and server is redundant. | | client and server is redundant. | |
| | | | |
| o LIPKEY was intended to minimize additional infrastructure | | o LIPKEY was intended to minimize additional infrastructure | |
| requirements beyond a certificate for the target, and the | | requirements beyond a certificate for the target, and the | |
| expectation is that existing password infrastructure can be | | expectation is that existing password infrastructure can be | |
| leveraged for the initiator. In some environments, a per-host | | leveraged for the initiator. In some environments, a per-host | |
| password does not exist yet. If certificates are used for any | | password does not exist yet. If certificates are used for any | |
| per-host principals, then additional password infrastructure is | | per-host principals, then additional password infrastructure is | |
|
| | | | |
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
| not needed. | | not needed. | |
| | | | |
| o In cases when a host is both an NFS client and server, it can | | o In cases when a host is both an NFS client and server, it can | |
| share the same per-host certificate. | | share the same per-host certificate. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| 4. Filehandles | | 4. Filehandles | |
| | | | |
| The filehandle in the NFS protocol is a per server unique identifier | | The filehandle in the NFS protocol is a per server unique identifier | |
| for a filesystem object. The contents of the filehandle are opaque | | for a filesystem object. The contents of the filehandle are opaque | |
| to the client. Therefore, the server is responsible for translating | | to the client. Therefore, the server is responsible for translating | |
| the filehandle to an internal representation of the filesystem | | the filehandle to an internal representation of the filesystem | |
| object. | | object. | |
| | | | |
| 4.1. Obtaining the First Filehandle | | 4.1. Obtaining the First Filehandle | |
| | | | |
| skipping to change at page 29, line 5 | | skipping to change at page 28, line 5 | |
| used, the client can then traverse the entirety of the server's file | | used, the client can then traverse the entirety of the server's file | |
| tree with the LOOKUP operation. A complete discussion of the server | | tree with the LOOKUP operation. A complete discussion of the server | |
| name space is in the section "NFS Server Name Space". | | name space is in the section "NFS Server Name Space". | |
| | | | |
| 4.1.2. Public Filehandle | | 4.1.2. Public Filehandle | |
| | | | |
| The second special filehandle is the PUBLIC filehandle. Unlike the | | The second special filehandle is the PUBLIC filehandle. Unlike the | |
| ROOT filehandle, the PUBLIC filehandle may be bound or represent an | | ROOT filehandle, the PUBLIC filehandle may be bound or represent an | |
| arbitrary filesystem object at the server. The server is responsible | | arbitrary filesystem object at the server. The server is responsible | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| for this binding. It may be that the PUBLIC filehandle and the ROOT | | for this binding. It may be that the PUBLIC filehandle and the ROOT | |
| filehandle refer to the same filesystem object. However, it is up to | | filehandle refer to the same filesystem object. However, it is up to | |
| the administrative software at the server and the policies of the | | the administrative software at the server and the policies of the | |
| server administrator to define the binding of the PUBLIC filehandle | | server administrator to define the binding of the PUBLIC filehandle | |
| and server filesystem object. The client may not make any | | and server filesystem object. The client may not make any | |
| assumptions about this binding. The client uses the PUBLIC filehandle | | assumptions about this binding. The client uses the PUBLIC filehandle | |
| via the PUTPUBFH operation. | | via the PUTPUBFH operation. | |
| | | | |
| 4.2. Filehandle Types | | 4.2. Filehandle Types | |
| | | | |
| skipping to change at page 29, line 55 | | skipping to change at page 28, line 55 | |
| opaque. The client stores filehandles for use in a later request and | | opaque. The client stores filehandles for use in a later request and | |
| can compare two filehandles from the same server for equality by | | can compare two filehandles from the same server for equality by | |
| doing a byte-by-byte comparison. However, the client MUST NOT | | doing a byte-by-byte comparison. However, the client MUST NOT | |
| otherwise interpret the contents of filehandles. If two filehandles | | otherwise interpret the contents of filehandles. If two filehandles | |
| from the same server are equal, they MUST refer to the same file. | | from the same server are equal, they MUST refer to the same file. | |
| Servers SHOULD try to maintain a one-to-one correspondence between | | Servers SHOULD try to maintain a one-to-one correspondence between | |
| filehandles and files but this is not required. Clients MUST use | | filehandles and files but this is not required. Clients MUST use | |
| filehandle comparisons only to improve performance, not for correct | | filehandle comparisons only to improve performance, not for correct | |
| behavior. All clients need to be prepared for situations in which it | | behavior. All clients need to be prepared for situations in which it | |
| cannot be determined whether two filehandles denote the same object | | cannot be determined whether two filehandles denote the same object | |
|
| and in such cases, avoid making invalid assumpions which might cause | | and in such cases, avoid making invalid assumptions which might cause | |
| incorrect behavior. Further discussion of filehandle and attribute | | incorrect behavior. Further discussion of filehandle and attribute | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| comparison in the context of data caching is presented in the section | | comparison in the context of data caching is presented in the section | |
| "Data Caching and File Identity". | | "Data Caching and File Identity". | |
| | | | |
| As an example, in the case that two different path names when | | As an example, in the case that two different path names when | |
| traversed at the server terminate at the same filesystem object, the | | traversed at the server terminate at the same filesystem object, the | |
| server SHOULD return the same filehandle for each path. This can | | server SHOULD return the same filehandle for each path. This can | |
| occur if a hard link is used to create two file names which refer to | | occur if a hard link is used to create two file names which refer to | |
| the same underlying file object and associated data. For example, if | | the same underlying file object and associated data. For example, if | |
| paths /a/b/c and /a/d/c refer to the same file, the server SHOULD | | paths /a/b/c and /a/d/c refer to the same file, the server SHOULD | |
| | | | |
| skipping to change at page 31, line 5 | | skipping to change at page 30, line 5 | |
| server should return NFS4ERR_STALE to the client (as is the case for | | server should return NFS4ERR_STALE to the client (as is the case for | |
| persistent filehandles). In all other cases where the server | | persistent filehandles). In all other cases where the server | |
| determines that a volatile filehandle can no longer be used, it | | determines that a volatile filehandle can no longer be used, it | |
| should return an error of NFS4ERR_FHEXPIRED. | | should return an error of NFS4ERR_FHEXPIRED. | |
| | | | |
| The mandatory attribute "fh_expire_type" is used by the client to | | The mandatory attribute "fh_expire_type" is used by the client to | |
| determine what type of filehandle the server is providing for a | | determine what type of filehandle the server is providing for a | |
| particular filesystem. This attribute is a bitmask with the | | particular filesystem. This attribute is a bitmask with the | |
| following values: | | following values: | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| FH4_PERSISTENT | | FH4_PERSISTENT | |
| The value of FH4_PERSISTENT is used to indicate a persistent | | The value of FH4_PERSISTENT is used to indicate a persistent | |
| filehandle, which is valid until the object is removed from the | | filehandle, which is valid until the object is removed from the | |
| filesystem. The server will not return NFS4ERR_FHEXPIRED for | | filesystem. The server will not return NFS4ERR_FHEXPIRED for | |
| this filehandle. FH4_PERSISTENT is defined as a value in which | | this filehandle. FH4_PERSISTENT is defined as a value in which | |
| none of the bits specified below are set. | | none of the bits specified below are set. | |
| | | | |
| FH4_VOLATILE_ANY | | FH4_VOLATILE_ANY | |
| The filehandle may expire at any time, except as specifically | | The filehandle may expire at any time, except as specifically | |
| | | | |
| skipping to change at page 31, line 55 | | skipping to change at page 30, line 55 | |
| but not all filehandles upon migration (e.g. all but those that | | but not all filehandles upon migration (e.g. all but those that | |
| are open), FH4_VOLATILE_ANY (in this case with | | are open), FH4_VOLATILE_ANY (in this case with | |
| FH4_NOEXPIRE_WITH_OPEN) is a better choice since the client may | | FH4_NOEXPIRE_WITH_OPEN) is a better choice since the client may | |
| not assume that all filehandles will expire when migration | | not assume that all filehandles will expire when migration | |
| occurs, and it is likely that additional expirations will occur | | occurs, and it is likely that additional expirations will occur | |
| (as a result of file CLOSE) that are separated in time from the | | (as a result of file CLOSE) that are separated in time from the | |
| migration event itself. | | migration event itself. | |
| | | | |
| 4.2.4. One Method of Constructing a Volatile Filehandle | | 4.2.4. One Method of Constructing a Volatile Filehandle | |
| | | | |
|
| As mentioned, in some instances a filehandle is stale (no longer | | | |
| valid; perhaps because the file was removed from the server) or it is | | | |
| expired (the underlying file is valid but since the filehandle is | | | |
| | | | |
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
| volatile, it may have expired). Thus the server needs to be able to | | | |
| return NFS4ERR_STALE in the former case and NFS4ERR_FHEXPIRED in the | | | |
| latter case. This can be done by careful construction of the volatile | | | |
| filehandle. One possible implementation follows. | | | |
| | | | |
| A volatile filehandle, while opaque to the client could contain: | | A volatile filehandle, while opaque to the client could contain: | |
| | | | |
| [volatile bit = 1 | server boot time | slot | generation number] | | [volatile bit = 1 | server boot time | slot | generation number] | |
| | | | |
|
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| o slot is an index in the server volatile filehandle table | | o slot is an index in the server volatile filehandle table | |
| | | | |
| o generation number is the generation number for the table | | o generation number is the generation number for the table | |
| entry/slot | | entry/slot | |
| | | | |
|
| If the server boot time is less than the current server boot time, | | When the client presents a volatile filehandle, the server makes the | |
| return NFS4ERR_FHEXPIRED. If slot is out of range, return | | following checks, which assume that the check for the volatile bit | |
| | | has passed. If the server boot time is less than the current server | |
| | | boot time, return NFS4ERR_FHEXPIRED. If slot is out of range, return | |
| NFS4ERR_BADHANDLE. If the generation number does not match, return | | NFS4ERR_BADHANDLE. If the generation number does not match, return | |
| NFS4ERR_FHEXPIRED. | | NFS4ERR_FHEXPIRED. | |
| | | | |
| When the server reboots, the table is gone (it is volatile). | | When the server reboots, the table is gone (it is volatile). | |
| | | | |
| If volatile bit is 0, then it is a persistent filehandle with a | | If volatile bit is 0, then it is a persistent filehandle with a | |
| different structure following it. | | different structure following it. | |
| | | | |
| 4.3. Client Recovery from Filehandle Expiration | | 4.3. Client Recovery from Filehandle Expiration | |
| | | | |
| | | | |
| skipping to change at page 33, line 4 | | skipping to change at page 31, line 50 | |
| from the filesystem, obviously the client will not be able to recover | | from the filesystem, obviously the client will not be able to recover | |
| from the expired filehandle. | | from the expired filehandle. | |
| | | | |
| It is also possible that the expired filehandle refers to a file that | | It is also possible that the expired filehandle refers to a file that | |
| has been renamed. If the file was renamed by another client, again | | has been renamed. If the file was renamed by another client, again | |
| it is possible that the original client will not be able to recover. | | it is possible that the original client will not be able to recover. | |
| However, in the case that the client itself is renaming the file and | | However, in the case that the client itself is renaming the file and | |
| the file is open, it is possible that the client may be able to | | the file is open, it is possible that the client may be able to | |
| recover. The client can determine the new path name based on the | | recover. The client can determine the new path name based on the | |
| processing of the rename request. The client can then regenerate the | | processing of the rename request. The client can then regenerate the | |
|
| | | | |
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
| new filehandle based on the new path name. The client could also use | | new filehandle based on the new path name. The client could also use | |
| the compound operation mechanism to construct a set of operations | | the compound operation mechanism to construct a set of operations | |
| like: | | like: | |
| RENAME A B | | RENAME A B | |
| LOOKUP B | | LOOKUP B | |
| GETFH | | GETFH | |
| Note that the COMPOUND procedure does not provide atomicity. This | | Note that the COMPOUND procedure does not provide atomicity. This | |
| example only reduces the overhead of recovering from an expired | | example only reduces the overhead of recovering from an expired | |
|
| | | | |
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| filehandle. | | filehandle. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| 5. File Attributes | | 5. File Attributes | |
| | | | |
| To meet the requirements of extensibility and increased | | To meet the requirements of extensibility and increased | |
| interoperability with non-UNIX platforms, attributes must be handled | | interoperability with non-UNIX platforms, attributes must be handled | |
| in a flexible manner. The NFS version 3 fattr3 structure contains a | | in a flexible manner. The NFS version 3 fattr3 structure contains a | |
| fixed list of attributes that not all clients and servers are able to | | fixed list of attributes that not all clients and servers are able to | |
| support or care about. The fattr3 structure can not be extended as | | support or care about. The fattr3 structure can not be extended as | |
| new needs arise and it provides no way to indicate non-support. With | | new needs arise and it provides no way to indicate non-support. With | |
| the NFS version 4 protocol, the client is able query what attributes | | the NFS version 4 protocol, the client is able query what attributes | |
| | | | |
| skipping to change at page 35, line 5 | | skipping to change at page 34, line 5 | |
| encouraged to define their new attributes as recommended attributes | | encouraged to define their new attributes as recommended attributes | |
| by bringing them to the IETF standards-track process. | | by bringing them to the IETF standards-track process. | |
| | | | |
| The set of attributes which are classified as mandatory is | | The set of attributes which are classified as mandatory is | |
| deliberately small since servers must do whatever it takes to support | | deliberately small since servers must do whatever it takes to support | |
| them. A server should support as many of the recommended attributes | | them. A server should support as many of the recommended attributes | |
| as possible but by their definition, the server is not required to | | as possible but by their definition, the server is not required to | |
| support all of them. Attributes are deemed mandatory if the data is | | support all of them. Attributes are deemed mandatory if the data is | |
| both needed by a large number of clients and is not otherwise | | both needed by a large number of clients and is not otherwise | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| reasonably computable by the client when support is not provided on | | reasonably computable by the client when support is not provided on | |
| the server. | | the server. | |
| | | | |
| Note that the hidden directory returned by OPENATTR is a convenience | | Note that the hidden directory returned by OPENATTR is a convenience | |
| for protocol processing. The client should not make any assumptions | | for protocol processing. The client should not make any assumptions | |
| about the server's implementation of named attributes and whether the | | about the server's implementation of named attributes and whether the | |
| underlying filesystem at the server has a named attribute directory | | underlying filesystem at the server has a named attribute directory | |
| or not. Therefore, operations such as SETATTR and GETATTR on the | | or not. Therefore, operations such as SETATTR and GETATTR on the | |
| named attribute directory are undefined. | | named attribute directory are undefined. | |
| | | | |
| skipping to change at page 36, line 5 | | skipping to change at page 35, line 5 | |
| fabricate or construct an attribute or whether to do without the | | fabricate or construct an attribute or whether to do without the | |
| attribute. | | attribute. | |
| | | | |
| 5.3. Named Attributes | | 5.3. Named Attributes | |
| | | | |
| These attributes are not supported by direct encoding in the NFS | | These attributes are not supported by direct encoding in the NFS | |
| Version 4 protocol but are accessed by string names rather than | | Version 4 protocol but are accessed by string names rather than | |
| numbers and correspond to an uninterpreted stream of bytes which are | | numbers and correspond to an uninterpreted stream of bytes which are | |
| stored with the filesystem object. The name space for these | | stored with the filesystem object. The name space for these | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| attributes may be accessed by using the OPENATTR operation. The | | attributes may be accessed by using the OPENATTR operation. The | |
| OPENATTR operation returns a filehandle for a virtual "attribute | | OPENATTR operation returns a filehandle for a virtual "attribute | |
| directory" and further perusal of the name space may be done using | | directory" and further perusal of the name space may be done using | |
| READDIR and LOOKUP operations on this filehandle. Named attributes | | READDIR and LOOKUP operations on this filehandle. Named attributes | |
| may then be examined or changed by normal READ and WRITE and CREATE | | may then be examined or changed by normal READ and WRITE and CREATE | |
| operations on the filehandles returned from READDIR and LOOKUP. | | operations on the filehandles returned from READDIR and LOOKUP. | |
| Named attributes may have attributes. | | Named attributes may have attributes. | |
| | | | |
| It is recommended that servers support arbitrary named attributes. A | | It is recommended that servers support arbitrary named attributes. A | |
| | | | |
| skipping to change at page 36, line 35 | | skipping to change at page 35, line 35 | |
| IETF standards track documents. See the section "IANA | | IETF standards track documents. See the section "IANA | |
| Considerations" for further discussion. | | Considerations" for further discussion. | |
| | | | |
| 5.4. Classification of Attributes | | 5.4. Classification of Attributes | |
| | | | |
| Each of the Mandatory and Recommended attributes can be classified in | | Each of the Mandatory and Recommended attributes can be classified in | |
| one of three categories: per server, per filesystem, or per | | one of three categories: per server, per filesystem, or per | |
| filesystem object. Note that it is possible that some per filesystem | | filesystem object. Note that it is possible that some per filesystem | |
| attributes may vary within the filesystem. See the "homogeneous" | | attributes may vary within the filesystem. See the "homogeneous" | |
| attribute for its definition. Note that the attributes | | attribute for its definition. Note that the attributes | |
|
| time_access_set and time_modify_set are not listed below because they | | time_access_set and time_modify_set are not listed in this section | |
| are write-only attributes used in a special instance of SETATTR. | | because they are write-only attributes corresponding to time_access | |
| | | and time_modify, and are used in a special instance of SETATTR. | |
| | | | |
| o The per server attribute is: | | o The per server attribute is: | |
| | | | |
| lease_time | | lease_time | |
| | | | |
| o The per filesystem attributes are: | | o The per filesystem attributes are: | |
| | | | |
| supp_attr, fh_expire_type, link_support, symlink_support, | | supp_attr, fh_expire_type, link_support, symlink_support, | |
| unique_handles, aclsupport, cansettime, case_insensitive, | | unique_handles, aclsupport, cansettime, case_insensitive, | |
| case_preserving, chown_restricted, files_avail, files_free, | | case_preserving, chown_restricted, files_avail, files_free, | |
| files_total, fs_locations, homogeneous, maxfilesize, maxname, | | files_total, fs_locations, homogeneous, maxfilesize, maxname, | |
| maxread, maxwrite, no_trunc, space_avail, space_free, | | maxread, maxwrite, no_trunc, space_avail, space_free, | |
| space_total, time_delta | | space_total, time_delta | |
| | | | |
| o The per filesystem object attributes are: | | o The per filesystem object attributes are: | |
| | | | |
| type, change, size, named_attr, fsid, rdattr_error, filehandle, | | type, change, size, named_attr, fsid, rdattr_error, filehandle, | |
| ACL, archive, fileid, hidden, maxlink, mimetype, mode, numlinks, | | ACL, archive, fileid, hidden, maxlink, mimetype, mode, numlinks, | |
| owner, owner_group, rawdev, space_used, system, time_access, | | owner, owner_group, rawdev, space_used, system, time_access, | |
| time_backup, time_create, time_metadata, time_modify, | | time_backup, time_create, time_metadata, time_modify, | |
|
| mounted_on_fileid | | | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| | | mounted_on_fileid | |
| | | | |
| For quota_avail_hard, quota_avail_soft, and quota_used see their | | For quota_avail_hard, quota_avail_soft, and quota_used see their | |
| definitions below for the appropriate classification. | | definitions below for the appropriate classification. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| 5.5. Mandatory Attributes - Definitions | | 5.5. Mandatory Attributes - Definitions | |
| | | | |
| Name # DataType Access Description | | Name # DataType Access Description | |
| ___________________________________________________________________ | | ___________________________________________________________________ | |
| supp_attr 0 bitmap READ The bit vector which | | supp_attr 0 bitmap READ The bit vector which | |
| would retrieve all | | would retrieve all | |
| mandatory and | | mandatory and | |
| recommended attributes | | recommended attributes | |
| that are supported for | | that are supported for | |
| | | | |
| skipping to change at page 38, line 53 | | skipping to change at page 37, line 53 | |
| object's time_metadata | | object's time_metadata | |
| attribute for this | | attribute for this | |
| attribute's value but | | attribute's value but | |
| only if the filesystem | | only if the filesystem | |
| object can not be | | object can not be | |
| updated more | | updated more | |
| frequently than the | | frequently than the | |
| resolution of | | resolution of | |
| time_metadata. | | time_metadata. | |
| | | | |
|
| size 4 uint64 R/W | | size 4 uint64 R/W The size of the object | |
| The size of the object | | | |
| in bytes. | | in bytes. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| link_support 5 bool READ True, if the object's | | link_support 5 bool READ True, if the object's | |
| filesystem supports | | filesystem supports | |
| hard links. | | hard links. | |
| | | | |
| symlink_support 6 bool READ True, if the object's | | symlink_support 6 bool READ True, if the object's | |
| filesystem supports | | filesystem supports | |
| symbolic links. | | symbolic links. | |
| | | | |
| named_attr 7 bool READ True, if this object | | named_attr 7 bool READ True, if this object | |
| | | | |
| skipping to change at page 39, line 29 | | skipping to change at page 38, line 29 | |
| attribute directory. | | attribute directory. | |
| | | | |
| fsid 8 fsid4 READ Unique filesystem | | fsid 8 fsid4 READ Unique filesystem | |
| identifier for the | | identifier for the | |
| filesystem holding | | filesystem holding | |
| this object. fsid | | this object. fsid | |
| contains major and | | contains major and | |
| minor components each | | minor components each | |
| of which are uint64. | | of which are uint64. | |
| | | | |
|
| unique_handles 9 bool READ | | unique_handles 9 bool READ True, if two distinct | |
| True, if two distinct | | | |
| filehandles guaranteed | | filehandles guaranteed | |
| to refer to two | | to refer to two | |
| different filesystem | | different filesystem | |
| objects. | | objects. | |
| | | | |
| lease_time 10 nfs_lease4 READ Duration of leases at | | lease_time 10 nfs_lease4 READ Duration of leases at | |
| server in seconds. | | server in seconds. | |
| | | | |
| rdattr_error 11 enum READ Error returned from | | rdattr_error 11 enum READ Error returned from | |
| getattr during | | getattr during | |
| readdir. | | readdir. | |
| | | | |
| filehandle 19 nfs_fh4 READ The filehandle of this | | filehandle 19 nfs_fh4 READ The filehandle of this | |
| object (primarily for | | object (primarily for | |
| readdir requests). | | readdir requests). | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| 5.6. Recommended Attributes - Definitions | | 5.6. Recommended Attributes - Definitions | |
| | | | |
| Name # Data Type Access Description | | Name # Data Type Access Description | |
| ______________________________________________________________________ | | ______________________________________________________________________ | |
| ACL 12 nfsace4<> R/W The access control | | ACL 12 nfsace4<> R/W The access control | |
| list for the object. | | list for the object. | |
| | | | |
| aclsupport 13 uint32 READ Indicates what types | | aclsupport 13 uint32 READ Indicates what types | |
| of ACLs are supported | | of ACLs are supported | |
| | | | |
| skipping to change at page 40, line 35 | | skipping to change at page 39, line 35 | |
| | | | |
| cansettime 15 bool READ True, if the server | | cansettime 15 bool READ True, if the server | |
| able to change the | | able to change the | |
| times for a | | times for a | |
| filesystem object as | | filesystem object as | |
| specified in a | | specified in a | |
| SETATTR operation. | | SETATTR operation. | |
| | | | |
| case_insensitive 16 bool READ True, if filename | | case_insensitive 16 bool READ True, if filename | |
| comparisons on this | | comparisons on this | |
|
| filesystem case | | filesystem are case | |
| insensitive. | | insensitive. | |
| | | | |
| case_preserving 17 bool READ True, if filename | | case_preserving 17 bool READ True, if filename | |
| case on this | | case on this | |
|
| filesystem preserved. | | filesystem are | |
| | | preserved. | |
| | | | |
| chown_restricted 18 bool READ If TRUE, the server | | chown_restricted 18 bool READ If TRUE, the server | |
| will reject any | | will reject any | |
| request to change | | request to change | |
| either the owner or | | either the owner or | |
| the group associated | | the group associated | |
| with a file if the | | with a file if the | |
| caller is not a | | caller is not a | |
| privileged user (for | | privileged user (for | |
| example, "root" in | | example, "root" in | |
| UNIX operating | | UNIX operating | |
| environments or in | | environments or in | |
| Windows 2000 the | | Windows 2000 the | |
| "Take Ownership" | | "Take Ownership" | |
| privilege). | | privilege). | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| fileid 20 uint64 READ A number uniquely | | fileid 20 uint64 READ A number uniquely | |
| identifying the file | | identifying the file | |
| within the | | within the | |
| filesystem. | | filesystem. | |
| | | | |
| files_avail 21 uint64 READ File slots available | | files_avail 21 uint64 READ File slots available | |
| to this user on the | | to this user on the | |
| filesystem containing | | filesystem containing | |
| this object - this | | this object - this | |
| | | | |
| skipping to change at page 42, line 5 | | skipping to change at page 41, line 5 | |
| are per filesystem | | are per filesystem | |
| attributes the same | | attributes the same | |
| for all filesystem's | | for all filesystem's | |
| objects. | | objects. | |
| | | | |
| maxfilesize 27 uint64 READ Maximum supported | | maxfilesize 27 uint64 READ Maximum supported | |
| file size for the | | file size for the | |
| filesystem of this | | filesystem of this | |
| object. | | object. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| maxlink 28 uint32 READ Maximum number of | | maxlink 28 uint32 READ Maximum number of | |
| links for this | | links for this | |
| object. | | object. | |
| | | | |
| maxname 29 uint32 READ Maximum filename size | | maxname 29 uint32 READ Maximum filename size | |
| supported for this | | supported for this | |
| object. | | object. | |
| | | | |
| maxread 30 uint64 READ Maximum read size | | maxread 30 uint64 READ Maximum read size | |
| supported for this | | supported for this | |
| object. | | object. | |
| | | | |
|
| maxwrite 31 uint64 READ | | maxwrite 31 uint64 READ Maximum write size | |
| Maximum write size | | | |
| supported for this | | supported for this | |
| object. This | | object. This | |
| attribute SHOULD be | | attribute SHOULD be | |
| supported if the file | | supported if the file | |
| is writable. Lack of | | is writable. Lack of | |
| this attribute can | | this attribute can | |
| lead to the client | | lead to the client | |
| either wasting | | either wasting | |
| bandwidth or not | | bandwidth or not | |
| receiving the best | | receiving the best | |
| | | | |
| skipping to change at page 43, line 5 | | skipping to change at page 42, line 5 | |
| to this object. | | to this object. | |
| | | | |
| owner 36 utf8<> R/W The string name of | | owner 36 utf8<> R/W The string name of | |
| the owner of this | | the owner of this | |
| object. | | object. | |
| | | | |
| owner_group 37 utf8<> R/W The string name of | | owner_group 37 utf8<> R/W The string name of | |
| the group ownership | | the group ownership | |
| of this object. | | of this object. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| quota_avail_hard 38 uint64 READ For definition see | | quota_avail_hard 38 uint64 READ For definition see | |
| "Quota Attributes" | | "Quota Attributes" | |
| section below. | | section below. | |
| | | | |
| quota_avail_soft 39 uint64 READ For definition see | | quota_avail_soft 39 uint64 READ For definition see | |
| "Quota Attributes" | | "Quota Attributes" | |
| section below. | | section below. | |
| | | | |
| quota_used 40 uint64 READ For definition see | | quota_used 40 uint64 READ For definition see | |
| | | | |
| skipping to change at page 44, line 5 | | skipping to change at page 43, line 5 | |
| | | | |
| space_total 44 uint64 READ Total disk space in | | space_total 44 uint64 READ Total disk space in | |
| bytes on the | | bytes on the | |
| filesystem containing | | filesystem containing | |
| this object. | | this object. | |
| | | | |
| space_used 45 uint64 READ Number of filesystem | | space_used 45 uint64 READ Number of filesystem | |
| bytes allocated to | | bytes allocated to | |
| this object. | | this object. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| system 46 bool R/W True, if this file is | | system 46 bool R/W True, if this file is | |
| a "system" file with | | a "system" file with | |
| respect to the | | respect to the | |
| Windows API? | | Windows API? | |
| | | | |
| time_access 47 nfstime4 READ The time of last | | time_access 47 nfstime4 READ The time of last | |
| access to the object | | access to the object | |
| by a read that was | | by a read that was | |
| satisfied by the | | satisfied by the | |
| server. | | server. | |
| | | | |
| time_access_set 48 settime4 WRITE Set the time of last | | time_access_set 48 settime4 WRITE Set the time of last | |
| access to the object. | | access to the object. | |
| SETATTR use only. | | SETATTR use only. | |
| | | | |
| time_backup 49 nfstime4 R/W The time of last | | time_backup 49 nfstime4 R/W The time of last | |
| backup of the object. | | backup of the object. | |
| | | | |
|
| time_create 50 nfstime4 R/W | | time_create 50 nfstime4 R/W The time of creation | |
| The time of creation | | | |
| of the object. This | | of the object. This | |
| attribute does not | | attribute does not | |
| have any relation to | | have any relation to | |
| the traditional UNIX | | the traditional UNIX | |
| file attribute | | file attribute | |
| "ctime" or "change | | "ctime" or "change | |
| time". | | time". | |
| | | | |
| time_delta 51 nfstime4 READ Smallest useful | | time_delta 51 nfstime4 READ Smallest useful | |
| server time | | server time | |
| granularity. | | granularity. | |
| | | | |
|
| time_metadata 52 nfstime4 R/W The time of last | | time_metadata 52 nfstime4 READ The time of last | |
| meta-data | | meta-data | |
| modification of the | | modification of the | |
| object. | | object. | |
| | | | |
| time_modify 53 nfstime4 READ The time of last | | time_modify 53 nfstime4 READ The time of last | |
| modification to the | | modification to the | |
| object. | | object. | |
| | | | |
| time_modify_set 54 settime4 WRITE Set the time of last | | time_modify_set 54 settime4 WRITE Set the time of last | |
| modification to the | | modification to the | |
| object. SETATTR use | | object. SETATTR use | |
| only. | | only. | |
| | | | |
| mounted_on_fileid 55 uint64 READ Like fileid, but if | | mounted_on_fileid 55 uint64 READ Like fileid, but if | |
| the target filehandle | | the target filehandle | |
| is the root of a | | is the root of a | |
| filesystem return the | | filesystem return the | |
| fileid of the | | fileid of the | |
| underlying directory. | | underlying directory. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| 5.7. Time Access | | 5.7. Time Access | |
| | | | |
| As defined above, the time_access attribute represents the time of | | As defined above, the time_access attribute represents the time of | |
| last access to the object by a read that was satisfied by the server. | | last access to the object by a read that was satisfied by the server. | |
| The notion of what is an "access" depends on server's operating | | The notion of what is an "access" depends on server's operating | |
| environment and/or the server's filesystem semantics. For example, | | environment and/or the server's filesystem semantics. For example, | |
| for servers obeying POSIX semantics, time_access would be updated | | for servers obeying POSIX semantics, time_access would be updated | |
| only by the READLINK, READ, and READDIR operations and not any of the | | only by the READLINK, READ, and READDIR operations and not any of the | |
| operations that modify the content of the object. Of course, setting | | operations that modify the content of the object. Of course, setting | |
| the corresponding time_access_set attribute is another way to modify | | the corresponding time_access_set attribute is another way to modify | |
| the time_access attribute. | | the time_access attribute. | |
| | | | |
|
| Whenever the file object resides on a writeable filesystem, the | | Whenever the file object resides on a writable filesystem, the server | |
| server should make best efforts to record time_access into stable | | should make best efforts to record time_access into stable storage. | |
| storage. However, to mitigate the performance effects of doing so, | | However, to mitigate the performance effects of doing so, and most | |
| and most especially whenever the server is satisifying the read of | | especially whenever the server is satisfying the read of the object's | |
| the object's content from its cache, the server MAY cache access time | | content from its cache, the server MAY cache access time updates and | |
| updates and lazily write them to stable storage. It is also | | lazily write them to stable storage. It is also acceptable to give | |
| acceptable to give administrators of the server the option to disable | | administrators of the server the option to disable time_access | |
| time_access updates. | | updates. | |
| | | | |
| 5.8. Interpreting owner and owner_group | | 5.8. Interpreting owner and owner_group | |
| | | | |
| The recommended attributes "owner" and "owner_group" (and also users | | The recommended attributes "owner" and "owner_group" (and also users | |
| and groups within the "acl" attribute) are represented in terms of a | | and groups within the "acl" attribute) are represented in terms of a | |
| UTF-8 string. To avoid a representation that is tied to a particular | | UTF-8 string. To avoid a representation that is tied to a particular | |
| underlying implementation at the client or server, the use of the | | underlying implementation at the client or server, the use of the | |
| UTF-8 string has been chosen. Note that section 6.1 of [RFC2624] | | UTF-8 string has been chosen. Note that section 6.1 of [RFC2624] | |
| provides additional rationale. It is expected that the client and | | provides additional rationale. It is expected that the client and | |
| server will have their own local representation of owner and | | server will have their own local representation of owner and | |
| | | | |
| skipping to change at page 46, line 5 | | skipping to change at page 45, line 5 | |
| to these security principals. When these local identifiers are | | to these security principals. When these local identifiers are | |
| translated to the form of the owner attribute, associated with files | | translated to the form of the owner attribute, associated with files | |
| created by such principals they identify, in a common format, the | | created by such principals they identify, in a common format, the | |
| users associated with each corresponding set of security principals. | | users associated with each corresponding set of security principals. | |
| | | | |
| The translation used to interpret owner and group strings is not | | The translation used to interpret owner and group strings is not | |
| specified as part of the protocol. This allows various solutions to | | specified as part of the protocol. This allows various solutions to | |
| be employed. For example, a local translation table may be consulted | | be employed. For example, a local translation table may be consulted | |
| that maps between a numeric id to the user@dns_domain syntax. A name | | that maps between a numeric id to the user@dns_domain syntax. A name | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| service may also be used to accomplish the translation. A server may | | service may also be used to accomplish the translation. A server may | |
| provide a more general service, not limited by any particular | | provide a more general service, not limited by any particular | |
| translation (which would only translate a limited set of possible | | translation (which would only translate a limited set of possible | |
| strings) by storing the owner and owner_group attributes in local | | strings) by storing the owner and owner_group attributes in local | |
| storage without any translation or it may augment a translation | | storage without any translation or it may augment a translation | |
| method by storing the entire string for attributes for which no | | method by storing the entire string for attributes for which no | |
| translation is available while using the local representation for | | translation is available while using the local representation for | |
| those cases in which a translation is available. | | those cases in which a translation is available. | |
| | | | |
| | | | |
| skipping to change at page 47, line 5 | | skipping to change at page 46, line 5 | |
| unsigned uid's and gid's, owner and group strings that consist of | | unsigned uid's and gid's, owner and group strings that consist of | |
| decimal numeric values with no leading zeros can be given a special | | decimal numeric values with no leading zeros can be given a special | |
| interpretation by clients and servers which choose to provide such | | interpretation by clients and servers which choose to provide such | |
| support. The receiver may treat such a user or group string as | | support. The receiver may treat such a user or group string as | |
| representing the same user as would be represented by a v2/v3 uid or | | representing the same user as would be represented by a v2/v3 uid or | |
| gid having the corresponding numeric value. A server is not | | gid having the corresponding numeric value. A server is not | |
| obligated to accept such a string, but may return an NFS4ERR_BADOWNER | | obligated to accept such a string, but may return an NFS4ERR_BADOWNER | |
| instead. To avoid this mechanism being used to subvert user and | | instead. To avoid this mechanism being used to subvert user and | |
| group translation, so that a client might pass all of the owners and | | group translation, so that a client might pass all of the owners and | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| groups in numeric form, a server SHOULD return an NFS4ERR_BADOWNER | | groups in numeric form, a server SHOULD return an NFS4ERR_BADOWNER | |
| error when there is a valid translation for the user or owner | | error when there is a valid translation for the user or owner | |
| designated in this way. In that case, the client must use the | | designated in this way. In that case, the client must use the | |
| appropriate name@domain string and not the special form for | | appropriate name@domain string and not the special form for | |
| compatibility. | | compatibility. | |
| | | | |
| The owner string "nobody" may be used to designate an anonymous user, | | The owner string "nobody" may be used to designate an anonymous user, | |
| which will be associated with a file created by a security principal | | which will be associated with a file created by a security principal | |
| that cannot be mapped through normal means to the owner attribute. | | that cannot be mapped through normal means to the owner attribute. | |
| | | | |
| skipping to change at page 48, line 5 | | skipping to change at page 47, line 5 | |
| allocations to other files or directories. | | allocations to other files or directories. | |
| | | | |
| quota_used | | quota_used | |
| The value in bytes which represent the amount of disc space used | | The value in bytes which represent the amount of disc space used | |
| by this file or directory and possibly a number of other similar | | by this file or directory and possibly a number of other similar | |
| files or directories, where the set of "similar" meets at least | | files or directories, where the set of "similar" meets at least | |
| the criterion that allocating space to any file or directory in | | the criterion that allocating space to any file or directory in | |
| the set will reduce the "quota_avail_hard" of every other file | | the set will reduce the "quota_avail_hard" of every other file | |
| or directory in the set. | | or directory in the set. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| Note that there may be a number of distinct but overlapping sets | | Note that there may be a number of distinct but overlapping sets | |
| of files or directories for which a quota_used value is | | of files or directories for which a quota_used value is | |
| maintained. E.g. "all files with a given owner", "all files with | | maintained. E.g. "all files with a given owner", "all files with | |
| a given group owner". etc. | | a given group owner". etc. | |
| | | | |
| The server is at liberty to choose any of those sets but should | | The server is at liberty to choose any of those sets but should | |
| do so in a repeatable way. The rule may be configured per- | | do so in a repeatable way. The rule may be configured per- | |
| filesystem or may be "choose the set with the smallest quota". | | filesystem or may be "choose the set with the smallest quota". | |
| | | | |
| | | | |
| skipping to change at page 48, line 49 | | skipping to change at page 47, line 49 | |
| | | | |
| To determine if a request succeeds, each nfsace4 entry is processed | | To determine if a request succeeds, each nfsace4 entry is processed | |
| in order by the server. Only ACEs which have a "who" that matches | | in order by the server. Only ACEs which have a "who" that matches | |
| the requester are considered. Each ACE is processed until all of the | | the requester are considered. Each ACE is processed until all of the | |
| bits of the requester's access have been ALLOWED. Once a bit (see | | bits of the requester's access have been ALLOWED. Once a bit (see | |
| below) has been ALLOWED by an ACCESS_ALLOWED_ACE, it is no longer | | below) has been ALLOWED by an ACCESS_ALLOWED_ACE, it is no longer | |
| considered in the processing of later ACEs. If an ACCESS_DENIED_ACE | | considered in the processing of later ACEs. If an ACCESS_DENIED_ACE | |
| is encountered where the requester's access still has unALLOWED bits | | is encountered where the requester's access still has unALLOWED bits | |
| in common with the "access_mask" of the ACE, the request is denied. | | in common with the "access_mask" of the ACE, the request is denied. | |
| However, unlike the ALLOWED and DENIED ACE types, the ALARM and AUDIT | | However, unlike the ALLOWED and DENIED ACE types, the ALARM and AUDIT | |
|
| ACE types do not affect a requestor's access, and instead are for | | ACE types do not affect a requester's access, and instead are for | |
| triggering events as a result of a requestor's access attempt. | | triggering events as a result of a requester's access attempt. | |
| Therefore, all AUDIT and ALARM ACEs are processed until end of the | | Therefore, all AUDIT and ALARM ACEs are processed until end of the | |
|
| ACL. | | ACL. When the ACL is fully processed, if there are bits in | |
| | | requester's mask that have not been considered whether the server | |
| | | allows or denies the access is undefined. If there is a mode | |
| | | attribute on the file, then this cannot happen, since the mode's | |
| | | | |
|
| The NFS version 4 ACL model is quite rich. Some server platforms may | | Draft Specification NFS version 4 Protocol September 2002 | |
| provide access control functionality that goes beyond the UNIX-style | | | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | MODE4_*OTH bits will map to EVERYONE@ ACEs that unambiguously specify | |
| | | the requester's access. | |
| | | | |
|
| | | The NFS version 4 ACL model is quite rich. Some server platforms may | |
| | | provide access control functionality that goes beyond the UNIX-style | |
| mode attribute, but which is not as rich as the NFS ACL model. So | | mode attribute, but which is not as rich as the NFS ACL model. So | |
| that users can take advantage of this more limited functionality, the | | that users can take advantage of this more limited functionality, the | |
| server may indicate that it supports ACLs as long as it follows the | | server may indicate that it supports ACLs as long as it follows the | |
| guidelines for mapping between its ACL model and the NFS version 4 | | guidelines for mapping between its ACL model and the NFS version 4 | |
| ACL model. | | ACL model. | |
| | | | |
| The situation is complicated by the fact that a server may have | | The situation is complicated by the fact that a server may have | |
| multiple modules that enforce ACLs. For example, the enforcement for | | multiple modules that enforce ACLs. For example, the enforcement for | |
| NFS version 4 access may be different from the enforcement for local | | NFS version 4 access may be different from the enforcement for local | |
| access, and both may be different from the enforcement for access | | access, and both may be different from the enforcement for access | |
| | | | |
| skipping to change at page 49, line 50 | | skipping to change at page 49, line 4 | |
| dependent) when any access attempt is | | dependent) when any access attempt is | |
| made to a file or directory for the | | made to a file or directory for the | |
| access methods specified in acemask4. | | access methods specified in acemask4. | |
| | | | |
| A server need not support all of the above ACE types. The bitmask | | A server need not support all of the above ACE types. The bitmask | |
| constants used to represent the above definitions within the | | constants used to represent the above definitions within the | |
| aclsupport attribute are as follows: | | aclsupport attribute are as follows: | |
| | | | |
| const ACL4_SUPPORT_ALLOW_ACL = 0x00000001; | | const ACL4_SUPPORT_ALLOW_ACL = 0x00000001; | |
| const ACL4_SUPPORT_DENY_ACL = 0x00000002; | | const ACL4_SUPPORT_DENY_ACL = 0x00000002; | |
|
| | | | |
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| const ACL4_SUPPORT_AUDIT_ACL = 0x00000004; | | const ACL4_SUPPORT_AUDIT_ACL = 0x00000004; | |
| const ACL4_SUPPORT_ALARM_ACL = 0x00000008; | | const ACL4_SUPPORT_ALARM_ACL = 0x00000008; | |
| | | | |
| The semantics of the "type" field follow the descriptions provided | | The semantics of the "type" field follow the descriptions provided | |
|
| | | | |
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
| above. | | above. | |
| | | | |
| The constants used for the type field (acetype4) are as follows: | | The constants used for the type field (acetype4) are as follows: | |
| | | | |
| const ACE4_ACCESS_ALLOWED_ACE_TYPE = 0x00000000; | | const ACE4_ACCESS_ALLOWED_ACE_TYPE = 0x00000000; | |
| const ACE4_ACCESS_DENIED_ACE_TYPE = 0x00000001; | | const ACE4_ACCESS_DENIED_ACE_TYPE = 0x00000001; | |
| const ACE4_SYSTEM_AUDIT_ACE_TYPE = 0x00000002; | | const ACE4_SYSTEM_AUDIT_ACE_TYPE = 0x00000002; | |
| const ACE4_SYSTEM_ALARM_ACE_TYPE = 0x00000003; | | const ACE4_SYSTEM_ALARM_ACE_TYPE = 0x00000003; | |
| | | | |
| Clients should not attempt to set an ACE unless the server claims | | Clients should not attempt to set an ACE unless the server claims | |
| support for that ACE type. If the server receives a request to set | | support for that ACE type. If the server receives a request to set | |
|
| an ACE that it cannot store, it must reject the request with | | an ACE that it cannot store, it MUST reject the request with | |
| NFS4ERR_ATTRNOTSUPP. | | NFS4ERR_ATTRNOTSUPP. If the server receives a request to set an ACE | |
| | | that it can store but cannot enforce, the server SHOULD reject the | |
| If the server receives a request to set an ACE that it can store but | | request with NFS4ERR_ATTRNOTSUPP. | |
| cannot enforce, the server SHOULD reject the request. | | | |
| | | | |
| Example: suppose a server can enforce NFS ACLs for NFS access but | | Example: suppose a server can enforce NFS ACLs for NFS access but | |
| cannot enforce ACLs for local access. If arbitrary processes can run | | cannot enforce ACLs for local access. If arbitrary processes can run | |
| on the server, then the server SHOULD NOT indicate ACL support. On | | on the server, then the server SHOULD NOT indicate ACL support. On | |
| the other hand, if only trusted administrative programs run locally, | | the other hand, if only trusted administrative programs run locally, | |
| then the server may indicate ACL support. | | then the server may indicate ACL support. | |
| | | | |
| 5.11.2. ACE Access Mask | | 5.11.2. ACE Access Mask | |
| | | | |
| The access_mask field contains values based on the following: | | The access_mask field contains values based on the following: | |
| | | | |
| skipping to change at page 50, line 50 | | skipping to change at page 50, line 4 | |
| ADD_FILE Permission to add a new file to a | | ADD_FILE Permission to add a new file to a | |
| directory | | directory | |
| APPEND_DATA Permission to append data to a file | | APPEND_DATA Permission to append data to a file | |
| ADD_SUBDIRECTORY Permission to create a subdirectory to a | | ADD_SUBDIRECTORY Permission to create a subdirectory to a | |
| directory | | directory | |
| READ_NAMED_ATTRS Permission to read the named attributes | | READ_NAMED_ATTRS Permission to read the named attributes | |
| of a file | | of a file | |
| WRITE_NAMED_ATTRS Permission to write the named attributes | | WRITE_NAMED_ATTRS Permission to write the named attributes | |
| of a file | | of a file | |
| EXECUTE Permission to execute a file | | EXECUTE Permission to execute a file | |
|
| | | | |
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| DELETE_CHILD Permission to delete a file or directory | | DELETE_CHILD Permission to delete a file or directory | |
| within a directory | | within a directory | |
| READ_ATTRIBUTES The ability to read basic attributes | | READ_ATTRIBUTES The ability to read basic attributes | |
| (non-acls) of a file | | (non-acls) of a file | |
|
| | | | |
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
| WRITE_ATTRIBUTES Permission to change basic attributes | | WRITE_ATTRIBUTES Permission to change basic attributes | |
| (non-acls) of a file | | (non-acls) of a file | |
| | | | |
| DELETE Permission to Delete the file | | DELETE Permission to Delete the file | |
| READ_ACL Permission to Read the ACL | | READ_ACL Permission to Read the ACL | |
| WRITE_ACL Permission to Write the ACL | | WRITE_ACL Permission to Write the ACL | |
| WRITE_OWNER Permission to change the owner | | WRITE_OWNER Permission to change the owner | |
| SYNCHRONIZE Permission to access file locally at the | | SYNCHRONIZE Permission to access file locally at the | |
| server with synchronous reads and writes | | server with synchronous reads and writes | |
| | | | |
| | | | |
| skipping to change at page 51, line 54 | | skipping to change at page 51, line 4 | |
| enabled. | | enabled. | |
| | | | |
| If a server receives a SETATTR request that it cannot accurately | | If a server receives a SETATTR request that it cannot accurately | |
| implement, it should error in the direction of more restricted | | implement, it should error in the direction of more restricted | |
| access. For example, suppose a server cannot distinguish overwriting | | access. For example, suppose a server cannot distinguish overwriting | |
| data from appending new data, as described in the previous paragraph. | | data from appending new data, as described in the previous paragraph. | |
| If a client submits an ACE where APPEND_DATA is set but WRITE_DATA is | | If a client submits an ACE where APPEND_DATA is set but WRITE_DATA is | |
| not (or vice versa), the server should reject the request with | | not (or vice versa), the server should reject the request with | |
| NFS4ERR_ATTRNOTSUPP. Nonetheless, if the ACE has type DENY, the | | NFS4ERR_ATTRNOTSUPP. Nonetheless, if the ACE has type DENY, the | |
| server may silently turn on the other bit, so that both APPEND_DATA | | server may silently turn on the other bit, so that both APPEND_DATA | |
|
| and WRITE_DATA are denied. | | | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| | | and WRITE_DATA are denied. | |
| | | | |
| 5.11.3. ACE flag | | 5.11.3. ACE flag | |
| | | | |
| The "flag" field contains values based on the following descriptions. | | The "flag" field contains values based on the following descriptions. | |
| | | | |
| ACE4_FILE_INHERIT_ACE | | ACE4_FILE_INHERIT_ACE | |
| | | | |
| Can be placed on a directory and indicates that this ACE should be | | Can be placed on a directory and indicates that this ACE should be | |
| added to each new non-directory file created. | | added to each new non-directory file created. | |
| | | | |
| | | | |
| skipping to change at page 52, line 46 | | skipping to change at page 51, line 48 | |
| | | | |
| ACE4_SUCCESSFUL_ACCESS_ACE_FLAG | | ACE4_SUCCESSFUL_ACCESS_ACE_FLAG | |
| | | | |
| ACL4_FAILED_ACCESS_ACE_FLAG | | ACL4_FAILED_ACCESS_ACE_FLAG | |
| | | | |
| The ACE4_SUCCESSFUL_ACCESS_ACE_FLAG (SUCCESS) and | | The ACE4_SUCCESSFUL_ACCESS_ACE_FLAG (SUCCESS) and | |
| ACE4_FAILED_ACCESS_ACE_FLAG (FAILED) flag bits relate only to | | ACE4_FAILED_ACCESS_ACE_FLAG (FAILED) flag bits relate only to | |
| ACE4_SYSTEM_AUDIT_ACE_TYPE (AUDIT) and ACE4_SYSTEM_ALARM_ACE_TYPE | | ACE4_SYSTEM_AUDIT_ACE_TYPE (AUDIT) and ACE4_SYSTEM_ALARM_ACE_TYPE | |
| (ALARM) ACE types. If during the processing of the file's ACL, the | | (ALARM) ACE types. If during the processing of the file's ACL, the | |
| server encounters an AUDIT or ALARM ACE that matches the principal | | server encounters an AUDIT or ALARM ACE that matches the principal | |
|
| attempting the OPEN, the server notes that fact, and the prescence, | | attempting the OPEN, the server notes that fact, and the presence, if | |
| if any, of the SUCCESS and FAILED flags encountered in the AUDIT or | | any, of the SUCCESS and FAILED flags encountered in the AUDIT or | |
| ALARM ACE. Once the server completes the ACL processing, and the | | ALARM ACE. Once the server completes the ACL processing, and the | |
| share reservation processing, and the OPEN call, it then notes if the | | share reservation processing, and the OPEN call, it then notes if the | |
| OPEN succeeded or failed. If the OPEN succeeded, and if the SUCCESS | | OPEN succeeded or failed. If the OPEN succeeded, and if the SUCCESS | |
|
| | | | |
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| flag was set for a matching AUDIT or ALARM, then the appropriate | | flag was set for a matching AUDIT or ALARM, then the appropriate | |
| AUDIT or ALARM event occurs. If the OPEN failed, and if the FAILED | | AUDIT or ALARM event occurs. If the OPEN failed, and if the FAILED | |
| flag was set for the matching AUDIT or ALARM, then the appropriate | | flag was set for the matching AUDIT or ALARM, then the appropriate | |
|
| | | | |
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
| AUDIT or ALARM event occurs. Clearly either or both of the SUCCESS | | AUDIT or ALARM event occurs. Clearly either or both of the SUCCESS | |
| or FAILED can be set, but if neither is set, the AUDIT or ALARM ACE | | or FAILED can be set, but if neither is set, the AUDIT or ALARM ACE | |
| is not useful. | | is not useful. | |
| | | | |
| The previously described processing applies to that of the ACCESS | | The previously described processing applies to that of the ACCESS | |
| operation as well. The difference being that "success" or "failure" | | operation as well. The difference being that "success" or "failure" | |
| does not mean whether ACCESS returns NFS4_OK or not. Success means | | does not mean whether ACCESS returns NFS4_OK or not. Success means | |
| whether ACCESS returns all requested and supported bits. Failure | | whether ACCESS returns all requested and supported bits. Failure | |
| means whether ACCESS failed to return a bit that was requested and | | means whether ACCESS failed to return a bit that was requested and | |
| supported. | | supported. | |
| | | | |
| skipping to change at page 53, line 52 | | skipping to change at page 53, line 4 | |
| should reject the request with NFS4ERR_ATTRNOTSUPP. If the server | | should reject the request with NFS4ERR_ATTRNOTSUPP. If the server | |
| supports a single "inherit ACE" flag that applies to both files and | | supports a single "inherit ACE" flag that applies to both files and | |
| directories, the server may reject the request (i.e., requiring the | | directories, the server may reject the request (i.e., requiring the | |
| client to set both the file and directory inheritance flags). The | | client to set both the file and directory inheritance flags). The | |
| server may also accept the request and silently turn on the | | server may also accept the request and silently turn on the | |
| ACE4_DIRECTORY_INHERIT_ACE flag. | | ACE4_DIRECTORY_INHERIT_ACE flag. | |
| | | | |
| 5.11.4. ACE who | | 5.11.4. ACE who | |
| | | | |
| There are several special identifiers ("who") which need to be | | There are several special identifiers ("who") which need to be | |
|
| | | | |
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| understood universally, rather than in the context of a particular | | understood universally, rather than in the context of a particular | |
| DNS domain. Some of these identifiers cannot be understood when an | | DNS domain. Some of these identifiers cannot be understood when an | |
| NFS client accesses the server, but have meaning when a local process | | NFS client accesses the server, but have meaning when a local process | |
|
| | | | |
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
| accesses the file. The ability to display and modify these | | accesses the file. The ability to display and modify these | |
| permissions is permitted over NFS, even if none of the access methods | | permissions is permitted over NFS, even if none of the access methods | |
| on the server understands the identifiers. | | on the server understands the identifiers. | |
| | | | |
| Who Description | | Who Description | |
| _______________________________________________________________ | | _______________________________________________________________ | |
| "OWNER" The owner of the file. | | "OWNER" The owner of the file. | |
| "GROUP" The group associated with the file. | | "GROUP" The group associated with the file. | |
| "EVERYONE" The world. | | "EVERYONE" The world. | |
| "INTERACTIVE" Accessed from an interactive terminal. | | "INTERACTIVE" Accessed from an interactive terminal. | |
| | | | |
| skipping to change at page 54, line 52 | | skipping to change at page 54, line 4 | |
| const MODE4_XGRP = 0x008; /* execute permission: group */ | | const MODE4_XGRP = 0x008; /* execute permission: group */ | |
| const MODE4_ROTH = 0x004; /* read permission: other */ | | const MODE4_ROTH = 0x004; /* read permission: other */ | |
| const MODE4_WOTH = 0x002; /* write permission: other */ | | const MODE4_WOTH = 0x002; /* write permission: other */ | |
| const MODE4_XOTH = 0x001; /* execute permission: other */ | | const MODE4_XOTH = 0x001; /* execute permission: other */ | |
| | | | |
| Bits MODE4_RUSR, MODE4_WUSR, and MODE4_XUSR apply to the principal | | Bits MODE4_RUSR, MODE4_WUSR, and MODE4_XUSR apply to the principal | |
| identified in the owner attribute. Bits MODE4_RGRP, MODE4_WGRP, and | | identified in the owner attribute. Bits MODE4_RGRP, MODE4_WGRP, and | |
| MODE4_XGRP apply to the principals identified in the owner_group | | MODE4_XGRP apply to the principals identified in the owner_group | |
| attribute. Bits MODE4_ROTH, MODE4_WOTH, MODE4_XOTH apply to any | | attribute. Bits MODE4_ROTH, MODE4_WOTH, MODE4_XOTH apply to any | |
| principal that does not match that in the owner group, and does not | | principal that does not match that in the owner group, and does not | |
|
| have a group matching that of the owner_group attribute. | | | |
| | | | |
|
| The remaining bits are not defined by this protocol and MUST NOT be | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | have a group matching that of the owner_group attribute. | |
| | | | |
|
| | | The remaining bits are not defined by this protocol and MUST NOT be | |
| used. The minor version mechanism must be used to define further bit | | used. The minor version mechanism must be used to define further bit | |
| usage. | | usage. | |
| | | | |
| Note that in UNIX, if a file has the MODE4_SGID bit set and no | | Note that in UNIX, if a file has the MODE4_SGID bit set and no | |
| MODE4_XGRP bit set, then READ and WRITE must use mandatory file | | MODE4_XGRP bit set, then READ and WRITE must use mandatory file | |
| locking. | | locking. | |
| | | | |
| 5.11.6. Mode and ACL Attribute | | 5.11.6. Mode and ACL Attribute | |
| | | | |
| The server that supports both mode and ACL must take care to | | The server that supports both mode and ACL must take care to | |
| | | | |
| skipping to change at page 55, line 55 | | skipping to change at page 55, line 4 | |
| mounted on the mount point. | | mounted on the mount point. | |
| | | | |
| Unlike NFS version 3, NFS version 4 allows a client's LOOKUP request | | Unlike NFS version 3, NFS version 4 allows a client's LOOKUP request | |
| to cross other filesystems. The client detects the filesystem | | to cross other filesystems. The client detects the filesystem | |
| crossing whenever the filehandle argument of LOOKUP has an fsid | | crossing whenever the filehandle argument of LOOKUP has an fsid | |
| attribute different from that of the filehandle returned by LOOKUP. A | | attribute different from that of the filehandle returned by LOOKUP. A | |
| UNIX-based client will consider this a "mount point crossing". UNIX | | UNIX-based client will consider this a "mount point crossing". UNIX | |
| has a legacy scheme for allowing a process to determine its current | | has a legacy scheme for allowing a process to determine its current | |
| working directory. This relies on readdir() of a mount point's parent | | working directory. This relies on readdir() of a mount point's parent | |
| and stat() of the mount point returning fileids as previously | | and stat() of the mount point returning fileids as previously | |
|
| | | | |
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| described. The mounted_on_fileid attribute corresponds to the fileid | | described. The mounted_on_fileid attribute corresponds to the fileid | |
| that readdir() would have returned as described previously. | | that readdir() would have returned as described previously. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
| While the NFS version 4 client could simply fabricate a fileid | | While the NFS version 4 client could simply fabricate a fileid | |
| corresponding to what mounted_on_fileid provides (and if the server | | corresponding to what mounted_on_fileid provides (and if the server | |
| does not support mounted_on_fileid, the client has no choice), there | | does not support mounted_on_fileid, the client has no choice), there | |
| is a risk that the client will generate a fileid that conflicts with | | is a risk that the client will generate a fileid that conflicts with | |
| one that is already assigned to another object in the filesystem. | | one that is already assigned to another object in the filesystem. | |
| Instead, if the server can provide the mounted_on_fileid, the | | Instead, if the server can provide the mounted_on_fileid, the | |
| potential for client operational problems in this area is eliminated. | | potential for client operational problems in this area is eliminated. | |
| | | | |
| If the server detects that there is no mounted point at the target | | If the server detects that there is no mounted point at the target | |
| file object, then the value for mounted_on_fileid that it returns is | | file object, then the value for mounted_on_fileid that it returns is | |
| | | | |
| skipping to change at page 57, line 5 | | skipping to change at page 56, line 5 | |
| fileid of a directory entry returned by readdir(). If | | fileid of a directory entry returned by readdir(). If | |
| mounted_on_fileid is requested in a GETATTR operation, the server | | mounted_on_fileid is requested in a GETATTR operation, the server | |
| should obey an invariant that has it returning a value that is equal | | should obey an invariant that has it returning a value that is equal | |
| to the file object's entry in the object's parent directory, i.e. | | to the file object's entry in the object's parent directory, i.e. | |
| what readdir() would have returned. Some operating environments | | what readdir() would have returned. Some operating environments | |
| allow a series of two or more filesystems to be mounted onto a single | | allow a series of two or more filesystems to be mounted onto a single | |
| mount point. In this case, for the server to obey the aforementioned | | mount point. In this case, for the server to obey the aforementioned | |
| invariant, it will need to find the base mount point, and not the | | invariant, it will need to find the base mount point, and not the | |
| intermediate mount points. | | intermediate mount points. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| 6. Filesystem Migration and Replication | | 6. Filesystem Migration and Replication | |
| | | | |
| With the use of the recommended attribute "fs_locations", the NFS | | With the use of the recommended attribute "fs_locations", the NFS | |
| version 4 server has a method of providing filesystem migration or | | version 4 server has a method of providing filesystem migration or | |
| replication services. For the purposes of migration and replication, | | replication services. For the purposes of migration and replication, | |
| a filesystem will be defined as all files that share a given fsid | | a filesystem will be defined as all files that share a given fsid | |
| (both major and minor values are the same). | | (both major and minor values are the same). | |
| | | | |
| The fs_locations attribute provides a list of filesystem locations. | | The fs_locations attribute provides a list of filesystem locations. | |
| | | | |
| skipping to change at page 58, line 5 | | skipping to change at page 57, line 5 | |
| | | | |
| Once the servers participating in the migration have completed the | | Once the servers participating in the migration have completed the | |
| move of the filesystem, the error NFS4ERR_MOVED will be returned for | | move of the filesystem, the error NFS4ERR_MOVED will be returned for | |
| subsequent requests received by the original server. The | | subsequent requests received by the original server. The | |
| NFS4ERR_MOVED error is returned for all operations except PUTFH and | | NFS4ERR_MOVED error is returned for all operations except PUTFH and | |
| GETATTR. Upon receiving the NFS4ERR_MOVED error, the client will | | GETATTR. Upon receiving the NFS4ERR_MOVED error, the client will | |
| obtain the value of the fs_locations attribute. The client will then | | obtain the value of the fs_locations attribute. The client will then | |
| use the contents of the attribute to redirect its requests to the | | use the contents of the attribute to redirect its requests to the | |
| specified server. To facilitate the use of GETATTR, operations such | | specified server. To facilitate the use of GETATTR, operations such | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| as PUTFH must also be accepted by the server for the migrated file | | as PUTFH must also be accepted by the server for the migrated file | |
| system's filehandles. Note that if the server returns NFS4ERR_MOVED, | | system's filehandles. Note that if the server returns NFS4ERR_MOVED, | |
| the server MUST support the fs_locations attribute. | | the server MUST support the fs_locations attribute. | |
| | | | |
| If the client requests more attributes than just fs_locations, the | | If the client requests more attributes than just fs_locations, the | |
| server may return fs_locations only. This is to be expected since | | server may return fs_locations only. This is to be expected since | |
| the server has migrated the filesystem and may not have a method of | | the server has migrated the filesystem and may not have a method of | |
| obtaining additional attribute data. | | obtaining additional attribute data. | |
| | | | |
| | | | |
| skipping to change at page 59, line 5 | | skipping to change at page 58, line 5 | |
| | | | |
| The fs_locations struct and attribute then contains an array of | | The fs_locations struct and attribute then contains an array of | |
| locations. Since the name space of each server may be constructed | | locations. Since the name space of each server may be constructed | |
| differently, the "fs_root" field is provided. The path represented | | differently, the "fs_root" field is provided. The path represented | |
| by fs_root represents the location of the filesystem in the server's | | by fs_root represents the location of the filesystem in the server's | |
| name space. Therefore, the fs_root path is only associated with the | | name space. Therefore, the fs_root path is only associated with the | |
| server from which the fs_locations attribute was obtained. The | | server from which the fs_locations attribute was obtained. The | |
| fs_root path is meant to aid the client in locating the filesystem at | | fs_root path is meant to aid the client in locating the filesystem at | |
| the various servers listed. | | the various servers listed. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| As an example, there is a replicated filesystem located at two | | As an example, there is a replicated filesystem located at two | |
| servers (servA and servB). At servA the filesystem is located at | | servers (servA and servB). At servA the filesystem is located at | |
| path "/a/b/c". At servB the filesystem is located at path "/x/y/z". | | path "/a/b/c". At servB the filesystem is located at path "/x/y/z". | |
| In this example the client accesses the filesystem first at servA | | In this example the client accesses the filesystem first at servA | |
| with a multi-component lookup path of "/a/b/c/d". Since the client | | with a multi-component lookup path of "/a/b/c/d". Since the client | |
| used a multi-component lookup to obtain the filehandle at "/a/b/c/d", | | used a multi-component lookup to obtain the filehandle at "/a/b/c/d", | |
| it is unaware that the filesystem's root is located in servA's name | | it is unaware that the filesystem's root is located in servA's name | |
| space at "/a/b/c". When the client switches to servB, it will need | | space at "/a/b/c". When the client switches to servB, it will need | |
| to determine that the directory it first referenced at servA is now | | to determine that the directory it first referenced at servA is now | |
| | | | |
| skipping to change at page 60, line 5 | | skipping to change at page 59, line 5 | |
| of the fh_expire_type attribute, whether volatile filehandles will | | of the fh_expire_type attribute, whether volatile filehandles will | |
| expire at the migration or replication event. If the bit | | expire at the migration or replication event. If the bit | |
| FH4_VOL_MIGRATION is set in the fh_expire_type attribute, the client | | FH4_VOL_MIGRATION is set in the fh_expire_type attribute, the client | |
| must treat the volatile filehandle as if the server had returned the | | must treat the volatile filehandle as if the server had returned the | |
| NFS4ERR_FHEXPIRED error. At the migration or replication event in | | NFS4ERR_FHEXPIRED error. At the migration or replication event in | |
| the presence of the FH4_VOL_MIGRATION bit, the client will not | | the presence of the FH4_VOL_MIGRATION bit, the client will not | |
| present the original or old volatile filehandle to the new server. | | present the original or old volatile filehandle to the new server. | |
| The client will start its communication with the new server by | | The client will start its communication with the new server by | |
| recovering its filehandles using the saved file names. | | recovering its filehandles using the saved file names. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| 7. NFS Server Name Space | | 7. NFS Server Name Space | |
| | | | |
| 7.1. Server Exports | | 7.1. Server Exports | |
| | | | |
| On a UNIX server the name space describes all the files reachable by | | On a UNIX server the name space describes all the files reachable by | |
| pathnames under the root directory or "/". On a Windows NT server | | pathnames under the root directory or "/". On a Windows NT server | |
| the name space constitutes all the files on disks named by mapped | | the name space constitutes all the files on disks named by mapped | |
| disk letters. NFS server administrators rarely make the entire | | disk letters. NFS server administrators rarely make the entire | |
| server's filesystem name space available to NFS clients. More often | | server's filesystem name space available to NFS clients. More often | |
| | | | |
| skipping to change at page 61, line 5 | | skipping to change at page 60, line 5 | |
| the server's name space on the client: it is static. If the server | | the server's name space on the client: it is static. If the server | |
| administrator adds a new export the client will be unaware of it. | | administrator adds a new export the client will be unaware of it. | |
| | | | |
| 7.3. Server Pseudo Filesystem | | 7.3. Server Pseudo Filesystem | |
| | | | |
| NFS version 4 servers avoid this name space inconsistency by | | NFS version 4 servers avoid this name space inconsistency by | |
| presenting all the exports within the framework of a single server | | presenting all the exports within the framework of a single server | |
| name space. An NFS version 4 client uses LOOKUP and READDIR | | name space. An NFS version 4 client uses LOOKUP and READDIR | |
| operations to browse seamlessly from one export to another. Portions | | operations to browse seamlessly from one export to another. Portions | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| of the server name space that are not exported are bridged via a | | of the server name space that are not exported are bridged via a | |
| "pseudo filesystem" that provides a view of exported directories | | "pseudo filesystem" that provides a view of exported directories | |
| only. A pseudo filesystem has a unique fsid and behaves like a | | only. A pseudo filesystem has a unique fsid and behaves like a | |
| normal, read only filesystem. | | normal, read only filesystem. | |
| | | | |
| Based on the construction of the server's name space, it is possible | | Based on the construction of the server's name space, it is possible | |
| that multiple pseudo filesystems may exist. For example, | | that multiple pseudo filesystems may exist. For example, | |
| | | | |
| /a pseudo filesystem | | /a pseudo filesystem | |
| /a/b real filesystem | | /a/b real filesystem | |
| /a/b/c pseudo filesystem | | /a/b/c pseudo filesystem | |
| /a/b/c/d real filesystem | | /a/b/c/d real filesystem | |
| | | | |
| Each of the pseudo filesystems are considered separate entities and | | Each of the pseudo filesystems are considered separate entities and | |
| therefore will have a unique fsid. | | therefore will have a unique fsid. | |
| | | | |
| 7.4. Multiple Roots | | 7.4. Multiple Roots | |
| | | | |
| The DOS and Windows operating environments are sometimes described as | | The DOS and Windows operating environments are sometimes described as | |
|
| having "multiple roots". filesystems are commonly represented as | | having "multiple roots". Filesystems are commonly represented as | |
| disk letters. MacOS represents filesystems as top level names. NFS | | disk letters. MacOS represents filesystems as top level names. NFS | |
| version 4 servers for these platforms can construct a pseudo file | | version 4 servers for these platforms can construct a pseudo file | |
| system above these root names so that disk letters or volume names | | system above these root names so that disk letters or volume names | |
| are simply directory names in the pseudo root. | | are simply directory names in the pseudo root. | |
| | | | |
| 7.5. Filehandle Volatility | | 7.5. Filehandle Volatility | |
| | | | |
| The nature of the server's pseudo filesystem is that it is a logical | | The nature of the server's pseudo filesystem is that it is a logical | |
| representation of filesystem(s) available from the server. | | representation of filesystem(s) available from the server. | |
| Therefore, the pseudo filesystem is most likely constructed | | Therefore, the pseudo filesystem is most likely constructed | |
| | | | |
| skipping to change at page 62, line 5 | | skipping to change at page 61, line 5 | |
| | | | |
| 7.6. Exported Root | | 7.6. Exported Root | |
| | | | |
| If the server's root filesystem is exported, one might conclude that | | If the server's root filesystem is exported, one might conclude that | |
| a pseudo-filesystem is not needed. This would be wrong. Assume the | | a pseudo-filesystem is not needed. This would be wrong. Assume the | |
| following filesystems on a server: | | following filesystems on a server: | |
| | | | |
| / disk1 (exported) | | / disk1 (exported) | |
| /a disk2 (not exported) | | /a disk2 (not exported) | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| /a/b disk3 (exported) | | /a/b disk3 (exported) | |
| | | | |
| Because disk2 is not exported, disk3 cannot be reached with simple | | Because disk2 is not exported, disk3 cannot be reached with simple | |
| LOOKUPs. The server must bridge the gap with a pseudo-filesystem. | | LOOKUPs. The server must bridge the gap with a pseudo-filesystem. | |
| | | | |
| 7.7. Mount Point Crossing | | 7.7. Mount Point Crossing | |
| | | | |
| The server filesystem environment may be constructed in such a way | | The server filesystem environment may be constructed in such a way | |
| that one filesystem contains a directory which is 'covered' or | | that one filesystem contains a directory which is 'covered' or | |
| | | | |
| skipping to change at page 62, line 53 | | skipping to change at page 61, line 53 | |
| server's perception of the client's ability to authenticate itself | | server's perception of the client's ability to authenticate itself | |
| properly. However, with the support of multiple security mechanisms | | properly. However, with the support of multiple security mechanisms | |
| and the ability to negotiate the appropriate use of these mechanisms, | | and the ability to negotiate the appropriate use of these mechanisms, | |
| the server is unable to properly determine if a client will be able | | the server is unable to properly determine if a client will be able | |
| to authenticate itself. If, based on its policies, the server | | to authenticate itself. If, based on its policies, the server | |
| chooses to limit the contents of the pseudo filesystem, the server | | chooses to limit the contents of the pseudo filesystem, the server | |
| may effectively hide filesystems from a client that may otherwise | | may effectively hide filesystems from a client that may otherwise | |
| have legitimate access. | | have legitimate access. | |
| | | | |
| As suggested practice, the server should apply the security policy of | | As suggested practice, the server should apply the security policy of | |
|
| a shared resource in the server's namespace to the ancestors | | a shared resource in the server's namespace to the components of the | |
| components of the namespace. For example: | | resource's ancestors. For example: | |
| | | | |
| / | | / | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| /a/b | | /a/b | |
| /a/b/c | | /a/b/c | |
|
| | | | |
| The /a/b/c directory is a real filesystem and is the shared resource. | | The /a/b/c directory is a real filesystem and is the shared resource. | |
| The security policy for /a/b/c is Kerberos with integrity. The | | The security policy for /a/b/c is Kerberos with integrity. The | |
|
| server should should apply the same security policy to /, /a, and | | server should apply the same security policy to /, /a, and /a/b. | |
| /a/b. This allows for the extension of the protection of the | | This allows for the extension of the protection of the server's | |
| server's namespace to the ancestors of the real shared resource. | | namespace to the ancestors of the real shared resource. | |
| | | | |
| For the case of the use of multiple, disjoint security mechanisms in | | For the case of the use of multiple, disjoint security mechanisms in | |
| the server's resources, the security for a particular object in the | | the server's resources, the security for a particular object in the | |
| server's namespace should be the union of all security mechanisms of | | server's namespace should be the union of all security mechanisms of | |
| all direct descendants. | | all direct descendants. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| 8. File Locking and Share Reservations | | 8. File Locking and Share Reservations | |
| | | | |
| Integrating locking into the NFS protocol necessarily causes it to be | | Integrating locking into the NFS protocol necessarily causes it to be | |
| stateful. With the inclusion of share reservations the protocol | | stateful. With the inclusion of share reservations the protocol | |
| becomes substantially more dependent on state than the traditional | | becomes substantially more dependent on state than the traditional | |
| combination of NFS and NLM [XNFS]. There are three components to | | combination of NFS and NLM [XNFS]. There are three components to | |
| making this state manageable: | | making this state manageable: | |
| | | | |
| o Clear division between client and server | | o Clear division between client and server | |
| | | | |
| skipping to change at page 65, line 5 | | skipping to change at page 64, line 5 | |
| owner. | | owner. | |
| | | | |
| The following sections describe the transition from the heavy weight | | The following sections describe the transition from the heavy weight | |
| information to the eventual stateid used for most client and server | | information to the eventual stateid used for most client and server | |
| locking and lease interactions. | | locking and lease interactions. | |
| | | | |
| 8.1.1. Client ID | | 8.1.1. Client ID | |
| | | | |
| For each LOCK request, the client must identify itself to the server. | | For each LOCK request, the client must identify itself to the server. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| This is done in such a way as to allow for correct lock | | This is done in such a way as to allow for correct lock | |
| identification and crash recovery. A sequence of a SETCLIENTID | | identification and crash recovery. A sequence of a SETCLIENTID | |
| operation followed by a SETCLIENTID_CONFIRM operation is required to | | operation followed by a SETCLIENTID_CONFIRM operation is required to | |
| establish the identification onto the server. Establishment of | | establish the identification onto the server. Establishment of | |
| identification by a new incarnation of the client also has the effect | | identification by a new incarnation of the client also has the effect | |
| of immediately breaking any leased state that a previous incarnation | | of immediately breaking any leased state that a previous incarnation | |
| of the client might have had on the server, as opposed to forcing the | | of the client might have had on the server, as opposed to forcing the | |
| new client incarnation to wait for the leases to expire. Breaking | | new client incarnation to wait for the leases to expire. Breaking | |
| the lease state amounts to the server removing all lock, share | | the lease state amounts to the server removing all lock, share | |
| | | | |
| skipping to change at page 65, line 32 | | skipping to change at page 64, line 32 | |
| | | | |
| struct nfs_client_id4 { | | struct nfs_client_id4 { | |
| verifier4 verifier; | | verifier4 verifier; | |
| opaque id<NFS4_OPAQUE_LIMIT>; | | opaque id<NFS4_OPAQUE_LIMIT>; | |
| }; | | }; | |
| | | | |
| The first field, verifier is a client incarnation verifier that is | | The first field, verifier is a client incarnation verifier that is | |
| used to detect client reboots. Only if the verifier is different from | | used to detect client reboots. Only if the verifier is different from | |
| that the server has previously recorded the client (as identified by | | that the server has previously recorded the client (as identified by | |
| the second field f the structure, id) does the server start the | | the second field f the structure, id) does the server start the | |
|
| process of cancelling the client's leased state. | | process of canceling the client's leased state. | |
| | | | |
| The second field, id is a variable length string that uniquely | | The second field, id is a variable length string that uniquely | |
| defines the client. | | defines the client. | |
| | | | |
| There are several considerations for how the client generates the id | | There are several considerations for how the client generates the id | |
| string: | | string: | |
| | | | |
| o The string should be unique so that multiple clients do not | | o The string should be unique so that multiple clients do not | |
| present the same string. The consequences of two clients | | present the same string. The consequences of two clients | |
| presenting the same string range from one client getting an | | presenting the same string range from one client getting an | |
| error to one client having its leased state abruptly and | | error to one client having its leased state abruptly and | |
|
| unexpectedly cancelled. | | unexpectedly canceled. | |
| | | | |
| o The string should be selected so the subsequent incarnations | | o The string should be selected so the subsequent incarnations | |
| (e.g. reboots) of the same client cause the client to present | | (e.g. reboots) of the same client cause the client to present | |
| the same string. The implementor is cautioned from an approach | | the same string. The implementor is cautioned from an approach | |
| that requires the string to be recorded in a local file because | | that requires the string to be recorded in a local file because | |
| this precludes the use of the implementation in an environment | | this precludes the use of the implementation in an environment | |
| where there is no local disk and all file access is from an NFS | | where there is no local disk and all file access is from an NFS | |
| version 4 server. | | version 4 server. | |
| | | | |
| o The string should be different for each server network address | | o The string should be different for each server network address | |
| that the client accesses, rather than common to all server | | that the client accesses, rather than common to all server | |
| network addresses. The reason is that it may not be possible for | | network addresses. The reason is that it may not be possible for | |
| the client to tell if same server is listening on multiple | | the client to tell if same server is listening on multiple | |
| network addresses. If the client issues SETCLIENTID with the | | network addresses. If the client issues SETCLIENTID with the | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| same id string to each network address of such a server, the | | same id string to each network address of such a server, the | |
| server will think it is the same client, and each successive | | server will think it is the same client, and each successive | |
| SETCLIENTID will cause the server to begin the process of | | SETCLIENTID will cause the server to begin the process of | |
| removing the client's previous leased state. | | removing the client's previous leased state. | |
| | | | |
| o The algorithm for generating the string should not assume that | | o The algorithm for generating the string should not assume that | |
| the client's network address won't change. This includes | | the client's network address won't change. This includes | |
| changes between client incarnations and even changes while the | | changes between client incarnations and even changes while the | |
| client is stilling running in its current incarnation. This | | client is stilling running in its current incarnation. This | |
| means that if the client includes just the client's and server's | | means that if the client includes just the client's and server's | |
| network address in the id string, there is a real risk, after | | network address in the id string, there is a real risk, after | |
| the client gives up the network address, that another client, | | the client gives up the network address, that another client, | |
|
| using a similar algorithm for generate the id string, will | | using a similar algorithm for generating the id string, will | |
| generating a conflicting id string. | | generate a conflicting id string. | |
| | | | |
| Given the above considerations, an example of a well generated id | | Given the above considerations, an example of a well generated id | |
| string is one that includes: | | string is one that includes: | |
| | | | |
| o The server's network address. | | o The server's network address. | |
| | | | |
| o The client's network address. | | o The client's network address. | |
| | | | |
| o For a user level NFS version 4 client, it should contain | | o For a user level NFS version 4 client, it should contain | |
| additional information to distinguish the client from other user | | additional information to distinguish the client from other user | |
| level clients running on the same host, such as a process id or | | level clients running on the same host, such as a process id or | |
| other unique sequence. | | other unique sequence. | |
| | | | |
| o Additional information that tends to be unique, such as one or | | o Additional information that tends to be unique, such as one or | |
| more of: | | more of: | |
| | | | |
|
| - The client machines serial number (for privacy reasons, it is | | - The client machine's serial number (for privacy reasons, it is | |
| best to perform some one way function on the serial number). | | best to perform some one way function on the serial number). | |
| | | | |
| - A MAC address. | | - A MAC address. | |
| | | | |
| - The timestamp of when the NFS version 4 software was first | | - The timestamp of when the NFS version 4 software was first | |
| installed on the client (though this is subject to the | | installed on the client (though this is subject to the | |
| previously mentioned caution about using information that is | | previously mentioned caution about using information that is | |
| stored in a file, because the file might only be accessible | | stored in a file, because the file might only be accessible | |
| over NFS version 4). | | over NFS version 4). | |
| | | | |
| | | | |
| skipping to change at page 67, line 5 | | skipping to change at page 66, line 5 | |
| the same between client incarnations, this shares the same | | the same between client incarnations, this shares the same | |
| problem as that of the using the timestamp of the software | | problem as that of the using the timestamp of the software | |
| installation. | | installation. | |
| | | | |
| As a security measure, the server MUST NOT cancel a client's leased | | As a security measure, the server MUST NOT cancel a client's leased | |
| state if the principal established the state for a given id string is | | state if the principal established the state for a given id string is | |
| not the same as the principal issuing the SETCLIENTID. | | not the same as the principal issuing the SETCLIENTID. | |
| | | | |
| Note that SETCLIENTID and SETCLIENTID_CONFIRM has a secondary purpose | | Note that SETCLIENTID and SETCLIENTID_CONFIRM has a secondary purpose | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| of establishing the information the server needs to make callbacks to | | of establishing the information the server needs to make callbacks to | |
| the client for purpose of supporting delegations. It is permitted to | | the client for purpose of supporting delegations. It is permitted to | |
| change this information via SETCLIENTID and SETCLIENTID_CONFIRM | | change this information via SETCLIENTID and SETCLIENTID_CONFIRM | |
| within the same incarnation of the client without removing the | | within the same incarnation of the client without removing the | |
| client's leased state. | | client's leased state. | |
| | | | |
| Once a SETCLIENTID and SETCLIENTID_CONFIRM sequence has successfully | | Once a SETCLIENTID and SETCLIENTID_CONFIRM sequence has successfully | |
| completed, the client uses the short hand client identifier, of type | | completed, the client uses the short hand client identifier, of type | |
| clientid4, instead of the longer and less compact nfs_client_id4 | | clientid4, instead of the longer and less compact nfs_client_id4 | |
|
| structure. This short hand client identfier (a clientid) is assigned | | structure. This short hand client identifier (a clientid) is | |
| by the server and should be chosen so that it will not conflict with | | assigned by the server and should be chosen so that it will not | |
| a clientid previously assigned by the server. This applies across | | conflict with a clientid previously assigned by the server. This | |
| server restarts or reboots. When a clientid is presented to a server | | applies across server restarts or reboots. When a clientid is | |
| and that clientid is not recognized, as would happen after a server | | presented to a server and that clientid is not recognized, as would | |
| reboot, the server will reject the request with the error | | happen after a server reboot, the server will reject the request with | |
| NFS4ERR_STALE_CLIENTID. When this happens, the client must obtain a | | the error NFS4ERR_STALE_CLIENTID. When this happens, the client must | |
| new clientid by use of the SETCLIENTID operation and then proceed to | | obtain a new clientid by use of the SETCLIENTID operation and then | |
| any other necessary recovery for the server reboot case (See the | | proceed to any other necessary recovery for the server reboot case | |
| section "Server Failure and Recovery"). | | (See the section "Server Failure and Recovery"). | |
| | | | |
| The client must also employ the SETCLIENTID operation when it | | The client must also employ the SETCLIENTID operation when it | |
| receives a NFS4ERR_STALE_STATEID error using a stateid derived from | | receives a NFS4ERR_STALE_STATEID error using a stateid derived from | |
| its current clientid, since this also indicates a server reboot which | | its current clientid, since this also indicates a server reboot which | |
| has invalidated the existing clientid (see the next section | | has invalidated the existing clientid (see the next section | |
| "lock_owner and stateid Definition" for details). | | "lock_owner and stateid Definition" for details). | |
| | | | |
| See the detailed descriptions of SETCLIENTID and SETCLIENTID_CONFIRM | | See the detailed descriptions of SETCLIENTID and SETCLIENTID_CONFIRM | |
| for a complete specification of the operations. | | for a complete specification of the operations. | |
| | | | |
| | | | |
| skipping to change at page 68, line 5 | | skipping to change at page 67, line 5 | |
| there had been no activity from that client for many minutes. | | there had been no activity from that client for many minutes. | |
| | | | |
| Note that if the id string in a SETCLIENTID request is properly | | Note that if the id string in a SETCLIENTID request is properly | |
| constructed, and if the client takes care to use the same principal | | constructed, and if the client takes care to use the same principal | |
| for each successive use of SETCLIENTID, then, barring an active | | for each successive use of SETCLIENTID, then, barring an active | |
| denial of service attack, NFS4ERR_CLID_INUSE should never be | | denial of service attack, NFS4ERR_CLID_INUSE should never be | |
| returned. | | returned. | |
| | | | |
| However, client bugs, server bugs, or perhaps a deliberate change of | | However, client bugs, server bugs, or perhaps a deliberate change of | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| the principal owner of the id string (such as the case of a client | | the principal owner of the id string (such as the case of a client | |
| that changes security flavors, and under the new flavor, there is no | | that changes security flavors, and under the new flavor, there is no | |
| mapping to the previous owner) will in rare cases result in | | mapping to the previous owner) will in rare cases result in | |
| NFS4ERR_CLID_INUSE. | | NFS4ERR_CLID_INUSE. | |
| | | | |
| In that event, when the server gets a SETCLIENTID for a client id | | In that event, when the server gets a SETCLIENTID for a client id | |
| that currently has no state, or it has state, but the lease has | | that currently has no state, or it has state, but the lease has | |
| expired, rather than returning NFS4ERR_CLID_INUSE, the server MUST | | expired, rather than returning NFS4ERR_CLID_INUSE, the server MUST | |
| allow the SETCLIENTID, and confirm the new clientid if followed by | | allow the SETCLIENTID, and confirm the new clientid if followed by | |
| | | | |
| skipping to change at page 69, line 5 | | skipping to change at page 68, line 5 | |
| o The stateid was generated by an earlier server instance (i.e. | | o The stateid was generated by an earlier server instance (i.e. | |
| before a server reboot). The error NFS4ERR_STALE_STATEID should | | before a server reboot). The error NFS4ERR_STALE_STATEID should | |
| be returned. | | be returned. | |
| | | | |
| o The stateid was generated by the current server instance but the | | o The stateid was generated by the current server instance but the | |
| stateid no longer designates the current locking state for the | | stateid no longer designates the current locking state for the | |
| lockowner-file pair in question (i.e. one or more locking | | lockowner-file pair in question (i.e. one or more locking | |
| operations has occurred). The error NFS4ERR_OLD_STATEID should | | operations has occurred). The error NFS4ERR_OLD_STATEID should | |
| be returned. | | be returned. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| This error condition will only occur when the client issues a | | This error condition will only occur when the client issues a | |
| locking request which changes a stateid while an I/O request | | locking request which changes a stateid while an I/O request | |
| that uses that stateid is outstanding. | | that uses that stateid is outstanding. | |
| | | | |
| o The stateid was generated by the current server instance but the | | o The stateid was generated by the current server instance but the | |
| stateid does not designate a locking state for any active | | stateid does not designate a locking state for any active | |
| lockowner-file pair. The error NFS4ERR_BAD_STATEID should be | | lockowner-file pair. The error NFS4ERR_BAD_STATEID should be | |
| returned. | | returned. | |
| | | | |
| This error condition will occur when there has been a logic | | This error condition will occur when there has been a logic | |
| error on the part of the client or server. This should not | | error on the part of the client or server. This should not | |
| happen. | | happen. | |
| | | | |
| One mechanism that may be used to satisfy these requirements is for | | One mechanism that may be used to satisfy these requirements is for | |
| the server to, | | the server to, | |
| | | | |
| o divide the "other" field of each stateid into two fields: | | o divide the "other" field of each stateid into two fields: | |
| | | | |
| - A server verifier which uniquely designates a particular | | - A server verifier which uniquely designates a particular | |
|
| server | | server instantiation. | |
| instantiation. | | | |
| | | | |
| - An index into a table of locking-state structures. | | - An index into a table of locking-state structures. | |
| | | | |
| o utilize the "seqid" field of each stateid, such that seqid is | | o utilize the "seqid" field of each stateid, such that seqid is | |
| monotonically incremented for each stateid that is associated | | monotonically incremented for each stateid that is associated | |
| with the same index into the locking-state table. | | with the same index into the locking-state table. | |
| | | | |
| By matching the incoming stateid and its field values with the state | | By matching the incoming stateid and its field values with the state | |
| held at the server, the server is able to easily determine if a | | held at the server, the server is able to easily determine if a | |
| stateid is valid for its current instantiation and state. If the | | stateid is valid for its current instantiation and state. If the | |
| | | | |
| skipping to change at page 69, line 56 | | skipping to change at page 69, line 4 | |
| between the old and new size (i.e. the range truncated or added to | | between the old and new size (i.e. the range truncated or added to | |
| the file by means of the SETATTR), even where SETATTR is not | | the file by means of the SETATTR), even where SETATTR is not | |
| explicitly mentioned in the text. | | explicitly mentioned in the text. | |
| | | | |
| If the lock_owner performs a READ or WRITE in a situation in which it | | If the lock_owner performs a READ or WRITE in a situation in which it | |
| has established a lock or share reservation on the server (any OPEN | | has established a lock or share reservation on the server (any OPEN | |
| constitutes a share reservation) the stateid (previously returned by | | constitutes a share reservation) the stateid (previously returned by | |
| the server) must be used to indicate what locks, including both | | the server) must be used to indicate what locks, including both | |
| record locks and share reservations, are held by the lockowner. If | | record locks and share reservations, are held by the lockowner. If | |
| no state is established by the client, either record lock or share | | no state is established by the client, either record lock or share | |
|
| reservation, a stateid of all bits 0 is used. Regardless whether a | | | |
| stateid of all bits 0, or a stateid returned by the server is used, | | | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
|
| | | reservation, a stateid of all bits 0 is used. Regardless whether a | |
| | | stateid of all bits 0, or a stateid returned by the server is used, | |
| if there is a conflicting share reservation or mandatory record lock | | if there is a conflicting share reservation or mandatory record lock | |
| held on the file, the server MUST refuse to service the READ or WRITE | | held on the file, the server MUST refuse to service the READ or WRITE | |
| operation. | | operation. | |
| | | | |
| Share reservations are established by OPEN operations and by their | | Share reservations are established by OPEN operations and by their | |
| nature are mandatory in that when the OPEN denies READ or WRITE | | nature are mandatory in that when the OPEN denies READ or WRITE | |
| operations, that denial results in such operations being rejected | | operations, that denial results in such operations being rejected | |
| with error NFS4ERR_LOCKED. Record locks may be implemented by the | | with error NFS4ERR_LOCKED. Record locks may be implemented by the | |
| server as either mandatory or advisory, or the choice of mandatory or | | server as either mandatory or advisory, or the choice of mandatory or | |
| advisory behavior may be determined by the server on the basis of the | | advisory behavior may be determined by the server on the basis of the | |
| file being accessed (for example, some UNIX-based servers support a | | file being accessed (for example, some UNIX-based servers support a | |
| "mandatory lock bit" on the mode attribute such that if set, record | | "mandatory lock bit" on the mode attribute such that if set, record | |
| locks are required on the file before I/O is possible). When record | | locks are required on the file before I/O is possible). When record | |
| locks are advisory, they only prevent the granting of conflicting | | locks are advisory, they only prevent the granting of conflicting | |
|
| lock requests and have no effect on READ's or WRITE's. Mandatory | | lock requests and have no effect on READs or WRITEs. Mandatory | |
| record locks, however, prevent conflicting I/O operations. When they | | record locks, however, prevent conflicting I/O operations. When they | |
|
| are attempted, they are rejected with NFS4ERR_LOCKED. Assuming an | | are attempted, they are rejected with NFS4ERR_LOCKED. When the | |
| operating environment like UNIX that requires it, when the client | | client gets NFS4ERR_LOCKED on a file it knows it has the proper share | |
| gets NFS4ERR_LOCKED on a file it knows it has the proper share | | | |
| reservation for, it will need to issue a LOCK request on the region | | reservation for, it will need to issue a LOCK request on the region | |
| of the file that includes the region the I/O was to be performed on, | | of the file that includes the region the I/O was to be performed on, | |
| with an appropriate locktype (i.e. READ*_LT for a READ operation, | | with an appropriate locktype (i.e. READ*_LT for a READ operation, | |
| WRITE*_LT for a WRITE operation). | | WRITE*_LT for a WRITE operation). | |
| | | | |
| With NFS version 3, there was no notion of a stateid so there was no | | With NFS version 3, there was no notion of a stateid so there was no | |
| way to tell if the application process of the client sending the READ | | way to tell if the application process of the client sending the READ | |
| or WRITE operation had also acquired the appropriate record lock on | | or WRITE operation had also acquired the appropriate record lock on | |
| the file. Thus there was no way to implement mandatory locking. With | | the file. Thus there was no way to implement mandatory locking. With | |
| the stateid construct, this barrier has been removed. | | the stateid construct, this barrier has been removed. | |
| | | | |
| skipping to change at page 70, line 58 | | skipping to change at page 70, line 5 | |
| NFS4ERR_LOCKED. | | NFS4ERR_LOCKED. | |
| | | | |
| For Windows environments, there are no advisory record locks, so the | | For Windows environments, there are no advisory record locks, so the | |
| server always checks for record locks during I/O requests. | | server always checks for record locks during I/O requests. | |
| | | | |
| Thus, the NFS version 4 LOCK operation does not need to distinguish | | Thus, the NFS version 4 LOCK operation does not need to distinguish | |
| between advisory and mandatory record locks. It is the NFS version 4 | | between advisory and mandatory record locks. It is the NFS version 4 | |
| server's processing of the READ and WRITE operations that introduces | | server's processing of the READ and WRITE operations that introduces | |
| the distinction. | | the distinction. | |
| | | | |
|
| Every stateid other than the special stateid values noted in this | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
|
| | | Every stateid other than the special stateid values noted in this | |
| section, whether returned by an OPEN-type operation (i.e. OPEN, | | section, whether returned by an OPEN-type operation (i.e. OPEN, | |
| OPEN_DOWNGRADE), or by a LOCK-type operation (i.e. LOCK or LOCKU), | | OPEN_DOWNGRADE), or by a LOCK-type operation (i.e. LOCK or LOCKU), | |
| defines an access mode for the file (i.e. READ, WRITE, or READ-WRITE) | | defines an access mode for the file (i.e. READ, WRITE, or READ-WRITE) | |
| as established by the original OPEN which began the stateid sequence, | | as established by the original OPEN which began the stateid sequence, | |
|
| and as modified by subsequent OPEN's and OPEN_DOWNGRADE's within that | | and as modified by subsequent OPENs and OPEN_DOWNGRADEs within that | |
| stateid sequence. When a READ, WRITE, or SETATTR which specifies the | | stateid sequence. When a READ, WRITE, or SETATTR which specifies the | |
| size attribute, is done, the operation is subject to checking against | | size attribute, is done, the operation is subject to checking against | |
| the access mode to verify that the operation is appropriate given the | | the access mode to verify that the operation is appropriate given the | |
| OPEN with which the operation is associated. | | OPEN with which the operation is associated. | |
| | | | |
|
| In the case of WRITE-type operations (i.e. WRITE's and SETATTR's | | In the case of WRITE-type operations (i.e. WRITEs and SETATTRs which | |
| which set size), the server must verify that the access mode allows | | set size), the server must verify that the access mode allows writing | |
| writing and return an NFS4ERR_OPENMODE error if it does not. In the | | and return an NFS4ERR_OPENMODE error if it does not. In the case, of | |
| case, of READ, the server may perform the corresponding check on the | | READ, the server may perform the corresponding check on the access | |
| access mode, or it may choose to allow READ on opens for WRITE only, | | mode, or it may choose to allow READ on opens for WRITE only, to | |
| to accommodate clients whose write implementation may unavoidably do | | accommodate clients whose write implementation may unavoidably do | |
| reads (e.g. due to buffer cache constraints). However, even if | | reads (e.g. due to buffer cache constraints). However, even if READs | |
| READ's are allowed in these circumstances, the server MUST still | | are allowed in these circumstances, the server MUST still check for | |
| check for locks that conflict with the READ (e.g. another open | | locks that conflict with the READ (e.g. another open specify denial | |
| specify denial of READ's). Note that a server which does enforce the | | of READs). Note that a server which does enforce the access mode | |
| access mode check on READ's need not explicitly check for conflicting | | check on READs need not explicitly check for conflicting share | |
| share reservations since the existence of OPEN for read access | | reservations since the existence of OPEN for read access guarantees | |
| guarantees that no conflicting share reservation can exist. | | that no conflicting share reservation can exist. | |
| | | | |
| A stateid of all bits 1 (one) MAY allow READ operations to bypass | | A stateid of all bits 1 (one) MAY allow READ operations to bypass | |
| locking checks at the server. However, WRITE operations with a | | locking checks at the server. However, WRITE operations with a | |
| stateid with bits all 1 (one) MUST NOT bypass locking checks and are | | stateid with bits all 1 (one) MUST NOT bypass locking checks and are | |
| treated exactly the same as if a stateid of all bits 0 were used. | | treated exactly the same as if a stateid of all bits 0 were used. | |
| | | | |
| A lock may not be granted while a READ or WRITE operation using one | | A lock may not be granted while a READ or WRITE operation using one | |
| of the special stateids is being performed and the range of the lock | | of the special stateids is being performed and the range of the lock | |
| request conflicts with the range of the READ or WRITE operation. For | | request conflicts with the range of the READ or WRITE operation. For | |
| the purposes of this paragraph, a conflict occurs when a shared lock | | the purposes of this paragraph, a conflict occurs when a shared lock | |
| | | | |
| skipping to change at page 71, line 57 | | skipping to change at page 71, line 4 | |
| Locking is different than most NFS operations as it requires "at- | | Locking is different than most NFS operations as it requires "at- | |
| most-one" semantics that are not provided by ONCRPC. ONCRPC over a | | most-one" semantics that are not provided by ONCRPC. ONCRPC over a | |
| reliable transport is not sufficient because a sequence of locking | | reliable transport is not sufficient because a sequence of locking | |
| requests may span multiple TCP connections. In the face of | | requests may span multiple TCP connections. In the face of | |
| retransmission or reordering, lock or unlock requests must have a | | retransmission or reordering, lock or unlock requests must have a | |
| well defined and consistent behavior. To accomplish this, each lock | | well defined and consistent behavior. To accomplish this, each lock | |
| request contains a sequence number that is a consecutively increasing | | request contains a sequence number that is a consecutively increasing | |
| integer. Different lock_owners have different sequences. The server | | integer. Different lock_owners have different sequences. The server | |
| maintains the last sequence number (L) received and the response that | | maintains the last sequence number (L) received and the response that | |
| was returned. The first request issued for any given lock_owner is | | was returned. The first request issued for any given lock_owner is | |
|
| issued with a sequence number of zero. | | | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| | | issued with a sequence number of zero. | |
| | | | |
| Note that for requests that contain a sequence number, for each | | Note that for requests that contain a sequence number, for each | |
| lock_owner, there should be no more than one outstanding request. | | lock_owner, there should be no more than one outstanding request. | |
| | | | |
| If a request (r) with a previous sequence number (r < L) is received, | | If a request (r) with a previous sequence number (r < L) is received, | |
| it is rejected with the return of error NFS4ERR_BAD_SEQID. Given a | | it is rejected with the return of error NFS4ERR_BAD_SEQID. Given a | |
| properly-functioning client, the response to (r) must have been | | properly-functioning client, the response to (r) must have been | |
| received before the last request (L) was sent. If a duplicate of | | received before the last request (L) was sent. If a duplicate of | |
| last request (r == L) is received, the stored response is returned. | | last request (r == L) is received, the stored response is returned. | |
| If a request beyond the next sequence (r == L + 2) is received, it is | | If a request beyond the next sequence (r == L + 2) is received, it is | |
| | | | |
| skipping to change at page 72, line 38 | | skipping to change at page 71, line 40 | |
| algorithm for removing unneeded requests. However, the last lock | | algorithm for removing unneeded requests. However, the last lock | |
| request and response on a given lock_owner must be cached as long as | | request and response on a given lock_owner must be cached as long as | |
| the lock state exists on the server. | | the lock state exists on the server. | |
| | | | |
| The client MUST monotonically increment the sequence number for the | | The client MUST monotonically increment the sequence number for the | |
| CLOSE, LOCK, LOCKU, OPEN, OPEN_CONFIRM, and OPEN_DOWNGRADE | | CLOSE, LOCK, LOCKU, OPEN, OPEN_CONFIRM, and OPEN_DOWNGRADE | |
| operations. This is true even in the event that the previous | | operations. This is true even in the event that the previous | |
| operation that used the sequence number received an error. The only | | operation that used the sequence number received an error. The only | |
| exception to this rule is if the previous operation received one of | | exception to this rule is if the previous operation received one of | |
| the following errors: NFS4ERR_STALE_CLIENTID, NFS4ERR_STALE_STATEID, | | the following errors: NFS4ERR_STALE_CLIENTID, NFS4ERR_STALE_STATEID, | |
|
| NFS4ERR_BAD_STATEID, NFS4ERR_BAD_SEQID. | | NFS4ERR_BAD_STATEID, NFS4ERR_BAD_SEQID, NFS4ERR_BADXDR, | |
| | | NFS4ERR_RESOURCE, NFS4ERR_NOFILEHANDLE. | |
| | | | |
| 8.1.6. Recovery from Replayed Requests | | 8.1.6. Recovery from Replayed Requests | |
| | | | |
| As described above, the sequence number is per lock_owner. As long | | As described above, the sequence number is per lock_owner. As long | |
| as the server maintains the last sequence number received and follows | | as the server maintains the last sequence number received and follows | |
| the methods described above, there are no risks of a Byzantine router | | the methods described above, there are no risks of a Byzantine router | |
| re-sending old requests. The server need only maintain the | | re-sending old requests. The server need only maintain the | |
| (lock_owner, sequence number) state as long as there are open files | | (lock_owner, sequence number) state as long as there are open files | |
| or closed files with locks outstanding. | | or closed files with locks outstanding. | |
| | | | |
| LOCK, LOCKU, OPEN, OPEN_DOWNGRADE, and CLOSE each contain a sequence | | LOCK, LOCKU, OPEN, OPEN_DOWNGRADE, and CLOSE each contain a sequence | |
| number and therefore the risk of the replay of these operations | | number and therefore the risk of the replay of these operations | |
| resulting in undesired effects is non-existent while the server | | resulting in undesired effects is non-existent while the server | |
| maintains the lock_owner state. | | maintains the lock_owner state. | |
| | | | |
|
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| 8.1.7. Releasing lock_owner State | | 8.1.7. Releasing lock_owner State | |
| | | | |
| When a particular lock_owner no longer holds open or file locking | | When a particular lock_owner no longer holds open or file locking | |
|
| | | | |
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
| state at the server, the server may choose to release the sequence | | state at the server, the server may choose to release the sequence | |
| number state associated with the lock_owner. The server may make | | number state associated with the lock_owner. The server may make | |
| this choice based on lease expiration, for the reclamation of server | | this choice based on lease expiration, for the reclamation of server | |
| memory, or other implementation specific details. In any event, the | | memory, or other implementation specific details. In any event, the | |
| server is able to do this safely only when the lock_owner no longer | | server is able to do this safely only when the lock_owner no longer | |
| is being utilized by the client. The server may choose to hold the | | is being utilized by the client. The server may choose to hold the | |
| lock_owner state in the event that retransmitted requests are | | lock_owner state in the event that retransmitted requests are | |
| received. However, the period to hold this state is implementation | | received. However, the period to hold this state is implementation | |
| specific. | | specific. | |
| | | | |
| | | | |
| skipping to change at page 73, line 54 | | skipping to change at page 73, line 5 | |
| situations in which the server can avoid the need for confirmation | | situations in which the server can avoid the need for confirmation | |
| when responding to open requests. The two constraints are: | | when responding to open requests. The two constraints are: | |
| | | | |
| o The server must not bestow a delegation for any open which would | | o The server must not bestow a delegation for any open which would | |
| require confirmation. | | require confirmation. | |
| | | | |
| o The server MUST NOT require confirmation on a reclaim-type open | | o The server MUST NOT require confirmation on a reclaim-type open | |
| (i.e. one specifying claim type CLAIM_PREVIOUS or | | (i.e. one specifying claim type CLAIM_PREVIOUS or | |
| CLAIM_DELEGATE_PREV). | | CLAIM_DELEGATE_PREV). | |
| | | | |
|
| These constraints are related in that reclaim-type opens are the | | Draft Specification NFS version 4 Protocol September 2002 | |
| only ones in which the server may be required to send a | | | |
| delegation. For CLAIM_NULL, sending the delegation is optional | | | |
| while for CLAIM_DELEGATE_CUR, no delegation is sent. | | | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | These constraints are related in that reclaim-type opens are the only | |
| | | ones in which the server may be required to send a delegation. For | |
| | | CLAIM_NULL, sending the delegation is optional while for | |
| | | CLAIM_DELEGATE_CUR, no delegation is sent. | |
| | | | |
| Delegations being sent with an open requiring confirmation are | | Delegations being sent with an open requiring confirmation are | |
| troublesome because recovering from non-confirmation adds undue | | troublesome because recovering from non-confirmation adds undue | |
|
| complexity to the protocol while requiring confirmation on | | complexity to the protocol while requiring confirmation on reclaim- | |
| reclaim-type opens poses difficulties in that the inability to | | type opens poses difficulties in that the inability to resolve the | |
| resolve the status of the reclaim until lease expiration may | | status of the reclaim until lease expiration may make it difficult to | |
| make it difficult to have timely determination of the set of | | have timely determination of the set of locks being reclaimed (since | |
| locks being reclaimed (since the grace period may expire). | | the grace period may expire). | |
| | | | |
| Requiring open confirmation on reclaim-type opens is avoidable | | Requiring open confirmation on reclaim-type opens is avoidable | |
|
| because of the nature of the environments in which such opens | | because of the nature of the environments in which such opens are | |
| are done. For CLAIM_PREVIOUS opens, this is immediately after | | done. For CLAIM_PREVIOUS opens, this is immediately after server | |
| server reboot, so there should be no time for lockowners to be | | reboot, so there should be no time for lockowners to be created, | |
| created, found to be unused, and recycled. For | | found to be unused, and recycled. For CLAIM_DELEGATE_PREV opens, we | |
| CLAIM_DELEGATE_PREV opens, we are dealing with a client reboot | | are dealing with a client reboot situation. A server which supports | |
| situation. A server which supports delegation can be sure that | | delegation can be sure that no lockowners for that client have been | |
| no lockowners for that client have been recycled since client | | recycled since client initialization and thus can ensure that | |
| initialization and thus can ensure that confirmation will not be | | confirmation will not be required. | |
| required. | | | |
| | | | |
| 8.2. Lock Ranges | | 8.2. Lock Ranges | |
| | | | |
| The protocol allows a lock owner to request a lock with a byte range | | The protocol allows a lock owner to request a lock with a byte range | |
| and then either upgrade or unlock a sub-range of the initial lock. | | and then either upgrade or unlock a sub-range of the initial lock. | |
| It is expected that this will be an uncommon type of request. In any | | It is expected that this will be an uncommon type of request. In any | |
| case, servers or server filesystems may not be able to support sub- | | case, servers or server filesystems may not be able to support sub- | |
| range lock semantics. In the event that a server receives a locking | | range lock semantics. In the event that a server receives a locking | |
| request that represents a sub-range of current locking state for the | | request that represents a sub-range of current locking state for the | |
| lock owner, the server is allowed to return the error | | lock owner, the server is allowed to return the error | |
| | | | |
| skipping to change at page 74, line 53 | | skipping to change at page 74, line 4 | |
| the recovery of file locking state in the event of server failure. | | the recovery of file locking state in the event of server failure. | |
| As discussed in the section "Server Failure and Recovery" below, the | | As discussed in the section "Server Failure and Recovery" below, the | |
| server may employ certain optimizations during recovery that work | | server may employ certain optimizations during recovery that work | |
| effectively only when the client's behavior during lock recovery is | | effectively only when the client's behavior during lock recovery is | |
| similar to the client's locking behavior prior to server failure. | | similar to the client's locking behavior prior to server failure. | |
| | | | |
| 8.3. Upgrading and Downgrading Locks | | 8.3. Upgrading and Downgrading Locks | |
| | | | |
| If a client has a write lock on a record, it can request an atomic | | If a client has a write lock on a record, it can request an atomic | |
| downgrade of the lock to a read lock via the LOCK request, by setting | | downgrade of the lock to a read lock via the LOCK request, by setting | |
|
| | | | |
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| the type to READ_LT. If the server supports atomic downgrade, the | | the type to READ_LT. If the server supports atomic downgrade, the | |
| request will succeed. If not, it will return NFS4ERR_LOCK_NOTSUPP. | | request will succeed. If not, it will return NFS4ERR_LOCK_NOTSUPP. | |
| The client should be prepared to receive this error, and if | | The client should be prepared to receive this error, and if | |
| appropriate, report the error to the requesting application. | | appropriate, report the error to the requesting application. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
| If a client has a read lock on a record, it can request an atomic | | If a client has a read lock on a record, it can request an atomic | |
| upgrade of the lock to a write lock via the LOCK request by setting | | upgrade of the lock to a write lock via the LOCK request by setting | |
| the type to WRITE_LT or WRITEW_LT. If the server does not support | | the type to WRITE_LT or WRITEW_LT. If the server does not support | |
| atomic upgrade, it will return NFS4ERR_LOCK_NOTSUPP. If the upgrade | | atomic upgrade, it will return NFS4ERR_LOCK_NOTSUPP. If the upgrade | |
| can be achieved without an existing conflict, the request will | | can be achieved without an existing conflict, the request will | |
| succeed. Otherwise, the server will return either NFS4ERR_DENIED or | | succeed. Otherwise, the server will return either NFS4ERR_DENIED or | |
| NFS4ERR_DEADLOCK. The error NFS4ERR_DEADLOCK is returned if the | | NFS4ERR_DEADLOCK. The error NFS4ERR_DEADLOCK is returned if the | |
| client issued the LOCK request with the type set to WRITEW_LT and the | | client issued the LOCK request with the type set to WRITEW_LT and the | |
| server has detected a deadlock. The client should be prepared to | | server has detected a deadlock. The client should be prepared to | |
| receive such errors and if appropriate, report the error to the | | receive such errors and if appropriate, report the error to the | |
| | | | |
| skipping to change at page 75, line 52 | | skipping to change at page 75, line 4 | |
| released, allowing a successful return. In this way, clients can | | released, allowing a successful return. In this way, clients can | |
| avoid the burden of needlessly frequent polling for blocking locks. | | avoid the burden of needlessly frequent polling for blocking locks. | |
| The server should take care in the length of delay in the event the | | The server should take care in the length of delay in the event the | |
| client retransmits the request. | | client retransmits the request. | |
| | | | |
| 8.5. Lease Renewal | | 8.5. Lease Renewal | |
| | | | |
| The purpose of a lease is to allow a server to remove stale locks | | The purpose of a lease is to allow a server to remove stale locks | |
| that are held by a client that has crashed or is otherwise | | that are held by a client that has crashed or is otherwise | |
| unreachable. It is not a mechanism for cache consistency and lease | | unreachable. It is not a mechanism for cache consistency and lease | |
|
| | | | |
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| renewals may not be denied if the lease interval has not expired. | | renewals may not be denied if the lease interval has not expired. | |
| | | | |
| The following events cause implicit renewal of all of the leases for | | The following events cause implicit renewal of all of the leases for | |
| a given client (i.e. all those sharing a given clientid). Each of | | a given client (i.e. all those sharing a given clientid). Each of | |
| these is a positive indication that the client is still active and | | these is a positive indication that the client is still active and | |
|
| | | | |
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
| that the associated state held at the server, for the client, is | | that the associated state held at the server, for the client, is | |
| still valid. | | still valid. | |
| | | | |
| o An OPEN with a valid clientid. | | o An OPEN with a valid clientid. | |
| | | | |
| o Any operation made with a valid stateid (CLOSE, DELEGPURGE, | | o Any operation made with a valid stateid (CLOSE, DELEGPURGE, | |
| DELEGRETURN, LOCK, LOCKU, OPEN, OPEN_CONFIRM, OPEN_DOWNGRADE, | | DELEGRETURN, LOCK, LOCKU, OPEN, OPEN_CONFIRM, OPEN_DOWNGRADE, | |
| READ, RENEW, SETATTR, WRITE). This does not include the special | | READ, RENEW, SETATTR, WRITE). This does not include the special | |
| stateids of all bits 0 or all bits 1. | | stateids of all bits 0 or all bits 1. | |
| | | | |
| | | | |
| skipping to change at page 76, line 52 | | skipping to change at page 76, line 5 | |
| 8.6. Crash Recovery | | 8.6. Crash Recovery | |
| | | | |
| The important requirement in crash recovery is that both the client | | The important requirement in crash recovery is that both the client | |
| and the server know when the other has failed. Additionally, it is | | and the server know when the other has failed. Additionally, it is | |
| required that a client sees a consistent view of data across server | | required that a client sees a consistent view of data across server | |
| restarts or reboots. All READ and WRITE operations that may have | | restarts or reboots. All READ and WRITE operations that may have | |
| been queued within the client or network buffers must wait until the | | been queued within the client or network buffers must wait until the | |
| client has successfully recovered the locks protecting the READ and | | client has successfully recovered the locks protecting the READ and | |
| WRITE operations. | | WRITE operations. | |
| | | | |
|
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| 8.6.1. Client Failure and Recovery | | 8.6.1. Client Failure and Recovery | |
| | | | |
| In the event that a client fails, the server may recover the client's | | In the event that a client fails, the server may recover the client's | |
| locks when the associated leases have expired. Conflicting locks | | locks when the associated leases have expired. Conflicting locks | |
| from another client may only be granted after this lease expiration. | | from another client may only be granted after this lease expiration. | |
|
| | | | |
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
| If the client is able to restart or reinitialize within the lease | | If the client is able to restart or reinitialize within the lease | |
| period the client may be forced to wait the remainder of the lease | | period the client may be forced to wait the remainder of the lease | |
| period before obtaining new locks. | | period before obtaining new locks. | |
| | | | |
| To minimize client delay upon restart, lock requests are associated | | To minimize client delay upon restart, lock requests are associated | |
| with an instance of the client by a client supplied verifier. This | | with an instance of the client by a client supplied verifier. This | |
| verifier is part of the initial SETCLIENTID call made by the client. | | verifier is part of the initial SETCLIENTID call made by the client. | |
| The server returns a clientid as a result of the SETCLIENTID | | The server returns a clientid as a result of the SETCLIENTID | |
| operation. The client then confirms the use of the clientid with | | operation. The client then confirms the use of the clientid with | |
| SETCLIENTID_CONFIRM. The clientid in combination with an opaque | | SETCLIENTID_CONFIRM. The clientid in combination with an opaque | |
| | | | |
| skipping to change at page 77, line 53 | | skipping to change at page 77, line 4 | |
| | | | |
| A client can determine that server failure (and thus loss of locking | | A client can determine that server failure (and thus loss of locking | |
| state) has occurred, when it receives one of two errors. The | | state) has occurred, when it receives one of two errors. The | |
| NFS4ERR_STALE_STATEID error indicates a stateid invalidated by a | | NFS4ERR_STALE_STATEID error indicates a stateid invalidated by a | |
| reboot or restart. The NFS4ERR_STALE_CLIENTID error indicates a | | reboot or restart. The NFS4ERR_STALE_CLIENTID error indicates a | |
| clientid invalidated by reboot or restart. When either of these are | | clientid invalidated by reboot or restart. When either of these are | |
| received, the client must establish a new clientid (See the section | | received, the client must establish a new clientid (See the section | |
| "Client ID") and re-establish the locking state as discussed below. | | "Client ID") and re-establish the locking state as discussed below. | |
| | | | |
| The period of special handling of locking and READs and WRITEs, equal | | The period of special handling of locking and READs and WRITEs, equal | |
|
| | | | |
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| in duration to the lease period, is referred to as the "grace | | in duration to the lease period, is referred to as the "grace | |
| period". During the grace period, clients recover locks and the | | period". During the grace period, clients recover locks and the | |
| associated state by reclaim-type locking requests (i.e. LOCK requests | | associated state by reclaim-type locking requests (i.e. LOCK requests | |
| with reclaim set to true and OPEN operations with a claim type of | | with reclaim set to true and OPEN operations with a claim type of | |
| CLAIM_PREVIOUS). During the grace period, the server must reject | | CLAIM_PREVIOUS). During the grace period, the server must reject | |
|
| | | | |
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
| READ and WRITE operations and non-reclaim locking requests (i.e. | | READ and WRITE operations and non-reclaim locking requests (i.e. | |
| other LOCK and OPEN operations) with an error of NFS4ERR_GRACE. | | other LOCK and OPEN operations) with an error of NFS4ERR_GRACE. | |
| | | | |
| If the server can reliably determine that granting a non-reclaim | | If the server can reliably determine that granting a non-reclaim | |
| request will not conflict with reclamation of locks by other clients, | | request will not conflict with reclamation of locks by other clients, | |
| the NFS4ERR_GRACE error does not have to be returned and the non- | | the NFS4ERR_GRACE error does not have to be returned and the non- | |
| reclaim client request can be serviced. For the server to be able to | | reclaim client request can be serviced. For the server to be able to | |
| service READ and WRITE operations during the grace period, it must | | service READ and WRITE operations during the grace period, it must | |
| again be able to guarantee that no possible conflict could arise | | again be able to guarantee that no possible conflict could arise | |
| between an impending reclaim locking request and the READ or WRITE | | between an impending reclaim locking request and the READ or WRITE | |
| | | | |
| skipping to change at page 78, line 54 | | skipping to change at page 78, line 4 | |
| Clients should be prepared for the return of NFS4ERR_GRACE errors for | | Clients should be prepared for the return of NFS4ERR_GRACE errors for | |
| non-reclaim lock and I/O requests. In this case the client should | | non-reclaim lock and I/O requests. In this case the client should | |
| employ a retry mechanism for the request. A delay (on the order of | | employ a retry mechanism for the request. A delay (on the order of | |
| several seconds) between retries should be used to avoid overwhelming | | several seconds) between retries should be used to avoid overwhelming | |
| the server. Further discussion of the general issue is included in | | the server. Further discussion of the general issue is included in | |
| [Floyd]. The client must account for the server that is able to | | [Floyd]. The client must account for the server that is able to | |
| perform I/O and non-reclaim locking requests within the grace period | | perform I/O and non-reclaim locking requests within the grace period | |
| as well as those that can not do so. | | as well as those that can not do so. | |
| | | | |
| A reclaim-type locking request outside the server's grace period can | | A reclaim-type locking request outside the server's grace period can | |
|
| | | | |
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| only succeed if the server can guarantee that no conflicting lock or | | only succeed if the server can guarantee that no conflicting lock or | |
| I/O request has been granted since reboot or restart. | | I/O request has been granted since reboot or restart. | |
| | | | |
| A server may, upon restart, establish a new value for the lease | | A server may, upon restart, establish a new value for the lease | |
| period. Therefore, clients should, once a new clientid is | | period. Therefore, clients should, once a new clientid is | |
|
| | | | |
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
| established, refetch the lease_time attribute and use it as the basis | | established, refetch the lease_time attribute and use it as the basis | |
| for lease renewal for the lease associated with that server. However, | | for lease renewal for the lease associated with that server. However, | |
| the server must establish, for this restart event, a grace period at | | the server must establish, for this restart event, a grace period at | |
| least as long as the lease period for the previous server | | least as long as the lease period for the previous server | |
| instantiation. This allows the client state obtained during the | | instantiation. This allows the client state obtained during the | |
| previous server instance to be reliably re-established. | | previous server instance to be reliably re-established. | |
| | | | |
| 8.6.3. Network Partitions and Recovery | | 8.6.3. Network Partitions and Recovery | |
| | | | |
| If the duration of a network partition is greater than the lease | | If the duration of a network partition is greater than the lease | |
| | | | |
| skipping to change at page 79, line 33 | | skipping to change at page 78, line 38 | |
| returning the error NFS4ERR_EXPIRED. Once this error is received, | | returning the error NFS4ERR_EXPIRED. Once this error is received, | |
| the client will suitably notify the application that held the lock. | | the client will suitably notify the application that held the lock. | |
| | | | |
| As a courtesy to the client or as an optimization, the server may | | As a courtesy to the client or as an optimization, the server may | |
| continue to hold locks on behalf of a client for which recent | | continue to hold locks on behalf of a client for which recent | |
| communication has extended beyond the lease period. If the server | | communication has extended beyond the lease period. If the server | |
| receives a lock or I/O request that conflicts with one of these | | receives a lock or I/O request that conflicts with one of these | |
| courtesy locks, the server must free the courtesy lock and grant the | | courtesy locks, the server must free the courtesy lock and grant the | |
| new request. | | new request. | |
| | | | |
|
| If the server continues to hold locks beyond the expiration of a | | When a network partition is combined with a server reboot, there are | |
| client's lease, the server MUST employ a method of recording this | | edge conditions that place requirements on the server in order to | |
| fact in its stable storage. Conflicting lock requests from another | | avoid silent data corruption following the server reboot. Two of | |
| client may be serviced after the lease expiration. There are various | | these edge conditions are known, and are discussed below. | |
| scenarios involving server failure after such an event that require | | | |
| the storage of these lease expirations or network partitions. One | | | |
| scenario is as follows: | | | |
| | | | |
|
| A client holds a lock at the server and encounters a | | The first edge condition has the following scenario: | |
| network partition and is unable to renew the associated | | | |
| lease. A second client obtains a conflicting lock and then | | | |
| frees the lock. After the unlock request by the second | | | |
| client, the server reboots or reinitializes. Once the | | | |
| server recovers, the network partition heals and the | | | |
| original client attempts to reclaim the original lock. | | | |
| | | | |
|
| In this scenario and without any state information, the server will | | 1. Client A acquires a lock. | |
| allow the reclaim and the client will be in an inconsistent state | | | |
| because the server or the client has no knowledge of the conflicting | | | |
| lock. | | | |
| | | | |
|
| The server may choose to store this lease expiration or network | | 2. Client A and server experience mutual network partition, | |
| partitioning state in a way that will only identify the client as a | | such that client A is unable to renew its lease. | |
| whole. Note that this may potentially lead to lock reclaims being | | | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | 3. Client A's lease expires, so server releases lock. | |
| | | | |
|
| denied unnecessarily because of a mix of conflicting and non- | | 4. Client B acquires a lock that would have conflicted with | |
| conflicting locks. The server may also choose to store information | | that of Client A. | |
| about each lock that has an expired lease with an associated | | | |
| conflicting lock. The choice of the amount and type of state | | 5. Client B releases the lock | |
| information that is stored is left to the implementor. In any case, | | | |
| the server must have enough state information to enable correct | | Draft Specification NFS version 4 Protocol September 2002 | |
| recovery from multiple partitions and multiple server failures. | | | |
| | | 6. Server reboots | |
| | | | |
| | | 7. Network partition between client A and server heals. | |
| | | | |
| | | 8. Client A issues a RENEW operation, and gets back a | |
| | | NFS4ERR_STALE_CLIENTID. | |
| | | | |
| | | 9. Client A reclaims its lock within the server's grace period. | |
| | | | |
| | | Thus, at the final step, the server has erroneously granted client | |
| | | A's lock reclaim. If client B modified the object the lock was | |
| | | protecting, client A will experience object corruption. | |
| | | | |
| | | The second known edge condition follows: | |
| | | | |
| | | 1. Client A acquires a lock. | |
| | | | |
| | | 2. Server reboots. | |
| | | | |
| | | 3. Client A and server experience mutual network partition, | |
| | | such that client A is unable to reclaim its lock within the | |
| | | grace period. | |
| | | | |
| | | 4. Server's reclaim grace period ends. Client A has no locks | |
| | | recorded on server. | |
| | | | |
| | | 5. Client B acquires a lock that would have conflicted with | |
| | | that of Client A. | |
| | | | |
| | | 6. Client B releases the lock | |
| | | | |
| | | 7. Server reboots a second time | |
| | | | |
| | | 8. Network partition between client A and server heals. | |
| | | | |
| | | 9. Client A issues a RENEW operation, and gets back a | |
| | | NFS4ERR_STALE_CLIENTID. | |
| | | | |
| | | 10. Client A reclaims its lock within the server's grace period. | |
| | | | |
| | | As with the first edge condition, the final step of the scenario of | |
| | | the second edge condition has the server erroneously granting client | |
| | | A's lock reclaim. | |
| | | | |
| | | Solving the first and second edge conditions requires that the server | |
| | | either assume after it reboots that edge condition occurs, and thus | |
| | | return NFS4ERR_NO_GRACE for all reclaim attempts, or that the server | |
| | | record some information stable storage. The amount of information | |
| | | the server records in stable storage is in inverse proportion to how | |
| | | harsh the server wants to be whenever the edge conditions occur. The | |
| | | server that is completely tolerant of all edge conditions will record | |
| | | in stable storage every lock that is acquired, removing the lock | |
| | | | |
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| | | record from stable storage only when the lock is unlocked by the | |
| | | client and the lock's lockowner advances the sequence number such | |
| | | that the lock release is not the last stateful event for the | |
| | | lockowner's sequence. For the two aforementioned edge conditions, the | |
| | | harshest a server can be, and still support a grace period for | |
| | | reclaims, requires that the server record in stable storage | |
| | | information some minimal information. For example, a server | |
| | | implementation could, for each client, save in stable storage a | |
| | | record containing: | |
| | | | |
| | | o the client's id string | |
| | | | |
| | | o a boolean that indicates if the client's lease expired or if | |
| | | there was administrative intervention (see the section, | |
| | | Server Revocation of Locks) to revoke a record lock, share | |
| | | reservation, or delegation | |
| | | | |
| | | o a timestamp that is updated the first time after a server | |
| | | boot or reboot the client acquires record locking, share | |
| | | reservation, or delegation state on the server. The | |
| | | timestamp need not be updated on subsequent lock requests | |
| | | until the server reboots. | |
| | | | |
| | | The server implementation would also record in the stable storage the | |
| | | timestamps from the two most recent server reboots. | |
| | | | |
| | | Assuming the above record keeping, for the first edge condition, | |
| | | after the server reboots, the record that client A's lease expired | |
| | | means that another client could have acquired a conflicting record | |
| | | lock, share reservation, or delegation. Hence the server must reject | |
| | | a reclaim from client A with the error NFS4ERR_NO_GRACE. | |
| | | | |
| | | For the second edge condition, after the server reboots for a second | |
| | | time, the record that the client had an unexpired record lock, share | |
| | | reservation, or delegation established before the server's previous | |
| | | incarnation means that the server must reject a reclaim from client A | |
| | | with the error NFS4ERR_NO_GRACE. | |
| | | | |
| | | Regardless of the level and approach to record keeping, the server | |
| | | MUST implement one of the following strategies (which apply to | |
| | | reclaims of share reservations, record locks, and delegations): | |
| | | | |
| | | 1. Reject all reclaims with NFS4ERR_NO_GRACE. This is | |
| | | superharsh, but necessary if the server does not want to | |
| | | record lock state in stable storage. | |
| | | | |
| | | 2. Record sufficient state in stable storage such that all | |
| | | known edge conditions involving server reboot, including the | |
| | | two noted in this section, are detected. False positives are | |
| | | acceptable. Note that at this time, it is not known if there | |
| | | are other edge conditions. | |
| | | | |
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| | | In the event, after a server reboot, the server determines | |
| | | that there is unrecoverable damage or corruption to the the | |
| | | stable storage, then for all clients and/or locks affected, | |
| | | the server MUST return NFS4ERR_NO_GRACE. | |
| | | | |
| | | A mandate for the client's handling of the NFS4ERR_NO_GRACE error is | |
| | | outside the scope of this specification, since the strategies for | |
| | | such handling are very dependent on the client's operating | |
| | | environment. However, one potential approach is described below. | |
| | | | |
| | | When the client receives NFS4ERR_NO_GRACE, it could examine the | |
| | | change attribute of the objects the client is trying to reclaim state | |
| | | for, and use that to determine whether to re-establish the state via | |
| | | normal OPEN or LOCK requests. This is acceptable provided the | |
| | | client's operating environment allows it. In otherwords, the client | |
| | | implementor is advised to document for his users the behavior. The | |
| | | client could also inform the application that its record lock or | |
| | | share reservations (whether they were delegated or not) have been | |
| | | lost, such as via a UNIX signal, a GUI pop-up window, etc. See the | |
| | | section, "Data Caching and Revocation" for a discussion of what the | |
| | | client should do for dealing with unreclaimed delegations on client | |
| | | state. | |
| | | | |
| For further discussion of revocation of locks see the section "Server | | For further discussion of revocation of locks see the section "Server | |
| Revocation of Locks". | | Revocation of Locks". | |
| | | | |
| 8.7. Recovery from a Lock Request Timeout or Abort | | 8.7. Recovery from a Lock Request Timeout or Abort | |
| | | | |
| In the event a lock request times out, a client may decide to not | | In the event a lock request times out, a client may decide to not | |
| retry the request. The client may also abort the request when the | | retry the request. The client may also abort the request when the | |
| process for which it was issued is terminated (e.g. in UNIX due to a | | process for which it was issued is terminated (e.g. in UNIX due to a | |
| signal). It is possible though that the server received the request | | signal). It is possible though that the server received the request | |
| | | | |
| skipping to change at page 80, line 44 | | skipping to change at page 82, line 5 | |
| not receive a response. From this, the next time the client does a | | not receive a response. From this, the next time the client does a | |
| lock operation for the lock_owner, it can send the cached request, if | | lock operation for the lock_owner, it can send the cached request, if | |
| there is one, and if the request was one that established state (e.g. | | there is one, and if the request was one that established state (e.g. | |
| a LOCK or OPEN operation), the server will return the cached result | | a LOCK or OPEN operation), the server will return the cached result | |
| or if never saw the request, perform it. The client can follow up | | or if never saw the request, perform it. The client can follow up | |
| with a request to remove the state (e.g. a LOCKU or CLOSE operation). | | with a request to remove the state (e.g. a LOCKU or CLOSE operation). | |
| With this approach, the sequencing and stateid information on the | | With this approach, the sequencing and stateid information on the | |
| client and server for the given lock_owner will re-synchronize and in | | client and server for the given lock_owner will re-synchronize and in | |
| turn the lock state will re-synchronize. | | turn the lock state will re-synchronize. | |
| | | | |
|
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| 8.8. Server Revocation of Locks | | 8.8. Server Revocation of Locks | |
| | | | |
| At any point, the server can revoke locks held by a client and the | | At any point, the server can revoke locks held by a client and the | |
| client must be prepared for this event. When the client detects that | | client must be prepared for this event. When the client detects that | |
| its locks have been or may have been revoked, the client is | | its locks have been or may have been revoked, the client is | |
| responsible for validating the state information between itself and | | responsible for validating the state information between itself and | |
| the server. Validating locking state for the client means that it | | the server. Validating locking state for the client means that it | |
| must verify or reclaim state for each lock currently held. | | must verify or reclaim state for each lock currently held. | |
| | | | |
| The first instance of lock revocation is upon server reboot or re- | | The first instance of lock revocation is upon server reboot or re- | |
| initialization. In this instance the client will receive an error | | initialization. In this instance the client will receive an error | |
| (NFS4ERR_STALE_STATEID or NFS4ERR_STALE_CLIENTID) and the client will | | (NFS4ERR_STALE_STATEID or NFS4ERR_STALE_CLIENTID) and the client will | |
| proceed with normal crash recovery as described in the previous | | proceed with normal crash recovery as described in the previous | |
|
| | | | |
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
| section. | | section. | |
| | | | |
| The second lock revocation event is the inability to renew the lease | | The second lock revocation event is the inability to renew the lease | |
| before expiration. While this is considered a rare or unusual event, | | before expiration. While this is considered a rare or unusual event, | |
| the client must be prepared to recover. Both the server and client | | the client must be prepared to recover. Both the server and client | |
| will be able to detect the failure to renew the lease and are capable | | will be able to detect the failure to renew the lease and are capable | |
| of recovering without data corruption. For the server, it tracks the | | of recovering without data corruption. For the server, it tracks the | |
| last renewal event serviced for the client and knows when the lease | | last renewal event serviced for the client and knows when the lease | |
| will expire. Similarly, the client must track operations which will | | will expire. Similarly, the client must track operations which will | |
| renew the lease period. Using the time that each such request was | | renew the lease period. Using the time that each such request was | |
| sent and the time that the corresponding reply was received, the | | sent and the time that the corresponding reply was received, the | |
| client should bound the time that the corresponding renewal could | | client should bound the time that the corresponding renewal could | |
| have occurred on the server and thus determine if it is possible that | | have occurred on the server and thus determine if it is possible that | |
| a lease period expiration could have occurred. | | a lease period expiration could have occurred. | |
| | | | |
| The third lock revocation event can occur as a result of | | The third lock revocation event can occur as a result of | |
| administrative intervention within the lease period. While this is | | administrative intervention within the lease period. While this is | |
| considered a rare event, it is possible that the server's | | considered a rare event, it is possible that the server's | |
| administrator has decided to release or revoke a particular lock held | | administrator has decided to release or revoke a particular lock held | |
| by the client. As a result of revocation, the client will receive an | | by the client. As a result of revocation, the client will receive an | |
|
| error of NFS4ERR_EXPIRED and the error is received within the lease | | error of NFS4ERR_ADMIN_REVOKED. In this instance the client may | |
| period for the lock. In this instance the client may assume that | | assume that only the lock_owner's locks have been lost. The client | |
| only the lock_owner's locks have been lost. The client notifies the | | notifies the lock holder appropriately. The client may not assume | |
| lock holder appropriately. The client may not assume the lease | | the lease period has been renewed as a result of failed operation. | |
| period has been renewed as a result of failed operation. | | | |
| | | | |
| When the client determines the lease period may have expired, the | | When the client determines the lease period may have expired, the | |
| client must mark all locks held for the associated lease as | | client must mark all locks held for the associated lease as | |
| "unvalidated". This means the client has been unable to re-establish | | "unvalidated". This means the client has been unable to re-establish | |
| or confirm the appropriate lock state with the server. As described | | or confirm the appropriate lock state with the server. As described | |
| in the previous section on crash recovery, there are scenarios in | | in the previous section on crash recovery, there are scenarios in | |
| which the server may grant conflicting locks after the lease period | | which the server may grant conflicting locks after the lease period | |
| has expired for a client. When it is possible that the lease period | | has expired for a client. When it is possible that the lease period | |
| has expired, the client must validate each lock currently held to | | has expired, the client must validate each lock currently held to | |
| ensure that a conflicting lock has not been granted. The client may | | ensure that a conflicting lock has not been granted. The client may | |
| accomplish this task by issuing an I/O request, either a pending I/O | | accomplish this task by issuing an I/O request, either a pending I/O | |
| or a zero-length read, specifying the stateid associated with the | | or a zero-length read, specifying the stateid associated with the | |
| lock in question. If the response to the request is success, the | | lock in question. If the response to the request is success, the | |
| client has validated all of the locks governed by that stateid and | | client has validated all of the locks governed by that stateid and | |
| re-established the appropriate state between itself and the server. | | re-established the appropriate state between itself and the server. | |
|
| | | | |
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| If the I/O request is not successful, then one or more of the locks | | If the I/O request is not successful, then one or more of the locks | |
| associated with the stateid was revoked by the server and the client | | associated with the stateid was revoked by the server and the client | |
| must notify the owner. | | must notify the owner. | |
| | | | |
| 8.9. Share Reservations | | 8.9. Share Reservations | |
| | | | |
| A share reservation is a mechanism to control access to a file. It | | A share reservation is a mechanism to control access to a file. It | |
| is a separate and independent mechanism from record locking. When a | | is a separate and independent mechanism from record locking. When a | |
| client opens a file, it issues an OPEN operation to the server | | client opens a file, it issues an OPEN operation to the server | |
| specifying the type of access required (READ, WRITE, or BOTH) and the | | specifying the type of access required (READ, WRITE, or BOTH) and the | |
| type of access to deny others (deny NONE, READ, WRITE, or BOTH). If | | type of access to deny others (deny NONE, READ, WRITE, or BOTH). If | |
|
| | | | |
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
| the OPEN fails the client will fail the application's open request. | | the OPEN fails the client will fail the application's open request. | |
| | | | |
| Pseudo-code definition of the semantics: | | Pseudo-code definition of the semantics: | |
| | | | |
|
| | | if (request.access == 0) | |
| | | return (NFS4ERR_INVAL) | |
| | | else | |
| if ((request.access & file_state.deny)) || | | if ((request.access & file_state.deny)) || | |
| (request.deny & file_state.access)) | | (request.deny & file_state.access)) | |
| return (NFS4ERR_DENIED) | | return (NFS4ERR_DENIED) | |
| | | | |
| This checking of share reservations on OPEN is done with no exception | | This checking of share reservations on OPEN is done with no exception | |
| for an existing OPEN for the same open_owner. | | for an existing OPEN for the same open_owner. | |
| | | | |
| The constants used for the OPEN and OPEN_DOWNGRADE operations for the | | The constants used for the OPEN and OPEN_DOWNGRADE operations for the | |
| access and deny fields are as follows: | | access and deny fields are as follows: | |
| | | | |
| | | | |
| skipping to change at page 82, line 41 | | skipping to change at page 84, line 5 | |
| | | | |
| To provide correct share semantics, a client MUST use the OPEN | | To provide correct share semantics, a client MUST use the OPEN | |
| operation to obtain the initial filehandle and indicate the desired | | operation to obtain the initial filehandle and indicate the desired | |
| access and what if any access to deny. Even if the client intends to | | access and what if any access to deny. Even if the client intends to | |
| use a stateid of all 0's or all 1's, it must still obtain the | | use a stateid of all 0's or all 1's, it must still obtain the | |
| filehandle for the regular file with the OPEN operation so the | | filehandle for the regular file with the OPEN operation so the | |
| appropriate share semantics can be applied. For clients that do not | | appropriate share semantics can be applied. For clients that do not | |
| have a deny mode built into their open programming interfaces, deny | | have a deny mode built into their open programming interfaces, deny | |
| equal to NONE should be used. | | equal to NONE should be used. | |
| | | | |
|
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| The OPEN operation with the CREATE flag, also subsumes the CREATE | | The OPEN operation with the CREATE flag, also subsumes the CREATE | |
| operation for regular files as used in previous versions of the NFS | | operation for regular files as used in previous versions of the NFS | |
| protocol. This allows a create with a share to be done atomically. | | protocol. This allows a create with a share to be done atomically. | |
| | | | |
| The CLOSE operation removes all share reservations held by the | | The CLOSE operation removes all share reservations held by the | |
| lock_owner on that file. If record locks are held, the client SHOULD | | lock_owner on that file. If record locks are held, the client SHOULD | |
| release all locks before issuing a CLOSE. The server MAY free all | | release all locks before issuing a CLOSE. The server MAY free all | |
| outstanding locks on CLOSE but some servers may not support the CLOSE | | outstanding locks on CLOSE but some servers may not support the CLOSE | |
| of a file that still has record locks held. The server MUST return | | of a file that still has record locks held. The server MUST return | |
| failure, NFS4ERR_LOCKS_HELD, if any locks would exist after the | | failure, NFS4ERR_LOCKS_HELD, if any locks would exist after the | |
| CLOSE. | | CLOSE. | |
| | | | |
| The LOOKUP operation will return a filehandle without establishing | | The LOOKUP operation will return a filehandle without establishing | |
| any lock state on the server. Without a valid stateid, the server | | any lock state on the server. Without a valid stateid, the server | |
| will assume the client has the least access. For example, a file | | will assume the client has the least access. For example, a file | |
| opened with deny READ/WRITE cannot be accessed using a filehandle | | opened with deny READ/WRITE cannot be accessed using a filehandle | |
|
| | | | |
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
| obtained through LOOKUP because it would not have a valid stateid | | obtained through LOOKUP because it would not have a valid stateid | |
| (i.e. using a stateid of all bits 0 or all bits 1). | | (i.e. using a stateid of all bits 0 or all bits 1). | |
| | | | |
| 8.10.1. Close and Retention of State Information | | 8.10.1. Close and Retention of State Information | |
| | | | |
| Since a CLOSE operation requests deallocation of a stateid, dealing | | Since a CLOSE operation requests deallocation of a stateid, dealing | |
| with retransmission of the CLOSE, may pose special difficulties, | | with retransmission of the CLOSE, may pose special difficulties, | |
| since the state information, which normally would be used to | | since the state information, which normally would be used to | |
| determine the state of the open file being designated, might be | | determine the state of the open file being designated, might be | |
| deallocated, resulting in an NFS4ERR_BAD_STATEID error. | | deallocated, resulting in an NFS4ERR_BAD_STATEID error. | |
| | | | |
| skipping to change at page 83, line 40 | | skipping to change at page 84, line 56 | |
| is not a retransmission. | | is not a retransmission. | |
| | | | |
| o The time that a lockowner is freed by the server due to period | | o The time that a lockowner is freed by the server due to period | |
| with no activity. | | with no activity. | |
| | | | |
| o All locks for the client are freed as a result of a SETCLIENTID. | | o All locks for the client are freed as a result of a SETCLIENTID. | |
| | | | |
| Servers may avoid this complexity, at the cost of less complete | | Servers may avoid this complexity, at the cost of less complete | |
| protocol error checking, by simply responding NFS4_OK in the event of | | protocol error checking, by simply responding NFS4_OK in the event of | |
| a CLOSE for a deallocated stateid, on the assumption that this case | | a CLOSE for a deallocated stateid, on the assumption that this case | |
|
| must be caused by a retranmitted close. When adopting this approach, | | must be caused by a retransmitted close. When adopting this | |
| it is desirable to at least log an error when returning a no-error | | | |
| indication in this situation. If the server maintains a reply-cache | | Draft Specification NFS version 4 Protocol September 2002 | |
| mechanism, it can verify the CLOSE is indeed a retransmission and | | | |
| avoid error logging in most cases. | | approach, it is desirable to at least log an error when returning a | |
| | | no-error indication in this situation. If the server maintains a | |
| | | reply-cache mechanism, it can verify the CLOSE is indeed a | |
| | | retransmission and avoid error logging in most cases. | |
| | | | |
| 8.11. Open Upgrade and Downgrade | | 8.11. Open Upgrade and Downgrade | |
| | | | |
| When an OPEN is done for a file and the lockowner for which the open | | When an OPEN is done for a file and the lockowner for which the open | |
| is being done already has the file open, the result is to upgrade the | | is being done already has the file open, the result is to upgrade the | |
| open file status maintained on the server to include the access and | | open file status maintained on the server to include the access and | |
| deny bits specified by the new OPEN as well as those for the existing | | deny bits specified by the new OPEN as well as those for the existing | |
| OPEN. The result is that there is one open file, as far as the | | OPEN. The result is that there is one open file, as far as the | |
| protocol is concerned, and it includes the union of the access and | | protocol is concerned, and it includes the union of the access and | |
| deny bits for all of the OPEN requests completed. Only a single | | deny bits for all of the OPEN requests completed. Only a single | |
|
| CLOSE will be done to reset the effects of both OPEN's. Note that | | CLOSE will be done to reset the effects of both OPENs. Note that the | |
| | | client, when issuing the OPEN, may not know that the same file is in | |
| Draft Specification NFS version 4 Protocol August 2002 | | fact being opened. The above only applies if both OPENs result in | |
| | | the OPENed object being designated by the same filehandle. | |
| the client, when issuing the OPEN, may not know that the same file is | | | |
| in fact being opened. The above only applies if both OPEN's result | | | |
| in the OPEN'ed object being designated by the same filehandle. | | | |
| | | | |
| When the server chooses to export multiple filehandles corresponding | | When the server chooses to export multiple filehandles corresponding | |
| to the same file object and returns different filehandles on two | | to the same file object and returns different filehandles on two | |
|
| different OPEN's of the same file object, the server MUST NOT "OR" | | different OPENs of the same file object, the server MUST NOT "OR" | |
| together the access and deny bits and coalesce the two open files. | | together the access and deny bits and coalesce the two open files. | |
|
| Instead the server must maintain separate OPEN's with separate | | Instead the server must maintain separate OPENs with separate | |
| stateid's and will require separate CLOSE's to free them. | | stateids and will require separate CLOSEs to free them. | |
| | | | |
| When multiple open files on the client are merged into a single open | | When multiple open files on the client are merged into a single open | |
| file object on the server, the close of one of the open files (on the | | file object on the server, the close of one of the open files (on the | |
| client) may necessitate change of the access and deny status of the | | client) may necessitate change of the access and deny status of the | |
| open file on the server. This is because the union of the access and | | open file on the server. This is because the union of the access and | |
|
| deny bits for the remaining open's may be smaller (i.e. a proper | | deny bits for the remaining opens may be smaller (i.e. a proper | |
| subset) than previously. The OPEN_DOWNGRADE operation is used to | | subset) than previously. The OPEN_DOWNGRADE operation is used to | |
| make the necessary change and the client should use it to update the | | make the necessary change and the client should use it to update the | |
| server so that share reservation requests by other clients are | | server so that share reservation requests by other clients are | |
| handled properly. | | handled properly. | |
| | | | |
| 8.12. Short and Long Leases | | 8.12. Short and Long Leases | |
| | | | |
| When determining the time period for the server lease, the usual | | When determining the time period for the server lease, the usual | |
| lease tradeoffs apply. Short leases are good for fast server | | lease tradeoffs apply. Short leases are good for fast server | |
| recovery at a cost of increased RENEW or READ (with zero length) | | recovery at a cost of increased RENEW or READ (with zero length) | |
| requests. Longer leases are certainly kinder and gentler to servers | | requests. Longer leases are certainly kinder and gentler to servers | |
| trying to handle very large numbers of clients. The number of RENEW | | trying to handle very large numbers of clients. The number of RENEW | |
| requests drop in proportion to the lease time. The disadvantages of | | requests drop in proportion to the lease time. The disadvantages of | |
|
| long leases are slower recovery after server failure (server must | | long leases are slower recovery after server failure (the server must | |
| wait for leases to expire and grace period before granting new lock | | wait for the leases to expire and the grace period to elapse before | |
| requests) and increased file contention (if client fails to transmit | | granting new lock requests) and increased file contention (if client | |
| an unlock request then server must wait for lease expiration before | | fails to transmit an unlock request then server must wait for lease | |
| granting new locks). | | expiration before granting new locks). | |
| | | | |
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| Long leases are usable if the server is able to store lease state in | | Long leases are usable if the server is able to store lease state in | |
| non-volatile memory. Upon recovery, the server can reconstruct the | | non-volatile memory. Upon recovery, the server can reconstruct the | |
| lease state from its non-volatile memory and continue operation with | | lease state from its non-volatile memory and continue operation with | |
| its clients and therefore long leases would not be an issue. | | its clients and therefore long leases would not be an issue. | |
| | | | |
| 8.13. Clocks, Propagation Delay, and Calculating Lease Expiration | | 8.13. Clocks, Propagation Delay, and Calculating Lease Expiration | |
| | | | |
| To avoid the need for synchronized clocks, lease times are granted by | | To avoid the need for synchronized clocks, lease times are granted by | |
| the server as a time delta. However, there is a requirement that the | | the server as a time delta. However, there is a requirement that the | |
| client and server clocks do not drift excessively over the duration | | client and server clocks do not drift excessively over the duration | |
| of the lock. There is also the issue of propagation delay across the | | of the lock. There is also the issue of propagation delay across the | |
| network which could easily be several hundred milliseconds as well as | | network which could easily be several hundred milliseconds as well as | |
| the possibility that requests will be lost and need to be | | the possibility that requests will be lost and need to be | |
| retransmitted. | | retransmitted. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
| To take propagation delay into account, the client should subtract it | | To take propagation delay into account, the client should subtract it | |
| from lease times (e.g. if the client estimates the one-way | | from lease times (e.g. if the client estimates the one-way | |
| propagation delay as 200 msec, then it can assume that the lease is | | propagation delay as 200 msec, then it can assume that the lease is | |
| already 200 msec old when it gets it). In addition, it will take | | already 200 msec old when it gets it). In addition, it will take | |
| another 200 msec to get a response back to the server. So the client | | another 200 msec to get a response back to the server. So the client | |
| must send a lock renewal or write data back to the server 400 msec | | must send a lock renewal or write data back to the server 400 msec | |
| before the lease would expire. | | before the lease would expire. | |
| | | | |
| The server's lease period configuration should take into account the | | The server's lease period configuration should take into account the | |
| network distance of the clients that will be accessing the server's | | network distance of the clients that will be accessing the server's | |
| resources. It is expected that the lease period will take into | | resources. It is expected that the lease period will take into | |
|
| account the network propogation delays and other network delay | | account the network propagation delays and other network delay | |
| factors for the client population. Since the protocol does not allow | | factors for the client population. Since the protocol does not allow | |
| for an automatic method to determine an appropriate lease period, the | | for an automatic method to determine an appropriate lease period, the | |
| server's administrator may have to tune the lease period. | | server's administrator may have to tune the lease period. | |
| | | | |
| 8.14. Migration, Replication and State | | 8.14. Migration, Replication and State | |
| | | | |
| When responsibility for handling a given file system is transferred | | When responsibility for handling a given file system is transferred | |
| to a new server (migration) or the client chooses to use an alternate | | to a new server (migration) or the client chooses to use an alternate | |
| server (e.g. in response to server unresponsiveness) in the context | | server (e.g. in response to server unresponsiveness) in the context | |
| of file system replication, the appropriate handling of state shared | | of file system replication, the appropriate handling of state shared | |
|
| between the client and server (i.e. locks, leases, stateid's, and | | between the client and server (i.e. locks, leases, stateids, and | |
| clientid's) is as described below. The handling differs between | | clientids) is as described below. The handling differs between | |
| migration and replication. For related discussion of file server | | migration and replication. For related discussion of file server | |
| state and recover of such see the sections under "File Locking and | | state and recover of such see the sections under "File Locking and | |
| Share Reservations" | | Share Reservations" | |
| | | | |
| If server replica or a server immigrating a filesystem agrees to, or | | If server replica or a server immigrating a filesystem agrees to, or | |
| is expected to, accept opaque values from the client that originated | | is expected to, accept opaque values from the client that originated | |
| from another server, then it is a wise implementation practice for | | from another server, then it is a wise implementation practice for | |
| the servers to encode the "opaque" values in network byte order. This | | the servers to encode the "opaque" values in network byte order. This | |
| way, servers acting as replicas or immigrating filesystems will be | | way, servers acting as replicas or immigrating filesystems will be | |
| able to parse values like stateids, directory cookies, filehandles, | | able to parse values like stateids, directory cookies, filehandles, | |
| etc. even if their native byte order is different from other servers | | etc. even if their native byte order is different from other servers | |
|
| | | | |
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| cooperating in the replication and migration of the filesystem. | | cooperating in the replication and migration of the filesystem. | |
| | | | |
| 8.14.1. Migration and State | | 8.14.1. Migration and State | |
| | | | |
| In the case of migration, the servers involved in the migration of a | | In the case of migration, the servers involved in the migration of a | |
| filesystem SHOULD transfer all server state from the original to the | | filesystem SHOULD transfer all server state from the original to the | |
| new server. This must be done in a way that is transparent to the | | new server. This must be done in a way that is transparent to the | |
| client. This state transfer will ease the client's transition when a | | client. This state transfer will ease the client's transition when a | |
| filesystem migration occurs. If the servers are successful in | | filesystem migration occurs. If the servers are successful in | |
|
| transferring all state, the client will continue to use stateid's | | transferring all state, the client will continue to use stateids | |
| assigned by the original server. Therefore the new server must | | assigned by the original server. Therefore the new server must | |
|
| recognize these stateid's as valid. This holds true for the clientid | | recognize these stateids as valid. This holds true for the clientid | |
| as well. Since responsibility for an entire filesystem is | | as well. Since responsibility for an entire filesystem is | |
| transferred with a migration event, there is no possibility that | | transferred with a migration event, there is no possibility that | |
| conflicts will arise on the new server as a result of the transfer of | | conflicts will arise on the new server as a result of the transfer of | |
|
| | | | |
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
| locks. | | locks. | |
| | | | |
| As part of the transfer of information between servers, leases would | | As part of the transfer of information between servers, leases would | |
| be transferred as well. The leases being transferred to the new | | be transferred as well. The leases being transferred to the new | |
| server will typically have a different expiration time from those for | | server will typically have a different expiration time from those for | |
| the same client, previously on the old server. To maintain the | | the same client, previously on the old server. To maintain the | |
| property that all leases on a given server for a given client expire | | property that all leases on a given server for a given client expire | |
| at the same time, the server should advance the expiration time to | | at the same time, the server should advance the expiration time to | |
| the later of the leases being transferred or the leases already | | the later of the leases being transferred or the leases already | |
| present. This allows the client to maintain lease renewal of both | | present. This allows the client to maintain lease renewal of both | |
| | | | |
| skipping to change at page 86, line 33 | | skipping to change at page 87, line 48 | |
| NFS4ERR_STALE_STATEID from the new server. The client should then | | NFS4ERR_STALE_STATEID from the new server. The client should then | |
| recover its state information as it normally would in response to a | | recover its state information as it normally would in response to a | |
| server failure. The new server must take care to allow for the | | server failure. The new server must take care to allow for the | |
| recovery of state information as it would in the event of server | | recovery of state information as it would in the event of server | |
| restart. | | restart. | |
| | | | |
| 8.14.2. Replication and State | | 8.14.2. Replication and State | |
| | | | |
| Since client switch-over in the case of replication is not under | | Since client switch-over in the case of replication is not under | |
| server control, the handling of state is different. In this case, | | server control, the handling of state is different. In this case, | |
|
| leases, stateid's and clientid's do not have validity across a | | leases, stateids and clientids do not have validity across a | |
| transition from one server to another. The client must re-establish | | transition from one server to another. The client must re-establish | |
| its locks on the new server. This can be compared to the re- | | its locks on the new server. This can be compared to the re- | |
| establishment of locks by means of reclaim-type requests after a | | establishment of locks by means of reclaim-type requests after a | |
| server reboot. The difference is that the server has no provision to | | server reboot. The difference is that the server has no provision to | |
| distinguish requests reclaiming locks from those obtaining new locks | | distinguish requests reclaiming locks from those obtaining new locks | |
| or to defer the latter. Thus, a client re-establishing a lock on the | | or to defer the latter. Thus, a client re-establishing a lock on the | |
| new server (by means of a LOCK or OPEN request), may have the | | new server (by means of a LOCK or OPEN request), may have the | |
| requests denied due to a conflicting lock. Since replication is | | requests denied due to a conflicting lock. Since replication is | |
|
| | | | |
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| intended for read-only use of filesystems, such denial of locks | | intended for read-only use of filesystems, such denial of locks | |
| should not pose large difficulties in practice. When an attempt to | | should not pose large difficulties in practice. When an attempt to | |
| re-establish a lock on a new server is denied, the client should | | re-establish a lock on a new server is denied, the client should | |
| treat the situation as if his original lock had been revoked. | | treat the situation as if his original lock had been revoked. | |
| | | | |
| 8.14.3. Notification of Migrated Lease | | 8.14.3. Notification of Migrated Lease | |
| | | | |
| In the case of lease renewal, the client may not be submitting | | In the case of lease renewal, the client may not be submitting | |
| requests for a filesystem that has been migrated to another server. | | requests for a filesystem that has been migrated to another server. | |
| This can occur because of the implicit lease renewal mechanism. The | | This can occur because of the implicit lease renewal mechanism. The | |
| client renews leases for all filesystems when submitting a request to | | client renews leases for all filesystems when submitting a request to | |
| any one filesystem at the server. | | any one filesystem at the server. | |
| | | | |
| In order for the client to schedule renewal of leases that may have | | In order for the client to schedule renewal of leases that may have | |
| been relocated to the new server, the client must find out about | | been relocated to the new server, the client must find out about | |
|
| | | | |
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
| lease relocation before those leases expire. To accomplish this, all | | lease relocation before those leases expire. To accomplish this, all | |
| operations which implicitly renew leases for a client (i.e. OPEN, | | operations which implicitly renew leases for a client (i.e. OPEN, | |
| CLOSE, READ, WRITE, RENEW, LOCK, LOCKT, LOCKU), will return the error | | CLOSE, READ, WRITE, RENEW, LOCK, LOCKT, LOCKU), will return the error | |
| NFS4ERR_LEASE_MOVED if responsibility for any of the leases to be | | NFS4ERR_LEASE_MOVED if responsibility for any of the leases to be | |
| renewed has been transferred to a new server. This condition will | | renewed has been transferred to a new server. This condition will | |
| continue until the client receives an NFS4ERR_MOVED error and the | | continue until the client receives an NFS4ERR_MOVED error and the | |
| server receives the subsequent GETATTR(fs_locations) for an access to | | server receives the subsequent GETATTR(fs_locations) for an access to | |
| each filesystem for which a lease has been moved to a new server. | | each filesystem for which a lease has been moved to a new server. | |
| | | | |
| When a client receives an NFS4ERR_LEASE_MOVED error, it should | | When a client receives an NFS4ERR_LEASE_MOVED error, it should | |
|
| perform some operation, such as a RENEW, on each filesystem | | perform an operation on each filesystem associated with the server in | |
| associated with the server in question. When the client receives an | | question. When the client receives an NFS4ERR_MOVED error, the | |
| NFS4ERR_MOVED error, the client can follow the normal process to | | client can follow the normal process to obtain the new server | |
| obtain the new server information (through the fs_locations | | information (through the fs_locations attribute) and perform renewal | |
| attribute) and perform renewal of those leases on the new server. If | | of those leases on the new server. If the server has not had state | |
| the server has not had state transferred to it transparently, the | | transferred to it transparently, the client will receive either | |
| client will receive either NFS4ERR_STALE_CLIENTID or | | NFS4ERR_STALE_CLIENTID or NFS4ERR_STALE_STATEID from the new server, | |
| NFS4ERR_STALE_STATEID from the new server, as described above, and | | as described above, and the client can then recover state information | |
| the client can then recover state information as it does in the event | | as it does in the event of server failure. | |
| of server failure. | | | |
| | | | |
| 8.14.4. Migration and the Lease_time Attribute | | 8.14.4. Migration and the Lease_time Attribute | |
| | | | |
| In order that the client may appropriately manage its leases in the | | In order that the client may appropriately manage its leases in the | |
| case of migration, the destination server must establish proper | | case of migration, the destination server must establish proper | |
| values for the lease_time attribute. | | values for the lease_time attribute. | |
| | | | |
| When state is transferred transparently, that state should include | | When state is transferred transparently, that state should include | |
| the correct value of the lease_time attribute. The lease_time | | the correct value of the lease_time attribute. The lease_time | |
| attribute on the destination server must never be less than that on | | attribute on the destination server must never be less than that on | |
| the source since this would result in premature expiration of leases | | the source since this would result in premature expiration of leases | |
| granted by the source server. Upon migration in which state is | | granted by the source server. Upon migration in which state is | |
| transferred transparently, the client is under no obligation to re- | | transferred transparently, the client is under no obligation to re- | |
| fetch the lease_time attribute and may continue to use the value | | fetch the lease_time attribute and may continue to use the value | |
| previously fetched (on the source server). | | previously fetched (on the source server). | |
| | | | |
|
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| If state has not been transferred transparently (i.e. the client sees | | If state has not been transferred transparently (i.e. the client sees | |
| a real or simulated server reboot), the client should fetch the value | | a real or simulated server reboot), the client should fetch the value | |
| of lease_time on the new (i.e. destination) server, and use it for | | of lease_time on the new (i.e. destination) server, and use it for | |
| subsequent locking requests. However the server must respect a grace | | subsequent locking requests. However the server must respect a grace | |
| period at least as long as the lease_time on the source server, in | | period at least as long as the lease_time on the source server, in | |
| order to ensure that clients have ample time to reclaim their locks | | order to ensure that clients have ample time to reclaim their locks | |
| before potentially conflicting non-reclaimed locks are granted. The | | before potentially conflicting non-reclaimed locks are granted. The | |
| means by which the new server obtains the value of lease_time on the | | means by which the new server obtains the value of lease_time on the | |
| old server is left to the server implementations. It is not | | old server is left to the server implementations. It is not | |
| specified by the NFS version 4 protocol. | | specified by the NFS version 4 protocol. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| 9. Client-Side Caching | | 9. Client-Side Caching | |
| | | | |
| Client-side caching of data, of file attributes, and of file names is | | Client-side caching of data, of file attributes, and of file names is | |
| essential to providing good performance with the NFS protocol. | | essential to providing good performance with the NFS protocol. | |
| Providing distributed cache coherence is a difficult problem and | | Providing distributed cache coherence is a difficult problem and | |
| previous versions of the NFS protocol have not attempted it. | | previous versions of the NFS protocol have not attempted it. | |
| Instead, several NFS client implementation techniques have been used | | Instead, several NFS client implementation techniques have been used | |
| to reduce the problems that a lack of coherence poses for users. | | to reduce the problems that a lack of coherence poses for users. | |
| These techniques have not been clearly defined by earlier protocol | | These techniques have not been clearly defined by earlier protocol | |
| | | | |
| skipping to change at page 89, line 5 | | skipping to change at page 91, line 5 | |
| performance is to allow a client that repeatedly opens a file to do | | performance is to allow a client that repeatedly opens a file to do | |
| so without reference to the server. This is done until potentially | | so without reference to the server. This is done until potentially | |
| conflicting operations from another client actually occur. | | conflicting operations from another client actually occur. | |
| | | | |
| A similar situation arises in connection with file locking. Sending | | A similar situation arises in connection with file locking. Sending | |
| file lock and unlock requests to the server as well as the read and | | file lock and unlock requests to the server as well as the read and | |
| write requests necessary to make data caching consistent with the | | write requests necessary to make data caching consistent with the | |
| locking semantics (see the section "Data Caching and File Locking") | | locking semantics (see the section "Data Caching and File Locking") | |
| can severely limit performance. When locking is used to provide | | can severely limit performance. When locking is used to provide | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| protection against infrequent conflicts, a large penalty is incurred. | | protection against infrequent conflicts, a large penalty is incurred. | |
| This penalty may discourage the use of file locking by applications. | | This penalty may discourage the use of file locking by applications. | |
| | | | |
| The NFS version 4 protocol provides more aggressive caching | | The NFS version 4 protocol provides more aggressive caching | |
| strategies with the following design goals: | | strategies with the following design goals: | |
| | | | |
| o Compatibility with a large range of server semantics. | | o Compatibility with a large range of server semantics. | |
| | | | |
| o Provide the same caching benefits as previous versions of the | | o Provide the same caching benefits as previous versions of the | |
| | | | |
| skipping to change at page 90, line 5 | | skipping to change at page 92, line 5 | |
| on them. Preliminary testing of callback functionality by means of a | | on them. Preliminary testing of callback functionality by means of a | |
| CB_NULL procedure determines whether callbacks can be supported. The | | CB_NULL procedure determines whether callbacks can be supported. The | |
| CB_NULL procedure checks the continuity of the callback path. A | | CB_NULL procedure checks the continuity of the callback path. A | |
| server makes a preliminary assessment of callback availability to a | | server makes a preliminary assessment of callback availability to a | |
| given client and avoids delegating responsibilities until it has | | given client and avoids delegating responsibilities until it has | |
| determined that callbacks are supported. Because the granting of a | | determined that callbacks are supported. Because the granting of a | |
| delegation is always conditional upon the absence of conflicting | | delegation is always conditional upon the absence of conflicting | |
| access, clients must not assume that a delegation will be granted and | | access, clients must not assume that a delegation will be granted and | |
| they must always be prepared for OPENs to be processed without any | | they must always be prepared for OPENs to be processed without any | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| delegations being granted. | | delegations being granted. | |
| | | | |
| Once granted, a delegation behaves in most ways like a lock. There | | Once granted, a delegation behaves in most ways like a lock. There | |
| is an associated lease that is subject to renewal together with all | | is an associated lease that is subject to renewal together with all | |
| of the other leases held by that client. | | of the other leases held by that client. | |
| | | | |
| Unlike locks, an operation by a second client to a delegated file | | Unlike locks, an operation by a second client to a delegated file | |
| will cause the server to recall a delegation through a callback. | | will cause the server to recall a delegation through a callback. | |
| | | | |
| | | | |
| skipping to change at page 91, line 5 | | skipping to change at page 93, line 5 | |
| There are three situations that delegation recovery must deal with: | | There are three situations that delegation recovery must deal with: | |
| | | | |
| o Client reboot or restart | | o Client reboot or restart | |
| | | | |
| o Server reboot or restart | | o Server reboot or restart | |
| | | | |
| o Network partition (full or callback-only) | | o Network partition (full or callback-only) | |
| | | | |
| In the event the client reboots or restarts, the failure to renew | | In the event the client reboots or restarts, the failure to renew | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| leases will result in the revocation of record locks and share | | leases will result in the revocation of record locks and share | |
| reservations. Delegations, however, may be treated a bit | | reservations. Delegations, however, may be treated a bit | |
| differently. | | differently. | |
| | | | |
| There will be situations in which delegations will need to be | | There will be situations in which delegations will need to be | |
| reestablished after a client reboots or restarts. The reason for | | reestablished after a client reboots or restarts. The reason for | |
| this is the client may have file data stored locally and this data | | this is the client may have file data stored locally and this data | |
| was associated with the previously held delegations. The client will | | was associated with the previously held delegations. The client will | |
| need to reestablish the appropriate file state on the server. | | need to reestablish the appropriate file state on the server. | |
| | | | |
| skipping to change at page 92, line 5 | | skipping to change at page 94, line 5 | |
| process of handling delegation reclaim reconciles three principles of | | process of handling delegation reclaim reconciles three principles of | |
| the NFS version 4 protocol: | | the NFS version 4 protocol: | |
| | | | |
| o Upon reclaim, a client reporting resources assigned to it by an | | o Upon reclaim, a client reporting resources assigned to it by an | |
| earlier server instance must be granted those resources. | | earlier server instance must be granted those resources. | |
| | | | |
| o The server has unquestionable authority to determine whether | | o The server has unquestionable authority to determine whether | |
| delegations are to be granted and, once granted, whether they | | delegations are to be granted and, once granted, whether they | |
| are to be continued. | | are to be continued. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| o The use of callbacks is not to be depended upon until the client | | o The use of callbacks is not to be depended upon until the client | |
| has proven its ability to receive them. | | has proven its ability to receive them. | |
| | | | |
| When a network partition occurs, delegations are subject to freeing | | When a network partition occurs, delegations are subject to freeing | |
| by the server when the lease renewal period expires. This is similar | | by the server when the lease renewal period expires. This is similar | |
| to the behavior for locks and share reservations. For delegations, | | to the behavior for locks and share reservations. For delegations, | |
| however, the server may extend the period in which conflicting | | however, the server may extend the period in which conflicting | |
| requests are held off. Eventually the occurrence of a conflicting | | requests are held off. Eventually the occurrence of a conflicting | |
| request from another client will cause revocation of the delegation. | | request from another client will cause revocation of the delegation. | |
| | | | |
| skipping to change at page 93, line 5 | | skipping to change at page 95, line 5 | |
| invalidate the assumptions that those using these facilities depend | | invalidate the assumptions that those using these facilities depend | |
| upon. | | upon. | |
| | | | |
| 9.3.1. Data Caching and OPENs | | 9.3.1. Data Caching and OPENs | |
| | | | |
| In order to avoid invalidating the sharing assumptions that | | In order to avoid invalidating the sharing assumptions that | |
| applications rely on, NFS version 4 clients should not provide cached | | applications rely on, NFS version 4 clients should not provide cached | |
| data to applications or modify it on behalf of an application when it | | data to applications or modify it on behalf of an application when it | |
| would not be valid to obtain or modify that same data via a READ or | | would not be valid to obtain or modify that same data via a READ or | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| WRITE operation. | | WRITE operation. | |
| | | | |
| Furthermore, in the absence of open delegation (see the section "Open | | Furthermore, in the absence of open delegation (see the section "Open | |
| Delegation") two additional rules apply. Note that these rules are | | Delegation") two additional rules apply. Note that these rules are | |
| obeyed in practice by many NFS version 2 and version 3 clients. | | obeyed in practice by many NFS version 2 and version 3 clients. | |
| | | | |
| o First, cached data present on a client must be revalidated after | | o First, cached data present on a client must be revalidated after | |
| doing an OPEN. Revalidating means that the client fetches the | | doing an OPEN. Revalidating means that the client fetches the | |
| change attribute from the server, compares it with the cached | | change attribute from the server, compares it with the cached | |
| | | | |
| skipping to change at page 94, line 5 | | skipping to change at page 96, line 5 | |
| written to the file. Hence, this requirement. | | written to the file. Hence, this requirement. | |
| | | | |
| 9.3.2. Data Caching and File Locking | | 9.3.2. Data Caching and File Locking | |
| | | | |
| For those applications that choose to use file locking instead of | | For those applications that choose to use file locking instead of | |
| share reservations to exclude inconsistent file access, there is an | | share reservations to exclude inconsistent file access, there is an | |
| analogous set of constraints that apply to client side data caching. | | analogous set of constraints that apply to client side data caching. | |
| These rules are effective only if the file locking is used in a way | | These rules are effective only if the file locking is used in a way | |
| that matches in an equivalent way the actual READ and WRITE | | that matches in an equivalent way the actual READ and WRITE | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| operations executed. This is as opposed to file locking that is | | operations executed. This is as opposed to file locking that is | |
| based on pure convention. For example, it is possible to manipulate | | based on pure convention. For example, it is possible to manipulate | |
| a two-megabyte file by dividing the file into two one-megabyte | | a two-megabyte file by dividing the file into two one-megabyte | |
| regions and protecting access to the two regions by file locks on | | regions and protecting access to the two regions by file locks on | |
| bytes zero and one. A lock for write on byte zero of the file would | | bytes zero and one. A lock for write on byte zero of the file would | |
| represent the right to do READ and WRITE operations on the first | | represent the right to do READ and WRITE operations on the first | |
| region. A lock for write on byte one of the file would represent the | | region. A lock for write on byte one of the file would represent the | |
| right to do READ and WRITE operations on the second region. As long | | right to do READ and WRITE operations on the second region. As long | |
| as all applications manipulating the file obey this convention, they | | as all applications manipulating the file obey this convention, they | |
| | | | |
| skipping to change at page 94, line 50 | | skipping to change at page 96, line 50 | |
| unlocked may cause invalid modification to the region outside the | | unlocked may cause invalid modification to the region outside the | |
| unlocked area. This, in turn, may be part of a region locked by | | unlocked area. This, in turn, may be part of a region locked by | |
| another client. Clients can avoid this situation by synchronously | | another client. Clients can avoid this situation by synchronously | |
| performing portions of write operations that overlap that portion | | performing portions of write operations that overlap that portion | |
| (initial or final) that is not a full block. Similarly, invalidating | | (initial or final) that is not a full block. Similarly, invalidating | |
| a locked area which is not an integral number of full buffer blocks | | a locked area which is not an integral number of full buffer blocks | |
| would require the client to read one or two partial blocks from the | | would require the client to read one or two partial blocks from the | |
| server if the revalidation procedure shows that the data which the | | server if the revalidation procedure shows that the data which the | |
| client possesses may not be valid. | | client possesses may not be valid. | |
| | | | |
|
| The data that is written to the server as a pre-requisite to the | | The data that is written to the server as a prerequisite to the | |
| unlocking of a region must be written, at the server, to stable | | unlocking of a region must be written, at the server, to stable | |
| storage. The client may accomplish this either with synchronous | | storage. The client may accomplish this either with synchronous | |
| writes or by following asynchronous writes with a COMMIT operation. | | writes or by following asynchronous writes with a COMMIT operation. | |
| This is required because retransmission of the modified data after a | | This is required because retransmission of the modified data after a | |
| server reboot might conflict with a lock held by another client. | | server reboot might conflict with a lock held by another client. | |
| | | | |
| A client implementation may choose to accommodate applications which | | A client implementation may choose to accommodate applications which | |
| use record locking in non-standard ways (e.g. using a record lock as | | use record locking in non-standard ways (e.g. using a record lock as | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| a global semaphore) by flushing to the server more data upon an LOCKU | | a global semaphore) by flushing to the server more data upon an LOCKU | |
| than is covered by the locked range. This may include modified data | | than is covered by the locked range. This may include modified data | |
| within files other than the one for which the unlocks are being done. | | within files other than the one for which the unlocks are being done. | |
| In such cases, the client must not interfere with applications whose | | In such cases, the client must not interfere with applications whose | |
| READs and WRITEs are being done only within the bounds of record | | READs and WRITEs are being done only within the bounds of record | |
| locks which the application holds. For example, an application locks | | locks which the application holds. For example, an application locks | |
| a single byte of a file and proceeds to write that single byte. A | | a single byte of a file and proceeds to write that single byte. A | |
| client that chose to handle a LOCKU by flushing all modified data to | | client that chose to handle a LOCKU by flushing all modified data to | |
| the server could validly write that single byte in response to an | | the server could validly write that single byte in response to an | |
| | | | |
| skipping to change at page 95, line 45 | | skipping to change at page 97, line 45 | |
| satisfy the request using the client's validated cache. If an | | satisfy the request using the client's validated cache. If an | |
| appropriate file lock is not held for the range of the read or write, | | appropriate file lock is not held for the range of the read or write, | |
| the read or write request must not be satisfied by the client's cache | | the read or write request must not be satisfied by the client's cache | |
| and the request must be sent to the server for processing. When a | | and the request must be sent to the server for processing. When a | |
| read or write request partially overlaps a locked region, the request | | read or write request partially overlaps a locked region, the request | |
| should be subdivided into multiple pieces with each region (locked or | | should be subdivided into multiple pieces with each region (locked or | |
| not) treated appropriately. | | not) treated appropriately. | |
| | | | |
| 9.3.4. Data Caching and File Identity | | 9.3.4. Data Caching and File Identity | |
| | | | |
|
| When clients cache data, the file data needs to organized according | | When clients cache data, the file data needs to be organized | |
| to the filesystem object to which the data belongs. For NFS version | | according to the filesystem object to which the data belongs. For | |
| 3 clients, the typical practice has been to assume for the purpose of | | NFS version 3 clients, the typical practice has been to assume for | |
| caching that distinct filehandles represent distinct filesystem | | the purpose of caching that distinct filehandles represent distinct | |
| objects. The client then has the choice to organize and maintain the | | filesystem objects. The client then has the choice to organize and | |
| data cache on this basis. | | maintain the data cache on this basis. | |
| | | | |
| In the NFS version 4 protocol, there is now the possibility to have | | In the NFS version 4 protocol, there is now the possibility to have | |
| significant deviations from a "one filehandle per object" model | | significant deviations from a "one filehandle per object" model | |
| because a filehandle may be constructed on the basis of the object's | | because a filehandle may be constructed on the basis of the object's | |
| pathname. Therefore, clients need a reliable method to determine if | | pathname. Therefore, clients need a reliable method to determine if | |
| two filehandles designate the same filesystem object. If clients | | two filehandles designate the same filesystem object. If clients | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| were simply to assume that all distinct filehandles denote distinct | | were simply to assume that all distinct filehandles denote distinct | |
| objects and proceed to do data caching on this basis, caching | | objects and proceed to do data caching on this basis, caching | |
| inconsistencies would arise between the distinct client side objects | | inconsistencies would arise between the distinct client side objects | |
| which mapped to the same server side object. | | which mapped to the same server side object. | |
| | | | |
| By providing a method to differentiate filehandles, the NFS version 4 | | By providing a method to differentiate filehandles, the NFS version 4 | |
| protocol alleviates a potential functional regression in comparison | | protocol alleviates a potential functional regression in comparison | |
| with the NFS version 3 protocol. Without this method, caching | | with the NFS version 3 protocol. Without this method, caching | |
| inconsistencies within the same client could occur and this has not | | inconsistencies within the same client could occur and this has not | |
| been present in previous versions of the NFS protocol. Note that it | | been present in previous versions of the NFS protocol. Note that it | |
| is possible to have such inconsistencies with applications executing | | is possible to have such inconsistencies with applications executing | |
| on multiple clients but that is not the issue being addressed here. | | on multiple clients but that is not the issue being addressed here. | |
| | | | |
| For the purposes of data caching, the following steps allow an NFS | | For the purposes of data caching, the following steps allow an NFS | |
| version 4 client to determine whether two distinct filehandles denote | | version 4 client to determine whether two distinct filehandles denote | |
| the same server side object: | | the same server side object: | |
| | | | |
|
| o If GETATTR directed to two filehandles have different values of | | o If GETATTR directed to two filehandles returns different values | |
| the fsid attribute, then the filehandles represent distinct | | of the fsid attribute, then the filehandles represent distinct | |
| objects. | | objects. | |
| | | | |
| o If GETATTR for any file with an fsid that matches the fsid of | | o If GETATTR for any file with an fsid that matches the fsid of | |
| the two filehandles in question returns a unique_handles | | the two filehandles in question returns a unique_handles | |
| attribute with a value of TRUE, then the two objects are | | attribute with a value of TRUE, then the two objects are | |
| distinct. | | distinct. | |
| | | | |
| o If GETATTR directed to the two filehandles does not return the | | o If GETATTR directed to the two filehandles does not return the | |
|
| fileid attribute for one or both of the handles, then it cannot | | fileid attribute for both of the handles, then it cannot be | |
| be determined whether the two objects are the same. Therefore, | | determined whether the two objects are the same. Therefore, | |
| operations which depend on that knowledge (e.g. client side data | | operations which depend on that knowledge (e.g. client side data | |
| caching) cannot be done reliably. | | caching) cannot be done reliably. | |
| | | | |
| o If GETATTR directed to the two filehandles returns different | | o If GETATTR directed to the two filehandles returns different | |
| values for the fileid attribute, then they are distinct objects. | | values for the fileid attribute, then they are distinct objects. | |
| | | | |
| o Otherwise they are the same object. | | o Otherwise they are the same object. | |
| | | | |
| 9.4. Open Delegation | | 9.4. Open Delegation | |
| | | | |
| | | | |
| skipping to change at page 97, line 5 | | skipping to change at page 99, line 5 | |
| delegation is recallable, since the circumstances that allowed for | | delegation is recallable, since the circumstances that allowed for | |
| the delegation are subject to change. In particular, the server may | | the delegation are subject to change. In particular, the server may | |
| receive a conflicting OPEN from another client, the server must | | receive a conflicting OPEN from another client, the server must | |
| recall the delegation before deciding whether the OPEN from the other | | recall the delegation before deciding whether the OPEN from the other | |
| client may be granted. Making a delegation is up to the server and | | client may be granted. Making a delegation is up to the server and | |
| clients should not assume that any particular OPEN either will or | | clients should not assume that any particular OPEN either will or | |
| will not result in an open delegation. The following is a typical | | will not result in an open delegation. The following is a typical | |
| set of conditions that servers might use in deciding whether OPEN | | set of conditions that servers might use in deciding whether OPEN | |
| should be delegated: | | should be delegated: | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| o The client must be able to respond to the server's callback | | o The client must be able to respond to the server's callback | |
| requests. The server will use the CB_NULL procedure for a test | | requests. The server will use the CB_NULL procedure for a test | |
| of callback ability. | | of callback ability. | |
| | | | |
| o The client must have responded properly to previous recalls. | | o The client must have responded properly to previous recalls. | |
| | | | |
| o There must be no current open conflicting with the requested | | o There must be no current open conflicting with the requested | |
| delegation. | | delegation. | |
| | | | |
| | | | |
| skipping to change at page 98, line 5 | | skipping to change at page 100, line 5 | |
| | | | |
| When an open delegation is made, the response to the OPEN contains an | | When an open delegation is made, the response to the OPEN contains an | |
| open delegation structure which specifies the following: | | open delegation structure which specifies the following: | |
| | | | |
| o the type of delegation (read or write) | | o the type of delegation (read or write) | |
| | | | |
| o space limitation information to control flushing of data on | | o space limitation information to control flushing of data on | |
| close (write open delegation only, see the section "Open | | close (write open delegation only, see the section "Open | |
| Delegation and Data Caching") | | Delegation and Data Caching") | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| o an nfsace4 specifying read and write permissions | | o an nfsace4 specifying read and write permissions | |
| | | | |
| o a stateid to represent the delegation for READ and WRITE | | o a stateid to represent the delegation for READ and WRITE | |
| | | | |
| The delegation stateid is separate and distinct from the stateid for | | The delegation stateid is separate and distinct from the stateid for | |
| the OPEN proper. The standard stateid, unlike the delegation | | the OPEN proper. The standard stateid, unlike the delegation | |
| stateid, is associated with a particular lock_owner and will continue | | stateid, is associated with a particular lock_owner and will continue | |
| to be valid after the delegation is recalled and the file remains | | to be valid after the delegation is recalled and the file remains | |
| open. | | open. | |
| | | | |
| skipping to change at page 99, line 5 | | skipping to change at page 101, line 5 | |
| The use of delegation together with various other forms of caching | | The use of delegation together with various other forms of caching | |
| creates the possibility that no server authentication will ever be | | creates the possibility that no server authentication will ever be | |
| performed for a given user since all of the user's requests might be | | performed for a given user since all of the user's requests might be | |
| satisfied locally. Where the client is depending on the server for | | satisfied locally. Where the client is depending on the server for | |
| authentication, the client should be sure authentication occurs for | | authentication, the client should be sure authentication occurs for | |
| each user by use of the ACCESS operation. This should be the case | | each user by use of the ACCESS operation. This should be the case | |
| even if an ACCESS operation would not be required otherwise. As | | even if an ACCESS operation would not be required otherwise. As | |
| mentioned before, the server may enforce frequent authentication by | | mentioned before, the server may enforce frequent authentication by | |
| returning an nfsace4 denying all access with every open delegation. | | returning an nfsace4 denying all access with every open delegation. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| 9.4.1. Open Delegation and Data Caching | | 9.4.1. Open Delegation and Data Caching | |
| | | | |
| OPEN delegation allows much of the message overhead associated with | | OPEN delegation allows much of the message overhead associated with | |
| the opening and closing files to be eliminated. An open when an open | | the opening and closing files to be eliminated. An open when an open | |
| delegation is in effect does not require that a validation message be | | delegation is in effect does not require that a validation message be | |
| sent to the server. The continued endurance of the "read open | | sent to the server. The continued endurance of the "read open | |
| delegation" provides a guarantee that no OPEN for write and thus no | | delegation" provides a guarantee that no OPEN for write and thus no | |
| write has occurred. Similarly, when closing a file opened for write | | write has occurred. Similarly, when closing a file opened for write | |
| and if write open delegation is in effect, the data written does not | | and if write open delegation is in effect, the data written does not | |
| | | | |
| skipping to change at page 100, line 5 | | skipping to change at page 102, line 5 | |
| The server can recall delegations as a result of managing the | | The server can recall delegations as a result of managing the | |
| available filesystem space. The client should abide by the server's | | available filesystem space. The client should abide by the server's | |
| state space limits for delegations. If the client exceeds the stated | | state space limits for delegations. If the client exceeds the stated | |
| limits for the delegation, the server's behavior is undefined. | | limits for the delegation, the server's behavior is undefined. | |
| | | | |
| Based on server conditions, quotas or available filesystem space, the | | Based on server conditions, quotas or available filesystem space, the | |
| server may grant write open delegations with very restrictive space | | server may grant write open delegations with very restrictive space | |
| limitations. The limitations may be defined in a way that will | | limitations. The limitations may be defined in a way that will | |
| always force modified data to be flushed to the server on close. | | always force modified data to be flushed to the server on close. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| With respect to authentication, flushing modified data to the server | | With respect to authentication, flushing modified data to the server | |
| after a CLOSE has occurred may be problematic. For example, the user | | after a CLOSE has occurred may be problematic. For example, the user | |
| of the application may have logged off the client and unexpired | | of the application may have logged off the client and unexpired | |
| authentication credentials may not be present. In this case, the | | authentication credentials may not be present. In this case, the | |
| client may need to take special care to ensure that local unexpired | | client may need to take special care to ensure that local unexpired | |
| credentials will in fact be available. This may be accomplished by | | credentials will in fact be available. This may be accomplished by | |
| tracking the expiration time of credentials and flushing data well in | | tracking the expiration time of credentials and flushing data well in | |
| advance of their expiration or by making private copies of | | advance of their expiration or by making private copies of | |
| credentials to assure their availability when needed. | | credentials to assure their availability when needed. | |
| | | | |
| skipping to change at page 100, line 51 | | skipping to change at page 102, line 51 | |
| | | | |
| Since CB_GETATTR is being used to satisfy another client's GETATTR | | Since CB_GETATTR is being used to satisfy another client's GETATTR | |
| request, the server only needs to know if the client holding the | | request, the server only needs to know if the client holding the | |
| delegation has a modified version of the file. If the client's copy | | delegation has a modified version of the file. If the client's copy | |
| of the delegated file is not modified (data or size), the server can | | of the delegated file is not modified (data or size), the server can | |
| satisfy the second client's GETATTR request from the attributes | | satisfy the second client's GETATTR request from the attributes | |
| stored locally at the server. If the file is modified, the server | | stored locally at the server. If the file is modified, the server | |
| only needs to know about this modified state. If the server | | only needs to know about this modified state. If the server | |
| determines that the file is currently modified, it will respond to | | determines that the file is currently modified, it will respond to | |
| the second client's GETATTR as if the file had been modified locally | | the second client's GETATTR as if the file had been modified locally | |
|
| at the server. This means that the server will take the current time | | at the server. | |
| and apply it to the construction of attributes like change and | | | |
| time_modify. | | | |
| | | | |
| Since the form of the change attribute is determined by the server | | Since the form of the change attribute is determined by the server | |
| and is opaque to the client, the client and server need to agree on a | | and is opaque to the client, the client and server need to agree on a | |
|
| | | | |
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
| method of communicating the modified state of the file. For the size | | method of communicating the modified state of the file. For the size | |
| attribute, the client will report its current view of the file size. | | attribute, the client will report its current view of the file size. | |
|
| | | | |
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| For the change attribute, the handling is more involved. | | For the change attribute, the handling is more involved. | |
| | | | |
| For the client, the following steps will be taken when receiving a | | For the client, the following steps will be taken when receiving a | |
| write delegation: | | write delegation: | |
| | | | |
| o The value of the change attribute will be obtained from the | | o The value of the change attribute will be obtained from the | |
| server and cached. Let this value be represented by c. | | server and cached. Let this value be represented by c. | |
| | | | |
| o The client will create a value greater than c that will be used | | o The client will create a value greater than c that will be used | |
| for communicating modified data is held at the client. Let this | | for communicating modified data is held at the client. Let this | |
| value be represented by d. | | value be represented by d. | |
| | | | |
| o When the client is queried via CB_GETATTR for the change | | o When the client is queried via CB_GETATTR for the change | |
| attribute, it checks to see if it holds modified data. If the | | attribute, it checks to see if it holds modified data. If the | |
| file is modified, the value d is returned for the change | | file is modified, the value d is returned for the change | |
| attribute value. If this file is not currently modified, the | | attribute value. If this file is not currently modified, the | |
| client returns the value c for the change attribute. | | client returns the value c for the change attribute. | |
| | | | |
|
| While the change attribute is opaque to client in the sense that it | | For simplicity of implementation, the client MAY for each CB_GETATTR | |
| has no idea what units of time, if any, the server is counting change | | return the same value d. This is true even if, between successive | |
| with, it is not opaque in that the client has to treat it as an | | CB_GETATTR operations, the client again modifies in the file's data | |
| integer, and the server has to be able to see the results of the | | or metadata in its cache. The client can return the same value | |
| client's changes to that integer. Therefore, the server MUST encode | | because the only requirement is that the client be able to indicate | |
| the change attribute in network order when sending it to the client, | | to the server that the client holds modified data. Therefore, the | |
| the client MUST decode it from network order to its native order when | | value of d may always be c + 1. | |
| receiving it, and the client MUST encode it network order when | | | |
| sending it to the server. For this reason, change is defined as an | | While the change attribute is opaque to the client in the sense that | |
| integer, rather than an opaque array of octets. | | it has no idea what units of time, if any, the server is counting | |
| | | change with, it is not opaque in that the client has to treat it as | |
| | | an unsigned integer, and the server has to be able to see the results | |
| | | of the client's changes to that integer. Therefore, the server MUST | |
| | | encode the change attribute in network order when sending it to the | |
| | | client. The client MUST decode it from network order to its native | |
| | | order when receiving it and the client MUST encode it network order | |
| | | when sending it to the server. For this reason, change is defined as | |
| | | an unsigned integer rather than an opaque array of octets. | |
| | | | |
| For the server, the following steps will be taken when providing a | | For the server, the following steps will be taken when providing a | |
| write delegation: | | write delegation: | |
| | | | |
|
| o On providing a write delegation, the server will cache a copy of | | o Upon providing a write delegation, the server will cache a copy | |
| the change attribute. Let this value be represented by sc. | | of the change attribute in the data structure it uses to record | |
| | | the delegation. Let this value be represented by sc. | |
| | | | |
|
| o The server obtains the change attribute from the client. Let | | o When a second client sends a GETATTR operation on the same file | |
| this value be cc. | | to the server, the server obtains the change attribute from the | |
| | | first client. Let this value be cc. | |
| | | | |
| o If the value cc is equal to sc, the file is not modified and the | | o If the value cc is equal to sc, the file is not modified and the | |
|
| server returns the current values for change and time_modify | | server returns the current values for change, time_metadata, and | |
| (for example) to the client requesting GETATTR. | | time_modify (for example) to the second client. | |
| | | | |
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| o If the value cc is NOT equal to sc, the file is currently | | o If the value cc is NOT equal to sc, the file is currently | |
|
| modified at the client and most likely will be modified at the | | modified at the first client and most likely will be modified at | |
| server at a future time. The server then uses the current time | | the server at a future time. The server then uses its current | |
| to construct attributes values for change and time_modify and | | time to construct attribute values for time_metadata and | |
| returns those values to the requestor. | | time_modify. A new value of sc, which we will call nsc, is | |
| | | computed by the server, such that nsc >= sc + 1. The server | |
| | | then returns the constructed time_metadata, time_modify, and nsc | |
| | | values to the requester. The server replaces sc in the | |
| | | delegation record with nsc. To prevent the possibility of | |
| | | time_modify, time_metadata, and change from appearing to go | |
| | | backward (which would happen if the client holding the | |
| | | delegation fails to write its modified data to the server before | |
| | | the delegation is revoked or returned), the server SHOULD update | |
| | | the file's metadata record with the constructed attribute | |
| | | values. For reasons of reasonable performance, committing the | |
| | | constructed attribute values to stable storage is OPTIONAL. | |
| | | | |
| | | As discussed earlier in this section, the client MAY return the | |
| | | same cc value on subsequent CB_GETATTR calls, even if the file | |
| | | was modified in the client's cache yet again between successive | |
| | | CB_GETATTR calls. Therefore, the server must assume that the | |
| | | file has been modified yet again, and MUST take care to ensure | |
| | | that the new nsc it constructs and returns is greater than the | |
| | | previous nsc it returned. An example implementation's | |
| | | delegation record would satisfy this mandate by including a | |
| | | boolean field (let us call it "modified") that is set to false | |
| | | when the delegation is granted, and an sc value set at the time | |
| | | of grant to the change attribute value. The modified field would | |
| | | be set to true the first time cc != sc, and would stay true | |
| | | until the delegation is returned or revoked. The processing for | |
| | | constructing nsc, time_modify, and time_metadata would use this | |
| | | pseudo code: | |
| | | | |
| | | if (!modified) { | |
| | | do CB_GETATTR for change and size; | |
| | | | |
| | | if (cc != sc) | |
| | | modified = TRUE; } else { | |
| | | do CB_GETATTR for size; } | |
| | | | |
| | | if (modified) { | |
| | | sc = sc + 1; | |
| | | time_modify = time_metadata = current_time; | |
| | | update sc, time_modify, time_metadata into file's metadata; | |
| | | } | |
| | | | |
| | | return to client (that sent GETATTR) the attributes | |
| | | it requested, but make sure size comes from what | |
| | | CB_GETATTR returned. Do not update the file's metadata | |
| | | with the client's modified size. | |
| | | | |
| o In the case that the file attribute size is different than the | | o In the case that the file attribute size is different than the | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| server's current value, the server treats this as a modification | | server's current value, the server treats this as a modification | |
| regardless of the value of the change attribute retrieved via | | regardless of the value of the change attribute retrieved via | |
| CB_GETATTR and responds to the second client as in the last | | CB_GETATTR and responds to the second client as in the last | |
| step. | | step. | |
| | | | |
| This methodology resolves issues of clock differences between client | | This methodology resolves issues of clock differences between client | |
| and server and other scenarios where the use of CB_GETATTR break | | and server and other scenarios where the use of CB_GETATTR break | |
| down. | | down. | |
| | | | |
|
| | | It should be noted that the server is under no obligation to use | |
| | | CB_GETATTR and therefore the server MAY simply recall the delegation | |
| | | to avoid its use. | |
| | | | |
| 9.4.4. Recall of Open Delegation | | 9.4.4. Recall of Open Delegation | |
| | | | |
| The following events necessitate recall of an open delegation: | | The following events necessitate recall of an open delegation: | |
| | | | |
| o Potentially conflicting OPEN request (or READ/WRITE done with | | o Potentially conflicting OPEN request (or READ/WRITE done with | |
| "special" stateid) | | "special" stateid) | |
| | | | |
| o SETATTR issued by another client | | o SETATTR issued by another client | |
| | | | |
| o REMOVE request for the file | | o REMOVE request for the file | |
| | | | |
| skipping to change at page 102, line 53 | | skipping to change at page 106, line 4 | |
| same updates must be done whenever a client chooses to return a | | same updates must be done whenever a client chooses to return a | |
| delegation voluntarily. The following items of state need to be | | delegation voluntarily. The following items of state need to be | |
| dealt with: | | dealt with: | |
| | | | |
| o If the file associated with the delegation is no longer open and | | o If the file associated with the delegation is no longer open and | |
| no previous CLOSE operation has been sent to the server, a CLOSE | | no previous CLOSE operation has been sent to the server, a CLOSE | |
| operation must be sent to the server. | | operation must be sent to the server. | |
| | | | |
| o If a file has other open references at the client, then OPEN | | o If a file has other open references at the client, then OPEN | |
| operations must be sent to the server. The appropriate stateids | | operations must be sent to the server. The appropriate stateids | |
|
| | | | |
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| will be provided by the server for subsequent use by the client | | will be provided by the server for subsequent use by the client | |
| since the delegation stateid will not longer be valid. These | | since the delegation stateid will not longer be valid. These | |
| OPEN requests are done with the claim type of | | OPEN requests are done with the claim type of | |
| CLAIM_DELEGATE_CUR. This will allow the presentation of the | | CLAIM_DELEGATE_CUR. This will allow the presentation of the | |
|
| | | | |
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
| delegation stateid so that the client can establish the | | delegation stateid so that the client can establish the | |
| appropriate rights to perform the OPEN. (see the section | | appropriate rights to perform the OPEN. (see the section | |
| "Operation 18: OPEN" for details.) | | "Operation 18: OPEN" for details.) | |
| | | | |
| o If there are granted file locks, the corresponding LOCK | | o If there are granted file locks, the corresponding LOCK | |
| operations need to be performed. This applies to the write open | | operations need to be performed. This applies to the write open | |
| delegation case only. | | delegation case only. | |
| | | | |
| o For a write open delegation, if at the time of recall the file | | o For a write open delegation, if at the time of recall the file | |
| is not open for write, all modified data for the file must be | | is not open for write, all modified data for the file must be | |
| | | | |
| skipping to change at page 103, line 55 | | skipping to change at page 107, line 4 | |
| constraints) make that desirable. Generally, however, the fact that | | constraints) make that desirable. Generally, however, the fact that | |
| the actual open state of the file may continue to change makes it not | | the actual open state of the file may continue to change makes it not | |
| worthwhile to send information about opens and closes to the server, | | worthwhile to send information about opens and closes to the server, | |
| except as part of delegation return. Only in the case of closing the | | except as part of delegation return. Only in the case of closing the | |
| open that resulted in obtaining the delegation would clients be | | open that resulted in obtaining the delegation would clients be | |
| likely to do this early, since, in that case, the close once done | | likely to do this early, since, in that case, the close once done | |
| will not be undone. Regardless of the client's choices on scheduling | | will not be undone. Regardless of the client's choices on scheduling | |
| these actions, all must be performed before the delegation is | | these actions, all must be performed before the delegation is | |
| returned, including (when applicable) the close that corresponds to | | returned, including (when applicable) the close that corresponds to | |
| the open that resulted in the delegation. These actions can be | | the open that resulted in the delegation. These actions can be | |
|
| | | | |
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| performed either in previous requests or in previous operations in | | performed either in previous requests or in previous operations in | |
| the same COMPOUND request. | | the same COMPOUND request. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | 9.4.5. Clients that Fail to Honor Delegation Recalls | |
| | | | |
|
| 9.4.5. Delegation Revocation | | A client may fail to respond to a recall for various reasons, such as | |
| | | a failure of the callback path from server to the client. The client | |
| | | may be unaware of a failure in the callback path. This lack of | |
| | | awareness could result in the client finding out long after the | |
| | | failure that its delegation has been revoked, and another client has | |
| | | modified the data for which the client had a delegation. This is | |
| | | especially a problem for the client that held a write delegation. | |
| | | | |
| | | The server also has a dilemma in that the client that fails to | |
| | | respond to the recall might also be sending other NFS requests, | |
| | | including those that renew the lease before the lease expires. | |
| | | Without returning an error for those lease renewing operations, the | |
| | | server leads the client to believe that the delegation it has is in | |
| | | force. | |
| | | | |
| | | This difficulty is solved by the following rules: | |
| | | | |
| | | o When the callback path is down, the server MUST NOT revoke the | |
| | | delegation if one of the following occurs: | |
| | | | |
| | | - The client has issued a RENEW operation and the server has | |
| | | returned an NFS4ERR_CB_PATH_DOWN error. The server MUST renew | |
| | | the lease for any record locks and share reservations the | |
| | | client has that the server has known about (as opposed to those | |
| | | locks and share reservations the client has established but not | |
| | | yet sent to the server, due to the delegation). The server | |
| | | SHOULD give the client a reasonable time to return its | |
| | | delegations to the server before revoking the client's | |
| | | delegations. | |
| | | | |
| | | - The client has not issued a RENEW operation for some period of | |
| | | time after the server attempted to recall the delegation. This | |
| | | period of time MUST NOT be less than the value of the | |
| | | lease_time attribute. | |
| | | | |
| | | o When the client holds a delegation, it can not rely on operations, | |
| | | except for RENEW, that take a stateid, to renew delegation leases | |
| | | across callback path failures. The client that wants to keep | |
| | | delegations in force across callback path failures must use RENEW | |
| | | to do so. | |
| | | | |
| | | 9.4.6. Delegation Revocation | |
| | | | |
| At the point a delegation is revoked, if there are associated opens | | At the point a delegation is revoked, if there are associated opens | |
| on the client, the applications holding these opens need to be | | on the client, the applications holding these opens need to be | |
|
| | | | |
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| notified. This notification usually occurs by returning errors for | | notified. This notification usually occurs by returning errors for | |
| READ/WRITE operations or when a close is attempted for the open file. | | READ/WRITE operations or when a close is attempted for the open file. | |
| | | | |
| If no opens exist for the file at the point the delegation is | | If no opens exist for the file at the point the delegation is | |
| revoked, then notification of the revocation is unnecessary. | | revoked, then notification of the revocation is unnecessary. | |
| However, if there is modified data present at the client for the | | However, if there is modified data present at the client for the | |
| file, the user of the application should be notified. Unfortunately, | | file, the user of the application should be notified. Unfortunately, | |
| it may not be possible to notify the user since active applications | | it may not be possible to notify the user since active applications | |
| may not be present at the client. See the section "Revocation | | may not be present at the client. See the section "Revocation | |
| Recovery for Write Open Delegation" for additional details. | | Recovery for Write Open Delegation" for additional details. | |
| | | | |
| skipping to change at page 105, line 4 | | skipping to change at page 108, line 53 | |
| violated. Depending on how errors are typically treated for the | | violated. Depending on how errors are typically treated for the | |
| client operating environment, further levels of notification | | client operating environment, further levels of notification | |
| including logging, console messages, and GUI pop-ups may be | | including logging, console messages, and GUI pop-ups may be | |
| appropriate. | | appropriate. | |
| | | | |
| 9.5.1. Revocation Recovery for Write Open Delegation | | 9.5.1. Revocation Recovery for Write Open Delegation | |
| | | | |
| Revocation recovery for a write open delegation poses the special | | Revocation recovery for a write open delegation poses the special | |
| issue of modified data in the client cache while the file is not | | issue of modified data in the client cache while the file is not | |
| open. In this situation, any client which does not flush modified | | open. In this situation, any client which does not flush modified | |
|
| | | | |
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
| data to the server on each close must ensure that the user receives | | data to the server on each close must ensure that the user receives | |
| appropriate notification of the failure as a result of the | | appropriate notification of the failure as a result of the | |
| revocation. Since such situations may require human action to | | revocation. Since such situations may require human action to | |
| correct problems, notification schemes in which the appropriate user | | correct problems, notification schemes in which the appropriate user | |
|
| | | | |
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| or administrator is notified may be necessary. Logging and console | | or administrator is notified may be necessary. Logging and console | |
| messages are typical examples. | | messages are typical examples. | |
| | | | |
| If there is modified data on the client, it must not be flushed | | If there is modified data on the client, it must not be flushed | |
| normally to the server. A client may attempt to provide a copy of | | normally to the server. A client may attempt to provide a copy of | |
| the file data as modified during the delegation under a different | | the file data as modified during the delegation under a different | |
| name in the filesystem name space to ease recovery. Note that when | | name in the filesystem name space to ease recovery. Note that when | |
| the client can determine that the file has not been modified by any | | the client can determine that the file has not been modified by any | |
| other client, or when the client has a complete cached copy of file | | other client, or when the client has a complete cached copy of file | |
| in question, such a saved copy of the client's view of the file may | | in question, such a saved copy of the client's view of the file may | |
| | | | |
| skipping to change at page 106, line 5 | | skipping to change at page 109, line 54 | |
| cached. The exception to this are modifications to attributes that | | cached. The exception to this are modifications to attributes that | |
| are intimately connected with data caching. Therefore, extending a | | are intimately connected with data caching. Therefore, extending a | |
| file by writing data to the local data cache is reflected immediately | | file by writing data to the local data cache is reflected immediately | |
| in the size as seen on the client without this change being | | in the size as seen on the client without this change being | |
| immediately reflected on the server. Normally such changes are not | | immediately reflected on the server. Normally such changes are not | |
| propagated directly to the server but when the modified data is | | propagated directly to the server but when the modified data is | |
| flushed to the server, analogous attribute changes are made on the | | flushed to the server, analogous attribute changes are made on the | |
| server. When open delegation is in effect, the modified attributes | | server. When open delegation is in effect, the modified attributes | |
| may be returned to the server in the response to a CB_RECALL call. | | may be returned to the server in the response to a CB_RECALL call. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
| The result of local caching of attributes is that the attribute | | The result of local caching of attributes is that the attribute | |
| caches maintained on individual clients will not be coherent. Changes | | caches maintained on individual clients will not be coherent. Changes | |
| made in one order on the server may be seen in a different order on | | made in one order on the server may be seen in a different order on | |
|
| | | | |
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| one client and in a third order on a different client. | | one client and in a third order on a different client. | |
| | | | |
| The typical filesystem application programming interfaces do not | | The typical filesystem application programming interfaces do not | |
| provide means to atomically modify or interrogate attributes for | | provide means to atomically modify or interrogate attributes for | |
| multiple files at the same time. The following rules provide an | | multiple files at the same time. The following rules provide an | |
| environment where the potential incoherences mentioned above can be | | environment where the potential incoherences mentioned above can be | |
| reasonably managed. These rules are derived from the practice of | | reasonably managed. These rules are derived from the practice of | |
| previous NFS protocols. | | previous NFS protocols. | |
| | | | |
| o All attributes for a given file (per-fsid attributes excepted) | | o All attributes for a given file (per-fsid attributes excepted) | |
| | | | |
| skipping to change at page 107, line 4 | | skipping to change at page 110, line 56 | |
| The client may maintain a cache of modified attributes for those | | The client may maintain a cache of modified attributes for those | |
| attributes intimately connected with data of modified regular files | | attributes intimately connected with data of modified regular files | |
| (size, time_modify, and change). Other than those three attributes, | | (size, time_modify, and change). Other than those three attributes, | |
| the client MUST NOT maintain a cache of modified attributes. Instead, | | the client MUST NOT maintain a cache of modified attributes. Instead, | |
| attribute changes are immediately sent to the server. | | attribute changes are immediately sent to the server. | |
| | | | |
| In some operating environments, the equivalent to time_access is | | In some operating environments, the equivalent to time_access is | |
| expected to be implicitly updated by each read of the content of the | | expected to be implicitly updated by each read of the content of the | |
| file object. If an NFS client is caching the content of a file | | file object. If an NFS client is caching the content of a file | |
| object, whether it is a regular file, directory, or symbolic link, | | object, whether it is a regular file, directory, or symbolic link, | |
|
| | | | |
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
| the client SHOULD NOT update the time_access attribute (via SETATTR | | the client SHOULD NOT update the time_access attribute (via SETATTR | |
| or a small READ or READDIR request) on the server with each read that | | or a small READ or READDIR request) on the server with each read that | |
| is satisfied from cache. The reason is that this can defeat the | | is satisfied from cache. The reason is that this can defeat the | |
|
| | | | |
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| performance benefits of caching content, especially since an explicit | | performance benefits of caching content, especially since an explicit | |
| SETATTR of time_access may alter the change attribute on the server. | | SETATTR of time_access may alter the change attribute on the server. | |
| If the change attribute changes, clients that are caching the content | | If the change attribute changes, clients that are caching the content | |
| will think the content has changed, and will re-read unmodified data | | will think the content has changed, and will re-read unmodified data | |
| from the server. Nor is the client encouraged to maintain a modified | | from the server. Nor is the client encouraged to maintain a modified | |
| version of time_access in its cache, since this would mean that the | | version of time_access in its cache, since this would mean that the | |
| client will either eventually have to write the access time to the | | client will either eventually have to write the access time to the | |
| server with bad performance effects, or it would never update the | | server with bad performance effects, or it would never update the | |
| server's time_access, thereby resulting in a situation where an | | server's time_access, thereby resulting in a situation where an | |
| application that caches access time between a close and open of the | | application that caches access time between a close and open of the | |
| same file observes the access time oscillating between the past and | | same file observes the access time oscillating between the past and | |
| present. The time_access attribute always means the time of last | | present. The time_access attribute always means the time of last | |
| access to a file by a read that was satisfied by the server. This way | | access to a file by a read that was satisfied by the server. This way | |
| clients will tend to see only time_access changes that go forward in | | clients will tend to see only time_access changes that go forward in | |
| time. | | time. | |
| | | | |
|
| 9.7. Name Caching | | 9.7. Data and Metadata Caching and Memory Mapped Files | |
| | | | |
| | | Some operating environments include the capability for an application | |
| | | to map a file's content into the application's address space. Each | |
| | | time the application accesses a memory location that corresponds to a | |
| | | block that has not been loaded into the address space, a page fault | |
| | | occurs and the file is read (or if the block does not exist in the | |
| | | file, the block is allocated and then instantiated in the | |
| | | application's address space). | |
| | | | |
| | | As long as each memory mapped access to the file requires a page | |
| | | fault, the relevant attributes of the file that are used to detect | |
| | | access and modification (time_access, time_metadata, time_modify, and | |
| | | change) will be updated. However, in many operating environments, | |
| | | when page faults are not required these attributes will not be | |
| | | updated on reads or updates to the file via memory access (regardless | |
| | | whether the file is local file or is being access remotely). A | |
| | | client or server MAY fail to update attributes of a file that is | |
| | | being accessed via memory mapped I/O. This has several implications: | |
| | | | |
| | | o If there is an application on the server that has memory mapped | |
| | | a file that a client is also accessing, the client may not be | |
| | | able to get a consistent value of the change attribute to | |
| | | determine whether its cache is stale or not. A server that | |
| | | knows that the file is memory mapped could always | |
| | | pessimistically return updated values for change so as to force | |
| | | the application to always get the most up to date data and | |
| | | metadata for the file. However, due to the negative performance | |
| | | implications of this, such behavior is OPTIONAL. | |
| | | | |
| | | o If the memory mapped file is not being modified on the server, | |
| | | and instead is just being read by an application via the memory | |
| | | mapped interface, the client will not see an updated time_access | |
| | | attribute. However, in many operating environments, neither | |
| | | will any process running on the server. Thus NFS clients are at | |
| | | | |
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| | | no disadvantage with respect to local processes. | |
| | | | |
| | | o If there is another client that is memory mapping the file, and | |
| | | if that client is holding a write delegation, the same set of | |
| | | issues as discussed in the previous two bullet items apply. So, | |
| | | when a server does a CB_GETATTR to a file that the client has | |
| | | modified in its cache, the response from CB_GETATTR will not | |
| | | necessarily be accurate. As discussed earlier, the client's | |
| | | obligation is to report that the file has been modified since | |
| | | the delegation was granted, not whether it has been modified | |
| | | again between successive CB_GETATTR calls, and the server MUST | |
| | | assume that any file the client has modified in cache has been | |
| | | modified again between successive CB_GETATTR calls. Depending | |
| | | on the nature of the client's memory management system, this | |
| | | weak obligation may not be possible. A client MAY return stale | |
| | | information in CB_GETATTR whenever the file is memory mapped. | |
| | | | |
| | | o The mixture of memory mapping and file locking on the same file | |
| | | is problematic. Consider the following scenario, where a page | |
| | | size on each client is 8192 bytes. | |
| | | | |
| | | - Client A memory maps first page (8192 bytes) of file X | |
| | | | |
| | | - Client B memory maps first page (8192 bytes) of file X | |
| | | | |
| | | - Client A write locks first 4096 bytes | |
| | | | |
| | | - Client B write locks second 4096 bytes | |
| | | | |
| | | - Client A, via a STORE instruction modifies part of its | |
| | | locked region. | |
| | | | |
| | | - Simultaneous to client A, client B issues a STORE on part | |
| | | of its locked region. | |
| | | | |
| | | Here the challenge is for each client to resynchronize to get a | |
| | | correct view of the first page. In many operating environments, | |
| | | the virtual memory management systems on each client only know a | |
| | | page is modified, not that a subset of the page corresponding to | |
| | | the respective lock regions has been modified. So it is not | |
| | | possible for each client to do the right thing, which is to only | |
| | | write to the server that portion of the page that is locked. | |
| | | For example, if client A simply writes out the page, and then | |
| | | client B writes out the page, client A's data is lost. | |
| | | | |
| | | Moreover, if mandatory locking is enabled on the file, then we | |
| | | have a different problem. When clients A and B issue the STORE | |
| | | instructions, the resulting page faults require a record lock on | |
| | | the entire page. Each client then tries to extend their locked | |
| | | range to the entire page, which results in a deadlock. | |
| | | | |
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| | | Communicating the NFS4ERR_DEADLOCK error to a STORE instruction | |
| | | is difficult at best. | |
| | | | |
| | | If a client is locking the entire memory mapped file, there is | |
| | | no problem with advisory or mandatory record locking, at least | |
| | | until the client unlocks a region in the middle of the file. | |
| | | | |
| | | Given the above issues the following are permitted: | |
| | | | |
| | | - Clients and servers MAY deny memory mapping a file they | |
| | | know there are record locks for. | |
| | | | |
| | | - Clients and servers MAY deny a record lock on a file they | |
| | | know is memory mapped. | |
| | | | |
| | | - A client MAY deny memory mapping a file that it knows | |
| | | requires mandatory locking for I/O. If mandatory locking | |
| | | is enabled after the file is opened and mapped, the client | |
| | | MAY deny the application further access to its mapped file. | |
| | | | |
| | | 9.8. Name Caching | |
| | | | |
| The results of LOOKUP and READDIR operations may be cached to avoid | | The results of LOOKUP and READDIR operations may be cached to avoid | |
| the cost of subsequent LOOKUP operations. Just as in the case of | | the cost of subsequent LOOKUP operations. Just as in the case of | |
| attribute caching, inconsistencies may arise among the various client | | attribute caching, inconsistencies may arise among the various client | |
| caches. To mitigate the effects of these inconsistencies and given | | caches. To mitigate the effects of these inconsistencies and given | |
| the context of typical filesystem APIs, an upper time boundary is | | the context of typical filesystem APIs, an upper time boundary is | |
| maintained on how long a client name cache entry can be kept without | | maintained on how long a client name cache entry can be kept without | |
| verifying that the entry has not been made invalid by a directory | | verifying that the entry has not been made invalid by a directory | |
| change operation performed by another client. | | change operation performed by another client. | |
| | | | |
| | | | |
| skipping to change at page 107, line 55 | | skipping to change at page 114, line 4 | |
| determine whether there have been changes made to the directory by | | determine whether there have been changes made to the directory by | |
| other clients. It does this by using the change attribute as | | other clients. It does this by using the change attribute as | |
| reported before and after the directory operation in the associated | | reported before and after the directory operation in the associated | |
| change_info4 value returned for the operation. The server is able to | | change_info4 value returned for the operation. The server is able to | |
| communicate to the client whether the change_info4 data is provided | | communicate to the client whether the change_info4 data is provided | |
| atomically with respect to the directory operation. If the change | | atomically with respect to the directory operation. If the change | |
| values are provided atomically, the client is then able to compare | | values are provided atomically, the client is then able to compare | |
| the pre-operation change value with the change value in the client's | | the pre-operation change value with the change value in the client's | |
| name cache. If the comparison indicates that the directory was | | name cache. If the comparison indicates that the directory was | |
| updated by another client, the name cache associated with the | | updated by another client, the name cache associated with the | |
|
| | | | |
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| modified directory is purged from the client. If the comparison | | modified directory is purged from the client. If the comparison | |
| indicates no modification, the name cache can be updated on the | | indicates no modification, the name cache can be updated on the | |
| client to reflect the directory operation and the associated timeout | | client to reflect the directory operation and the associated timeout | |
|
| | | | |
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
| extended. The post-operation change value needs to be saved as the | | extended. The post-operation change value needs to be saved as the | |
| basis for future change_info4 comparisons. | | basis for future change_info4 comparisons. | |
| | | | |
| As demonstrated by the scenario above, name caching requires that the | | As demonstrated by the scenario above, name caching requires that the | |
| client revalidate name cache data by inspecting the change attribute | | client revalidate name cache data by inspecting the change attribute | |
| of a directory at the point when the name cache item was cached. | | of a directory at the point when the name cache item was cached. | |
| This requires that the server update the change attribute for | | This requires that the server update the change attribute for | |
| directories when the contents of the corresponding directory is | | directories when the contents of the corresponding directory is | |
| modified. For a client to use the change_info4 information | | modified. For a client to use the change_info4 information | |
| appropriately and correctly, the server must report the pre and post | | appropriately and correctly, the server must report the pre and post | |
| operation change attribute values atomically. When the server is | | operation change attribute values atomically. When the server is | |
| unable to report the before and after values atomically with respect | | unable to report the before and after values atomically with respect | |
| to the directory operation, the server must indicate that fact in the | | to the directory operation, the server must indicate that fact in the | |
| change_info4 return value. When the information is not atomically | | change_info4 return value. When the information is not atomically | |
| reported, the client should not assume that other clients have not | | reported, the client should not assume that other clients have not | |
| changed the directory. | | changed the directory. | |
| | | | |
|
| 9.8. Directory Caching | | 9.9. Directory Caching | |
| | | | |
| The results of READDIR operations may be used to avoid subsequent | | The results of READDIR operations may be used to avoid subsequent | |
| READDIR operations. Just as in the cases of attribute and name | | READDIR operations. Just as in the cases of attribute and name | |
| caching, inconsistencies may arise among the various client caches. | | caching, inconsistencies may arise among the various client caches. | |
| To mitigate the effects of these inconsistencies, and given the | | To mitigate the effects of these inconsistencies, and given the | |
| context of typical filesystem APIs, the following rules should be | | context of typical filesystem APIs, the following rules should be | |
| followed: | | followed: | |
| | | | |
| o Cached READDIR information for a directory which is not obtained | | o Cached READDIR information for a directory which is not obtained | |
| in a single READDIR operation must always be a consistent | | in a single READDIR operation must always be a consistent | |
| | | | |
| skipping to change at page 108, line 55 | | skipping to change at page 115, line 4 | |
| question, checking the change attribute of the directory with GETATTR | | question, checking the change attribute of the directory with GETATTR | |
| is adequate. The lifetime of the cache entry can be extended at | | is adequate. The lifetime of the cache entry can be extended at | |
| these checkpoints. When a client is modifying the directory, the | | these checkpoints. When a client is modifying the directory, the | |
| client needs to use the change_info4 data to determine whether there | | client needs to use the change_info4 data to determine whether there | |
| are other clients modifying the directory. If it is determined that | | are other clients modifying the directory. If it is determined that | |
| no other client modifications are occurring, the client may update | | no other client modifications are occurring, the client may update | |
| its directory cache to reflect its own changes. | | its directory cache to reflect its own changes. | |
| | | | |
| As demonstrated previously, directory caching requires that the | | As demonstrated previously, directory caching requires that the | |
| client revalidate directory cache data by inspecting the change | | client revalidate directory cache data by inspecting the change | |
|
| | | | |
| | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| attribute of a directory at the point when the directory was cached. | | attribute of a directory at the point when the directory was cached. | |
| This requires that the server update the change attribute for | | This requires that the server update the change attribute for | |
| directories when the contents of the corresponding directory is | | directories when the contents of the corresponding directory is | |
|
| | | | |
| Draft Specification NFS version 4 Protocol August 2002 | | | |
| | | | |
| modified. For a client to use the change_info4 information | | modified. For a client to use the change_info4 information | |
| appropriately and correctly, the server must report the pre and post | | appropriately and correctly, the server must report the pre and post | |
| operation change attribute values atomically. When the server is | | operation change attribute values atomically. When the server is | |
| unable to report the before and after values atomically with respect | | unable to report the before and after values atomically with respect | |
| to the directory operation, the server must indicate that fact in the | | to the directory operation, the server must indicate that fact in the | |
| change_info4 return value. When the information is not atomically | | change_info4 return value. When the information is not atomically | |
| reported, the client should not assume that other clients have not | | reported, the client should not assume that other clients have not | |
| changed the directory. | | changed the directory. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| 10. Minor Versioning | | 10. Minor Versioning | |
| | | | |
| To address the requirement of an NFS protocol that can evolve as the | | To address the requirement of an NFS protocol that can evolve as the | |
| need arises, the NFS version 4 protocol contains the rules and | | need arises, the NFS version 4 protocol contains the rules and | |
| framework to allow for future minor changes or versioning. | | framework to allow for future minor changes or versioning. | |
| | | | |
| The base assumption with respect to minor versioning is that any | | The base assumption with respect to minor versioning is that any | |
| future accepted minor version must follow the IETF process and be | | future accepted minor version must follow the IETF process and be | |
| documented in a standards track RFC. Therefore, each minor version | | documented in a standards track RFC. Therefore, each minor version | |
| | | | |
| skipping to change at page 111, line 5 | | skipping to change at page 117, line 5 | |
| documented attribute. | | documented attribute. | |
| | | | |
| Since attribute results are specified as an opaque array of | | Since attribute results are specified as an opaque array of | |
| per-attribute XDR encoded results, the complexity of adding new | | per-attribute XDR encoded results, the complexity of adding new | |
| attributes in the midst of the current definitions will be too | | attributes in the midst of the current definitions will be too | |
| burdensome. | | burdensome. | |
| | | | |
| 3 Minor versions must not modify the structure of an existing | | 3 Minor versions must not modify the structure of an existing | |
| operation's arguments or results. | | operation's arguments or results. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| Again the complexity of handling multiple structure definitions | | Again the complexity of handling multiple structure definitions | |
| for a single operation is too burdensome. New operations should | | for a single operation is too burdensome. New operations should | |
| be added instead of modifying existing structures for a minor | | be added instead of modifying existing structures for a minor | |
| version. | | version. | |
| | | | |
| This rule does not preclude the following adaptations in a minor | | This rule does not preclude the following adaptations in a minor | |
| version. | | version. | |
| | | | |
| o adding bits to flag fields such as new attributes to | | o adding bits to flag fields such as new attributes to | |
| | | | |
| skipping to change at page 112, line 5 | | skipping to change at page 118, line 5 | |
| the request as an XDR decode error. This approach allows for | | the request as an XDR decode error. This approach allows for | |
| the obsolescence of an operation while maintaining its structure | | the obsolescence of an operation while maintaining its structure | |
| so that a future minor version can reintroduce the operation. | | so that a future minor version can reintroduce the operation. | |
| | | | |
| 8.1 Minor versions may declare attributes mandatory to NOT | | 8.1 Minor versions may declare attributes mandatory to NOT | |
| implement. | | implement. | |
| | | | |
| 8.2 Minor versions may declare flag bits or enumeration values as | | 8.2 Minor versions may declare flag bits or enumeration values as | |
| mandatory to NOT implement. | | mandatory to NOT implement. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| 9 Minor versions may downgrade features from mandatory to | | 9 Minor versions may downgrade features from mandatory to | |
| recommended, or recommended to optional. | | recommended, or recommended to optional. | |
| | | | |
| 10 Minor versions may upgrade features from optional to recommended | | 10 Minor versions may upgrade features from optional to recommended | |
| or recommended to mandatory. | | or recommended to mandatory. | |
| | | | |
| 11 A client and server that support minor version X must support | | 11 A client and server that support minor version X must support | |
| minor versions 0 (zero) through X-1 as well. | | minor versions 0 (zero) through X-1 as well. | |
| | | | |
| | | | |
| skipping to change at page 113, line 5 | | skipping to change at page 119, line 5 | |
| | | | |
| This rule allows for the introduction of new functionality and | | This rule allows for the introduction of new functionality and | |
| forces the use of implementation experience before designating a | | forces the use of implementation experience before designating a | |
| feature as mandatory. | | feature as mandatory. | |
| | | | |
| 13 A client MUST NOT attempt to use a stateid, filehandle, or | | 13 A client MUST NOT attempt to use a stateid, filehandle, or | |
| similar returned object from the COMPOUND procedure with minor | | similar returned object from the COMPOUND procedure with minor | |
| version X for another COMPOUND procedure with minor version Y, | | version X for another COMPOUND procedure with minor version Y, | |
| where X != Y. | | where X != Y. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| 11. Internationalization | | 11. Internationalization | |
| | | | |
| The primary issue in which NFS needs to deal with | | The primary issue in which NFS needs to deal with | |
| internationalization, or I18N, is with respect to file names and | | internationalization, or I18N, is with respect to file names and | |
| other strings as used within the protocol. The choice of string | | other strings as used within the protocol. The choice of string | |
| representation must allow reasonable name/string access to clients | | representation must allow reasonable name/string access to clients | |
| which use various languages. The UTF-8 encoding of the UCS as | | which use various languages. The UTF-8 encoding of the UCS as | |
| defined by [ISO10646] allows for this type of access and follows the | | defined by [ISO10646] allows for this type of access and follows the | |
| policy described in "IETF Policy on Character Sets and Languages", | | policy described in "IETF Policy on Character Sets and Languages", | |
| | | | |
| skipping to change at page 114, line 5 | | skipping to change at page 120, line 5 | |
| could be understood by all clients and servers, and maintaining them | | could be understood by all clients and servers, and maintaining them | |
| in the face of changes would be considerable. A better solution is | | in the face of changes would be considerable. A better solution is | |
| desirable. | | desirable. | |
| | | | |
| If the NFS version 4 protocol used a universal 16 bit or 32 bit | | If the NFS version 4 protocol used a universal 16 bit or 32 bit | |
| character set (or an encoding of a 16 bit or 32 bit character set | | character set (or an encoding of a 16 bit or 32 bit character set | |
| into octets), then the server and client need not care if the locale | | into octets), then the server and client need not care if the locale | |
| of the user accessing the file is different than the locale of the | | of the user accessing the file is different than the locale of the | |
| user who created the file. The unique 16 bit or 32 bit encoding of | | user who created the file. The unique 16 bit or 32 bit encoding of | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| the character allows for determination of what language the character | | the character allows for determination of what language the character | |
| is from and also how to display that character on the client. The | | is from and also how to display that character on the client. The | |
| server need not know what locales are used. | | server need not know what locales are used. | |
| | | | |
| 11.2. Overview of Universal Character Set Standards | | 11.2. Overview of Universal Character Set Standards | |
| | | | |
| The previous section makes a case for using a universal character | | The previous section makes a case for using a universal character | |
| set. This section makes the case for using UTF-8 as the specific | | set. This section makes the case for using UTF-8 as the specific | |
| universal character set for the NFS version 4 protocol. | | universal character set for the NFS version 4 protocol. | |
| | | | |
| skipping to change at page 115, line 5 | | skipping to change at page 121, line 5 | |
| encoding of UCS characters as described below. | | encoding of UCS characters as described below. | |
| | | | |
| UTF-1 Only historical interest; it has been removed from 10646-1 | | UTF-1 Only historical interest; it has been removed from 10646-1 | |
| | | | |
| UTF-7 Encodes the entire "repertoire" of UCS "characters using | | UTF-7 Encodes the entire "repertoire" of UCS "characters using | |
| only octets with the higher order bit clear". [RFC2152] | | only octets with the higher order bit clear". [RFC2152] | |
| describes UTF-7. UTF-7 accomplishes this by reserving one | | describes UTF-7. UTF-7 accomplishes this by reserving one | |
| of the 7bit US-ASCII characters as a "shift" character to | | of the 7bit US-ASCII characters as a "shift" character to | |
| indicate non-US-ASCII characters. | | indicate non-US-ASCII characters. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| UTF-8 Unlike UTF-7, uses all 8 bits of the octets. US-ASCII | | UTF-8 Unlike UTF-7, uses all 8 bits of the octets. US-ASCII | |
| characters are encoded as before unchanged. Any octet with | | characters are encoded as before unchanged. Any octet with | |
| the high bit cleared can only mean a US-ASCII character. | | the high bit cleared can only mean a US-ASCII character. | |
| The high bit set means that a UCS character is being | | The high bit set means that a UCS character is being | |
| encoded. | | encoded. | |
| | | | |
| UTF-16 Encodes UCS-4 characters into UCS-2 characters using a | | UTF-16 Encodes UCS-4 characters into UCS-2 characters using a | |
| reserved range in UCS-2. | | reserved range in UCS-2. | |
| | | | |
| | | | |
| skipping to change at page 116, line 5 | | skipping to change at page 122, line 5 | |
| 0000 0080-0000 07FF 110xxxxx 10xxxxxx | | 0000 0080-0000 07FF 110xxxxx 10xxxxxx | |
| 0000 0800-0000 FFFF 1110xxxx 10xxxxxx 10xxxxxx | | 0000 0800-0000 FFFF 1110xxxx 10xxxxxx 10xxxxxx | |
| | | | |
| 0001 0000-001F FFFF 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx | | 0001 0000-001F FFFF 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx | |
| 0020 0000-03FF FFFF 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx | | 0020 0000-03FF FFFF 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx | |
| 0400 0000-7FFF FFFF 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx | | 0400 0000-7FFF FFFF 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx | |
| 10xxxxxx | | 10xxxxxx | |
| | | | |
| See [RFC2279] for precise encoding and decoding rules. Note because | | See [RFC2279] for precise encoding and decoding rules. Note because | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| of UTF-16, the algorithm from Unicode/UCS-2 to UTF-8 needs to account | | of UTF-16, the algorithm from Unicode/UCS-2 to UTF-8 needs to account | |
| for the reserved range between D800 and DFFF. | | for the reserved range between D800 and DFFF. | |
| | | | |
| Note that the 16 bit UCS or Unicode characters require no more than 3 | | Note that the 16 bit UCS or Unicode characters require no more than 3 | |
| octets to encode into UTF-8 | | octets to encode into UTF-8 | |
| | | | |
| Interestingly, UTF-8 has room to handle characters larger than 31 | | Interestingly, UTF-8 has room to handle characters larger than 31 | |
| bits, because the leading octet of form: | | bits, because the leading octet of form: | |
| | | | |
| | | | |
| skipping to change at page 117, line 5 | | skipping to change at page 123, line 5 | |
| | | | |
| 11.6. UTF-8 Related Errors | | 11.6. UTF-8 Related Errors | |
| | | | |
| Where the client sends an invalid UTF-8 string, the server should | | Where the client sends an invalid UTF-8 string, the server should | |
| return an NFS4ERR_INVAL error. This includes cases in which | | return an NFS4ERR_INVAL error. This includes cases in which | |
| inappropriate prefixes are detected and where the count includes | | inappropriate prefixes are detected and where the count includes | |
| trailing bytes that do not constitute a full UCS character. | | trailing bytes that do not constitute a full UCS character. | |
| | | | |
| Where the client supplied string is valid UTF-8 but contains | | Where the client supplied string is valid UTF-8 but contains | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| characters that are not supported by the server as a value for that | | characters that are not supported by the server as a value for that | |
| string (e.g. names containing characters that have more than two | | string (e.g. names containing characters that have more than two | |
| octets on a filesystem that supports Unicode characters only), the | | octets on a filesystem that supports Unicode characters only), the | |
| server should return an NFS4ERR_BADCHAR error. | | server should return an NFS4ERR_BADCHAR error. | |
| | | | |
| Where a UTF-8 string is used as a file name, and the filesystem, | | Where a UTF-8 string is used as a file name, and the filesystem, | |
| while supporting all of the characters within the name, does not | | while supporting all of the characters within the name, does not | |
| allow that particular name to be used, the error should return the | | allow that particular name to be used, the error should return the | |
| error NFS4ERR_BADNAME. This includes situations in which the server | | error NFS4ERR_BADNAME. This includes situations in which the server | |
| filesystem imposes a normalization constraint on name strings, but | | filesystem imposes a normalization constraint on name strings, but | |
| will also include such situations as filesystem prohibitions of "." | | will also include such situations as filesystem prohibitions of "." | |
| and ".." as file names for certain operations, and other such | | and ".." as file names for certain operations, and other such | |
| constraints. | | constraints. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol August 2002 | | Draft Specification NFS version 4 Protocol September 2002 | |
| | | | |
| 12. Error Definitions | | 12. Error Definitions | |
| | | | |
| NFS error numbers are assigned to failed operations within a compound | | NFS error numbers are assigned to failed operations within a compound | |
| request. A compound request contains a number of NFS operations that | | request. A compound request contains a number of NFS operations that | |
| have their results encoded in sequence in a compound reply. The | | have their results encoded in sequence in a compound reply. The | |
| results of successful operations will consist of an NFS4_OK status | | |