| draft-ietf-nfsv4-rfc3010bis-05.txt | | rfc3530.txt | |
| | | | |
|
| NFS version 4 Working Group S. Shepler | | Network Working Group S. Shepler | |
| INTERNET-DRAFT Sun Microsystems, Inc. | | Request for Comments: 3530 B. Callaghan | |
| Obsoletes: 3010 C. Beame | | Obsoletes: 3010 D. Robinson | |
| Document: draft-ietf-nfsv4-rfc3010bis-05.txt Hummingbird Ltd. | | Category: Standards Track R. Thurlow | |
| B. Callaghan | | | |
| Sun Microsystems, Inc. | | Sun Microsystems, Inc. | |
|
| | | C. Beame | |
| | | Hummingbird Ltd. | |
| M. Eisler | | M. Eisler | |
|
| Network Appliance, Inc. | | | |
| D. Noveck | | D. Noveck | |
| Network Appliance, Inc. | | Network Appliance, Inc. | |
|
| D. Robinson | | April 2003 | |
| Sun Microsystems, Inc. | | | |
| R. Thurlow | | | |
| Sun Microsystems, Inc. | | | |
| November 2002 | | | |
| | | | |
|
| NFS version 4 Protocol | | Network File System (NFS) version 4 Protocol | |
| | | | |
| Status of this Memo | | Status of this Memo | |
| | | | |
|
| This document is an Internet-Draft and is in full conformance with | | This document specifies an Internet standards track protocol for the | |
| all provisions of Section 10 of RFC2026. | | Internet community, and requests discussion and suggestions for | |
| | | improvements. Please refer to the current edition of the "Internet | |
| Internet-Drafts are working documents of the Internet Engineering | | Official Protocol Standards" (STD 1) for the standardization state | |
| Task Force (IETF), its areas, and its working groups. Note that | | and status of this protocol. Distribution of this memo is unlimited. | |
| other groups may also distribute working documents as Internet- | | | |
| Drafts. | | | |
| | | | |
| Internet-Drafts are draft documents valid for a maximum of six months | | | |
| and may be updated, replaced, or obsoleted by other documents at any | | | |
| time. It is inappropriate to use Internet- Drafts as reference | | | |
| material or to cite them other than as "work in progress." | | | |
| | | | |
|
| The list of current Internet-Drafts can be accessed at | | Copyright Notice | |
| http://www.ietf.org/ietf/1id-abstracts.txt | | | |
| | | | |
|
| The list of Internet-Draft Shadow Directories can be accessed at | | Copyright (C) The Internet Society (2003). All Rights Reserved. | |
| http://www.ietf.org/shadow.html. | | | |
| | | | |
| Abstract | | Abstract | |
| | | | |
|
| This document replaces [RFC3010] as the definition of the NFS version | | The Network File System (NFS) version 4 is a distributed filesystem | |
| 4 protocol. | | protocol which owes heritage to NFS protocol version 2, RFC 1094, and | |
| | | version 3, RFC 1813. Unlike earlier versions, the NFS version 4 | |
| Draft Specification NFS version 4 Protocol November 2002 | | protocol supports traditional file access while integrating support | |
| | | for file locking and the mount protocol. In addition, support for | |
| NFS version 4 is a distributed filesystem protocol which owes | | strong security (and its negotiation), compound operations, client | |
| heritage to NFS protocol versions 2 [RFC1094] and 3 [RFC1813]. | | caching, and internationalization have been added. Of course, | |
| Unlike earlier versions, the NFS version 4 protocol supports | | attention has been applied to making NFS version 4 operate well in an | |
| traditional file access while integrating support for file locking | | Internet environment. | |
| and the mount protocol. In addition, support for strong security | | | |
| (and its negotiation), compound operations, client caching, and | | | |
| internationalization have been added. Of course, attention has been | | | |
| applied to making NFS version 4 operate well in an Internet | | | |
| environment. | | | |
| | | | |
| Copyright | | | |
| | | | |
|
| Copyright (C) The Internet Society (2000-2002). All Rights Reserved. | | This document replaces RFC 3010 as the definition of the NFS version | |
| | | 4 protocol. | |
| | | | |
| Key Words | | Key Words | |
| | | | |
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |
| document are to be interpreted as described in [RFC2119]. | | document are to be interpreted as described in [RFC2119]. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| Table of Contents | | Table of Contents | |
| | | | |
|
| 1. Changes since RFC3010 . . . . . . . . . . . . . . . . . . . . 8 | | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 8 | |
| 1.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 9 | | 1.1. Changes since RFC 3010 . . . . . . . . . . . . . . . 8 | |
| 1.2. Inconsistencies of this Document with Section 18 . . . . . 9 | | 1.2. NFS version 4 Goals. . . . . . . . . . . . . . . . . 9 | |
| 1.3. Overview of NFS version 4 Features . . . . . . . . . . . 10 | | 1.3. Inconsistencies of this Document with Section 18 . . 9 | |
| 1.3.1. RPC and Security . . . . . . . . . . . . . . . . . . . 10 | | 1.4. Overview of NFS version 4 Features . . . . . . . . . 10 | |
| 1.3.2. Procedure and Operation Structure . . . . . . . . . . . 10 | | 1.4.1. RPC and Security . . . . . . . . . . . . . . 10 | |
| 1.3.3. Filesystem Model . . . . . . . . . . . . . . . . . . . 11 | | 1.4.2. Procedure and Operation Structure. . . . . . 10 | |
| 1.3.3.1. Filehandle Types . . . . . . . . . . . . . . . . . . 11 | | 1.4.3. Filesystem Mode. . . . . . . . . . . . . . . 11 | |
| 1.3.3.2. Attribute Types . . . . . . . . . . . . . . . . . . . 12 | | 1.4.3.1. Filehandle Types . . . . . . . . . 11 | |
| 1.3.3.3. Filesystem Replication and Migration . . . . . . . . 12 | | 1.4.3.2. Attribute Types. . . . . . . . . . 12 | |
| 1.3.4. OPEN and CLOSE . . . . . . . . . . . . . . . . . . . . 13 | | 1.4.3.3. Filesystem Replication and | |
| 1.3.5. File locking . . . . . . . . . . . . . . . . . . . . . 13 | | Migration. . . . . . . . . . . . . 13 | |
| 1.3.6. Client Caching and Delegation . . . . . . . . . . . . . 13 | | 1.4.4. OPEN and CLOSE . . . . . . . . . . . . . . . 13 | |
| 1.4. General Definitions . . . . . . . . . . . . . . . . . . . 14 | | 1.4.5. File locking . . . . . . . . . . . . . . . . 13 | |
| | | 1.4.6. Client Caching and Delegation. . . . . . . . 13 | |
| | | 1.5. General Definitions. . . . . . . . . . . . . . . . . 14 | |
| 2. Protocol Data Types . . . . . . . . . . . . . . . . . . . . 16 | | 2. Protocol Data Types . . . . . . . . . . . . . . . . . . . . 16 | |
|
| 2.1. Basic Data Types . . . . . . . . . . . . . . . . . . . . 16 | | 2.1. Basic Data Types . . . . . . . . . . . . . . . . . . 16 | |
| 2.2. Structured Data Types . . . . . . . . . . . . . . . . . . 17 | | 2.2. Structured Data Types. . . . . . . . . . . . . . . . 18 | |
| 3. RPC and Security Flavor . . . . . . . . . . . . . . . . . . 23 | | 3. RPC and Security Flavor . . . . . . . . . . . . . . . . . . 23 | |
|
| 3.1. Ports and Transports . . . . . . . . . . . . . . . . . . 23 | | 3.1. Ports and Transports . . . . . . . . . . . . . . . . 23 | |
| 3.1.1. Client Retransmission Behavior . . . . . . . . . . . . 24 | | 3.1.1. Client Retransmission Behavior . . . . . . . 24 | |
| 3.2. Security Flavors . . . . . . . . . . . . . . . . . . . . 24 | | 3.2. Security Flavors . . . . . . . . . . . . . . . . . . 25 | |
| 3.2.1. Security mechanisms for NFS version 4 . . . . . . . . . 24 | | 3.2.1. Security mechanisms for NFS version 4. . . . 25 | |
| 3.2.1.1. Kerberos V5 as a security triple . . . . . . . . . . 25 | | 3.2.1.1. Kerberos V5 as a security triple . 25 | |
| 3.2.1.2. LIPKEY as a security triple . . . . . . . . . . . . . 25 | | 3.2.1.2. LIPKEY as a security triple. . . . 26 | |
| 3.2.1.3. SPKM-3 as a security triple . . . . . . . . . . . . . 26 | | 3.2.1.3. SPKM-3 as a security triple. . . . 27 | |
| 3.3. Security Negotiation . . . . . . . . . . . . . . . . . . 27 | | 3.3. Security Negotiation . . . . . . . . . . . . . . . . 27 | |
| 3.3.1. SECINFO . . . . . . . . . . . . . . . . . . . . . . . . 27 | | 3.3.1. SECINFO. . . . . . . . . . . . . . . . . . . 28 | |
| 3.3.2. Security Error . . . . . . . . . . . . . . . . . . . . 27 | | 3.3.2. Security Error . . . . . . . . . . . . . . . 28 | |
| 3.4. Callback RPC Authentication . . . . . . . . . . . . . . . 28 | | 3.4. Callback RPC Authentication. . . . . . . . . . . . . 28 | |
| 4. Filehandles . . . . . . . . . . . . . . . . . . . . . . . . 30 | | 4. Filehandles . . . . . . . . . . . . . . . . . . . . . . . . 30 | |
|
| 4.1. Obtaining the First Filehandle . . . . . . . . . . . . . 30 | | 4.1. Obtaining the First Filehandle . . . . . . . . . . . 30 | |
| 4.1.1. Root Filehandle . . . . . . . . . . . . . . . . . . . . 30 | | 4.1.1. Root Filehandle. . . . . . . . . . . . . . . 31 | |
| 4.1.2. Public Filehandle . . . . . . . . . . . . . . . . . . . 30 | | 4.1.2. Public Filehandle. . . . . . . . . . . . . . 31 | |
| 4.2. Filehandle Types . . . . . . . . . . . . . . . . . . . . 31 | | 4.2. Filehandle Types . . . . . . . . . . . . . . . . . . 31 | |
| 4.2.1. General Properties of a Filehandle . . . . . . . . . . 31 | | 4.2.1. General Properties of a Filehandle . . . . . 32 | |
| 4.2.2. Persistent Filehandle . . . . . . . . . . . . . . . . . 32 | | 4.2.2. Persistent Filehandle. . . . . . . . . . . . 32 | |
| 4.2.3. Volatile Filehandle . . . . . . . . . . . . . . . . . . 32 | | 4.2.3. Volatile Filehandle. . . . . . . . . . . . . 33 | |
| 4.2.4. One Method of Constructing a Volatile Filehandle . . . 33 | | 4.2.4. One Method of Constructing a | |
| 4.3. Client Recovery from Filehandle Expiration . . . . . . . 34 | | Volatile Filehandle. . . . . . . . . . . . . 34 | |
| 5. File Attributes . . . . . . . . . . . . . . . . . . . . . . 36 | | 4.3. Client Recovery from Filehandle Expiration . . . . . 35 | |
| 5.1. Mandatory Attributes . . . . . . . . . . . . . . . . . . 37 | | 5. File Attributes. . . . . . . . . . . . . . . . . . . . . . 35 | |
| 5.2. Recommended Attributes . . . . . . . . . . . . . . . . . 37 | | 5.1. Mandatory Attributes . . . . . . . . . . . . . . . . 37 | |
| 5.3. Named Attributes . . . . . . . . . . . . . . . . . . . . 37 | | 5.2. Recommended Attributes . . . . . . . . . . . . . . . 37 | |
| 5.4. Classification of Attributes . . . . . . . . . . . . . . 38 | | 5.3. Named Attributes . . . . . . . . . . . . . . . . . . 37 | |
| 5.5. Mandatory Attributes - Definitions . . . . . . . . . . . 40 | | 5.4. Classification of Attributes . . . . . . . . . . . . 38 | |
| 5.6. Recommended Attributes - Definitions . . . . . . . . . . 42 | | 5.5. Mandatory Attributes - Definitions . . . . . . . . . 39 | |
| 5.7. Time Access . . . . . . . . . . . . . . . . . . . . . . . 47 | | 5.6. Recommended Attributes - Definitions . . . . . . . . 41 | |
| 5.8. Interpreting owner and owner_group . . . . . . . . . . . 47 | | 5.7. Time Access. . . . . . . . . . . . . . . . . . . . . 46 | |
| 5.9. Character Case Attributes . . . . . . . . . . . . . . . . 49 | | 5.8. Interpreting owner and owner_group . . . . . . . . . 47 | |
| 5.10. Quota Attributes . . . . . . . . . . . . . . . . . . . . 49 | | 5.9. Character Case Attributes. . . . . . . . . . . . . . 49 | |
| | | 5.10. Quota Attributes . . . . . . . . . . . . . . . . . . 49 | |
| Draft Specification NFS version 4 Protocol November 2002 | | 5.11. Access Control Lists . . . . . . . . . . . . . . . . 50 | |
| | | 5.11.1. ACE type . . . . . . . . . . . . . . . . . 51 | |
| 5.11. Access Control Lists . . . . . . . . . . . . . . . . . . 50 | | 5.11.2. ACE Access Mask. . . . . . . . . . . . . . 52 | |
| 5.11.1. ACE type . . . . . . . . . . . . . . . . . . . . . . . 51 | | 5.11.3. ACE flag . . . . . . . . . . . . . . . . . 54 | |
| 5.11.2. ACE Access Mask . . . . . . . . . . . . . . . . . . . 52 | | 5.11.4. ACE who . . . . . . . . . . . . . . . . . 55 | |
| 5.11.3. ACE flag . . . . . . . . . . . . . . . . . . . . . . . 54 | | 5.11.5. Mode Attribute . . . . . . . . . . . . . . 56 | |
| 5.11.4. ACE who . . . . . . . . . . . . . . . . . . . . . . . 56 | | 5.11.6. Mode and ACL Attribute . . . . . . . . . . 57 | |
| 5.11.5. Mode Attribute . . . . . . . . . . . . . . . . . . . . 56 | | 5.11.7. mounted_on_fileid. . . . . . . . . . . . . 57 | |
| 5.11.6. Mode and ACL Attribute . . . . . . . . . . . . . . . . 57 | | 6. Filesystem Migration and Replication . . . . . . . . . . . 58 | |
| 5.11.7. mounted_on_fileid . . . . . . . . . . . . . . . . . . 57 | | 6.1. Replication. . . . . . . . . . . . . . . . . . . . . 58 | |
| 6. Filesystem Migration and Replication . . . . . . . . . . . 59 | | 6.2. Migration. . . . . . . . . . . . . . . . . . . . . . 59 | |
| 6.1. Replication . . . . . . . . . . . . . . . . . . . . . . . 59 | | 6.3. Interpretation of the fs_locations Attribute . . . . 60 | |
| 6.2. Migration . . . . . . . . . . . . . . . . . . . . . . . . 59 | | 6.4. Filehandle Recovery for Migration or Replication . . 61 | |
| 6.3. Interpretation of the fs_locations Attribute . . . . . . 60 | | 7. NFS Server Name Space . . . . . . . . . . . . . . . . . . . 61 | |
| 6.4. Filehandle Recovery for Migration or Replication . . . . 61 | | 7.1. Server Exports . . . . . . . . . . . . . . . . . . . 61 | |
| 7. NFS Server Name Space . . . . . . . . . . . . . . . . . . . 62 | | 7.2. Browsing Exports . . . . . . . . . . . . . . . . . . 62 | |
| 7.1. Server Exports . . . . . . . . . . . . . . . . . . . . . 62 | | 7.3. Server Pseudo Filesystem . . . . . . . . . . . . . . 62 | |
| 7.2. Browsing Exports . . . . . . . . . . . . . . . . . . . . 62 | | 7.4. Multiple Roots . . . . . . . . . . . . . . . . . . . 63 | |
| 7.3. Server Pseudo Filesystem . . . . . . . . . . . . . . . . 62 | | 7.5. Filehandle Volatility. . . . . . . . . . . . . . . . 63 | |
| 7.4. Multiple Roots . . . . . . . . . . . . . . . . . . . . . 63 | | 7.6. Exported Root. . . . . . . . . . . . . . . . . . . . 63 | |
| 7.5. Filehandle Volatility . . . . . . . . . . . . . . . . . . 63 | | 7.7. Mount Point Crossing . . . . . . . . . . . . . . . . 63 | |
| 7.6. Exported Root . . . . . . . . . . . . . . . . . . . . . . 63 | | 7.8. Security Policy and Name Space Presentation. . . . . 64 | |
| 7.7. Mount Point Crossing . . . . . . . . . . . . . . . . . . 64 | | 8. File Locking and Share Reservations. . . . . . . . . . . . 65 | |
| 7.8. Security Policy and Name Space Presentation . . . . . . . 64 | | 8.1. Locking. . . . . . . . . . . . . . . . . . . . . . . 65 | |
| 8. File Locking and Share Reservations . . . . . . . . . . . . 66 | | 8.1.1. Client ID. . . . . . . . . . . . . . . . . 66 | |
| 8.1. Locking . . . . . . . . . . . . . . . . . . . . . . . . . 66 | | 8.1.2. Server Release of Clientid . . . . . . . . 69 | |
| 8.1.1. Client ID . . . . . . . . . . . . . . . . . . . . . . . 66 | | 8.1.3. lock_owner and stateid Definition. . . . . 69 | |
| 8.1.2. Server Release of Clientid . . . . . . . . . . . . . . 69 | | 8.1.4. Use of the stateid and Locking . . . . . . 71 | |
| 8.1.3. lock_owner and stateid Definition . . . . . . . . . . . 70 | | 8.1.5. Sequencing of Lock Requests. . . . . . . . 73 | |
| 8.1.4. Use of the stateid and Locking . . . . . . . . . . . . 71 | | 8.1.6. Recovery from Replayed Requests. . . . . . 74 | |
| 8.1.5. Sequencing of Lock Requests . . . . . . . . . . . . . . 73 | | 8.1.7. Releasing lock_owner State . . . . . . . . 74 | |
| 8.1.6. Recovery from Replayed Requests . . . . . . . . . . . . 74 | | 8.1.8. Use of Open Confirmation . . . . . . . . . 75 | |
| 8.1.7. Releasing lock_owner State . . . . . . . . . . . . . . 75 | | 8.2. Lock Ranges. . . . . . . . . . . . . . . . . . . . . 76 | |
| 8.1.8. Use of Open Confirmation . . . . . . . . . . . . . . . 75 | | 8.3. Upgrading and Downgrading Locks. . . . . . . . . . . 76 | |
| 8.2. Lock Ranges . . . . . . . . . . . . . . . . . . . . . . . 76 | | 8.4. Blocking Locks . . . . . . . . . . . . . . . . . . . 77 | |
| 8.3. Upgrading and Downgrading Locks . . . . . . . . . . . . . 76 | | 8.5. Lease Renewal. . . . . . . . . . . . . . . . . . . . 77 | |
| 8.4. Blocking Locks . . . . . . . . . . . . . . . . . . . . . 77 | | 8.6. Crash Recovery . . . . . . . . . . . . . . . . . . . 78 | |
| 8.5. Lease Renewal . . . . . . . . . . . . . . . . . . . . . . 77 | | 8.6.1. Client Failure and Recovery. . . . . . . . 79 | |
| 8.6. Crash Recovery . . . . . . . . . . . . . . . . . . . . . 78 | | 8.6.2. Server Failure and Recovery. . . . . . . . 79 | |
| 8.6.1. Client Failure and Recovery . . . . . . . . . . . . . . 79 | | 8.6.3. Network Partitions and Recovery. . . . . . 81 | |
| 8.6.2. Server Failure and Recovery . . . . . . . . . . . . . . 79 | | 8.7. Recovery from a Lock Request Timeout or Abort . . . 85 | |
| 8.6.3. Network Partitions and Recovery . . . . . . . . . . . . 81 | | 8.8. Server Revocation of Locks. . . . . . . . . . . . . 85 | |
| 8.7. Recovery from a Lock Request Timeout or Abort . . . . . . 84 | | 8.9. Share Reservations. . . . . . . . . . . . . . . . . 86 | |
| 8.8. Server Revocation of Locks . . . . . . . . . . . . . . . 85 | | 8.10. OPEN/CLOSE Operations . . . . . . . . . . . . . . . 87 | |
| 8.9. Share Reservations . . . . . . . . . . . . . . . . . . . 86 | | 8.10.1. Close and Retention of State | |
| 8.10. OPEN/CLOSE Operations . . . . . . . . . . . . . . . . . 86 | | Information. . . . . . . . . . . . . . . . 88 | |
| 8.10.1. Close and Retention of State Information . . . . . . . 87 | | 8.11. Open Upgrade and Downgrade. . . . . . . . . . . . . 88 | |
| 8.11. Open Upgrade and Downgrade . . . . . . . . . . . . . . . 88 | | 8.12. Short and Long Leases . . . . . . . . . . . . . . . 89 | |
| 8.12. Short and Long Leases . . . . . . . . . . . . . . . . . 88 | | | |
| 8.13. Clocks, Propagation Delay, and Calculating Lease | | 8.13. Clocks, Propagation Delay, and Calculating Lease | |
|
| Expiration . . . . . . . . . . . . . . . . . . . . . . . 89 | | Expiration. . . . . . . . . . . . . . . . . . . . . 89 | |
| 8.14. Migration, Replication and State . . . . . . . . . . . . 89 | | 8.14. Migration, Replication and State. . . . . . . . . . 90 | |
| 8.14.1. Migration and State . . . . . . . . . . . . . . . . . 90 | | 8.14.1. Migration and State. . . . . . . . . . . . 90 | |
| 8.14.2. Replication and State . . . . . . . . . . . . . . . . 90 | | 8.14.2. Replication and State. . . . . . . . . . . 91 | |
| | | 8.14.3. Notification of Migrated Lease . . . . . . 92 | |
| Draft Specification NFS version 4 Protocol November 2002 | | 8.14.4. Migration and the Lease_time Attribute . . 92 | |
| | | | |
| 8.14.3. Notification of Migrated Lease . . . . . . . . . . . . 91 | | | |
| 8.14.4. Migration and the Lease_time Attribute . . . . . . . . 91 | | | |
| 9. Client-Side Caching . . . . . . . . . . . . . . . . . . . . 93 | | 9. Client-Side Caching . . . . . . . . . . . . . . . . . . . . 93 | |
|
| 9.1. Performance Challenges for Client-Side Caching . . . . . 93 | | 9.1. Performance Challenges for Client-Side Caching. . . 93 | |
| 9.2. Delegation and Callbacks . . . . . . . . . . . . . . . . 94 | | 9.2. Delegation and Callbacks. . . . . . . . . . . . . . 94 | |
| 9.2.1. Delegation Recovery . . . . . . . . . . . . . . . . . . 95 | | 9.2.1. Delegation Recovery . . . . . . . . . . . . 96 | |
| 9.3. Data Caching . . . . . . . . . . . . . . . . . . . . . . 97 | | 9.3. Data Caching. . . . . . . . . . . . . . . . . . . . 98 | |
| 9.3.1. Data Caching and OPENs . . . . . . . . . . . . . . . . 97 | | 9.3.1. Data Caching and OPENs . . . . . . . . . . 98 | |
| 9.3.2. Data Caching and File Locking . . . . . . . . . . . . . 98 | | 9.3.2. Data Caching and File Locking. . . . . . . 99 | |
| 9.3.3. Data Caching and Mandatory File Locking . . . . . . . . 100 | | 9.3.3. Data Caching and Mandatory File Locking. . 101 | |
| 9.3.4. Data Caching and File Identity . . . . . . . . . . . . 100 | | 9.3.4. Data Caching and File Identity . . . . . . 101 | |
| 9.4. Open Delegation . . . . . . . . . . . . . . . . . . . . . 101 | | 9.4. Open Delegation . . . . . . . . . . . . . . . . . . 102 | |
| 9.4.1. Open Delegation and Data Caching . . . . . . . . . . . 104 | | 9.4.1. Open Delegation and Data Caching . . . . . 104 | |
| 9.4.2. Open Delegation and File Locks . . . . . . . . . . . . 105 | | 9.4.2. Open Delegation and File Locks . . . . . . 106 | |
| 9.4.3. Handling of CB_GETATTR . . . . . . . . . . . . . . . . 105 | | 9.4.3. Handling of CB_GETATTR . . . . . . . . . . 106 | |
| 9.4.4. Recall of Open Delegation . . . . . . . . . . . . . . . 108 | | 9.4.4. Recall of Open Delegation. . . . . . . . . 109 | |
| 9.4.5. Clients that Fail to Honor Delegation Recalls . . . . . 110 | | 9.4.5. Clients that Fail to Honor | |
| 9.4.6. Delegation Revocation . . . . . . . . . . . . . . . . . 110 | | Delegation Recalls . . . . . . . . . . . . 111 | |
| 9.5. Data Caching and Revocation . . . . . . . . . . . . . . . 111 | | 9.4.6. Delegation Revocation. . . . . . . . . . . 112 | |
| 9.5.1. Revocation Recovery for Write Open Delegation . . . . . 111 | | 9.5. Data Caching and Revocation . . . . . . . . . . . . 112 | |
| 9.6. Attribute Caching . . . . . . . . . . . . . . . . . . . . 112 | | 9.5.1. Revocation Recovery for Write Open | |
| 9.7. Data and Metadata Caching and Memory Mapped Files . . . . 114 | | Delegation . . . . . . . . . . . . . . . . 113 | |
| 9.8. Name Caching . . . . . . . . . . . . . . . . . . . . . . 116 | | 9.6. Attribute Caching . . . . . . . . . . . . . . . . . 113 | |
| 9.9. Directory Caching . . . . . . . . . . . . . . . . . . . . 117 | | 9.7. Data and Metadata Caching and Memory Mapped Files . 115 | |
| 10. Minor Versioning . . . . . . . . . . . . . . . . . . . . . 119 | | 9.8. Name Caching . . . . . . . . . . . . . . . . . . . 118 | |
| | | 9.9. Directory Caching . . . . . . . . . . . . . . . . . 119 | |
| | | 10. Minor Versioning . . . . . . . . . . . . . . . . . . . . . 120 | |
| 11. Internationalization . . . . . . . . . . . . . . . . . . . 122 | | 11. Internationalization . . . . . . . . . . . . . . . . . . . 122 | |
|
| 11.1. Stringprep profile for the utf8str_cs type . . . . . . . 123 | | 11.1. Stringprep profile for the utf8str_cs type. . . . . 123 | |
| 11.1.1. Intended applicability of the nfs4_cs_prep profile . . 123 | | 11.1.1. Intended applicability of the | |
| 11.1.2. Character repertoire of nfs4_cs_prep . . . . . . . . . 123 | | nfs4_cs_prep profile . . . . . . . . . . . 123 | |
| 11.1.3. Mapping used by nfs4_cs_prep . . . . . . . . . . . . . 123 | | 11.1.2. Character repertoire of nfs4_cs_prep . . . 124 | |
| 11.1.4. Normalization used by nfs4_cs_prep . . . . . . . . . . 124 | | 11.1.3. Mapping used by nfs4_cs_prep . . . . . . . 124 | |
| 11.1.5. Prohibited output for nfs4_cs_prep . . . . . . . . . . 124 | | 11.1.4. Normalization used by nfs4_cs_prep . . . . 124 | |
| 11.1.6. Bidirectional output for nfs4_cs_prep . . . . . . . . 124 | | 11.1.5. Prohibited output for nfs4_cs_prep . . . . 125 | |
| 11.2. Stringprep profile for the utf8str_cis type . . . . . . 124 | | 11.1.6. Bidirectional output for nfs4_cs_prep. . . 125 | |
| 11.2.1. Intended applicability of the nfs4_cis_prep profile . 124 | | 11.2. Stringprep profile for the utf8str_cis type . . . . 125 | |
| 11.2.2. Character repertoire of nfs4_cis_prep . . . . . . . . 124 | | 11.2.1. Intended applicability of the | |
| 11.2.3. Mapping used by nfs4_cis_prep . . . . . . . . . . . . 124 | | nfs4_cis_prep profile. . . . . . . . . . . 125 | |
| 11.2.4. Normalization used by nfs4_cis_prep . . . . . . . . . 125 | | 11.2.2. Character repertoire of nfs4_cis_prep . . 125 | |
| 11.2.5. Prohibited output for nfs4_cis_prep . . . . . . . . . 125 | | 11.2.3. Mapping used by nfs4_cis_prep . . . . . . 125 | |
| 11.2.6. Bidirectional output for nfs4_cis_prep . . . . . . . . 125 | | 11.2.4. Normalization used by nfs4_cis_prep . . . 125 | |
| 11.3. Stringprep profile for the utf8str_mixed type . . . . . 125 | | 11.2.5. Prohibited output for nfs4_cis_prep . . . 126 | |
| 11.3.1. Intended applicability of the nfs4_mixed_prep profile 125 | | 11.2.6. Bidirectional output for nfs4_cis_prep . . 126 | |
| 11.3.2. Character repertoire of nfs4_mixed_prep . . . . . . . 125 | | 11.3. Stringprep profile for the utf8str_mixed type . . . 126 | |
| 11.3.3. Mapping used by nfs4_cis_prep . . . . . . . . . . . . 125 | | 11.3.1. Intended applicability of the | |
| 11.3.4. Normalization used by nfs4_mixed_prep . . . . . . . . 126 | | nfs4_mixed_prep profile. . . . . . . . . . 126 | |
| 11.3.5. Prohibited output for nfs4_mixed_prep . . . . . . . . 126 | | 11.3.2. Character repertoire of nfs4_mixed_prep . 126 | |
| 11.3.6. Bidirectional output for nfs4_mixed_prep . . . . . . . 126 | | 11.3.3. Mapping used by nfs4_cis_prep . . . . . . 126 | |
| 11.4. UTF-8 Related Errors . . . . . . . . . . . . . . . . . . 126 | | 11.3.4. Normalization used by nfs4_mixed_prep . . 127 | |
| 12. Error Definitions . . . . . . . . . . . . . . . . . . . . 127 | | 11.3.5. Prohibited output for nfs4_mixed_prep . . 127 | |
| 13. NFS version 4 Requests . . . . . . . . . . . . . . . . . . 133 | | 11.3.6. Bidirectional output for nfs4_mixed_prep . 127 | |
| 13.1. Compound Procedure . . . . . . . . . . . . . . . . . . . 133 | | 11.4. UTF-8 Related Errors. . . . . . . . . . . . . . . . 127 | |
| 13.2. Evaluation of a Compound Request . . . . . . . . . . . . 134 | | 12. Error Definitions . . . . . . . . . . . . . . . . . . . . 128 | |
| | | 13. NFS version 4 Requests . . . . . . . . . . . . . . . . . . 134 | |
| Draft Specification NFS version 4 Protocol November 2002 | | 13.1. Compound Procedure. . . . . . . . . . . . . . . . . 134 | |
| | | 13.2. Evaluation of a Compound Request. . . . . . . . . . 135 | |
| 13.3. Synchronous Modifying Operations . . . . . . . . . . . . 134 | | 13.3. Synchronous Modifying Operations. . . . . . . . . . 136 | |
| 13.4. Operation Values . . . . . . . . . . . . . . . . . . . . 135 | | 13.4. Operation Values. . . . . . . . . . . . . . . . . . 136 | |
| 14. NFS version 4 Procedures . . . . . . . . . . . . . . . . . 136 | | 14. NFS version 4 Procedures . . . . . . . . . . . . . . . . . 136 | |
|
| 14.1. Procedure 0: NULL - No Operation . . . . . . . . . . . . 136 | | 14.1. Procedure 0: NULL - No Operation. . . . . . . . . . 136 | |
| 14.2. Procedure 1: COMPOUND - Compound Operations . . . . . . 137 | | 14.2. Procedure 1: COMPOUND - Compound Operations . . . . 137 | |
| 14.2.1. Operation 3: ACCESS - Check Access Rights . . . . . . 140 | | 14.2.1. Operation 3: ACCESS - Check Access | |
| 14.2.2. Operation 4: CLOSE - Close File . . . . . . . . . . . 143 | | Rights. . . . . . . . . . . . . . . . . . 140 | |
| 14.2.3. Operation 5: COMMIT - Commit Cached Data . . . . . . . 145 | | 14.2.2. Operation 4: CLOSE - Close File . . . . . 142 | |
| 14.2.4. Operation 6: CREATE - Create a Non-Regular File Object 148 | | 14.2.3. Operation 5: COMMIT - Commit | |
| 14.2.5. Operation 7: DELEGPURGE - Purge Delegations Awaiting | | Cached Data . . . . . . . . . . . . . . . 144 | |
| Recovery . . . . . . . . . . . . . . . . . . . . . . . 151 | | 14.2.4. Operation 6: CREATE - Create a | |
| 14.2.6. Operation 8: DELEGRETURN - Return Delegation . . . . . 153 | | Non-Regular File Object . . . . . . . . . 147 | |
| 14.2.7. Operation 9: GETATTR - Get Attributes . . . . . . . . 154 | | 14.2.5. Operation 7: DELEGPURGE - | |
| 14.2.8. Operation 10: GETFH - Get Current Filehandle . . . . . 156 | | Purge Delegations Awaiting Recovery . . . 150 | |
| 14.2.9. Operation 11: LINK - Create Link to a File . . . . . . 158 | | 14.2.6. Operation 8: DELEGRETURN - Return | |
| 14.2.10. Operation 12: LOCK - Create Lock . . . . . . . . . . 160 | | Delegation. . . . . . . . . . . . . . . . 151 | |
| 14.2.11. Operation 13: LOCKT - Test For Lock . . . . . . . . . 164 | | 14.2.7. Operation 9: GETATTR - Get Attributes . . 152 | |
| 14.2.12. Operation 14: LOCKU - Unlock File . . . . . . . . . . 166 | | 14.2.8. Operation 10: GETFH - Get Current | |
| 14.2.13. Operation 15: LOOKUP - Lookup Filename . . . . . . . 168 | | Filehandle. . . . . . . . . . . . . . . . 153 | |
| 14.2.14. Operation 16: LOOKUPP - Lookup Parent Directory . . . 171 | | 14.2.9. Operation 11: LINK - Create Link to a | |
| 14.2.15. Operation 17: NVERIFY - Verify Difference in | | File. . . . . . . . . . . . . . . . . . . 154 | |
| Attributes . . . . . . . . . . . . . . . . . . . . . 172 | | 14.2.10. Operation 12: LOCK - Create Lock . . . . 156 | |
| 14.2.16. Operation 18: OPEN - Open a Regular File . . . . . . 174 | | 14.2.11. Operation 13: LOCKT - Test For Lock . . . 160 | |
| 14.2.17. Operation 19: OPENATTR - Open Named Attribute | | 14.2.12. Operation 14: LOCKU - Unlock File . . . . 162 | |
| Directory . . . . . . . . . . . . . . . . . . . . . . 184 | | 14.2.13. Operation 15: LOOKUP - Lookup Filename. . 163 | |
| 14.2.18. Operation 20: OPEN_CONFIRM - Confirm Open . . . . . . 186 | | 14.2.14. Operation 16: LOOKUPP - Lookup | |
| 14.2.19. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access 189 | | Parent Directory. . . . . . . . . . . . . 165 | |
| 14.2.20. Operation 22: PUTFH - Set Current Filehandle . . . . 191 | | 14.2.15. Operation 17: NVERIFY - Verify | |
| 14.2.21. Operation 23: PUTPUBFH - Set Public Filehandle . . . 192 | | Difference in Attributes . . . . . . . . 166 | |
| 14.2.22. Operation 24: PUTROOTFH - Set Root Filehandle . . . . 194 | | 14.2.16. Operation 18: OPEN - Open a Regular | |
| 14.2.23. Operation 25: READ - Read from File . . . . . . . . . 195 | | File. . . . . . . . . . . . . . . . . . . 168 | |
| 14.2.24. Operation 26: READDIR - Read Directory . . . . . . . 198 | | 14.2.17. Operation 19: OPENATTR - Open Named | |
| 14.2.25. Operation 27: READLINK - Read Symbolic Link . . . . . 202 | | Attribute Directory . . . . . . . . . . . 178 | |
| 14.2.26. Operation 28: REMOVE - Remove Filesystem Object . . . 204 | | 14.2.18. Operation 20: OPEN_CONFIRM - | |
| 14.2.27. Operation 29: RENAME - Rename Directory Entry . . . . 207 | | Confirm Open . . . . . . . . . . . . . . 180 | |
| 14.2.28. Operation 30: RENEW - Renew a Lease . . . . . . . . . 210 | | 14.2.19. Operation 21: OPEN_DOWNGRADE - | |
| 14.2.29. Operation 31: RESTOREFH - Restore Saved Filehandle . 212 | | Reduce Open File Access . . . . . . . . . 182 | |
| 14.2.30. Operation 32: SAVEFH - Save Current Filehandle . . . 214 | | 14.2.20. Operation 22: PUTFH - Set | |
| 14.2.31. Operation 33: SECINFO - Obtain Available Security . . 215 | | Current Filehandle. . . . . . . . . . . . 184 | |
| 14.2.32. Operation 34: SETATTR - Set Attributes . . . . . . . 219 | | 14.2.21. Operation 23: PUTPUBFH - | |
| 14.2.33. Operation 35: SETCLIENTID - Negotiate Clientid . . . 222 | | Set Public Filehandle . . . . . . . . . . 185 | |
| 14.2.34. Operation 36: SETCLIENTID_CONFIRM - Confirm Clientid 226 | | 14.2.22. Operation 24: PUTROOTFH - | |
| 14.2.35. Operation 37: VERIFY - Verify Same Attributes . . . . 230 | | Set Root Filehandle . . . . . . . . . . . 186 | |
| 14.2.36. Operation 38: WRITE - Write to File . . . . . . . . . 232 | | 14.2.23. Operation 25: READ - Read from File . . . 187 | |
| 14.2.37. Operation 39: RELEASE_LOCKOWNER - Release Lockowner | | 14.2.24. Operation 26: READDIR - | |
| State . . . . . . . . . . . . . . . . . . . . . . . . 237 | | Read Directory. . . . . . . . . . . . . . 190 | |
| 14.2.38. Operation 10044: ILLEGAL - Illegal operation . . . . 239 | | 14.2.25. Operation 27: READLINK - | |
| 15. NFS version 4 Callback Procedures . . . . . . . . . . . . 240 | | Read Symbolic Link. . . . . . . . . . . . 193 | |
| 15.1. Procedure 0: CB_NULL - No Operation . . . . . . . . . . 240 | | 14.2.26. Operation 28: REMOVE - | |
| 15.2. Procedure 1: CB_COMPOUND - Compound Operations . . . . . 241 | | Remove Filesystem Object. . . . . . . . . 195 | |
| 15.2.1. Operation 3: CB_GETATTR - Get Attributes . . . . . . . 243 | | 14.2.27. Operation 29: RENAME - | |
| 15.2.2. Operation 4: CB_RECALL - Recall an Open Delegation . . 245 | | Rename Directory Entry. . . . . . . . . . 197 | |
| | | 14.2.28. Operation 30: RENEW - Renew a Lease . . . 200 | |
| Draft Specification NFS version 4 Protocol November 2002 | | 14.2.29. Operation 31: RESTOREFH - | |
| | | Restore Saved Filehandle. . . . . . . . . 201 | |
| 15.2.3. Operation 10044: CB_ILLEGAL - Illegal Callback | | 14.2.30. Operation 32: SAVEFH - Save | |
| Operation . . . . . . . . . . . . . . . . . . . . . . 247 | | Current Filehandle. . . . . . . . . . . . 202 | |
| 16. Security Considerations . . . . . . . . . . . . . . . . . 248 | | 14.2.31. Operation 33: SECINFO - Obtain | |
| 17. IANA Considerations . . . . . . . . . . . . . . . . . . . 250 | | Available Security. . . . . . . . . . . . 203 | |
| 17.1. Named Attribute Definition . . . . . . . . . . . . . . . 250 | | 14.2.32. Operation 34: SETATTR - Set Attributes. . 206 | |
| 17.2. ONC RPC Network Identifiers (netids) . . . . . . . . . . 250 | | 14.2.33. Operation 35: SETCLIENTID - | |
| 18. RPC definition file . . . . . . . . . . . . . . . . . . . 252 | | Negotiate Clientid. . . . . . . . . . . . 209 | |
| 19. Normative References . . . . . . . . . . . . . . . . . . . 284 | | 14.2.34. Operation 36: SETCLIENTID_CONFIRM - | |
| 20. Informative References . . . . . . . . . . . . . . . . . . 285 | | Confirm Clientid. . . . . . . . . . . . . 213 | |
| 21. Authors . . . . . . . . . . . . . . . . . . . . . . . . . 289 | | 14.2.35. Operation 37: VERIFY - | |
| 21.1. Editor's Address . . . . . . . . . . . . . . . . . . . . 289 | | Verify Same Attributes. . . . . . . . . . 217 | |
| 21.2. Authors' Addresses . . . . . . . . . . . . . . . . . . . 289 | | 14.2.36. Operation 38: WRITE - Write to File . . . 218 | |
| 21.3. Acknowledgements . . . . . . . . . . . . . . . . . . . . 290 | | 14.2.37. Operation 39: RELEASE_LOCKOWNER - | |
| 22. Full Copyright Statement . . . . . . . . . . . . . . . . . 291 | | Release Lockowner State . . . . . . . . . 223 | |
| | | 14.2.38. Operation 10044: ILLEGAL - | |
| | | Illegal operation . . . . . . . . . . . . 224 | |
| | | 15. NFS version 4 Callback Procedures . . . . . . . . . . . . 225 | |
| | | 15.1. Procedure 0: CB_NULL - No Operation . . . . . . . . 225 | |
| | | 15.2. Procedure 1: CB_COMPOUND - Compound | |
| | | Operations. . . . . . . . . . . . . . . . . . . . . 226 | |
| | | 15.2.1. Operation 3: CB_GETATTR - Get | |
| | | Attributes . . . . . . . . . . . . . . . . 228 | |
| | | 15.2.2. Operation 4: CB_RECALL - | |
| | | Recall an Open Delegation. . . . . . . . . 229 | |
| | | 15.2.3. Operation 10044: CB_ILLEGAL - | |
| | | Illegal Callback Operation . . . . . . . . 230 | |
| | | 16. Security Considerations . . . . . . . . . . . . . . . . . 231 | |
| | | 17. IANA Considerations . . . . . . . . . . . . . . . . . . . 232 | |
| | | 17.1. Named Attribute Definition. . . . . . . . . . . . . 232 | |
| | | 17.2. ONC RPC Network Identifiers (netids). . . . . . . . 232 | |
| | | 18. RPC definition file . . . . . . . . . . . . . . . . . . . 234 | |
| | | 19. Acknowledgements . . . . . . . . . . . . . . . . . . . . . 268 | |
| | | 20. Normative References . . . . . . . . . . . . . . . . . . . 268 | |
| | | 21. Informative References . . . . . . . . . . . . . . . . . . 270 | |
| | | 22. Authors' Information . . . . . . . . . . . . . . . . . . . 273 | |
| | | 22.1. Editor's Address. . . . . . . . . . . . . . . . . . 273 | |
| | | 22.2. Authors' Addresses. . . . . . . . . . . . . . . . . 274 | |
| | | 23. Full Copyright Statement . . . . . . . . . . . . . . . . . 275 | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | 1. Introduction | |
| | | | |
|
| 1. Changes since RFC3010 | | 1.1. Changes since RFC 3010 | |
| | | | |
| This definition of the NFS version 4 protocol replaces or obsoletes | | This definition of the NFS version 4 protocol replaces or obsoletes | |
| the definition present in [RFC3010]. While portions of the two | | the definition present in [RFC3010]. While portions of the two | |
| documents have remained the same, there have been substantive changes | | documents have remained the same, there have been substantive changes | |
| in others. The changes made between [RFC3010] and this document | | in others. The changes made between [RFC3010] and this document | |
| represent implementation experience and further review of the | | represent implementation experience and further review of the | |
| protocol. While some modifications were made for ease of | | protocol. While some modifications were made for ease of | |
| implementation or clarification, most updates represent errors or | | implementation or clarification, most updates represent errors or | |
| situations where the [RFC3010] definition were untenable. | | situations where the [RFC3010] definition were untenable. | |
| | | | |
| The following list is not all inclusive of all changes but presents | | The following list is not all inclusive of all changes but presents | |
| some of the most notable changes or additions made: | | some of the most notable changes or additions made: | |
| | | | |
| o The state model has added an open_owner4 identifier. This was | | o The state model has added an open_owner4 identifier. This was | |
|
| done to accommodate Posix based clients and the model they use | | done to accommodate Posix based clients and the model they use for | |
| for file locking. For Posix clients, an open_owner4 would | | file locking. For Posix clients, an open_owner4 would correspond | |
| correspond to a file descriptor potentially shared amongst a set | | to a file descriptor potentially shared amongst a set of processes | |
| of processes and the lock_owner4 identifier would correspond to | | and the lock_owner4 identifier would correspond to a process that | |
| a process that is locking a file. | | is locking a file. | |
| | | | |
|
| o Clarifications and error conditions were added for the handling | | o Clarifications and error conditions were added for the handling of | |
| of the owner and group attributes. Since these attributes are | | the owner and group attributes. Since these attributes are string | |
| string based (as opposed to the numeric uid/gid of previous | | based (as opposed to the numeric uid/gid of previous versions of | |
| versions of NFS), translations may not be available and hence | | NFS), translations may not be available and hence the changes | |
| the changes made. | | made. | |
| | | | |
| o Clarifications for the ACL and mode attributes to address | | o Clarifications for the ACL and mode attributes to address | |
| evaluation and partial support. | | evaluation and partial support. | |
| | | | |
|
| o For identifiers that are defined as XDR opaque, limits were set | | o For identifiers that are defined as XDR opaque, limits were set on | |
| on their size. | | their size. | |
| | | | |
| o Added the mounted_on_filed attribute to allow Posix clients to | | o Added the mounted_on_filed attribute to allow Posix clients to | |
| correctly construct local mounts. | | correctly construct local mounts. | |
| | | | |
| o Modified the SETCLIENTID/SETCLIENTID_CONFIRM operations to deal | | o Modified the SETCLIENTID/SETCLIENTID_CONFIRM operations to deal | |
|
| correctly with confirmation details along with adding the | | correctly with confirmation details along with adding the ability | |
| ability to specify new client callback information. Also added | | to specify new client callback information. Also added | |
| clarification of the callback information itself. | | clarification of the callback information itself. | |
| | | | |
| o Added a new operation LOCKOWNER_RELEASE to enable notifying the | | o Added a new operation LOCKOWNER_RELEASE to enable notifying the | |
| server that a lock_owner4 will no longer be used by the client. | | server that a lock_owner4 will no longer be used by the client. | |
| | | | |
|
| o RENEW operation changes to identify the client correctly and | | o RENEW operation changes to identify the client correctly and allow | |
| allow for additional error returns. | | for additional error returns. | |
| | | | |
| o Verify error return possibilities for all operations. | | o Verify error return possibilities for all operations. | |
| | | | |
| o Remove use of the pathname4 data type from LOOKUP and OPEN in | | o Remove use of the pathname4 data type from LOOKUP and OPEN in | |
| favor of having the client construct a sequence of LOOKUP | | favor of having the client construct a sequence of LOOKUP | |
|
| | | operations to achieive the same effect. | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| operations to acheive the same effect. | | | |
| | | | |
| o Clarification of the internationalization issues and adoption of | | o Clarification of the internationalization issues and adoption of | |
| the new stringprep profile framework. | | the new stringprep profile framework. | |
| | | | |
|
| 1.1. Introduction | | 1.2. NFS Version 4 Goals | |
| | | | |
| The NFS version 4 protocol is a further revision of the NFS protocol | | The NFS version 4 protocol is a further revision of the NFS protocol | |
| defined already by versions 2 [RFC1094] and 3 [RFC1813]. It retains | | defined already by versions 2 [RFC1094] and 3 [RFC1813]. It retains | |
| the essential characteristics of previous versions: design for easy | | the essential characteristics of previous versions: design for easy | |
| recovery, independent of transport protocols, operating systems and | | recovery, independent of transport protocols, operating systems and | |
| filesystems, simplicity, and good performance. The NFS version 4 | | filesystems, simplicity, and good performance. The NFS version 4 | |
| revision has the following goals: | | revision has the following goals: | |
| | | | |
| o Improved access and good performance on the Internet. | | o Improved access and good performance on the Internet. | |
| | | | |
|
| The protocol is designed to transit firewalls easily, perform | | The protocol is designed to transit firewalls easily, perform well | |
| well where latency is high and bandwidth is low, and scale to | | where latency is high and bandwidth is low, and scale to very | |
| very large numbers of clients per server. | | large numbers of clients per server. | |
| | | | |
| o Strong security with negotiation built into the protocol. | | o Strong security with negotiation built into the protocol. | |
| | | | |
| The protocol builds on the work of the ONCRPC working group in | | The protocol builds on the work of the ONCRPC working group in | |
|
| supporting the RPCSEC_GSS protocol. Additionally, the NFS | | supporting the RPCSEC_GSS protocol. Additionally, the NFS version | |
| version 4 protocol provides a mechanism to allow clients and | | 4 protocol provides a mechanism to allow clients and servers the | |
| servers the ability to negotiate security and require clients | | ability to negotiate security and require clients and servers to | |
| and servers to support a minimal set of security schemes. | | support a minimal set of security schemes. | |
| | | | |
| o Good cross-platform interoperability. | | o Good cross-platform interoperability. | |
| | | | |
| The protocol features a filesystem model that provides a useful, | | The protocol features a filesystem model that provides a useful, | |
| common set of features that does not unduly favor one filesystem | | common set of features that does not unduly favor one filesystem | |
| or operating system over another. | | or operating system over another. | |
| | | | |
| o Designed for protocol extensions. | | o Designed for protocol extensions. | |
| | | | |
|
| The protocol is designed to accept standard extensions that do | | The protocol is designed to accept standard extensions that do not | |
| not compromise backward compatibility. | | compromise backward compatibility. | |
| | | | |
|
| 1.2. Inconsistencies of this Document with Section 18 | | 1.3. Inconsistencies of this Document with Section 18 | |
| | | | |
| Section 18, RPC Definition File, contains the definitions in XDR | | Section 18, RPC Definition File, contains the definitions in XDR | |
| description language of the constructs used by the protocol. Prior | | description language of the constructs used by the protocol. Prior | |
| to Section 18, several of the constructs are reproduced for purposes | | to Section 18, several of the constructs are reproduced for purposes | |
| of explanation. The reader is warned of the possibility of errors in | | of explanation. The reader is warned of the possibility of errors in | |
| the reproduced constructs outside of Section 18. For any part of the | | the reproduced constructs outside of Section 18. For any part of the | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| document that is inconsistent with Section 18, Section 18 is to be | | document that is inconsistent with Section 18, Section 18 is to be | |
| considered authoritative. | | considered authoritative. | |
| | | | |
|
| 1.3. Overview of NFS version 4 Features | | 1.4. Overview of NFS version 4 Features | |
| | | | |
| To provide a reasonable context for the reader, the major features of | | To provide a reasonable context for the reader, the major features of | |
| NFS version 4 protocol will be reviewed in brief. This will be done | | NFS version 4 protocol will be reviewed in brief. This will be done | |
| to provide an appropriate context for both the reader who is familiar | | to provide an appropriate context for both the reader who is familiar | |
| with the previous versions of the NFS protocol and the reader that is | | with the previous versions of the NFS protocol and the reader that is | |
| new to the NFS protocols. For the reader new to the NFS protocols, | | new to the NFS protocols. For the reader new to the NFS protocols, | |
| there is still a fundamental knowledge that is expected. The reader | | there is still a fundamental knowledge that is expected. The reader | |
| should be familiar with the XDR and RPC protocols as described in | | should be familiar with the XDR and RPC protocols as described in | |
| [RFC1831] and [RFC1832]. A basic knowledge of filesystems and | | [RFC1831] and [RFC1832]. A basic knowledge of filesystems and | |
| distributed filesystems is expected as well. | | distributed filesystems is expected as well. | |
| | | | |
|
| 1.3.1. RPC and Security | | 1.4.1. RPC and Security | |
| | | | |
| As with previous versions of NFS, the External Data Representation | | As with previous versions of NFS, the External Data Representation | |
| (XDR) and Remote Procedure Call (RPC) mechanisms used for the NFS | | (XDR) and Remote Procedure Call (RPC) mechanisms used for the NFS | |
| version 4 protocol are those defined in [RFC1831] and [RFC1832]. To | | version 4 protocol are those defined in [RFC1831] and [RFC1832]. To | |
| meet end to end security requirements, the RPCSEC_GSS framework | | meet end to end security requirements, the RPCSEC_GSS framework | |
| [RFC2203] will be used to extend the basic RPC security. With the | | [RFC2203] will be used to extend the basic RPC security. With the | |
| use of RPCSEC_GSS, various mechanisms can be provided to offer | | use of RPCSEC_GSS, various mechanisms can be provided to offer | |
| authentication, integrity, and privacy to the NFS version 4 protocol. | | authentication, integrity, and privacy to the NFS version 4 protocol. | |
| Kerberos V5 will be used as described in [RFC1964] to provide one | | Kerberos V5 will be used as described in [RFC1964] to provide one | |
| security framework. The LIPKEY GSS-API mechanism described in | | security framework. The LIPKEY GSS-API mechanism described in | |
| | | | |
| skipping to change at page 10, line 46 | | skipping to change at page 10, line 45 | |
| version 4 security. | | version 4 security. | |
| | | | |
| To enable in-band security negotiation, the NFS version 4 protocol | | To enable in-band security negotiation, the NFS version 4 protocol | |
| has added a new operation which provides the client a method of | | has added a new operation which provides the client a method of | |
| querying the server about its policies regarding which security | | querying the server about its policies regarding which security | |
| mechanisms must be used for access to the server's filesystem | | mechanisms must be used for access to the server's filesystem | |
| resources. With this, the client can securely match the security | | resources. With this, the client can securely match the security | |
| mechanism that meets the policies specified at both the client and | | mechanism that meets the policies specified at both the client and | |
| server. | | server. | |
| | | | |
|
| 1.3.2. Procedure and Operation Structure | | 1.4.2. Procedure and Operation Structure | |
| | | | |
| A significant departure from the previous versions of the NFS | | A significant departure from the previous versions of the NFS | |
| protocol is the introduction of the COMPOUND procedure. For the NFS | | protocol is the introduction of the COMPOUND procedure. For the NFS | |
| version 4 protocol, there are two RPC procedures, NULL and COMPOUND. | | version 4 protocol, there are two RPC procedures, NULL and COMPOUND. | |
| The COMPOUND procedure is defined in terms of operations and these | | The COMPOUND procedure is defined in terms of operations and these | |
| operations correspond more closely to the traditional NFS procedures. | | operations correspond more closely to the traditional NFS procedures. | |
|
| | | | |
| With the use of the COMPOUND procedure, the client is able to build | | With the use of the COMPOUND procedure, the client is able to build | |
| simple or complex requests. These COMPOUND requests allow for a | | simple or complex requests. These COMPOUND requests allow for a | |
| reduction in the number of RPCs needed for logical filesystem | | reduction in the number of RPCs needed for logical filesystem | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| operations. For example, without previous contact with a server a | | operations. For example, without previous contact with a server a | |
| client will be able to read data from a file in one request by | | client will be able to read data from a file in one request by | |
| combining LOOKUP, OPEN, and READ operations in a single COMPOUND RPC. | | combining LOOKUP, OPEN, and READ operations in a single COMPOUND RPC. | |
| With previous versions of the NFS protocol, this type of single | | With previous versions of the NFS protocol, this type of single | |
| request was not possible. | | request was not possible. | |
| | | | |
| The model used for COMPOUND is very simple. There is no logical OR | | The model used for COMPOUND is very simple. There is no logical OR | |
| or ANDing of operations. The operations combined within a COMPOUND | | or ANDing of operations. The operations combined within a COMPOUND | |
| request are evaluated in order by the server. Once an operation | | request are evaluated in order by the server. Once an operation | |
| returns a failing result, the evaluation ends and the results of all | | returns a failing result, the evaluation ends and the results of all | |
| | | | |
| skipping to change at page 11, line 29 | | skipping to change at page 11, line 30 | |
| The NFS version 4 protocol continues to have the client refer to a | | The NFS version 4 protocol continues to have the client refer to a | |
| file or directory at the server by a "filehandle". The COMPOUND | | file or directory at the server by a "filehandle". The COMPOUND | |
| procedure has a method of passing a filehandle from one operation to | | procedure has a method of passing a filehandle from one operation to | |
| another within the sequence of operations. There is a concept of a | | another within the sequence of operations. There is a concept of a | |
| "current filehandle" and "saved filehandle". Most operations use the | | "current filehandle" and "saved filehandle". Most operations use the | |
| "current filehandle" as the filesystem object to operate upon. The | | "current filehandle" as the filesystem object to operate upon. The | |
| "saved filehandle" is used as temporary filehandle storage within a | | "saved filehandle" is used as temporary filehandle storage within a | |
| COMPOUND procedure as well as an additional operand for certain | | COMPOUND procedure as well as an additional operand for certain | |
| operations. | | operations. | |
| | | | |
|
| 1.3.3. Filesystem Model | | 1.4.3. Filesystem Model | |
| | | | |
| The general filesystem model used for the NFS version 4 protocol is | | The general filesystem model used for the NFS version 4 protocol is | |
| the same as previous versions. The server filesystem is hierarchical | | the same as previous versions. The server filesystem is hierarchical | |
| with the regular files contained within being treated as opaque byte | | with the regular files contained within being treated as opaque byte | |
| streams. In a slight departure, file and directory names are encoded | | streams. In a slight departure, file and directory names are encoded | |
| with UTF-8 to deal with the basics of internationalization. | | with UTF-8 to deal with the basics of internationalization. | |
| | | | |
| The NFS version 4 protocol does not require a separate protocol to | | The NFS version 4 protocol does not require a separate protocol to | |
| provide for the initial mapping between path name and filehandle. | | provide for the initial mapping between path name and filehandle. | |
| Instead of using the older MOUNT protocol for this mapping, the | | Instead of using the older MOUNT protocol for this mapping, the | |
| server provides a ROOT filehandle that represents the logical root or | | server provides a ROOT filehandle that represents the logical root or | |
| top of the filesystem tree provided by the server. The server | | top of the filesystem tree provided by the server. The server | |
| provides multiple filesystems by gluing them together with pseudo | | provides multiple filesystems by gluing them together with pseudo | |
| filesystems. These pseudo filesystems provide for potential gaps in | | filesystems. These pseudo filesystems provide for potential gaps in | |
| the path names between real filesystems. | | the path names between real filesystems. | |
| | | | |
|
| 1.3.3.1. Filehandle Types | | 1.4.3.1. Filehandle Types | |
| | | | |
| In previous versions of the NFS protocol, the filehandle provided by | | In previous versions of the NFS protocol, the filehandle provided by | |
| the server was guaranteed to be valid or persistent for the lifetime | | the server was guaranteed to be valid or persistent for the lifetime | |
| of the filesystem object to which it referred. For some server | | of the filesystem object to which it referred. For some server | |
| implementations, this persistence requirement has been difficult to | | implementations, this persistence requirement has been difficult to | |
| meet. For the NFS version 4 protocol, this requirement has been | | meet. For the NFS version 4 protocol, this requirement has been | |
| relaxed by introducing another type of filehandle, volatile. With | | relaxed by introducing another type of filehandle, volatile. With | |
| persistent and volatile filehandle types, the server implementation | | persistent and volatile filehandle types, the server implementation | |
| can match the abilities of the filesystem at the server along with | | can match the abilities of the filesystem at the server along with | |
| the operating environment. The client will have knowledge of the | | the operating environment. The client will have knowledge of the | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| type of filehandle being provided by the server and can be prepared | | type of filehandle being provided by the server and can be prepared | |
| to deal with the semantics of each. | | to deal with the semantics of each. | |
| | | | |
|
| 1.3.3.2. Attribute Types | | 1.4.3.2. Attribute Types | |
| | | | |
| The NFS version 4 protocol introduces three classes of filesystem or | | The NFS version 4 protocol introduces three classes of filesystem or | |
| file attributes. Like the additional filehandle type, the | | file attributes. Like the additional filehandle type, the | |
| classification of file attributes has been done to ease server | | classification of file attributes has been done to ease server | |
| implementations along with extending the overall functionality of the | | implementations along with extending the overall functionality of the | |
| NFS protocol. This attribute model is structured to be extensible | | NFS protocol. This attribute model is structured to be extensible | |
| such that new attributes can be introduced in minor revisions of the | | such that new attributes can be introduced in minor revisions of the | |
| protocol without requiring significant rework. | | protocol without requiring significant rework. | |
| | | | |
| The three classifications are: mandatory, recommended and named | | The three classifications are: mandatory, recommended and named | |
| | | | |
| skipping to change at page 12, line 46 | | skipping to change at page 13, line 5 | |
| directory or file and referred to by a string name. Named attributes | | directory or file and referred to by a string name. Named attributes | |
| are meant to be used by client applications as a method to associate | | are meant to be used by client applications as a method to associate | |
| application specific data with a regular file or directory. | | application specific data with a regular file or directory. | |
| | | | |
| One significant addition to the recommended set of file attributes is | | One significant addition to the recommended set of file attributes is | |
| the Access Control List (ACL) attribute. This attribute provides for | | the Access Control List (ACL) attribute. This attribute provides for | |
| directory and file access control beyond the model used in previous | | directory and file access control beyond the model used in previous | |
| versions of the NFS protocol. The ACL definition allows for | | versions of the NFS protocol. The ACL definition allows for | |
| specification of user and group level access control. | | specification of user and group level access control. | |
| | | | |
|
| 1.3.3.3. Filesystem Replication and Migration | | 1.4.3.3. Filesystem Replication and Migration | |
| | | | |
| With the use of a special file attribute, the ability to migrate or | | With the use of a special file attribute, the ability to migrate or | |
| replicate server filesystems is enabled within the protocol. The | | replicate server filesystems is enabled within the protocol. The | |
| filesystem locations attribute provides a method for the client to | | filesystem locations attribute provides a method for the client to | |
| probe the server about the location of a filesystem. In the event of | | probe the server about the location of a filesystem. In the event of | |
| a migration of a filesystem, the client will receive an error when | | a migration of a filesystem, the client will receive an error when | |
| operating on the filesystem and it can then query as to the new file | | operating on the filesystem and it can then query as to the new file | |
| system location. Similar steps are used for replication, the client | | system location. Similar steps are used for replication, the client | |
| is able to query the server for the multiple available locations of a | | is able to query the server for the multiple available locations of a | |
| particular filesystem. From this information, the client can use its | | particular filesystem. From this information, the client can use its | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| own policies to access the appropriate filesystem location. | | own policies to access the appropriate filesystem location. | |
| | | | |
|
| 1.3.4. OPEN and CLOSE | | 1.4.4. OPEN and CLOSE | |
| | | | |
| The NFS version 4 protocol introduces OPEN and CLOSE operations. The | | The NFS version 4 protocol introduces OPEN and CLOSE operations. The | |
| OPEN operation provides a single point where file lookup, creation, | | OPEN operation provides a single point where file lookup, creation, | |
| and share semantics can be combined. The CLOSE operation also | | and share semantics can be combined. The CLOSE operation also | |
| provides for the release of state accumulated by OPEN. | | provides for the release of state accumulated by OPEN. | |
| | | | |
|
| 1.3.5. File locking | | 1.4.5. File locking | |
| | | | |
| With the NFS version 4 protocol, the support for byte range file | | With the NFS version 4 protocol, the support for byte range file | |
| locking is part of the NFS protocol. The file locking support is | | locking is part of the NFS protocol. The file locking support is | |
| structured so that an RPC callback mechanism is not required. This | | structured so that an RPC callback mechanism is not required. This | |
| is a departure from the previous versions of the NFS file locking | | is a departure from the previous versions of the NFS file locking | |
| protocol, Network Lock Manager (NLM). The state associated with file | | protocol, Network Lock Manager (NLM). The state associated with file | |
| locks is maintained at the server under a lease-based model. The | | locks is maintained at the server under a lease-based model. The | |
| server defines a single lease period for all state held by a NFS | | server defines a single lease period for all state held by a NFS | |
| client. If the client does not renew its lease within the defined | | client. If the client does not renew its lease within the defined | |
| period, all state associated with the client's lease may be released | | period, all state associated with the client's lease may be released | |
| by the server. The client may renew its lease with use of the RENEW | | by the server. The client may renew its lease with use of the RENEW | |
| operation or implicitly by use of other operations (primarily READ). | | operation or implicitly by use of other operations (primarily READ). | |
| | | | |
|
| 1.3.6. Client Caching and Delegation | | 1.4.6. Client Caching and Delegation | |
| | | | |
| The file, attribute, and directory caching for the NFS version 4 | | The file, attribute, and directory caching for the NFS version 4 | |
| protocol is similar to previous versions. Attributes and directory | | protocol is similar to previous versions. Attributes and directory | |
| information are cached for a duration determined by the client. At | | information are cached for a duration determined by the client. At | |
| the end of a predefined timeout, the client will query the server to | | the end of a predefined timeout, the client will query the server to | |
| see if the related filesystem object has been updated. | | see if the related filesystem object has been updated. | |
| | | | |
| For file data, the client checks its cache validity when the file is | | For file data, the client checks its cache validity when the file is | |
| opened. A query is sent to the server to determine if the file has | | opened. A query is sent to the server to determine if the file has | |
| been changed. Based on this information, the client determines if | | been changed. Based on this information, the client determines if | |
| | | | |
| skipping to change at page 14, line 4 | | skipping to change at page 14, line 17 | |
| | | | |
| The major addition to NFS version 4 in the area of caching is the | | The major addition to NFS version 4 in the area of caching is the | |
| ability of the server to delegate certain responsibilities to the | | ability of the server to delegate certain responsibilities to the | |
| client. When the server grants a delegation for a file to a client, | | client. When the server grants a delegation for a file to a client, | |
| the client is guaranteed certain semantics with respect to the | | the client is guaranteed certain semantics with respect to the | |
| sharing of that file with other clients. At OPEN, the server may | | sharing of that file with other clients. At OPEN, the server may | |
| provide the client either a read or write delegation for the file. | | provide the client either a read or write delegation for the file. | |
| If the client is granted a read delegation, it is assured that no | | If the client is granted a read delegation, it is assured that no | |
| other client has the ability to write to the file for the duration of | | other client has the ability to write to the file for the duration of | |
| the delegation. If the client is granted a write delegation, the | | the delegation. If the client is granted a write delegation, the | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| client is assured that no other client has read or write access to | | client is assured that no other client has read or write access to | |
| the file. | | the file. | |
| | | | |
| Delegations can be recalled by the server. If another client | | Delegations can be recalled by the server. If another client | |
| requests access to the file in such a way that the access conflicts | | requests access to the file in such a way that the access conflicts | |
| with the granted delegation, the server is able to notify the initial | | with the granted delegation, the server is able to notify the initial | |
| client and recall the delegation. This requires that a callback path | | client and recall the delegation. This requires that a callback path | |
| exist between the server and client. If this callback path does not | | exist between the server and client. If this callback path does not | |
| exist, then delegations can not be granted. The essence of a | | exist, then delegations can not be granted. The essence of a | |
| delegation is that it allows the client to locally service operations | | delegation is that it allows the client to locally service operations | |
| such as OPEN, CLOSE, LOCK, LOCKU, READ, WRITE without immediate | | such as OPEN, CLOSE, LOCK, LOCKU, READ, WRITE without immediate | |
| interaction with the server. | | interaction with the server. | |
| | | | |
|
| 1.4. General Definitions | | 1.5. General Definitions | |
| | | | |
| The following definitions are provided for the purpose of providing | | The following definitions are provided for the purpose of providing | |
| an appropriate context for the reader. | | an appropriate context for the reader. | |
| | | | |
| Client The "client" is the entity that accesses the NFS server's | | Client The "client" is the entity that accesses the NFS server's | |
| resources. The client may be an application which contains | | resources. The client may be an application which contains | |
| the logic to access the NFS server directly. The client | | the logic to access the NFS server directly. The client | |
| may also be the traditional operating system client remote | | may also be the traditional operating system client remote | |
| filesystem services for a set of applications. | | filesystem services for a set of applications. | |
| | | | |
| | | | |
| skipping to change at page 15, line 5 | | skipping to change at page 15, line 20 | |
| | | | |
| All leases granted by a server have the same fixed | | All leases granted by a server have the same fixed | |
| interval. Note that the fixed interval was chosen to | | interval. Note that the fixed interval was chosen to | |
| alleviate the expense a server would have in maintaining | | alleviate the expense a server would have in maintaining | |
| state about variable length leases across server failures. | | state about variable length leases across server failures. | |
| | | | |
| Lock The term "lock" is used to refer to both record (byte- | | Lock The term "lock" is used to refer to both record (byte- | |
| range) locks as well as share reservations unless | | range) locks as well as share reservations unless | |
| specifically stated otherwise. | | specifically stated otherwise. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| Server The "Server" is the entity responsible for coordinating | | Server The "Server" is the entity responsible for coordinating | |
| client access to a set of filesystems. | | client access to a set of filesystems. | |
| | | | |
| Stable Storage | | Stable Storage | |
| NFS version 4 servers must be able to recover without data | | NFS version 4 servers must be able to recover without data | |
| loss from multiple power failures (including cascading | | loss from multiple power failures (including cascading | |
| power failures, that is, several power failures in quick | | power failures, that is, several power failures in quick | |
| succession), operating system failures, and hardware | | succession), operating system failures, and hardware | |
| failure of components other than the storage medium itself | | failure of components other than the storage medium itself | |
| (for example, disk, nonvolatile RAM). | | (for example, disk, nonvolatile RAM). | |
| | | | |
| Some examples of stable storage that are allowable for an | | Some examples of stable storage that are allowable for an | |
| NFS server include: | | NFS server include: | |
| | | | |
| 1. Media commit of data, that is, the modified data has | | 1. Media commit of data, that is, the modified data has | |
|
| been successfully written to the disk media, | | been successfully written to the disk media, for | |
| for example, the disk platter. | | example, the disk platter. | |
| | | | |
|
| 2. An immediate reply disk drive with battery-backed | | 2. An immediate reply disk drive with battery-backed on- | |
| on-drive intermediate storage or uninterruptible power | | drive intermediate storage or uninterruptible power | |
| system (UPS). | | system (UPS). | |
| | | | |
| 3. Server commit of data with battery-backed intermediate | | 3. Server commit of data with battery-backed intermediate | |
| storage and recovery software. | | storage and recovery software. | |
| | | | |
|
| 4. Cache commit with uninterruptible power system (UPS) | | 4. Cache commit with uninterruptible power system (UPS) and | |
| and recovery software. | | recovery software. | |
| | | | |
| Stateid A 128-bit quantity returned by a server that uniquely | | Stateid A 128-bit quantity returned by a server that uniquely | |
| defines the open and locking state provided by the server | | defines the open and locking state provided by the server | |
| for a specific open or lock owner for a specific file. | | for a specific open or lock owner for a specific file. | |
| | | | |
| Stateids composed of all bits 0 or all bits 1 have special | | Stateids composed of all bits 0 or all bits 1 have special | |
| meaning and are reserved values. | | meaning and are reserved values. | |
| | | | |
| Verifier A 64-bit quantity generated by the client that the server | | Verifier A 64-bit quantity generated by the client that the server | |
| can use to determine if the client has restarted and lost | | can use to determine if the client has restarted and lost | |
| all previous lock state. | | all previous lock state. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| 2. Protocol Data Types | | 2. Protocol Data Types | |
| | | | |
| The syntax and semantics to describe the data types of the NFS | | The syntax and semantics to describe the data types of the NFS | |
| version 4 protocol are defined in the XDR [RFC1832] and RPC [RFC1831] | | version 4 protocol are defined in the XDR [RFC1832] and RPC [RFC1831] | |
| documents. The next sections build upon the XDR data types to define | | documents. The next sections build upon the XDR data types to define | |
| types and structures specific to this protocol. | | types and structures specific to this protocol. | |
| | | | |
| 2.1. Basic Data Types | | 2.1. Basic Data Types | |
| | | | |
| Data Type Definition | | Data Type Definition | |
| | | | |
| skipping to change at page 17, line 5 | | skipping to change at page 17, line 16 | |
| | | | |
| mode4 typedef uint32_t mode4; | | mode4 typedef uint32_t mode4; | |
| Mode attribute data type | | Mode attribute data type | |
| | | | |
| nfs_cookie4 typedef uint64_t nfs_cookie4; | | nfs_cookie4 typedef uint64_t nfs_cookie4; | |
| Opaque cookie value for READDIR | | Opaque cookie value for READDIR | |
| | | | |
| nfs_fh4 typedef opaque nfs_fh4<NFS4_FHSIZE>; | | nfs_fh4 typedef opaque nfs_fh4<NFS4_FHSIZE>; | |
| Filehandle definition; NFS4_FHSIZE is defined as 128 | | Filehandle definition; NFS4_FHSIZE is defined as 128 | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| nfs_ftype4 enum nfs_ftype4; | | nfs_ftype4 enum nfs_ftype4; | |
| Various defined file types | | Various defined file types | |
| | | | |
| nfsstat4 enum nfsstat4; | | nfsstat4 enum nfsstat4; | |
| Return value for operations | | Return value for operations | |
| | | | |
| offset4 typedef uint64_t offset4; | | offset4 typedef uint64_t offset4; | |
| Various offset designations (READ, WRITE, | | Various offset designations (READ, WRITE, | |
| LOCK, COMMIT) | | LOCK, COMMIT) | |
| | | | |
| | | | |
| skipping to change at page 17, line 36 | | skipping to change at page 17, line 45 | |
| Instead contains an ASN.1 OBJECT IDENTIFIER as used | | Instead contains an ASN.1 OBJECT IDENTIFIER as used | |
| by GSS-API in the mech_type argument to | | by GSS-API in the mech_type argument to | |
| GSS_Init_sec_context. See [RFC2743] for details. | | GSS_Init_sec_context. See [RFC2743] for details. | |
| | | | |
| seqid4 typedef uint32_t seqid4; | | seqid4 typedef uint32_t seqid4; | |
| Sequence identifier used for file locking | | Sequence identifier used for file locking | |
| | | | |
| utf8string typedef opaque utf8string<>; | | utf8string typedef opaque utf8string<>; | |
| UTF-8 encoding for strings | | UTF-8 encoding for strings | |
| | | | |
|
| utf8str_cis typedef opaque utf8str_cis<>; | | utf8str_cis typedef opaque utf8str_cis; | |
| Case-insensitive UTF-8 string | | Case-insensitive UTF-8 string | |
| | | | |
|
| utf8str_cs typedef opaque utf8str_cs<>; | | utf8str_cs typedef opaque utf8str_cs; | |
| Case-sensitive UTF-8 string | | Case-sensitive UTF-8 string | |
|
| | | utf8str_mixed typedef opaque utf8str_mixed; | |
| utf8str_mixed typedef opaque utf8str_mixed<>; | | | |
| UTF-8 strings with a case sensitive prefix and | | UTF-8 strings with a case sensitive prefix and | |
| a case insensitive suffix. | | a case insensitive suffix. | |
| | | | |
| verifier4 typedef opaque verifier4[NFS4_VERIFIER_SIZE]; | | verifier4 typedef opaque verifier4[NFS4_VERIFIER_SIZE]; | |
| Verifier used for various operations (COMMIT, | | Verifier used for various operations (COMMIT, | |
| CREATE, OPEN, READDIR, SETCLIENTID, | | CREATE, OPEN, READDIR, SETCLIENTID, | |
| SETCLIENTID_CONFIRM, WRITE) NFS4_VERIFIER_SIZE is | | SETCLIENTID_CONFIRM, WRITE) NFS4_VERIFIER_SIZE is | |
| defined as 8. | | defined as 8. | |
| | | | |
| 2.2. Structured Data Types | | 2.2. Structured Data Types | |
| | | | |
| nfstime4 | | nfstime4 | |
| struct nfstime4 { | | struct nfstime4 { | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| int64_t seconds; | | int64_t seconds; | |
| uint32_t nseconds; | | uint32_t nseconds; | |
| } | | } | |
| | | | |
|
| The nfstime4 structure gives the number of seconds and | | The nfstime4 structure gives the number of seconds and nanoseconds | |
| nanoseconds since midnight or 0 hour January 1, 1970 Coordinated | | since midnight or 0 hour January 1, 1970 Coordinated Universal Time | |
| Universal Time (UTC). Values greater than zero for the seconds | | (UTC). Values greater than zero for the seconds field denote dates | |
| field denote dates after the 0 hour January 1, 1970. Values | | after the 0 hour January 1, 1970. Values less than zero for the | |
| less than zero for the seconds field denote dates before the 0 | | seconds field denote dates before the 0 hour January 1, 1970. In | |
| hour January 1, 1970. In both cases, the nseconds field is to | | both cases, the nseconds field is to be added to the seconds field | |
| be added to the seconds field for the final time representation. | | for the final time representation. For example, if the time to be | |
| For example, if the time to be represented is one-half second | | represented is one-half second before 0 hour January 1, 1970, the | |
| before 0 hour January 1, 1970, the seconds field would have a | | seconds field would have a value of negative one (-1) and the | |
| value of negative one (-1) and the nseconds fields would have a | | nseconds fields would have a value of one-half second (500000000). | |
| value of one-half second (500000000). Values greater than | | Values greater than 999,999,999 for nseconds are considered invalid. | |
| 999,999,999 for nseconds are considered invalid. | | | |
| | | | |
|
| This data type is used to pass time and date information. A | | This data type is used to pass time and date information. A server | |
| server converts to and from its local representation of time | | converts to and from its local representation of time when processing | |
| when processing time values, preserving as much accuracy as | | time values, preserving as much accuracy as possible. If the | |
| possible. If the precision of timestamps stored for a filesystem | | precision of timestamps stored for a filesystem object is less than | |
| object is less than defined, loss of precision can occur. An | | defined, loss of precision can occur. An adjunct time maintenance | |
| adjunct time maintenance protocol is recommended to reduce | | protocol is recommended to reduce client and server time skew. | |
| client and server time skew. | | | |
| | | | |
| time_how4 | | time_how4 | |
| | | | |
| enum time_how4 { | | enum time_how4 { | |
| SET_TO_SERVER_TIME4 = 0, | | SET_TO_SERVER_TIME4 = 0, | |
| SET_TO_CLIENT_TIME4 = 1 | | SET_TO_CLIENT_TIME4 = 1 | |
| }; | | }; | |
|
| | | | |
| settime4 | | settime4 | |
| | | | |
| union settime4 switch (time_how4 set_it) { | | union settime4 switch (time_how4 set_it) { | |
| case SET_TO_CLIENT_TIME4: | | case SET_TO_CLIENT_TIME4: | |
| nfstime4 time; | | nfstime4 time; | |
| default: | | default: | |
| void; | | void; | |
| }; | | }; | |
| | | | |
|
| The above definitions are used as the attribute definitions to | | The above definitions are used as the attribute definitions to set | |
| set time values. If set_it is SET_TO_SERVER_TIME4, then the | | time values. If set_it is SET_TO_SERVER_TIME4, then the server uses | |
| server uses its local representation of time for the time value. | | its local representation of time for the time value. | |
| | | | |
| specdata4 | | specdata4 | |
| | | | |
| struct specdata4 { | | struct specdata4 { | |
| uint32_t specdata1; /* major device number */ | | uint32_t specdata1; /* major device number */ | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| uint32_t specdata2; /* minor device number */ | | uint32_t specdata2; /* minor device number */ | |
| }; | | }; | |
| | | | |
|
| This data type represents additional information for the device | | This data type represents additional information for the device file | |
| file types NF4CHR and NF4BLK. | | types NF4CHR and NF4BLK. | |
| | | | |
| fsid4 | | fsid4 | |
| | | | |
| struct fsid4 { | | struct fsid4 { | |
| uint64_t major; | | uint64_t major; | |
| uint64_t minor; | | uint64_t minor; | |
| }; | | }; | |
| | | | |
|
| This type is the filesystem identifier that is used as a | | This type is the filesystem identifier that is used as a mandatory | |
| mandatory attribute. | | attribute. | |
| | | | |
| fs_location4 | | fs_location4 | |
| | | | |
| struct fs_location4 { | | struct fs_location4 { | |
| utf8str_cis server<>; | | utf8str_cis server<>; | |
| pathname4 rootpath; | | pathname4 rootpath; | |
| }; | | }; | |
| | | | |
| fs_locations4 | | fs_locations4 | |
| | | | |
| | | | |
| skipping to change at page 19, line 36 | | skipping to change at page 20, line 4 | |
| utf8str_cis server<>; | | utf8str_cis server<>; | |
| pathname4 rootpath; | | pathname4 rootpath; | |
| }; | | }; | |
| | | | |
| fs_locations4 | | fs_locations4 | |
| | | | |
| struct fs_locations4 { | | struct fs_locations4 { | |
| pathname4 fs_root; | | pathname4 fs_root; | |
| fs_location4 locations<>; | | fs_location4 locations<>; | |
| }; | | }; | |
|
| | | | |
| The fs_location4 and fs_locations4 data types are used for the | | The fs_location4 and fs_locations4 data types are used for the | |
|
| fs_locations recommended attribute which is used for migration | | fs_locations recommended attribute which is used for migration and | |
| and replication support. | | replication support. | |
| | | | |
| fattr4 | | fattr4 | |
| | | | |
| struct fattr4 { | | struct fattr4 { | |
| bitmap4 attrmask; | | bitmap4 attrmask; | |
| attrlist4 attr_vals; | | attrlist4 attr_vals; | |
| }; | | }; | |
| | | | |
| The fattr4 structure is used to represent file and directory | | The fattr4 structure is used to represent file and directory | |
| attributes. | | attributes. | |
| | | | |
|
| The bitmap is a counted array of 32 bit integers used to contain | | The bitmap is a counted array of 32 bit integers used to contain bit | |
| bit values. The position of the integer in the array that | | values. The position of the integer in the array that contains bit n | |
| contains bit n can be computed from the expression (n / 32) and | | can be computed from the expression (n / 32) and its bit within that | |
| its bit within that integer is (n mod 32). | | integer is (n mod 32). | |
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| 0 1 | | 0 1 | |
| +-----------+-----------+-----------+-- | | +-----------+-----------+-----------+-- | |
| | count | 31 .. 0 | 63 .. 32 | | | | count | 31 .. 0 | 63 .. 32 | | |
| +-----------+-----------+-----------+-- | | +-----------+-----------+-----------+-- | |
| | | | |
| change_info4 | | change_info4 | |
| | | | |
| struct change_info4 { | | struct change_info4 { | |
| bool atomic; | | bool atomic; | |
| changeid4 before; | | changeid4 before; | |
| changeid4 after; | | changeid4 after; | |
| }; | | }; | |
| | | | |
| This structure is used with the CREATE, LINK, REMOVE, RENAME | | This structure is used with the CREATE, LINK, REMOVE, RENAME | |
|
| operations to let the client know the value of the change | | operations to let the client know the value of the change attribute | |
| attribute for the directory in which the target filesystem | | for the directory in which the target filesystem object resides. | |
| object resides. | | | |
| | | | |
| clientaddr4 | | clientaddr4 | |
| | | | |
| struct clientaddr4 { | | struct clientaddr4 { | |
| /* see struct rpcb in RFC1833 */ | | /* see struct rpcb in RFC1833 */ | |
| string r_netid<>; /* network id */ | | string r_netid<>; /* network id */ | |
| string r_addr<>; /* universal address */ | | string r_addr<>; /* universal address */ | |
| }; | | }; | |
| | | | |
| The clientaddr4 structure is used as part of the SETCLIENTID | | The clientaddr4 structure is used as part of the SETCLIENTID | |
|
| operation to either specify the address of the client that is | | operation to either specify the address of the client that is using a | |
| using a clientid or as part of the callback registration. The | | clientid or as part of the callback registration. The | |
| r_netid and r_addr fields are specified in [RFC1833], but they | | r_netid and r_addr fields are specified in [RFC1833], but they are | |
| are underspecified in [RFC1833] as far as what they should look | | underspecified in [RFC1833] as far as what they should look like for | |
| like for specific protocols. | | specific protocols. | |
| | | | |
|
| For TCP over IPv4 and for UDP over IPv4, the format of r_addr is | | For TCP over IPv4 and for UDP over IPv4, the format of r_addr is the | |
| the US-ASCII string: | | US-ASCII string: | |
| | | | |
| h1.h2.h3.h4.p1.p2 | | h1.h2.h3.h4.p1.p2 | |
| | | | |
| The prefix, "h1.h2.h3.h4", is the standard textual form for | | The prefix, "h1.h2.h3.h4", is the standard textual form for | |
| representing an IPv4 address, which is always four octets long. | | representing an IPv4 address, which is always four octets long. | |
|
| Assuming big-endian ordering, h1, h2, h3, and h4, are | | Assuming big-endian ordering, h1, h2, h3, and h4, are respectively, | |
| respectively, the first through fourth octets each converted to | | the first through fourth octets each converted to ASCII-decimal. | |
| ASCII-decimal. Assuming big-endian ordering, p1 and p2 are, | | Assuming big-endian ordering, p1 and p2 are, respectively, the first | |
| respectively, the first and second octets each converted to | | and second octets each converted to ASCII-decimal. For example, if a | |
| ASCII-decimal. For example, if a host, in big-endian order, has | | host, in big-endian order, has an address of 0x0A010307 and there is | |
| an address of 0x0A010307 and there is a service listening on, in | | a service listening on, in big endian order, port 0x020F (decimal | |
| big endian order, port 0x020F (decimal 527), then complete | | 527), then the complete universal address is "10.1.3.7.2.15". | |
| universal address is "10.1.3.7.2.15". | | | |
| | | | |
| For TCP over IPv4 the value of r_netid is the string "tcp". For | | | |
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
|
| UDP over IPv4 the value of r_netid is the string "udp". | | For TCP over IPv4 the value of r_netid is the string "tcp". For UDP | |
| | | over IPv4 the value of r_netid is the string "udp". | |
| | | | |
|
| For TCP over IPv6 and for UDP over IPv6, the format of r_addr is | | For TCP over IPv6 and for UDP over IPv6, the format of r_addr is the | |
| the US-ASCII string: | | US-ASCII string: | |
| | | | |
| x1:x2:x3:x4:x5:x6:x7:x8.p1.p2 | | x1:x2:x3:x4:x5:x6:x7:x8.p1.p2 | |
| | | | |
|
| The suffix "p1.p2" is the service port, and is computed the same | | The suffix "p1.p2" is the service port, and is computed the same way | |
| way as with universal addresses for TCP and UDP over IPv4. The | | as with universal addresses for TCP and UDP over IPv4. The prefix, | |
| prefix, "x1:x2:x3:x4:x5:x6:x7:x8", is the standard textual form | | "x1:x2:x3:x4:x5:x6:x7:x8", is the standard textual form for | |
| for representing an IPv6 address as defined in Section 2.2 of | | representing an IPv6 address as defined in Section 2.2 of [RFC2373]. | |
| [RFC1884]. Additionally, the two alternative forms specified in | | Additionally, the two alternative forms specified in Section 2.2 of | |
| Section 2.2 of [RFC1884] are also acceptable. | | [RFC2373] are also acceptable. | |
| | | | |
|
| For TCP over IPv6 the value of r_netid is the string "tcp6". | | For TCP over IPv6 the value of r_netid is the string "tcp6". For UDP | |
| For UDP over IPv6 the value of r_netid is the string "udp6". | | over IPv6 the value of r_netid is the string "udp6". | |
| | | | |
| cb_client4 | | cb_client4 | |
| | | | |
| struct cb_client4 { | | struct cb_client4 { | |
| unsigned int cb_program; | | unsigned int cb_program; | |
| clientaddr4 cb_location; | | clientaddr4 cb_location; | |
| }; | | }; | |
| | | | |
|
| This structure is used by the client to inform the server of its | | This structure is used by the client to inform the server of its call | |
| call back address; includes the program number and client | | back address; includes the program number and client address. | |
| address. | | | |
| | | | |
| nfs_client_id4 | | nfs_client_id4 | |
| | | | |
| struct nfs_client_id4 { | | struct nfs_client_id4 { | |
| verifier4 verifier; | | verifier4 verifier; | |
| opaque id<NFS4_OPAQUE_LIMIT>; | | opaque id<NFS4_OPAQUE_LIMIT>; | |
| }; | | }; | |
| | | | |
|
| This structure is part of the arguments to the SETCLIENTID | | This structure is part of the arguments to the SETCLIENTID operation. | |
| operation. NFS4_OPAQUE_LIMIT is defined as 1024. | | NFS4_OPAQUE_LIMIT is defined as 1024. | |
| | | | |
| open_owner4 | | open_owner4 | |
| | | | |
| struct open_owner4 { | | struct open_owner4 { | |
| clientid4 clientid; | | clientid4 clientid; | |
| opaque owner<NFS4_OPAQUE_LIMIT>; | | opaque owner<NFS4_OPAQUE_LIMIT>; | |
| }; | | }; | |
| | | | |
| This structure is used to identify the owner of open state. | | This structure is used to identify the owner of open state. | |
| NFS4_OPAQUE_LIMIT is defined as 1024. | | NFS4_OPAQUE_LIMIT is defined as 1024. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| lock_owner4 | | lock_owner4 | |
| | | | |
| struct lock_owner4 { | | struct lock_owner4 { | |
| clientid4 clientid; | | clientid4 clientid; | |
| opaque owner<NFS4_OPAQUE_LIMIT>; | | opaque owner<NFS4_OPAQUE_LIMIT>; | |
| }; | | }; | |
| | | | |
|
| This structure is used to identify the owner of file locking | | This structure is used to identify the owner of file locking state. | |
| state. NFS4_OPAQUE_LIMIT is defined as 1024. | | NFS4_OPAQUE_LIMIT is defined as 1024. | |
| | | | |
| open_to_lock_owner4 | | open_to_lock_owner4 | |
| | | | |
| struct open_to_lock_owner4 { | | struct open_to_lock_owner4 { | |
| seqid4 open_seqid; | | seqid4 open_seqid; | |
| stateid4 open_stateid; | | stateid4 open_stateid; | |
| seqid4 lock_seqid; | | seqid4 lock_seqid; | |
| lock_owner4 lock_owner; | | lock_owner4 lock_owner; | |
| }; | | }; | |
| | | | |
| This structure is used for the first LOCK operation done for an | | This structure is used for the first LOCK operation done for an | |
|
| open_owner4. It provides both the open_stateid and lock_owner | | open_owner4. It provides both the open_stateid and lock_owner such | |
| such that the transition is made from a valid open_stateid | | that the transition is made from a valid open_stateid sequence to | |
| sequence to that of the new lock_stateid sequence. Using this | | that of the new lock_stateid sequence. Using this mechanism avoids | |
| mechanism avoids the confirmation of the lock_owner/lock_seqid | | the confirmation of the lock_owner/lock_seqid pair since it is tied | |
| pair since it is tied to established state in the form of the | | to established state in the form of the open_stateid/open_seqid. | |
| open_stateid/open_seqid. | | | |
| | | | |
| stateid4 | | stateid4 | |
| | | | |
| struct stateid4 { | | struct stateid4 { | |
| uint32_t seqid; | | uint32_t seqid; | |
| opaque other[12]; | | opaque other[12]; | |
| }; | | }; | |
| | | | |
| This structure is used for the various state sharing mechanisms | | This structure is used for the various state sharing mechanisms | |
|
| between the client and server. For the client, this data | | between the client and server. For the client, this data structure | |
| structure is read-only. The starting value of the seqid field | | is read-only. The starting value of the seqid field is undefined. | |
| is undefined. The server is required to increment the seqid | | The server is required to increment the seqid field monotonically at | |
| field monotonically at each transition of the stateid. This is | | each transition of the stateid. This is important since the client | |
| important since the client will inspect the seqid in OPEN | | will inspect the seqid in OPEN stateids to determine the order of | |
| stateids to determine the order of OPEN processing done by the | | OPEN processing done by the server. | |
| server. | | | |
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| 3. RPC and Security Flavor | | 3. RPC and Security Flavor | |
| | | | |
| The NFS version 4 protocol is a Remote Procedure Call (RPC) | | The NFS version 4 protocol is a Remote Procedure Call (RPC) | |
| application that uses RPC version 2 and the corresponding eXternal | | application that uses RPC version 2 and the corresponding eXternal | |
| Data Representation (XDR) as defined in [RFC1831] and [RFC1832]. The | | Data Representation (XDR) as defined in [RFC1831] and [RFC1832]. The | |
| RPCSEC_GSS security flavor as defined in [RFC2203] MUST be used as | | RPCSEC_GSS security flavor as defined in [RFC2203] MUST be used as | |
| the mechanism to deliver stronger security for the NFS version 4 | | the mechanism to deliver stronger security for the NFS version 4 | |
| protocol. | | protocol. | |
| | | | |
| 3.1. Ports and Transports | | 3.1. Ports and Transports | |
| | | | |
| Historically, NFS version 2 and version 3 servers have resided on | | Historically, NFS version 2 and version 3 servers have resided on | |
|
| port 2049. The registered port 2049 [RFC1700] for the NFS protocol | | port 2049. The registered port 2049 [RFC3232] for the NFS protocol | |
| should be the default configuration. Using the registered port for | | should be the default configuration. Using the registered port for | |
| NFS services means the NFS client will not need to use the RPC | | NFS services means the NFS client will not need to use the RPC | |
| binding protocols as described in [RFC1833]; this will allow NFS to | | binding protocols as described in [RFC1833]; this will allow NFS to | |
| transit firewalls. | | transit firewalls. | |
| | | | |
| Where an NFS version 4 implementation supports operation over the IP | | Where an NFS version 4 implementation supports operation over the IP | |
| network protocol, the supported transports between NFS and IP MUST be | | network protocol, the supported transports between NFS and IP MUST be | |
| among the IETF-approved congestion control transport protocols, which | | among the IETF-approved congestion control transport protocols, which | |
| include TCP and SCTP. To enhance the possibilities for | | include TCP and SCTP. To enhance the possibilities for | |
| interoperability, an NFS version 4 implementation MUST support | | interoperability, an NFS version 4 implementation MUST support | |
| | | | |
| skipping to change at page 23, line 52 | | skipping to change at page 24, line 17 | |
| based. However, this modification of the authentication model does | | based. However, this modification of the authentication model does | |
| not imply a technical requirement to move the TCP connection | | not imply a technical requirement to move the TCP connection | |
| management model from whole machine-based to one based on a per user | | management model from whole machine-based to one based on a per user | |
| model. In particular, NFS over TCP client implementations have | | model. In particular, NFS over TCP client implementations have | |
| traditionally multiplexed traffic for multiple users over a common | | traditionally multiplexed traffic for multiple users over a common | |
| TCP connection between an NFS client and server. This has been true, | | TCP connection between an NFS client and server. This has been true, | |
| regardless whether the NFS client is using AUTH_SYS, AUTH_DH, | | regardless whether the NFS client is using AUTH_SYS, AUTH_DH, | |
| RPCSEC_GSS or any other flavor. Similarly, NFS over TCP server | | RPCSEC_GSS or any other flavor. Similarly, NFS over TCP server | |
| implementations have assumed such a model and thus scale the | | implementations have assumed such a model and thus scale the | |
| implementation of TCP connection management in proportion to the | | implementation of TCP connection management in proportion to the | |
|
| number of expected client machines. It is intended that NFS version 4 | | number of expected client machines. It is intended that NFS version | |
| will not modify this connection management model. NFS version 4 | | 4 will not modify this connection management model. NFS version 4 | |
| clients that violate this assumption can expect scaling issues on the | | clients that violate this assumption can expect scaling issues on the | |
| server and hence reduced service. | | server and hence reduced service. | |
| | | | |
| Note that for various timers, the client and server should avoid | | Note that for various timers, the client and server should avoid | |
| inadvertent synchronization of those timers. For further discussion | | inadvertent synchronization of those timers. For further discussion | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| of the general issue refer to [Floyd]. | | of the general issue refer to [Floyd]. | |
| | | | |
| 3.1.1. Client Retransmission Behavior | | 3.1.1. Client Retransmission Behavior | |
| | | | |
| When processing a request received over a reliable transport such as | | When processing a request received over a reliable transport such as | |
| TCP, the NFS version 4 server MUST NOT silently drop the request, | | TCP, the NFS version 4 server MUST NOT silently drop the request, | |
| except if the transport connection has been broken. Given such a | | except if the transport connection has been broken. Given such a | |
| contract between NFS version 4 clients and servers, clients MUST NOT | | contract between NFS version 4 clients and servers, clients MUST NOT | |
| retry a request unless one or both of the following are true: | | retry a request unless one or both of the following are true: | |
| | | | |
| o The transport connection has been broken | | o The transport connection has been broken | |
| | | | |
| o The procedure being retried is the NULL procedure | | o The procedure being retried is the NULL procedure | |
| | | | |
| Since reliable transports, such as TCP, do not always synchronously | | Since reliable transports, such as TCP, do not always synchronously | |
| inform a peer when the other peer has broken the connection (for | | inform a peer when the other peer has broken the connection (for | |
|
| example, when an NFS server reboots), so the NFS version 4 client may | | example, when an NFS server reboots), the NFS version 4 client may | |
| want to actively "probe" the connection to see if has been broken. | | want to actively "probe" the connection to see if has been broken. | |
| Use of the NULL procedure is one recommended way to do so. So, when | | Use of the NULL procedure is one recommended way to do so. So, when | |
| a client experiences a remote procedure call timeout (of some | | a client experiences a remote procedure call timeout (of some | |
| arbitrary implementation specific amount), rather than retrying the | | arbitrary implementation specific amount), rather than retrying the | |
| remote procedure call, it could instead issue a NULL procedure call | | remote procedure call, it could instead issue a NULL procedure call | |
|
| to the server. If the server has died, the transport connection break | | to the server. If the server has died, the transport connection | |
| will eventually be indicated to the NFS version 4 client. The client | | break will eventually be indicated to the NFS version 4 client. The | |
| can then reconnect, and then retry the original request. If the NULL | | client can then reconnect, and then retry the original request. If | |
| procedure call gets a response, the connection has not broken. The | | the NULL procedure call gets a response, the connection has not | |
| client can decide to wait longer for the original request's response, | | broken. The client can decide to wait longer for the original | |
| or it can break the transport connection and reconnect before re- | | request's response, or it can break the transport connection and | |
| sending the original request. | | reconnect before re-sending the original request. | |
| | | | |
| For callbacks from the server to the client, the same rules apply, | | For callbacks from the server to the client, the same rules apply, | |
| but the server doing the callback becomes the client, and the client | | but the server doing the callback becomes the client, and the client | |
| receiving the callback becomes the server. | | receiving the callback becomes the server. | |
| | | | |
| 3.2. Security Flavors | | 3.2. Security Flavors | |
| | | | |
| Traditional RPC implementations have included AUTH_NONE, AUTH_SYS, | | Traditional RPC implementations have included AUTH_NONE, AUTH_SYS, | |
| AUTH_DH, and AUTH_KRB4 as security flavors. With [RFC2203] an | | AUTH_DH, and AUTH_KRB4 as security flavors. With [RFC2203] an | |
| additional security flavor of RPCSEC_GSS has been introduced which | | additional security flavor of RPCSEC_GSS has been introduced which | |
| uses the functionality of GSS-API [RFC2743]. This allows for the use | | uses the functionality of GSS-API [RFC2743]. This allows for the use | |
| of various security mechanisms by the RPC layer without the | | of various security mechanisms by the RPC layer without the | |
| additional implementation overhead of adding RPC security flavors. | | additional implementation overhead of adding RPC security flavors. | |
| For NFS version 4, the RPCSEC_GSS security flavor MUST be used to | | For NFS version 4, the RPCSEC_GSS security flavor MUST be used to | |
| enable the mandatory security mechanism. Other flavors, such as, | | enable the mandatory security mechanism. Other flavors, such as, | |
| AUTH_NONE, AUTH_SYS, and AUTH_DH MAY be implemented as well. | | AUTH_NONE, AUTH_SYS, and AUTH_DH MAY be implemented as well. | |
| | | | |
| 3.2.1. Security mechanisms for NFS version 4 | | 3.2.1. Security mechanisms for NFS version 4 | |
| | | | |
| The use of RPCSEC_GSS requires selection of: mechanism, quality of | | The use of RPCSEC_GSS requires selection of: mechanism, quality of | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| protection, and service (authentication, integrity, privacy). The | | protection, and service (authentication, integrity, privacy). The | |
| remainder of this document will refer to these three parameters of | | remainder of this document will refer to these three parameters of | |
| the RPCSEC_GSS security as the security triple. | | the RPCSEC_GSS security as the security triple. | |
| | | | |
| 3.2.1.1. Kerberos V5 as a security triple | | 3.2.1.1. Kerberos V5 as a security triple | |
| | | | |
| The Kerberos V5 GSS-API mechanism as described in [RFC1964] MUST be | | The Kerberos V5 GSS-API mechanism as described in [RFC1964] MUST be | |
| implemented and provide the following security triples. | | implemented and provide the following security triples. | |
| | | | |
| column descriptions: | | column descriptions: | |
| | | | |
| 1 == number of pseudo flavor | | 1 == number of pseudo flavor | |
| 2 == name of pseudo flavor | | 2 == name of pseudo flavor | |
| 3 == mechanism's OID | | 3 == mechanism's OID | |
| 4 == mechanism's algorithm(s) | | 4 == mechanism's algorithm(s) | |
| 5 == RPCSEC_GSS service | | 5 == RPCSEC_GSS service | |
| | | | |
| 1 2 3 4 5 | | 1 2 3 4 5 | |
|
| ----------------------------------------------------------------------- | | -------------------------------------------------------------------- | |
| 390003 krb5 1.2.840.113554.1.2.2 DES MAC MD5 rpc_gss_svc_none | | 390003 krb5 1.2.840.113554.1.2.2 DES MAC MD5 rpc_gss_svc_none | |
| 390004 krb5i 1.2.840.113554.1.2.2 DES MAC MD5 rpc_gss_svc_integrity | | 390004 krb5i 1.2.840.113554.1.2.2 DES MAC MD5 rpc_gss_svc_integrity | |
| 390005 krb5p 1.2.840.113554.1.2.2 DES MAC MD5 rpc_gss_svc_privacy | | 390005 krb5p 1.2.840.113554.1.2.2 DES MAC MD5 rpc_gss_svc_privacy | |
| for integrity, | | for integrity, | |
| and 56 bit DES | | and 56 bit DES | |
| for privacy. | | for privacy. | |
| | | | |
| Note that the pseudo flavor is presented here as a mapping aid to the | | Note that the pseudo flavor is presented here as a mapping aid to the | |
| implementor. Because this NFS protocol includes a method to | | implementor. Because this NFS protocol includes a method to | |
| negotiate security and it understands the GSS-API mechanism, the | | negotiate security and it understands the GSS-API mechanism, the | |
| | | | |
| skipping to change at page 26, line 4 | | skipping to change at page 26, line 26 | |
| migrate to the use of AES. | | migrate to the use of AES. | |
| | | | |
| 3.2.1.2. LIPKEY as a security triple | | 3.2.1.2. LIPKEY as a security triple | |
| | | | |
| The LIPKEY GSS-API mechanism as described in [RFC2847] MUST be | | The LIPKEY GSS-API mechanism as described in [RFC2847] MUST be | |
| implemented and provide the following security triples. The | | implemented and provide the following security triples. The | |
| definition of the columns matches the previous subsection "Kerberos | | definition of the columns matches the previous subsection "Kerberos | |
| V5 as security triple" | | V5 as security triple" | |
| | | | |
| 1 2 3 4 5 | | 1 2 3 4 5 | |
|
| | | -------------------------------------------------------------------- | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| ----------------------------------------------------------------------- | | | |
| 390006 lipkey 1.3.6.1.5.5.9 negotiated rpc_gss_svc_none | | 390006 lipkey 1.3.6.1.5.5.9 negotiated rpc_gss_svc_none | |
| 390007 lipkey-i 1.3.6.1.5.5.9 negotiated rpc_gss_svc_integrity | | 390007 lipkey-i 1.3.6.1.5.5.9 negotiated rpc_gss_svc_integrity | |
| 390008 lipkey-p 1.3.6.1.5.5.9 negotiated rpc_gss_svc_privacy | | 390008 lipkey-p 1.3.6.1.5.5.9 negotiated rpc_gss_svc_privacy | |
| | | | |
| The mechanism algorithm is listed as "negotiated". This is because | | The mechanism algorithm is listed as "negotiated". This is because | |
| LIPKEY is layered on SPKM-3 and in SPKM-3 [RFC2847] the | | LIPKEY is layered on SPKM-3 and in SPKM-3 [RFC2847] the | |
| confidentiality and integrity algorithms are negotiated. Since | | confidentiality and integrity algorithms are negotiated. Since | |
| SPKM-3 specifies HMAC-MD5 for integrity as MANDATORY, 128 bit | | SPKM-3 specifies HMAC-MD5 for integrity as MANDATORY, 128 bit | |
| cast5CBC for confidentiality for privacy as MANDATORY, and further | | cast5CBC for confidentiality for privacy as MANDATORY, and further | |
| specifies that HMAC-MD5 and cast5CBC MUST be listed first before | | specifies that HMAC-MD5 and cast5CBC MUST be listed first before | |
| | | | |
| skipping to change at page 26, line 42 | | skipping to change at page 27, line 13 | |
| details. | | details. | |
| | | | |
| 3.2.1.3. SPKM-3 as a security triple | | 3.2.1.3. SPKM-3 as a security triple | |
| | | | |
| The SPKM-3 GSS-API mechanism as described in [RFC2847] MUST be | | The SPKM-3 GSS-API mechanism as described in [RFC2847] MUST be | |
| implemented and provide the following security triples. The | | implemented and provide the following security triples. The | |
| definition of the columns matches the previous subsection "Kerberos | | definition of the columns matches the previous subsection "Kerberos | |
| V5 as security triple". | | V5 as security triple". | |
| | | | |
| 1 2 3 4 5 | | 1 2 3 4 5 | |
|
| ----------------------------------------------------------------------- | | -------------------------------------------------------------------- | |
| 390009 spkm3 1.3.6.1.5.5.1.3 negotiated rpc_gss_svc_none | | 390009 spkm3 1.3.6.1.5.5.1.3 negotiated rpc_gss_svc_none | |
| 390010 spkm3i 1.3.6.1.5.5.1.3 negotiated rpc_gss_svc_integrity | | 390010 spkm3i 1.3.6.1.5.5.1.3 negotiated rpc_gss_svc_integrity | |
| 390011 spkm3p 1.3.6.1.5.5.1.3 negotiated rpc_gss_svc_privacy | | 390011 spkm3p 1.3.6.1.5.5.1.3 negotiated rpc_gss_svc_privacy | |
| | | | |
| For a discussion as to why the mechanism algorithm is listed as | | For a discussion as to why the mechanism algorithm is listed as | |
| "negotiated", see the previous section "LIPKEY as a security triple." | | "negotiated", see the previous section "LIPKEY as a security triple." | |
| | | | |
| Because SPKM-3 negotiates the algorithms, subsequent calls to SPKM- | | Because SPKM-3 negotiates the algorithms, subsequent calls to SPKM- | |
| 3's GSS_Wrap() and GSS_GetMIC() by RPCSEC_GSS will use a quality of | | 3's GSS_Wrap() and GSS_GetMIC() by RPCSEC_GSS will use a quality of | |
| protection value of 0 (zero). See section 5.2 of [RFC2025] for an | | protection value of 0 (zero). See section 5.2 of [RFC2025] for an | |
| explanation. | | explanation. | |
| | | | |
| Even though LIPKEY is layered over SPKM-3, SPKM-3 is specified as a | | Even though LIPKEY is layered over SPKM-3, SPKM-3 is specified as a | |
| mandatory set of triples to handle the situations where the initiator | | mandatory set of triples to handle the situations where the initiator | |
| (the client) is anonymous or where the initiator has its own | | (the client) is anonymous or where the initiator has its own | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| certificate. If the initiator is anonymous, there will not be a user | | certificate. If the initiator is anonymous, there will not be a user | |
| name and password to send to the target (the server). If the | | name and password to send to the target (the server). If the | |
| initiator has its own certificate, then using passwords is | | initiator has its own certificate, then using passwords is | |
| superfluous. | | superfluous. | |
| | | | |
| 3.3. Security Negotiation | | 3.3. Security Negotiation | |
| | | | |
| With the NFS version 4 server potentially offering multiple security | | With the NFS version 4 server potentially offering multiple security | |
| mechanisms, the client needs a method to determine or negotiate which | | mechanisms, the client needs a method to determine or negotiate which | |
| mechanism is to be used for its communication with the server. The | | mechanism is to be used for its communication with the server. The | |
| | | | |
| skipping to change at page 27, line 41 | | skipping to change at page 28, line 18 | |
| per filehandle basis, what security triple is to be used for server | | per filehandle basis, what security triple is to be used for server | |
| access. In general, the client will not have to use the SECINFO | | access. In general, the client will not have to use the SECINFO | |
| operation except during initial communication with the server or when | | operation except during initial communication with the server or when | |
| the client crosses policy boundaries at the server. It is possible | | the client crosses policy boundaries at the server. It is possible | |
| that the server's policies change during the client's interaction | | that the server's policies change during the client's interaction | |
| therefore forcing the client to negotiate a new security triple. | | therefore forcing the client to negotiate a new security triple. | |
| | | | |
| 3.3.2. Security Error | | 3.3.2. Security Error | |
| | | | |
| Based on the assumption that each NFS version 4 client and server | | Based on the assumption that each NFS version 4 client and server | |
|
| must support a minimum set of security (i.e. LIPKEY, SPKM-3, and | | must support a minimum set of security (i.e., LIPKEY, SPKM-3, and | |
| Kerberos-V5 all under RPCSEC_GSS), the NFS client will start its | | Kerberos-V5 all under RPCSEC_GSS), the NFS client will start its | |
| communication with the server with one of the minimal security | | communication with the server with one of the minimal security | |
| triples. During communication with the server, the client may | | triples. During communication with the server, the client may | |
| receive an NFS error of NFS4ERR_WRONGSEC. This error allows the | | receive an NFS error of NFS4ERR_WRONGSEC. This error allows the | |
| server to notify the client that the security triple currently being | | server to notify the client that the security triple currently being | |
| used is not appropriate for access to the server's filesystem | | used is not appropriate for access to the server's filesystem | |
| resources. The client is then responsible for determining what | | resources. The client is then responsible for determining what | |
| security triples are available at the server and choose one which is | | security triples are available at the server and choose one which is | |
| appropriate for the client. See the section for the "SECINFO" | | appropriate for the client. See the section for the "SECINFO" | |
| operation for further discussion of how the client will respond to | | operation for further discussion of how the client will respond to | |
| the NFS4ERR_WRONGSEC error and use SECINFO. | | the NFS4ERR_WRONGSEC error and use SECINFO. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| 3.4. Callback RPC Authentication | | 3.4. Callback RPC Authentication | |
| | | | |
| Except as noted elsewhere in this section, the callback RPC | | Except as noted elsewhere in this section, the callback RPC | |
| (described later) MUST mutually authenticate the NFS server to the | | (described later) MUST mutually authenticate the NFS server to the | |
| principal that acquired the clientid (also described later), using | | principal that acquired the clientid (also described later), using | |
| the security flavor the original SETCLIENTID operation used. | | the security flavor the original SETCLIENTID operation used. | |
| | | | |
| For AUTH_NONE, there are no principals, so this is a non-issue. | | For AUTH_NONE, there are no principals, so this is a non-issue. | |
| | | | |
| AUTH_SYS has no notions of mutual authentication or a server | | AUTH_SYS has no notions of mutual authentication or a server | |
| | | | |
| skipping to change at page 29, line 5 | | skipping to change at page 29, line 36 | |
| For Kerberos V5, nfs/hostname would be a server principal in the | | For Kerberos V5, nfs/hostname would be a server principal in the | |
| Kerberos Key Distribution Center database. This is the same | | Kerberos Key Distribution Center database. This is the same | |
| principal the client acquired a GSS-API context for when it issued | | principal the client acquired a GSS-API context for when it issued | |
| the SETCLIENTID operation, therefore, the realm name for the server | | the SETCLIENTID operation, therefore, the realm name for the server | |
| principal must be the same for the callback as it was for the | | principal must be the same for the callback as it was for the | |
| SETCLIENTID. | | SETCLIENTID. | |
| | | | |
| For LIPKEY, this would be the username passed to the target (the NFS | | For LIPKEY, this would be the username passed to the target (the NFS | |
| version 4 client that receives the callback). | | version 4 client that receives the callback). | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| It should be noted that LIPKEY may not work for callbacks, since the | | It should be noted that LIPKEY may not work for callbacks, since the | |
| LIPKEY client uses a user id/password. If the NFS client receiving | | LIPKEY client uses a user id/password. If the NFS client receiving | |
| the callback can authenticate the NFS server's user name/password | | the callback can authenticate the NFS server's user name/password | |
| pair, and if the user that the NFS server is authenticating to has a | | pair, and if the user that the NFS server is authenticating to has a | |
| public key certificate, then it works. | | public key certificate, then it works. | |
| | | | |
| In situations where the NFS client uses LIPKEY and uses a per-host | | In situations where the NFS client uses LIPKEY and uses a per-host | |
| principal for the SETCLIENTID operation, instead of using LIPKEY for | | principal for the SETCLIENTID operation, instead of using LIPKEY for | |
| SETCLIENTID, it is RECOMMENDED that SPKM-3 with mutual authentication | | SETCLIENTID, it is RECOMMENDED that SPKM-3 with mutual authentication | |
| be used. This effectively means that the client will use a | | be used. This effectively means that the client will use a | |
| certificate to authenticate and identify the initiator to the target | | certificate to authenticate and identify the initiator to the target | |
| on the NFS server. Using SPKM-3 and not LIPKEY has the following | | on the NFS server. Using SPKM-3 and not LIPKEY has the following | |
| advantages: | | advantages: | |
| | | | |
| o When the server does a callback, it must authenticate to the | | o When the server does a callback, it must authenticate to the | |
| principal used in the SETCLIENTID. Even if LIPKEY is used, | | principal used in the SETCLIENTID. Even if LIPKEY is used, | |
|
| because LIPKEY is layered over SPKM-3, the NFS client will need | | because LIPKEY is layered over SPKM-3, the NFS client will need to | |
| to have a certificate that corresponds to the principal used in | | have a certificate that corresponds to the principal used in the | |
| the SETCLIENTID operation. From an administrative perspective, | | SETCLIENTID operation. From an administrative perspective, having | |
| having a user name, password, and certificate for both the | | a user name, password, and certificate for both the client and | |
| client and server is redundant. | | server is redundant. | |
| | | | |
| o LIPKEY was intended to minimize additional infrastructure | | o LIPKEY was intended to minimize additional infrastructure | |
| requirements beyond a certificate for the target, and the | | requirements beyond a certificate for the target, and the | |
| expectation is that existing password infrastructure can be | | expectation is that existing password infrastructure can be | |
| leveraged for the initiator. In some environments, a per-host | | leveraged for the initiator. In some environments, a per-host | |
| password does not exist yet. If certificates are used for any | | password does not exist yet. If certificates are used for any | |
| per-host principals, then additional password infrastructure is | | per-host principals, then additional password infrastructure is | |
| not needed. | | not needed. | |
| | | | |
| o In cases when a host is both an NFS client and server, it can | | o In cases when a host is both an NFS client and server, it can | |
| share the same per-host certificate. | | share the same per-host certificate. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| 4. Filehandles | | 4. Filehandles | |
| | | | |
| The filehandle in the NFS protocol is a per server unique identifier | | The filehandle in the NFS protocol is a per server unique identifier | |
| for a filesystem object. The contents of the filehandle are opaque | | for a filesystem object. The contents of the filehandle are opaque | |
| to the client. Therefore, the server is responsible for translating | | to the client. Therefore, the server is responsible for translating | |
| the filehandle to an internal representation of the filesystem | | the filehandle to an internal representation of the filesystem | |
| object. | | object. | |
| | | | |
| 4.1. Obtaining the First Filehandle | | 4.1. Obtaining the First Filehandle | |
| | | | |
| | | | |
| skipping to change at page 31, line 4 | | skipping to change at page 31, line 22 | |
| ROOT of the server's file tree. Once this PUTROOTFH operation is | | ROOT of the server's file tree. Once this PUTROOTFH operation is | |
| used, the client can then traverse the entirety of the server's file | | used, the client can then traverse the entirety of the server's file | |
| tree with the LOOKUP operation. A complete discussion of the server | | tree with the LOOKUP operation. A complete discussion of the server | |
| name space is in the section "NFS Server Name Space". | | name space is in the section "NFS Server Name Space". | |
| | | | |
| 4.1.2. Public Filehandle | | 4.1.2. Public Filehandle | |
| | | | |
| The second special filehandle is the PUBLIC filehandle. Unlike the | | The second special filehandle is the PUBLIC filehandle. Unlike the | |
| ROOT filehandle, the PUBLIC filehandle may be bound or represent an | | ROOT filehandle, the PUBLIC filehandle may be bound or represent an | |
| arbitrary filesystem object at the server. The server is responsible | | arbitrary filesystem object at the server. The server is responsible | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| for this binding. It may be that the PUBLIC filehandle and the ROOT | | for this binding. It may be that the PUBLIC filehandle and the ROOT | |
| filehandle refer to the same filesystem object. However, it is up to | | filehandle refer to the same filesystem object. However, it is up to | |
| the administrative software at the server and the policies of the | | the administrative software at the server and the policies of the | |
| server administrator to define the binding of the PUBLIC filehandle | | server administrator to define the binding of the PUBLIC filehandle | |
| and server filesystem object. The client may not make any | | and server filesystem object. The client may not make any | |
|
| assumptions about this binding. The client uses the PUBLIC filehandle | | assumptions about this binding. The client uses the PUBLIC | |
| via the PUTPUBFH operation. | | filehandle via the PUTPUBFH operation. | |
| | | | |
| 4.2. Filehandle Types | | 4.2. Filehandle Types | |
| | | | |
| In the NFS version 2 and 3 protocols, there was one type of | | In the NFS version 2 and 3 protocols, there was one type of | |
| filehandle with a single set of semantics. This type of filehandle | | filehandle with a single set of semantics. This type of filehandle | |
| is termed "persistent" in NFS Version 4. The semantics of a | | is termed "persistent" in NFS Version 4. The semantics of a | |
| persistent filehandle remain the same as before. A new type of | | persistent filehandle remain the same as before. A new type of | |
| filehandle introduced in NFS Version 4 is the "volatile" filehandle, | | filehandle introduced in NFS Version 4 is the "volatile" filehandle, | |
| which attempts to accommodate certain server environments. | | which attempts to accommodate certain server environments. | |
| | | | |
| | | | |
| skipping to change at page 32, line 4 | | skipping to change at page 32, line 26 | |
| doing a byte-by-byte comparison. However, the client MUST NOT | | doing a byte-by-byte comparison. However, the client MUST NOT | |
| otherwise interpret the contents of filehandles. If two filehandles | | otherwise interpret the contents of filehandles. If two filehandles | |
| from the same server are equal, they MUST refer to the same file. | | from the same server are equal, they MUST refer to the same file. | |
| Servers SHOULD try to maintain a one-to-one correspondence between | | Servers SHOULD try to maintain a one-to-one correspondence between | |
| filehandles and files but this is not required. Clients MUST use | | filehandles and files but this is not required. Clients MUST use | |
| filehandle comparisons only to improve performance, not for correct | | filehandle comparisons only to improve performance, not for correct | |
| behavior. All clients need to be prepared for situations in which it | | behavior. All clients need to be prepared for situations in which it | |
| cannot be determined whether two filehandles denote the same object | | cannot be determined whether two filehandles denote the same object | |
| and in such cases, avoid making invalid assumptions which might cause | | and in such cases, avoid making invalid assumptions which might cause | |
| incorrect behavior. Further discussion of filehandle and attribute | | incorrect behavior. Further discussion of filehandle and attribute | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| comparison in the context of data caching is presented in the section | | comparison in the context of data caching is presented in the section | |
| "Data Caching and File Identity". | | "Data Caching and File Identity". | |
| | | | |
| As an example, in the case that two different path names when | | As an example, in the case that two different path names when | |
| traversed at the server terminate at the same filesystem object, the | | traversed at the server terminate at the same filesystem object, the | |
| server SHOULD return the same filehandle for each path. This can | | server SHOULD return the same filehandle for each path. This can | |
| occur if a hard link is used to create two file names which refer to | | occur if a hard link is used to create two file names which refer to | |
| the same underlying file object and associated data. For example, if | | the same underlying file object and associated data. For example, if | |
| paths /a/b/c and /a/d/c refer to the same file, the server SHOULD | | paths /a/b/c and /a/d/c refer to the same file, the server SHOULD | |
| return the same filehandle for both path names traversals. | | return the same filehandle for both path names traversals. | |
| | | | |
| skipping to change at page 32, line 37 | | skipping to change at page 33, line 7 | |
| server must honor the same filehandle as the old NFS server. | | server must honor the same filehandle as the old NFS server. | |
| | | | |
| The persistent filehandle will be become stale or invalid when the | | The persistent filehandle will be become stale or invalid when the | |
| filesystem object is removed. When the server is presented with a | | filesystem object is removed. When the server is presented with a | |
| persistent filehandle that refers to a deleted object, it MUST return | | persistent filehandle that refers to a deleted object, it MUST return | |
| an error of NFS4ERR_STALE. A filehandle may become stale when the | | an error of NFS4ERR_STALE. A filehandle may become stale when the | |
| filesystem containing the object is no longer available. The file | | filesystem containing the object is no longer available. The file | |
| system may become unavailable if it exists on removable media and the | | system may become unavailable if it exists on removable media and the | |
| media is no longer available at the server or the filesystem in whole | | media is no longer available at the server or the filesystem in whole | |
| has been destroyed or the filesystem has simply been removed from the | | has been destroyed or the filesystem has simply been removed from the | |
|
| server's name space (i.e. unmounted in a UNIX environment). | | server's name space (i.e., unmounted in a UNIX environment). | |
| | | | |
| 4.2.3. Volatile Filehandle | | 4.2.3. Volatile Filehandle | |
| | | | |
| A volatile filehandle does not share the same longevity | | A volatile filehandle does not share the same longevity | |
| characteristics of a persistent filehandle. The server may determine | | characteristics of a persistent filehandle. The server may determine | |
| that a volatile filehandle is no longer valid at many different | | that a volatile filehandle is no longer valid at many different | |
| points in time. If the server can definitively determine that a | | points in time. If the server can definitively determine that a | |
| volatile filehandle refers to an object that has been removed, the | | volatile filehandle refers to an object that has been removed, the | |
| server should return NFS4ERR_STALE to the client (as is the case for | | server should return NFS4ERR_STALE to the client (as is the case for | |
| persistent filehandles). In all other cases where the server | | persistent filehandles). In all other cases where the server | |
| determines that a volatile filehandle can no longer be used, it | | determines that a volatile filehandle can no longer be used, it | |
| should return an error of NFS4ERR_FHEXPIRED. | | should return an error of NFS4ERR_FHEXPIRED. | |
| | | | |
| The mandatory attribute "fh_expire_type" is used by the client to | | The mandatory attribute "fh_expire_type" is used by the client to | |
| determine what type of filehandle the server is providing for a | | determine what type of filehandle the server is providing for a | |
| particular filesystem. This attribute is a bitmask with the | | particular filesystem. This attribute is a bitmask with the | |
| following values: | | following values: | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| FH4_PERSISTENT | | FH4_PERSISTENT | |
|
| The value of FH4_PERSISTENT is used to indicate a persistent | | The value of FH4_PERSISTENT is used to indicate a | |
| filehandle, which is valid until the object is removed from the | | persistent filehandle, which is valid until the object is | |
| filesystem. The server will not return NFS4ERR_FHEXPIRED for | | removed from the filesystem. The server will not return | |
| this filehandle. FH4_PERSISTENT is defined as a value in which | | NFS4ERR_FHEXPIRED for this filehandle. FH4_PERSISTENT is | |
| none of the bits specified below are set. | | defined as a value in which none of the bits specified | |
| | | below are set. | |
| | | | |
| FH4_VOLATILE_ANY | | FH4_VOLATILE_ANY | |
|
| The filehandle may expire at any time, except as specifically | | The filehandle may expire at any time, except as | |
| excluded (i.e. FH4_NO_EXPIRE_WITH_OPEN). | | specifically excluded (i.e., FH4_NO_EXPIRE_WITH_OPEN). | |
| | | | |
| FH4_NOEXPIRE_WITH_OPEN | | FH4_NOEXPIRE_WITH_OPEN | |
|
| May only be set when FH4_VOLATILE_ANY is set. If this bit is | | May only be set when FH4_VOLATILE_ANY is set. If this bit | |
| set, then the meaning of FH4_VOLATILE_ANY is qualified to | | is set, then the meaning of FH4_VOLATILE_ANY is qualified | |
| exclude any expiration of the filehandle when it is open. | | to exclude any expiration of the filehandle when it is | |
| | | open. | |
| | | | |
| FH4_VOL_MIGRATION | | FH4_VOL_MIGRATION | |
| The filehandle will expire as a result of migration. If | | The filehandle will expire as a result of migration. If | |
| FH4_VOL_ANY is set, FH4_VOL_MIGRATION is redundant. | | FH4_VOL_ANY is set, FH4_VOL_MIGRATION is redundant. | |
| | | | |
| FH4_VOL_RENAME | | FH4_VOL_RENAME | |
| The filehandle will expire during rename. This includes a | | The filehandle will expire during rename. This includes a | |
|
| rename by the requesting client or a rename by any other client. | | rename by the requesting client or a rename by any other | |
| If FH4_VOL_ANY is set, FH4_VOL_RENAME is redundant. | | client. If FH4_VOL_ANY is set, FH4_VOL_RENAME is | |
| | | redundant. | |
| | | | |
|
| Servers which provide volatile filehandles that may expire while | | Servers which provide volatile filehandles that may expire while open | |
| open (i.e. if FH4_VOL_MIGRATION or FH4_VOL_RENAME is set or if | | (i.e., if FH4_VOL_MIGRATION or FH4_VOL_RENAME is set or if | |
| FH4_VOLATILE_ANY is set and FH4_NOEXPIRE_WITH_OPEN not set), | | FH4_VOLATILE_ANY is set and FH4_NOEXPIRE_WITH_OPEN not set), should | |
| should deny a RENAME or REMOVE that would affect an OPEN file of | | deny a RENAME or REMOVE that would affect an OPEN file of any of the | |
| any of the components leading to the OPEN file. In addition, | | components leading to the OPEN file. In addition, the server should | |
| the server should deny all RENAME or REMOVE requests during the | | deny all RENAME or REMOVE requests during the grace period upon | |
| grace period upon server restart. | | server restart. | |
| | | | |
|
| Note that the bits FH4_VOL_MIGRATION and FH4_VOL_RENAME allow | | Note that the bits FH4_VOL_MIGRATION and FH4_VOL_RENAME allow the | |
| the client to determine that expiration has occurred whenever a | | client to determine that expiration has occurred whenever a specific | |
| specific event occurs, without an explicit filehandle expiration | | event occurs, without an explicit filehandle expiration error from | |
| error from the server. FH4_VOL_ANY does not provide this form | | the server. FH4_VOL_ANY does not provide this form of information. | |
| of information. In situations where the server will expire many, | | In situations where the server will expire many, but not all | |
| but not all filehandles upon migration (e.g. all but those that | | filehandles upon migration (e.g., all but those that are open), | |
| are open), FH4_VOLATILE_ANY (in this case with | | FH4_VOLATILE_ANY (in this case with FH4_NOEXPIRE_WITH_OPEN) is a | |
| FH4_NOEXPIRE_WITH_OPEN) is a better choice since the client may | | better choice since the client may not assume that all filehandles | |
| not assume that all filehandles will expire when migration | | will expire when migration occurs, and it is likely that additional | |
| occurs, and it is likely that additional expirations will occur | | expirations will occur (as a result of file CLOSE) that are separated | |
| (as a result of file CLOSE) that are separated in time from the | | in time from the migration event itself. | |
| migration event itself. | | | |
| | | | |
| 4.2.4. One Method of Constructing a Volatile Filehandle | | 4.2.4. One Method of Constructing a Volatile Filehandle | |
| | | | |
| A volatile filehandle, while opaque to the client could contain: | | A volatile filehandle, while opaque to the client could contain: | |
| | | | |
| [volatile bit = 1 | server boot time | slot | generation number] | | [volatile bit = 1 | server boot time | slot | generation number] | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| o slot is an index in the server volatile filehandle table | | o slot is an index in the server volatile filehandle table | |
| | | | |
| o generation number is the generation number for the table | | o generation number is the generation number for the table | |
| entry/slot | | entry/slot | |
| | | | |
| When the client presents a volatile filehandle, the server makes the | | When the client presents a volatile filehandle, the server makes the | |
| following checks, which assume that the check for the volatile bit | | following checks, which assume that the check for the volatile bit | |
| has passed. If the server boot time is less than the current server | | has passed. If the server boot time is less than the current server | |
| boot time, return NFS4ERR_FHEXPIRED. If slot is out of range, return | | boot time, return NFS4ERR_FHEXPIRED. If slot is out of range, return | |
| NFS4ERR_BADHANDLE. If the generation number does not match, return | | NFS4ERR_BADHANDLE. If the generation number does not match, return | |
| | | | |
| skipping to change at page 34, line 56 | | skipping to change at page 35, line 37 | |
| However, in the case that the client itself is renaming the file and | | However, in the case that the client itself is renaming the file and | |
| the file is open, it is possible that the client may be able to | | the file is open, it is possible that the client may be able to | |
| recover. The client can determine the new path name based on the | | recover. The client can determine the new path name based on the | |
| processing of the rename request. The client can then regenerate the | | processing of the rename request. The client can then regenerate the | |
| new filehandle based on the new path name. The client could also use | | new filehandle based on the new path name. The client could also use | |
| the compound operation mechanism to construct a set of operations | | the compound operation mechanism to construct a set of operations | |
| like: | | like: | |
| RENAME A B | | RENAME A B | |
| LOOKUP B | | LOOKUP B | |
| GETFH | | GETFH | |
|
| | | | |
| Note that the COMPOUND procedure does not provide atomicity. This | | Note that the COMPOUND procedure does not provide atomicity. This | |
| example only reduces the overhead of recovering from an expired | | example only reduces the overhead of recovering from an expired | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| filehandle. | | filehandle. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| 5. File Attributes | | 5. File Attributes | |
| | | | |
| To meet the requirements of extensibility and increased | | To meet the requirements of extensibility and increased | |
| interoperability with non-UNIX platforms, attributes must be handled | | interoperability with non-UNIX platforms, attributes must be handled | |
| in a flexible manner. The NFS version 3 fattr3 structure contains a | | in a flexible manner. The NFS version 3 fattr3 structure contains a | |
| fixed list of attributes that not all clients and servers are able to | | fixed list of attributes that not all clients and servers are able to | |
| support or care about. The fattr3 structure can not be extended as | | support or care about. The fattr3 structure can not be extended as | |
| new needs arise and it provides no way to indicate non-support. With | | new needs arise and it provides no way to indicate non-support. With | |
| the NFS version 4 protocol, the client is able query what attributes | | the NFS version 4 protocol, the client is able query what attributes | |
| the server supports and construct requests with only those supported | | the server supports and construct requests with only those supported | |
| | | | |
| skipping to change at page 37, line 4 | | skipping to change at page 36, line 43 | |
| than by an NFS client implementation. NFS implementors are strongly | | than by an NFS client implementation. NFS implementors are strongly | |
| encouraged to define their new attributes as recommended attributes | | encouraged to define their new attributes as recommended attributes | |
| by bringing them to the IETF standards-track process. | | by bringing them to the IETF standards-track process. | |
| | | | |
| The set of attributes which are classified as mandatory is | | The set of attributes which are classified as mandatory is | |
| deliberately small since servers must do whatever it takes to support | | deliberately small since servers must do whatever it takes to support | |
| them. A server should support as many of the recommended attributes | | them. A server should support as many of the recommended attributes | |
| as possible but by their definition, the server is not required to | | as possible but by their definition, the server is not required to | |
| support all of them. Attributes are deemed mandatory if the data is | | support all of them. Attributes are deemed mandatory if the data is | |
| both needed by a large number of clients and is not otherwise | | both needed by a large number of clients and is not otherwise | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| reasonably computable by the client when support is not provided on | | reasonably computable by the client when support is not provided on | |
| the server. | | the server. | |
| | | | |
| Note that the hidden directory returned by OPENATTR is a convenience | | Note that the hidden directory returned by OPENATTR is a convenience | |
| for protocol processing. The client should not make any assumptions | | for protocol processing. The client should not make any assumptions | |
| about the server's implementation of named attributes and whether the | | about the server's implementation of named attributes and whether the | |
| underlying filesystem at the server has a named attribute directory | | underlying filesystem at the server has a named attribute directory | |
| or not. Therefore, operations such as SETATTR and GETATTR on the | | or not. Therefore, operations such as SETATTR and GETATTR on the | |
| named attribute directory are undefined. | | named attribute directory are undefined. | |
| | | | |
| | | | |
| skipping to change at page 38, line 4 | | skipping to change at page 37, line 43 | |
| clients but the client is better positioned decide whether and how to | | clients but the client is better positioned decide whether and how to | |
| fabricate or construct an attribute or whether to do without the | | fabricate or construct an attribute or whether to do without the | |
| attribute. | | attribute. | |
| | | | |
| 5.3. Named Attributes | | 5.3. Named Attributes | |
| | | | |
| These attributes are not supported by direct encoding in the NFS | | These attributes are not supported by direct encoding in the NFS | |
| Version 4 protocol but are accessed by string names rather than | | Version 4 protocol but are accessed by string names rather than | |
| numbers and correspond to an uninterpreted stream of bytes which are | | numbers and correspond to an uninterpreted stream of bytes which are | |
| stored with the filesystem object. The name space for these | | stored with the filesystem object. The name space for these | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| attributes may be accessed by using the OPENATTR operation. The | | attributes may be accessed by using the OPENATTR operation. The | |
| OPENATTR operation returns a filehandle for a virtual "attribute | | OPENATTR operation returns a filehandle for a virtual "attribute | |
| directory" and further perusal of the name space may be done using | | directory" and further perusal of the name space may be done using | |
| READDIR and LOOKUP operations on this filehandle. Named attributes | | READDIR and LOOKUP operations on this filehandle. Named attributes | |
| may then be examined or changed by normal READ and WRITE and CREATE | | may then be examined or changed by normal READ and WRITE and CREATE | |
| operations on the filehandles returned from READDIR and LOOKUP. | | operations on the filehandles returned from READDIR and LOOKUP. | |
| Named attributes may have attributes. | | Named attributes may have attributes. | |
| | | | |
| It is recommended that servers support arbitrary named attributes. A | | It is recommended that servers support arbitrary named attributes. A | |
| client should not depend on the ability to store any named attributes | | client should not depend on the ability to store any named attributes | |
| | | | |
| skipping to change at page 38, line 49 | | skipping to change at page 38, line 39 | |
| o The per server attribute is: | | o The per server attribute is: | |
| | | | |
| lease_time | | lease_time | |
| | | | |
| o The per filesystem attributes are: | | o The per filesystem attributes are: | |
| | | | |
| supp_attr, fh_expire_type, link_support, symlink_support, | | supp_attr, fh_expire_type, link_support, symlink_support, | |
| unique_handles, aclsupport, cansettime, case_insensitive, | | unique_handles, aclsupport, cansettime, case_insensitive, | |
| case_preserving, chown_restricted, files_avail, files_free, | | case_preserving, chown_restricted, files_avail, files_free, | |
| files_total, fs_locations, homogeneous, maxfilesize, maxname, | | files_total, fs_locations, homogeneous, maxfilesize, maxname, | |
|
| maxread, maxwrite, no_trunc, space_avail, space_free, | | maxread, maxwrite, no_trunc, space_avail, space_free, space_total, | |
| space_total, time_delta | | time_delta | |
| | | | |
| o The per filesystem object attributes are: | | o The per filesystem object attributes are: | |
| | | | |
| type, change, size, named_attr, fsid, rdattr_error, filehandle, | | type, change, size, named_attr, fsid, rdattr_error, filehandle, | |
| ACL, archive, fileid, hidden, maxlink, mimetype, mode, numlinks, | | ACL, archive, fileid, hidden, maxlink, mimetype, mode, numlinks, | |
| owner, owner_group, rawdev, space_used, system, time_access, | | owner, owner_group, rawdev, space_used, system, time_access, | |
| time_backup, time_create, time_metadata, time_modify, | | time_backup, time_create, time_metadata, time_modify, | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| mounted_on_fileid | | mounted_on_fileid | |
| | | | |
| For quota_avail_hard, quota_avail_soft, and quota_used see their | | For quota_avail_hard, quota_avail_soft, and quota_used see their | |
| definitions below for the appropriate classification. | | definitions below for the appropriate classification. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| 5.5. Mandatory Attributes - Definitions | | 5.5. Mandatory Attributes - Definitions | |
| | | | |
| Name # DataType Access Description | | Name # DataType Access Description | |
| ___________________________________________________________________ | | ___________________________________________________________________ | |
| supp_attr 0 bitmap READ The bit vector which | | supp_attr 0 bitmap READ The bit vector which | |
| would retrieve all | | would retrieve all | |
| mandatory and | | mandatory and | |
| recommended attributes | | recommended attributes | |
| that are supported for | | that are supported for | |
| this object. The | | this object. The | |
| | | | |
| skipping to change at page 41, line 5 | | skipping to change at page 40, line 5 | |
| only if the filesystem | | only if the filesystem | |
| object can not be | | object can not be | |
| updated more | | updated more | |
| frequently than the | | frequently than the | |
| resolution of | | resolution of | |
| time_metadata. | | time_metadata. | |
| | | | |
| size 4 uint64 R/W The size of the object | | size 4 uint64 R/W The size of the object | |
| in bytes. | | in bytes. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| link_support 5 bool READ True, if the object's | | link_support 5 bool READ True, if the object's | |
| filesystem supports | | filesystem supports | |
| hard links. | | hard links. | |
| | | | |
| symlink_support 6 bool READ True, if the object's | | symlink_support 6 bool READ True, if the object's | |
| filesystem supports | | filesystem supports | |
| symbolic links. | | symbolic links. | |
| | | | |
| named_attr 7 bool READ True, if this object | | named_attr 7 bool READ True, if this object | |
| has named attributes. | | has named attributes. | |
| | | | |
| skipping to change at page 42, line 5 | | skipping to change at page 41, line 5 | |
| server in seconds. | | server in seconds. | |
| | | | |
| rdattr_error 11 enum READ Error returned from | | rdattr_error 11 enum READ Error returned from | |
| getattr during | | getattr during | |
| readdir. | | readdir. | |
| | | | |
| filehandle 19 nfs_fh4 READ The filehandle of this | | filehandle 19 nfs_fh4 READ The filehandle of this | |
| object (primarily for | | object (primarily for | |
| readdir requests). | | readdir requests). | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| 5.6. Recommended Attributes - Definitions | | 5.6. Recommended Attributes - Definitions | |
| | | | |
| Name # Data Type Access Description | | Name # Data Type Access Description | |
|
| ______________________________________________________________________ | | _____________________________________________________________________ | |
| ACL 12 nfsace4<> R/W The access control | | ACL 12 nfsace4<> R/W The access control | |
| list for the object. | | list for the object. | |
| | | | |
| aclsupport 13 uint32 READ Indicates what types | | aclsupport 13 uint32 READ Indicates what types | |
|
| of ACLs are supported | | of ACLs are | |
| on the current | | supported on the | |
| filesystem. | | current filesystem. | |
| | | | |
| archive 14 bool R/W True, if this file | | archive 14 bool R/W True, if this file | |
| has been archived | | has been archived | |
| since the time of | | since the time of | |
| last modification | | last modification | |
| (deprecated in favor | | (deprecated in favor | |
| of time_backup). | | of time_backup). | |
| | | | |
| cansettime 15 bool READ True, if the server | | cansettime 15 bool READ True, if the server | |
|
| able to change the | | is able to change | |
| times for a | | the times for a | |
| filesystem object as | | filesystem object as | |
| specified in a | | specified in a | |
| SETATTR operation. | | SETATTR operation. | |
| | | | |
| case_insensitive 16 bool READ True, if filename | | case_insensitive 16 bool READ True, if filename | |
| comparisons on this | | comparisons on this | |
| filesystem are case | | filesystem are case | |
| insensitive. | | insensitive. | |
| | | | |
| case_preserving 17 bool READ True, if filename | | case_preserving 17 bool READ True, if filename | |
| | | | |
| skipping to change at page 43, line 5 | | skipping to change at page 42, line 7 | |
| with a file if the | | with a file if the | |
| caller is not a | | caller is not a | |
| privileged user (for | | privileged user (for | |
| example, "root" in | | example, "root" in | |
| UNIX operating | | UNIX operating | |
| environments or in | | environments or in | |
| Windows 2000 the | | Windows 2000 the | |
| "Take Ownership" | | "Take Ownership" | |
| privilege). | | privilege). | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| fileid 20 uint64 READ A number uniquely | | fileid 20 uint64 READ A number uniquely | |
| identifying the file | | identifying the file | |
| within the | | within the | |
| filesystem. | | filesystem. | |
| | | | |
| files_avail 21 uint64 READ File slots available | | files_avail 21 uint64 READ File slots available | |
| to this user on the | | to this user on the | |
|
| filesystem containing | | filesystem | |
| this object - this | | containing this | |
| should be the | | object - this should | |
| smallest relevant | | be the smallest | |
| limit. | | relevant limit. | |
| | | | |
| files_free 22 uint64 READ Free file slots on | | files_free 22 uint64 READ Free file slots on | |
| the filesystem | | the filesystem | |
| containing this | | containing this | |
| object - this should | | object - this should | |
| be the smallest | | be the smallest | |
| relevant limit. | | relevant limit. | |
| | | | |
| files_total 23 uint64 READ Total file slots on | | files_total 23 uint64 READ Total file slots on | |
| the filesystem | | the filesystem | |
| containing this | | containing this | |
| object. | | object. | |
| | | | |
| fs_locations 24 fs_locations READ Locations where this | | fs_locations 24 fs_locations READ Locations where this | |
| filesystem may be | | filesystem may be | |
|
| found. If the server | | found. If the | |
| returns NFS4ERR_MOVED | | server returns | |
| | | NFS4ERR_MOVED | |
| as an error, this | | as an error, this | |
| attribute MUST be | | attribute MUST be | |
| supported. | | supported. | |
| | | | |
| hidden 25 bool R/W True, if the file is | | hidden 25 bool R/W True, if the file is | |
| considered hidden | | considered hidden | |
| with respect to the | | with respect to the | |
|
| Windows API? | | Windows API. | |
| | | | |
| homogeneous 26 bool READ True, if this | | homogeneous 26 bool READ True, if this | |
| object's filesystem | | object's filesystem | |
|
| is homogeneous, i.e. | | is homogeneous, | |
| are per filesystem | | i.e., are per | |
| | | filesystem | |
| attributes the same | | attributes the same | |
| for all filesystem's | | for all filesystem's | |
|
| objects. | | objects? | |
| | | | |
| maxfilesize 27 uint64 READ Maximum supported | | maxfilesize 27 uint64 READ Maximum supported | |
| file size for the | | file size for the | |
| filesystem of this | | filesystem of this | |
| object. | | object. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| maxlink 28 uint32 READ Maximum number of | | maxlink 28 uint32 READ Maximum number of | |
| links for this | | links for this | |
| object. | | object. | |
| | | | |
|
| maxname 29 uint32 READ Maximum filename size | | maxname 29 uint32 READ Maximum filename | |
| supported for this | | size supported for | |
| object. | | this object. | |
| | | | |
| maxread 30 uint64 READ Maximum read size | | maxread 30 uint64 READ Maximum read size | |
| supported for this | | supported for this | |
| object. | | object. | |
| | | | |
| maxwrite 31 uint64 READ Maximum write size | | maxwrite 31 uint64 READ Maximum write size | |
| supported for this | | supported for this | |
| object. This | | object. This | |
| attribute SHOULD be | | attribute SHOULD be | |
|
| supported if the file | | supported if the | |
| is writable. Lack of | | file is writable. | |
| this attribute can | | Lack of this | |
| | | attribute can | |
| lead to the client | | lead to the client | |
| either wasting | | either wasting | |
| bandwidth or not | | bandwidth or not | |
| receiving the best | | receiving the best | |
| performance. | | performance. | |
| | | | |
| mimetype 32 utf8<> R/W MIME body | | mimetype 32 utf8<> R/W MIME body | |
| type/subtype of this | | type/subtype of this | |
| object. | | object. | |
| | | | |
| | | | |
| skipping to change at page 45, line 5 | | skipping to change at page 44, line 16 | |
| to this object. | | to this object. | |
| | | | |
| owner 36 utf8<> R/W The string name of | | owner 36 utf8<> R/W The string name of | |
| the owner of this | | the owner of this | |
| object. | | object. | |
| | | | |
| owner_group 37 utf8<> R/W The string name of | | owner_group 37 utf8<> R/W The string name of | |
| the group ownership | | the group ownership | |
| of this object. | | of this object. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| quota_avail_hard 38 uint64 READ For definition see | | quota_avail_hard 38 uint64 READ For definition see | |
| "Quota Attributes" | | "Quota Attributes" | |
| section below. | | section below. | |
| | | | |
| quota_avail_soft 39 uint64 READ For definition see | | quota_avail_soft 39 uint64 READ For definition see | |
| "Quota Attributes" | | "Quota Attributes" | |
| section below. | | section below. | |
| | | | |
| quota_used 40 uint64 READ For definition see | | quota_used 40 uint64 READ For definition see | |
| "Quota Attributes" | | "Quota Attributes" | |
| section below. | | section below. | |
| | | | |
| rawdev 41 specdata4 READ Raw device | | rawdev 41 specdata4 READ Raw device | |
| identifier. UNIX | | identifier. UNIX | |
| device major/minor | | device major/minor | |
|
| node information. If | | node information. | |
| the value of type is | | If the value of | |
| not NF4BLK or NF4CHR, | | type is not | |
| | | NF4BLK or NF4CHR, | |
| the value return | | the value return | |
| SHOULD NOT be | | SHOULD NOT be | |
| considered useful. | | considered useful. | |
| | | | |
| space_avail 42 uint64 READ Disk space in bytes | | space_avail 42 uint64 READ Disk space in bytes | |
| available to this | | available to this | |
| user on the | | user on the | |
|
| filesystem containing | | filesystem | |
| this object - this | | containing this | |
| should be the | | object - this should | |
| smallest relevant | | be the smallest | |
| limit. | | relevant limit. | |
| | | | |
| space_free 43 uint64 READ Free disk space in | | space_free 43 uint64 READ Free disk space in | |
| bytes on the | | bytes on the | |
|
| filesystem containing | | filesystem | |
| this object - this | | containing this | |
| should be the | | object - this should | |
| smallest relevant | | be the smallest | |
| limit. | | relevant limit. | |
| | | | |
| space_total 44 uint64 READ Total disk space in | | space_total 44 uint64 READ Total disk space in | |
| bytes on the | | bytes on the | |
|
| filesystem containing | | filesystem | |
| this object. | | containing this | |
| | | object. | |
| | | | |
| space_used 45 uint64 READ Number of filesystem | | space_used 45 uint64 READ Number of filesystem | |
| bytes allocated to | | bytes allocated to | |
| this object. | | this object. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | system 46 bool R/W True, if this file | |
| | | is a "system" file | |
| system 46 bool R/W True, if this file is | | with respect to the | |
| a "system" file with | | Windows API. | |
| respect to the | | | |
| Windows API? | | | |
| | | | |
| time_access 47 nfstime4 READ The time of last | | time_access 47 nfstime4 READ The time of last | |
| access to the object | | access to the object | |
| by a read that was | | by a read that was | |
| satisfied by the | | satisfied by the | |
| server. | | server. | |
| | | | |
| time_access_set 48 settime4 WRITE Set the time of last | | time_access_set 48 settime4 WRITE Set the time of last | |
|
| access to the object. | | access to the | |
| SETATTR use only. | | object. SETATTR | |
| | | use only. | |
| | | | |
| time_backup 49 nfstime4 R/W The time of last | | time_backup 49 nfstime4 R/W The time of last | |
|
| backup of the object. | | backup of the | |
| | | object. | |
| | | | |
| time_create 50 nfstime4 R/W The time of creation | | time_create 50 nfstime4 R/W The time of creation | |
| of the object. This | | of the object. This | |
| attribute does not | | attribute does not | |
| have any relation to | | have any relation to | |
| the traditional UNIX | | the traditional UNIX | |
| file attribute | | file attribute | |
| "ctime" or "change | | "ctime" or "change | |
| time". | | time". | |
| | | | |
| | | | |
| skipping to change at page 46, line 53 | | skipping to change at page 46, line 20 | |
| time_modify 53 nfstime4 READ The time of last | | time_modify 53 nfstime4 READ The time of last | |
| modification to the | | modification to the | |
| object. | | object. | |
| | | | |
| time_modify_set 54 settime4 WRITE Set the time of last | | time_modify_set 54 settime4 WRITE Set the time of last | |
| modification to the | | modification to the | |
| object. SETATTR use | | object. SETATTR use | |
| only. | | only. | |
| | | | |
| mounted_on_fileid 55 uint64 READ Like fileid, but if | | mounted_on_fileid 55 uint64 READ Like fileid, but if | |
|
| the target filehandle | | the target | |
| is the root of a | | filehandle is the | |
| filesystem return the | | root of a filesystem | |
| fileid of the | | return the fileid of | |
| underlying directory. | | the underlying | |
| | | directory. | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| 5.7. Time Access | | 5.7. Time Access | |
| | | | |
| As defined above, the time_access attribute represents the time of | | As defined above, the time_access attribute represents the time of | |
| last access to the object by a read that was satisfied by the server. | | last access to the object by a read that was satisfied by the server. | |
| The notion of what is an "access" depends on server's operating | | The notion of what is an "access" depends on server's operating | |
| environment and/or the server's filesystem semantics. For example, | | environment and/or the server's filesystem semantics. For example, | |
| for servers obeying POSIX semantics, time_access would be updated | | for servers obeying POSIX semantics, time_access would be updated | |
| only by the READLINK, READ, and READDIR operations and not any of the | | only by the READLINK, READ, and READDIR operations and not any of the | |
| operations that modify the content of the object. Of course, setting | | operations that modify the content of the object. Of course, setting | |
| | | | |
| skipping to change at page 48, line 4 | | skipping to change at page 47, line 35 | |
| storage, to serve as a means of identifying the users corresponding | | storage, to serve as a means of identifying the users corresponding | |
| to these security principals. When these local identifiers are | | to these security principals. When these local identifiers are | |
| translated to the form of the owner attribute, associated with files | | translated to the form of the owner attribute, associated with files | |
| created by such principals they identify, in a common format, the | | created by such principals they identify, in a common format, the | |
| users associated with each corresponding set of security principals. | | users associated with each corresponding set of security principals. | |
| | | | |
| The translation used to interpret owner and group strings is not | | The translation used to interpret owner and group strings is not | |
| specified as part of the protocol. This allows various solutions to | | specified as part of the protocol. This allows various solutions to | |
| be employed. For example, a local translation table may be consulted | | be employed. For example, a local translation table may be consulted | |
| that maps between a numeric id to the user@dns_domain syntax. A name | | that maps between a numeric id to the user@dns_domain syntax. A name | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| service may also be used to accomplish the translation. A server may | | service may also be used to accomplish the translation. A server may | |
| provide a more general service, not limited by any particular | | provide a more general service, not limited by any particular | |
| translation (which would only translate a limited set of possible | | translation (which would only translate a limited set of possible | |
| strings) by storing the owner and owner_group attributes in local | | strings) by storing the owner and owner_group attributes in local | |
| storage without any translation or it may augment a translation | | storage without any translation or it may augment a translation | |
| method by storing the entire string for attributes for which no | | method by storing the entire string for attributes for which no | |
| translation is available while using the local representation for | | translation is available while using the local representation for | |
| those cases in which a translation is available. | | those cases in which a translation is available. | |
| | | | |
| Servers that do not provide support for all possible values of the | | Servers that do not provide support for all possible values of the | |
| | | | |
| skipping to change at page 48, line 49 | | skipping to change at page 48, line 28 | |
| server, the attribute value must be constructed without the "@". | | server, the attribute value must be constructed without the "@". | |
| Therefore, the absence of the @ from the owner or owner_group | | Therefore, the absence of the @ from the owner or owner_group | |
| attribute signifies that no translation was available at the sender | | attribute signifies that no translation was available at the sender | |
| and that the receiver of the attribute should not use that string as | | and that the receiver of the attribute should not use that string as | |
| a basis for translation into its own internal format. Even though | | a basis for translation into its own internal format. Even though | |
| the attribute value can not be translated, it may still be useful. | | the attribute value can not be translated, it may still be useful. | |
| In the case of a client, the attribute string may be used for local | | In the case of a client, the attribute string may be used for local | |
| display of ownership. | | display of ownership. | |
| | | | |
| To provide a greater degree of compatibility with previous versions | | To provide a greater degree of compatibility with previous versions | |
|
| of NFS (i.e. v2 and v3), which identified users and groups by 32-bit | | of NFS (i.e., v2 and v3), which identified users and groups by 32-bit | |
| unsigned uid's and gid's, owner and group strings that consist of | | unsigned uid's and gid's, owner and group strings that consist of | |
| decimal numeric values with no leading zeros can be given a special | | decimal numeric values with no leading zeros can be given a special | |
| interpretation by clients and servers which choose to provide such | | interpretation by clients and servers which choose to provide such | |
| support. The receiver may treat such a user or group string as | | support. The receiver may treat such a user or group string as | |
| representing the same user as would be represented by a v2/v3 uid or | | representing the same user as would be represented by a v2/v3 uid or | |
| gid having the corresponding numeric value. A server is not | | gid having the corresponding numeric value. A server is not | |
| obligated to accept such a string, but may return an NFS4ERR_BADOWNER | | obligated to accept such a string, but may return an NFS4ERR_BADOWNER | |
| instead. To avoid this mechanism being used to subvert user and | | instead. To avoid this mechanism being used to subvert user and | |
| group translation, so that a client might pass all of the owners and | | group translation, so that a client might pass all of the owners and | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| groups in numeric form, a server SHOULD return an NFS4ERR_BADOWNER | | groups in numeric form, a server SHOULD return an NFS4ERR_BADOWNER | |
| error when there is a valid translation for the user or owner | | error when there is a valid translation for the user or owner | |
| designated in this way. In that case, the client must use the | | designated in this way. In that case, the client must use the | |
| appropriate name@domain string and not the special form for | | appropriate name@domain string and not the special form for | |
| compatibility. | | compatibility. | |
| | | | |
| The owner string "nobody" may be used to designate an anonymous user, | | The owner string "nobody" may be used to designate an anonymous user, | |
| which will be associated with a file created by a security principal | | which will be associated with a file created by a security principal | |
| that cannot be mapped through normal means to the owner attribute. | | that cannot be mapped through normal means to the owner attribute. | |
| | | | |
| | | | |
| skipping to change at page 49, line 36 | | skipping to change at page 49, line 24 | |
| section "Internationalization". | | section "Internationalization". | |
| | | | |
| 5.10. Quota Attributes | | 5.10. Quota Attributes | |
| | | | |
| For the attributes related to filesystem quotas, the following | | For the attributes related to filesystem quotas, the following | |
| definitions apply: | | definitions apply: | |
| | | | |
| quota_avail_soft | | quota_avail_soft | |
| The value in bytes which represents the amount of additional | | The value in bytes which represents the amount of additional | |
| disk space that can be allocated to this file or directory | | disk space that can be allocated to this file or directory | |
|
| before the user may reasonably be warned. It is understood that | | before the user may reasonably be warned. It is understood | |
| this space may be consumed by allocations to other files or | | that this space may be consumed by allocations to other files | |
| directories though there is a rule as to which other files or | | or directories though there is a rule as to which other files | |
| directories. | | or directories. | |
| | | | |
| quota_avail_hard | | quota_avail_hard | |
|
| The value in bytes which represent the amount of additional disk | | The value in bytes which represent the amount of additional | |
| space beyond the current allocation that can be allocated to | | disk space beyond the current allocation that can be allocated | |
| this file or directory before further allocations will be | | to this file or directory before further allocations will be | |
| refused. It is understood that this space may be consumed by | | refused. It is understood that this space may be consumed by | |
| allocations to other files or directories. | | allocations to other files or directories. | |
| | | | |
| quota_used | | quota_used | |
|
| The value in bytes which represent the amount of disc space used | | The value in bytes which represent the amount of disc space | |
| by this file or directory and possibly a number of other similar | | used by this file or directory and possibly a number of other | |
| files or directories, where the set of "similar" meets at least | | similar files or directories, where the set of "similar" meets | |
| the criterion that allocating space to any file or directory in | | at least the criterion that allocating space to any file or | |
| the set will reduce the "quota_avail_hard" of every other file | | directory in the set will reduce the "quota_avail_hard" of | |
| or directory in the set. | | every other file or directory in the set. | |
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
|
| Note that there may be a number of distinct but overlapping sets | | Note that there may be a number of distinct but overlapping | |
| of files or directories for which a quota_used value is | | sets of files or directories for which a quota_used value is | |
| maintained. E.g. "all files with a given owner", "all files with | | maintained (e.g., "all files with a given owner", "all files | |
| a given group owner". etc. | | with a given group owner", etc.). | |
| | | | |
| The server is at liberty to choose any of those sets but should | | The server is at liberty to choose any of those sets but should | |
| do so in a repeatable way. The rule may be configured per- | | do so in a repeatable way. The rule may be configured per- | |
| filesystem or may be "choose the set with the smallest quota". | | filesystem or may be "choose the set with the smallest quota". | |
| | | | |
| 5.11. Access Control Lists | | 5.11. Access Control Lists | |
| | | | |
| The NFS version 4 ACL attribute is an array of access control entries | | The NFS version 4 ACL attribute is an array of access control entries | |
| (ACE). Although, the client can read and write the ACL attribute, | | (ACE). Although, the client can read and write the ACL attribute, | |
| the NFSv4 model is the server does all access control based on the | | the NFSv4 model is the server does all access control based on the | |
| | | | |
| skipping to change at page 51, line 5 | | skipping to change at page 50, line 46 | |
| the requester are considered. Each ACE is processed until all of the | | the requester are considered. Each ACE is processed until all of the | |
| bits of the requester's access have been ALLOWED. Once a bit (see | | bits of the requester's access have been ALLOWED. Once a bit (see | |
| below) has been ALLOWED by an ACCESS_ALLOWED_ACE, it is no longer | | below) has been ALLOWED by an ACCESS_ALLOWED_ACE, it is no longer | |
| considered in the processing of later ACEs. If an ACCESS_DENIED_ACE | | considered in the processing of later ACEs. If an ACCESS_DENIED_ACE | |
| is encountered where the requester's access still has unALLOWED bits | | is encountered where the requester's access still has unALLOWED bits | |
| in common with the "access_mask" of the ACE, the request is denied. | | in common with the "access_mask" of the ACE, the request is denied. | |
| However, unlike the ALLOWED and DENIED ACE types, the ALARM and AUDIT | | However, unlike the ALLOWED and DENIED ACE types, the ALARM and AUDIT | |
| ACE types do not affect a requester's access, and instead are for | | ACE types do not affect a requester's access, and instead are for | |
| triggering events as a result of a requester's access attempt. | | triggering events as a result of a requester's access attempt. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| Therefore, all AUDIT and ALARM ACEs are processed until end of the | | Therefore, all AUDIT and ALARM ACEs are processed until end of the | |
| ACL. When the ACL is fully processed, if there are bits in | | ACL. When the ACL is fully processed, if there are bits in | |
| requester's mask that have not been considered whether the server | | requester's mask that have not been considered whether the server | |
| allows or denies the access is undefined. If there is a mode | | allows or denies the access is undefined. If there is a mode | |
| attribute on the file, then this cannot happen, since the mode's | | attribute on the file, then this cannot happen, since the mode's | |
| MODE4_*OTH bits will map to EVERYONE@ ACEs that unambiguously specify | | MODE4_*OTH bits will map to EVERYONE@ ACEs that unambiguously specify | |
| the requester's access. | | the requester's access. | |
| | | | |
| The NFS version 4 ACL model is quite rich. Some server platforms may | | The NFS version 4 ACL model is quite rich. Some server platforms may | |
| provide access control functionality that goes beyond the UNIX-style | | provide access control functionality that goes beyond the UNIX-style | |
| | | | |
| skipping to change at page 52, line 5 | | skipping to change at page 51, line 47 | |
| in acemask4. | | in acemask4. | |
| | | | |
| ALARM Generate a system ALARM (system | | ALARM Generate a system ALARM (system | |
| dependent) when any access attempt is | | dependent) when any access attempt is | |
| made to a file or directory for the | | made to a file or directory for the | |
| access methods specified in acemask4. | | access methods specified in acemask4. | |
| | | | |
| A server need not support all of the above ACE types. The bitmask | | A server need not support all of the above ACE types. The bitmask | |
| constants used to represent the above definitions within the | | constants used to represent the above definitions within the | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| aclsupport attribute are as follows: | | aclsupport attribute are as follows: | |
| | | | |
| const ACL4_SUPPORT_ALLOW_ACL = 0x00000001; | | const ACL4_SUPPORT_ALLOW_ACL = 0x00000001; | |
| const ACL4_SUPPORT_DENY_ACL = 0x00000002; | | const ACL4_SUPPORT_DENY_ACL = 0x00000002; | |
| const ACL4_SUPPORT_AUDIT_ACL = 0x00000004; | | const ACL4_SUPPORT_AUDIT_ACL = 0x00000004; | |
| const ACL4_SUPPORT_ALARM_ACL = 0x00000008; | | const ACL4_SUPPORT_ALARM_ACL = 0x00000008; | |
|
| | | | |
| The semantics of the "type" field follow the descriptions provided | | The semantics of the "type" field follow the descriptions provided | |
| above. | | above. | |
| | | | |
| The constants used for the type field (acetype4) are as follows: | | The constants used for the type field (acetype4) are as follows: | |
| | | | |
| const ACE4_ACCESS_ALLOWED_ACE_TYPE = 0x00000000; | | const ACE4_ACCESS_ALLOWED_ACE_TYPE = 0x00000000; | |
| const ACE4_ACCESS_DENIED_ACE_TYPE = 0x00000001; | | const ACE4_ACCESS_DENIED_ACE_TYPE = 0x00000001; | |
| const ACE4_SYSTEM_AUDIT_ACE_TYPE = 0x00000002; | | const ACE4_SYSTEM_AUDIT_ACE_TYPE = 0x00000002; | |
| const ACE4_SYSTEM_ALARM_ACE_TYPE = 0x00000003; | | const ACE4_SYSTEM_ALARM_ACE_TYPE = 0x00000003; | |
| | | | |
| | | | |
| skipping to change at page 53, line 4 | | skipping to change at page 52, line 42 | |
| _______________________________________________________________ | | _______________________________________________________________ | |
| READ_DATA Permission to read the data of the file | | READ_DATA Permission to read the data of the file | |
| LIST_DIRECTORY Permission to list the contents of a | | LIST_DIRECTORY Permission to list the contents of a | |
| directory | | directory | |
| WRITE_DATA Permission to modify the file's data | | WRITE_DATA Permission to modify the file's data | |
| ADD_FILE Permission to add a new file to a | | ADD_FILE Permission to add a new file to a | |
| directory | | directory | |
| APPEND_DATA Permission to append data to a file | | APPEND_DATA Permission to append data to a file | |
| ADD_SUBDIRECTORY Permission to create a subdirectory to a | | ADD_SUBDIRECTORY Permission to create a subdirectory to a | |
| directory | | directory | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| READ_NAMED_ATTRS Permission to read the named attributes | | READ_NAMED_ATTRS Permission to read the named attributes | |
| of a file | | of a file | |
| WRITE_NAMED_ATTRS Permission to write the named attributes | | WRITE_NAMED_ATTRS Permission to write the named attributes | |
| of a file | | of a file | |
| EXECUTE Permission to execute a file | | EXECUTE Permission to execute a file | |
| DELETE_CHILD Permission to delete a file or directory | | DELETE_CHILD Permission to delete a file or directory | |
| within a directory | | within a directory | |
| READ_ATTRIBUTES The ability to read basic attributes | | READ_ATTRIBUTES The ability to read basic attributes | |
| (non-acls) of a file | | (non-acls) of a file | |
| WRITE_ATTRIBUTES Permission to change basic attributes | | WRITE_ATTRIBUTES Permission to change basic attributes | |
| | | | |
| skipping to change at page 53, line 49 | | skipping to change at page 53, line 36 | |
| const ACE4_DELETE = 0x00010000; | | const ACE4_DELETE = 0x00010000; | |
| const ACE4_READ_ACL = 0x00020000; | | const ACE4_READ_ACL = 0x00020000; | |
| const ACE4_WRITE_ACL = 0x00040000; | | const ACE4_WRITE_ACL = 0x00040000; | |
| const ACE4_WRITE_OWNER = 0x00080000; | | const ACE4_WRITE_OWNER = 0x00080000; | |
| const ACE4_SYNCHRONIZE = 0x00100000; | | const ACE4_SYNCHRONIZE = 0x00100000; | |
| | | | |
| Server implementations need not provide the granularity of control | | Server implementations need not provide the granularity of control | |
| that is implied by this list of masks. For example, POSIX-based | | that is implied by this list of masks. For example, POSIX-based | |
| systems might not distinguish APPEND_DATA (the ability to append to a | | systems might not distinguish APPEND_DATA (the ability to append to a | |
| file) from WRITE_DATA (the ability to modify existing contents); both | | file) from WRITE_DATA (the ability to modify existing contents); both | |
|
| masks would be tied to a single ``write'' permission. When such a | | masks would be tied to a single "write" permission. When such a | |
| server returns attributes to the client, it would show both | | server returns attributes to the client, it would show both | |
| APPEND_DATA and WRITE_DATA if and only if the write permission is | | APPEND_DATA and WRITE_DATA if and only if the write permission is | |
| enabled. | | enabled. | |
| | | | |
| If a server receives a SETATTR request that it cannot accurately | | If a server receives a SETATTR request that it cannot accurately | |
| implement, it should error in the direction of more restricted | | implement, it should error in the direction of more restricted | |
| access. For example, suppose a server cannot distinguish overwriting | | access. For example, suppose a server cannot distinguish overwriting | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| data from appending new data, as described in the previous paragraph. | | data from appending new data, as described in the previous paragraph. | |
| If a client submits an ACE where APPEND_DATA is set but WRITE_DATA is | | If a client submits an ACE where APPEND_DATA is set but WRITE_DATA is | |
| not (or vice versa), the server should reject the request with | | not (or vice versa), the server should reject the request with | |
| NFS4ERR_ATTRNOTSUPP. Nonetheless, if the ACE has type DENY, the | | NFS4ERR_ATTRNOTSUPP. Nonetheless, if the ACE has type DENY, the | |
| server may silently turn on the other bit, so that both APPEND_DATA | | server may silently turn on the other bit, so that both APPEND_DATA | |
| and WRITE_DATA are denied. | | and WRITE_DATA are denied. | |
| | | | |
| 5.11.3. ACE flag | | 5.11.3. ACE flag | |
| | | | |
| The "flag" field contains values based on the following descriptions. | | The "flag" field contains values based on the following descriptions. | |
| | | | |
| skipping to change at page 54, line 19 | | skipping to change at page 54, line 10 | |
| not (or vice versa), the server should reject the request with | | not (or vice versa), the server should reject the request with | |
| NFS4ERR_ATTRNOTSUPP. Nonetheless, if the ACE has type DENY, the | | NFS4ERR_ATTRNOTSUPP. Nonetheless, if the ACE has type DENY, the | |
| server may silently turn on the other bit, so that both APPEND_DATA | | server may silently turn on the other bit, so that both APPEND_DATA | |
| and WRITE_DATA are denied. | | and WRITE_DATA are denied. | |
| | | | |
| 5.11.3. ACE flag | | 5.11.3. ACE flag | |
| | | | |
| The "flag" field contains values based on the following descriptions. | | The "flag" field contains values based on the following descriptions. | |
| | | | |
| ACE4_FILE_INHERIT_ACE | | ACE4_FILE_INHERIT_ACE | |
|
| | | | |
| Can be placed on a directory and indicates that this ACE should be | | Can be placed on a directory and indicates that this ACE should be | |
| added to each new non-directory file created. | | added to each new non-directory file created. | |
| | | | |
| ACE4_DIRECTORY_INHERIT_ACE | | ACE4_DIRECTORY_INHERIT_ACE | |
|
| | | | |
| Can be placed on a directory and indicates that this ACE should be | | Can be placed on a directory and indicates that this ACE should be | |
| added to each new directory created. | | added to each new directory created. | |
| | | | |
| ACE4_INHERIT_ONLY_ACE | | ACE4_INHERIT_ONLY_ACE | |
|
| | | | |
| Can be placed on a directory but does not apply to the directory, | | Can be placed on a directory but does not apply to the directory, | |
|
| only to newly created files/directories as specified by the above two | | only to newly created files/directories as specified by the above | |
| flags. | | two flags. | |
| | | | |
| ACE4_NO_PROPAGATE_INHERIT_ACE | | ACE4_NO_PROPAGATE_INHERIT_ACE | |
|
| | | | |
| Can be placed on a directory. Normally when a new directory is | | Can be placed on a directory. Normally when a new directory is | |
| created and an ACE exists on the parent directory which is marked | | created and an ACE exists on the parent directory which is marked | |
|
| ACL4_DIRECTORY_INHERIT_ACE, two ACEs are placed on the new directory. | | ACL4_DIRECTORY_INHERIT_ACE, two ACEs are placed on the new | |
| One for the directory itself and one which is an inheritable ACE for | | directory. One for the directory itself and one which is an | |
| newly created directories. This flag tells the server to not place | | inheritable ACE for newly created directories. This flag tells | |
| an ACE on the newly created directory which is inheritable by | | the server to not place an ACE on the newly created directory | |
| subdirectories of the created directory. | | which is inheritable by subdirectories of the created directory. | |
| | | | |
| ACE4_SUCCESSFUL_ACCESS_ACE_FLAG | | ACE4_SUCCESSFUL_ACCESS_ACE_FLAG | |
| | | | |
| ACL4_FAILED_ACCESS_ACE_FLAG | | ACL4_FAILED_ACCESS_ACE_FLAG | |
|
| | | | |
| The ACE4_SUCCESSFUL_ACCESS_ACE_FLAG (SUCCESS) and | | The ACE4_SUCCESSFUL_ACCESS_ACE_FLAG (SUCCESS) and | |
| ACE4_FAILED_ACCESS_ACE_FLAG (FAILED) flag bits relate only to | | ACE4_FAILED_ACCESS_ACE_FLAG (FAILED) flag bits relate only to | |
| ACE4_SYSTEM_AUDIT_ACE_TYPE (AUDIT) and ACE4_SYSTEM_ALARM_ACE_TYPE | | ACE4_SYSTEM_AUDIT_ACE_TYPE (AUDIT) and ACE4_SYSTEM_ALARM_ACE_TYPE | |
|
| (ALARM) ACE types. If during the processing of the file's ACL, the | | (ALARM) ACE types. If during the processing of the file's ACL, | |
| server encounters an AUDIT or ALARM ACE that matches the principal | | the server encounters an AUDIT or ALARM ACE that matches the | |
| | | principal attempting the OPEN, the server notes that fact, and the | |
| Draft Specification NFS version 4 Protocol November 2002 | | presence, if any, of the SUCCESS and FAILED flags encountered in | |
| | | the AUDIT or ALARM ACE. Once the server completes the ACL | |
| attempting the OPEN, the server notes that fact, and the presence, if | | processing, and the share reservation processing, and the OPEN | |
| any, of the SUCCESS and FAILED flags encountered in the AUDIT or | | call, it then notes if the OPEN succeeded or failed. If the OPEN | |
| ALARM ACE. Once the server completes the ACL processing, and the | | succeeded, and if the SUCCESS flag was set for a matching AUDIT or | |
| share reservation processing, and the OPEN call, it then notes if the | | ALARM, then the appropriate AUDIT or ALARM event occurs. If the | |
| OPEN succeeded or failed. If the OPEN succeeded, and if the SUCCESS | | OPEN failed, and if the FAILED flag was set for the matching AUDIT | |
| flag was set for a matching AUDIT or ALARM, then the appropriate | | or ALARM, then the appropriate AUDIT or ALARM event occurs. | |
| AUDIT or ALARM event occurs. If the OPEN failed, and if the FAILED | | Clearly either or both of the SUCCESS or FAILED can be set, but if | |
| flag was set for the matching AUDIT or ALARM, then the appropriate | | neither is set, the AUDIT or ALARM ACE is not useful. | |
| AUDIT or ALARM event occurs. Clearly either or both of the SUCCESS | | | |
| or FAILED can be set, but if neither is set, the AUDIT or ALARM ACE | | | |
| is not useful. | | | |
| | | | |
| The previously described processing applies to that of the ACCESS | | The previously described processing applies to that of the ACCESS | |
|
| operation as well. The difference being that "success" or "failure" | | operation as well. The difference being that "success" or | |
| does not mean whether ACCESS returns NFS4_OK or not. Success means | | "failure" does not mean whether ACCESS returns NFS4_OK or not. | |
| whether ACCESS returns all requested and supported bits. Failure | | Success means whether ACCESS returns all requested and supported | |
| means whether ACCESS failed to return a bit that was requested and | | bits. Failure means whether ACCESS failed to return a bit that | |
| supported. | | was requested and supported. | |
| | | | |
| ACE4_IDENTIFIER_GROUP | | ACE4_IDENTIFIER_GROUP | |
|
| | | | |
| Indicates that the "who" refers to a GROUP as defined under UNIX. | | Indicates that the "who" refers to a GROUP as defined under UNIX. | |
| | | | |
| The bitmask constants used for the flag field are as follows: | | The bitmask constants used for the flag field are as follows: | |
| | | | |
| const ACE4_FILE_INHERIT_ACE = 0x00000001; | | const ACE4_FILE_INHERIT_ACE = 0x00000001; | |
| const ACE4_DIRECTORY_INHERIT_ACE = 0x00000002; | | const ACE4_DIRECTORY_INHERIT_ACE = 0x00000002; | |
| const ACE4_NO_PROPAGATE_INHERIT_ACE = 0x00000004; | | const ACE4_NO_PROPAGATE_INHERIT_ACE = 0x00000004; | |
| const ACE4_INHERIT_ONLY_ACE = 0x00000008; | | const ACE4_INHERIT_ONLY_ACE = 0x00000008; | |
| const ACE4_SUCCESSFUL_ACCESS_ACE_FLAG = 0x00000010; | | const ACE4_SUCCESSFUL_ACCESS_ACE_FLAG = 0x00000010; | |
| const ACE4_FAILED_ACCESS_ACE_FLAG = 0x00000020; | | const ACE4_FAILED_ACCESS_ACE_FLAG = 0x00000020; | |
| | | | |
| skipping to change at page 56, line 5 | | skipping to change at page 55, line 42 | |
| For example, suppose a client tries to set an ACE with | | For example, suppose a client tries to set an ACE with | |
| ACE4_FILE_INHERIT_ACE set but not ACE4_DIRECTORY_INHERIT_ACE. If the | | ACE4_FILE_INHERIT_ACE set but not ACE4_DIRECTORY_INHERIT_ACE. If the | |
| server does not support any form of ACL inheritance, the server | | server does not support any form of ACL inheritance, the server | |
| should reject the request with NFS4ERR_ATTRNOTSUPP. If the server | | should reject the request with NFS4ERR_ATTRNOTSUPP. If the server | |
| supports a single "inherit ACE" flag that applies to both files and | | supports a single "inherit ACE" flag that applies to both files and | |
| directories, the server may reject the request (i.e., requiring the | | directories, the server may reject the request (i.e., requiring the | |
| client to set both the file and directory inheritance flags). The | | client to set both the file and directory inheritance flags). The | |
| server may also accept the request and silently turn on the | | server may also accept the request and silently turn on the | |
| ACE4_DIRECTORY_INHERIT_ACE flag. | | ACE4_DIRECTORY_INHERIT_ACE flag. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| 5.11.4. ACE who | | 5.11.4. ACE who | |
| | | | |
| There are several special identifiers ("who") which need to be | | There are several special identifiers ("who") which need to be | |
| understood universally, rather than in the context of a particular | | understood universally, rather than in the context of a particular | |
| DNS domain. Some of these identifiers cannot be understood when an | | DNS domain. Some of these identifiers cannot be understood when an | |
| NFS client accesses the server, but have meaning when a local process | | NFS client accesses the server, but have meaning when a local process | |
| accesses the file. The ability to display and modify these | | accesses the file. The ability to display and modify these | |
| permissions is permitted over NFS, even if none of the access methods | | permissions is permitted over NFS, even if none of the access methods | |
| on the server understands the identifiers. | | on the server understands the identifiers. | |
| | | | |
| | | | |
| skipping to change at page 57, line 5 | | skipping to change at page 56, line 44 | |
| const MODE4_RGRP = 0x020; /* read permission: group */ | | const MODE4_RGRP = 0x020; /* read permission: group */ | |
| const MODE4_WGRP = 0x010; /* write permission: group */ | | const MODE4_WGRP = 0x010; /* write permission: group */ | |
| const MODE4_XGRP = 0x008; /* execute permission: group */ | | const MODE4_XGRP = 0x008; /* execute permission: group */ | |
| const MODE4_ROTH = 0x004; /* read permission: other */ | | const MODE4_ROTH = 0x004; /* read permission: other */ | |
| const MODE4_WOTH = 0x002; /* write permission: other */ | | const MODE4_WOTH = 0x002; /* write permission: other */ | |
| const MODE4_XOTH = 0x001; /* execute permission: other */ | | const MODE4_XOTH = 0x001; /* execute permission: other */ | |
| | | | |
| Bits MODE4_RUSR, MODE4_WUSR, and MODE4_XUSR apply to the principal | | Bits MODE4_RUSR, MODE4_WUSR, and MODE4_XUSR apply to the principal | |
| identified in the owner attribute. Bits MODE4_RGRP, MODE4_WGRP, and | | identified in the owner attribute. Bits MODE4_RGRP, MODE4_WGRP, and | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| MODE4_XGRP apply to the principals identified in the owner_group | | MODE4_XGRP apply to the principals identified in the owner_group | |
| attribute. Bits MODE4_ROTH, MODE4_WOTH, MODE4_XOTH apply to any | | attribute. Bits MODE4_ROTH, MODE4_WOTH, MODE4_XOTH apply to any | |
| principal that does not match that in the owner group, and does not | | principal that does not match that in the owner group, and does not | |
| have a group matching that of the owner_group attribute. | | have a group matching that of the owner_group attribute. | |
| | | | |
| The remaining bits are not defined by this protocol and MUST NOT be | | The remaining bits are not defined by this protocol and MUST NOT be | |
| used. The minor version mechanism must be used to define further bit | | used. The minor version mechanism must be used to define further bit | |
| usage. | | usage. | |
| | | | |
| Note that in UNIX, if a file has the MODE4_SGID bit set and no | | Note that in UNIX, if a file has the MODE4_SGID bit set and no | |
| | | | |
| skipping to change at page 57, line 29 | | skipping to change at page 57, line 18 | |
| | | | |
| 5.11.6. Mode and ACL Attribute | | 5.11.6. Mode and ACL Attribute | |
| | | | |
| The server that supports both mode and ACL must take care to | | The server that supports both mode and ACL must take care to | |
| synchronize the MODE4_*USR, MODE4_*GRP, and MODE4_*OTH bits with the | | synchronize the MODE4_*USR, MODE4_*GRP, and MODE4_*OTH bits with the | |
| ACEs which have respective who fields of "OWNER@", "GROUP@", and | | ACEs which have respective who fields of "OWNER@", "GROUP@", and | |
| "EVERYONE@" so that the client can see semantically equivalent access | | "EVERYONE@" so that the client can see semantically equivalent access | |
| permissions exist whether the client asks for owner, owner_group and | | permissions exist whether the client asks for owner, owner_group and | |
| mode attributes, or for just the ACL. | | mode attributes, or for just the ACL. | |
| | | | |
|
| Because the mode attribute includes bits (e.g. MODE4_SVTX) that have | | Because the mode attribute includes bits (e.g., MODE4_SVTX) that have | |
| nothing to do with ACL semantics, it is permitted for clients to | | nothing to do with ACL semantics, it is permitted for clients to | |
| specify both the ACL attribute and mode in the same SETATTR | | specify both the ACL attribute and mode in the same SETATTR | |
| operation. However, because there is no prescribed order for | | operation. However, because there is no prescribed order for | |
| processing the attributes in a SETATTR, the client must ensure that | | processing the attributes in a SETATTR, the client must ensure that | |
| ACL attribute, if specified without mode, would produce the desired | | ACL attribute, if specified without mode, would produce the desired | |
| mode bits, and conversely, the mode attribute if specified without | | mode bits, and conversely, the mode attribute if specified without | |
| ACL, would produce the desired "OWNER@", "GROUP@", and "EVERYONE@" | | ACL, would produce the desired "OWNER@", "GROUP@", and "EVERYONE@" | |
| ACEs. | | ACEs. | |
| | | | |
| 5.11.7. mounted_on_fileid | | 5.11.7. mounted_on_fileid | |
| | | | |
| skipping to change at page 57, line 56 | | skipping to change at page 57, line 45 | |
| with a component name and a fileid. The fileid of the mount point's | | with a component name and a fileid. The fileid of the mount point's | |
| directory entry will be different from the fileid that the stat() | | directory entry will be different from the fileid that the stat() | |
| system call returns. The stat() system call is returning the fileid | | system call returns. The stat() system call is returning the fileid | |
| of the root of the mounted filesystem, whereas readdir() is returning | | of the root of the mounted filesystem, whereas readdir() is returning | |
| the fileid stat() would have returned before any filesystems were | | the fileid stat() would have returned before any filesystems were | |
| mounted on the mount point. | | mounted on the mount point. | |
| | | | |
| Unlike NFS version 3, NFS version 4 allows a client's LOOKUP request | | Unlike NFS version 3, NFS version 4 allows a client's LOOKUP request | |
| to cross other filesystems. The client detects the filesystem | | to cross other filesystems. The client detects the filesystem | |
| crossing whenever the filehandle argument of LOOKUP has an fsid | | crossing whenever the filehandle argument of LOOKUP has an fsid | |
|
| attribute different from that of the filehandle returned by LOOKUP. A | | attribute different from that of the filehandle returned by LOOKUP. | |
| UNIX-based client will consider this a "mount point crossing". UNIX | | A UNIX-based client will consider this a "mount point crossing". | |
| | | UNIX has a legacy scheme for allowing a process to determine its | |
| Draft Specification NFS version 4 Protocol November 2002 | | current working directory. This relies on readdir() of a mount | |
| | | point's parent and stat() of the mount point returning fileids as | |
| has a legacy scheme for allowing a process to determine its current | | previously described. The mounted_on_fileid attribute corresponds to | |
| working directory. This relies on readdir() of a mount point's parent | | the fileid that readdir() would have returned as described | |
| and stat() of the mount point returning fileids as previously | | previously. | |
| described. The mounted_on_fileid attribute corresponds to the fileid | | | |
| that readdir() would have returned as described previously. | | | |
| | | | |
| While the NFS version 4 client could simply fabricate a fileid | | While the NFS version 4 client could simply fabricate a fileid | |
| corresponding to what mounted_on_fileid provides (and if the server | | corresponding to what mounted_on_fileid provides (and if the server | |
| does not support mounted_on_fileid, the client has no choice), there | | does not support mounted_on_fileid, the client has no choice), there | |
| is a risk that the client will generate a fileid that conflicts with | | is a risk that the client will generate a fileid that conflicts with | |
| one that is already assigned to another object in the filesystem. | | one that is already assigned to another object in the filesystem. | |
| Instead, if the server can provide the mounted_on_fileid, the | | Instead, if the server can provide the mounted_on_fileid, the | |
| potential for client operational problems in this area is eliminated. | | potential for client operational problems in this area is eliminated. | |
| | | | |
| If the server detects that there is no mounted point at the target | | If the server detects that there is no mounted point at the target | |
| | | | |
| skipping to change at page 58, line 33 | | skipping to change at page 58, line 25 | |
| the same as that of the fileid attribute. | | the same as that of the fileid attribute. | |
| | | | |
| The mounted_on_fileid attribute is RECOMMENDED, so the server SHOULD | | The mounted_on_fileid attribute is RECOMMENDED, so the server SHOULD | |
| provide it if possible, and for a UNIX-based server, this is | | provide it if possible, and for a UNIX-based server, this is | |
| straightforward. Usually, mounted_on_fileid will be requested during | | straightforward. Usually, mounted_on_fileid will be requested during | |
| a READDIR operation, in which case it is trivial (at least for UNIX- | | a READDIR operation, in which case it is trivial (at least for UNIX- | |
| based servers) to return mounted_on_fileid since it is equal to the | | based servers) to return mounted_on_fileid since it is equal to the | |
| fileid of a directory entry returned by readdir(). If | | fileid of a directory entry returned by readdir(). If | |
| mounted_on_fileid is requested in a GETATTR operation, the server | | mounted_on_fileid is requested in a GETATTR operation, the server | |
| should obey an invariant that has it returning a value that is equal | | should obey an invariant that has it returning a value that is equal | |
|
| to the file object's entry in the object's parent directory, i.e. | | to the file object's entry in the object's parent directory, i.e., | |
| what readdir() would have returned. Some operating environments | | what readdir() would have returned. Some operating environments | |
| allow a series of two or more filesystems to be mounted onto a single | | allow a series of two or more filesystems to be mounted onto a single | |
| mount point. In this case, for the server to obey the aforementioned | | mount point. In this case, for the server to obey the aforementioned | |
| invariant, it will need to find the base mount point, and not the | | invariant, it will need to find the base mount point, and not the | |
| intermediate mount points. | | intermediate mount points. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| 6. Filesystem Migration and Replication | | 6. Filesystem Migration and Replication | |
| | | | |
| With the use of the recommended attribute "fs_locations", the NFS | | With the use of the recommended attribute "fs_locations", the NFS | |
| version 4 server has a method of providing filesystem migration or | | version 4 server has a method of providing filesystem migration or | |
| replication services. For the purposes of migration and replication, | | replication services. For the purposes of migration and replication, | |
| a filesystem will be defined as all files that share a given fsid | | a filesystem will be defined as all files that share a given fsid | |
| (both major and minor values are the same). | | (both major and minor values are the same). | |
| | | | |
| The fs_locations attribute provides a list of filesystem locations. | | The fs_locations attribute provides a list of filesystem locations. | |
| These locations are specified by providing the server name (either | | These locations are specified by providing the server name (either | |
| | | | |
| skipping to change at page 60, line 4 | | skipping to change at page 59, line 33 | |
| event between client and server is specified here. | | event between client and server is specified here. | |
| | | | |
| Once the servers participating in the migration have completed the | | Once the servers participating in the migration have completed the | |
| move of the filesystem, the error NFS4ERR_MOVED will be returned for | | move of the filesystem, the error NFS4ERR_MOVED will be returned for | |
| subsequent requests received by the original server. The | | subsequent requests received by the original server. The | |
| NFS4ERR_MOVED error is returned for all operations except PUTFH and | | NFS4ERR_MOVED error is returned for all operations except PUTFH and | |
| GETATTR. Upon receiving the NFS4ERR_MOVED error, the client will | | GETATTR. Upon receiving the NFS4ERR_MOVED error, the client will | |
| obtain the value of the fs_locations attribute. The client will then | | obtain the value of the fs_locations attribute. The client will then | |
| use the contents of the attribute to redirect its requests to the | | use the contents of the attribute to redirect its requests to the | |
| specified server. To facilitate the use of GETATTR, operations such | | specified server. To facilitate the use of GETATTR, operations such | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| as PUTFH must also be accepted by the server for the migrated file | | as PUTFH must also be accepted by the server for the migrated file | |
| system's filehandles. Note that if the server returns NFS4ERR_MOVED, | | system's filehandles. Note that if the server returns NFS4ERR_MOVED, | |
| the server MUST support the fs_locations attribute. | | the server MUST support the fs_locations attribute. | |
| | | | |
| If the client requests more attributes than just fs_locations, the | | If the client requests more attributes than just fs_locations, the | |
| server may return fs_locations only. This is to be expected since | | server may return fs_locations only. This is to be expected since | |
| the server has migrated the filesystem and may not have a method of | | the server has migrated the filesystem and may not have a method of | |
| obtaining additional attribute data. | | obtaining additional attribute data. | |
| | | | |
| The server implementor needs to be careful in developing a migration | | The server implementor needs to be careful in developing a migration | |
| | | | |
| skipping to change at page 61, line 5 | | skipping to change at page 60, line 39 | |
| | | | |
| The fs_locations struct and attribute then contains an array of | | The fs_locations struct and attribute then contains an array of | |
| locations. Since the name space of each server may be constructed | | locations. Since the name space of each server may be constructed | |
| differently, the "fs_root" field is provided. The path represented | | differently, the "fs_root" field is provided. The path represented | |
| by fs_root represents the location of the filesystem in the server's | | by fs_root represents the location of the filesystem in the server's | |
| name space. Therefore, the fs_root path is only associated with the | | name space. Therefore, the fs_root path is only associated with the | |
| server from which the fs_locations attribute was obtained. The | | server from which the fs_locations attribute was obtained. The | |
| fs_root path is meant to aid the client in locating the filesystem at | | fs_root path is meant to aid the client in locating the filesystem at | |
| the various servers listed. | | the various servers listed. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| As an example, there is a replicated filesystem located at two | | As an example, there is a replicated filesystem located at two | |
| servers (servA and servB). At servA the filesystem is located at | | servers (servA and servB). At servA the filesystem is located at | |
| path "/a/b/c". At servB the filesystem is located at path "/x/y/z". | | path "/a/b/c". At servB the filesystem is located at path "/x/y/z". | |
| In this example the client accesses the filesystem first at servA | | In this example the client accesses the filesystem first at servA | |
| with a multi-component lookup path of "/a/b/c/d". Since the client | | with a multi-component lookup path of "/a/b/c/d". Since the client | |
| used a multi-component lookup to obtain the filehandle at "/a/b/c/d", | | used a multi-component lookup to obtain the filehandle at "/a/b/c/d", | |
| it is unaware that the filesystem's root is located in servA's name | | it is unaware that the filesystem's root is located in servA's name | |
| space at "/a/b/c". When the client switches to servB, it will need | | space at "/a/b/c". When the client switches to servB, it will need | |
| to determine that the directory it first referenced at servA is now | | to determine that the directory it first referenced at servA is now | |
| represented by the path "/x/y/z/d" on servB. To facilitate this, the | | represented by the path "/x/y/z/d" on servB. To facilitate this, the | |
| | | | |
| skipping to change at page 62, line 5 | | skipping to change at page 61, line 35 | |
| of the fh_expire_type attribute, whether volatile filehandles will | | of the fh_expire_type attribute, whether volatile filehandles will | |
| expire at the migration or replication event. If the bit | | expire at the migration or replication event. If the bit | |
| FH4_VOL_MIGRATION is set in the fh_expire_type attribute, the client | | FH4_VOL_MIGRATION is set in the fh_expire_type attribute, the client | |
| must treat the volatile filehandle as if the server had returned the | | must treat the volatile filehandle as if the server had returned the | |
| NFS4ERR_FHEXPIRED error. At the migration or replication event in | | NFS4ERR_FHEXPIRED error. At the migration or replication event in | |
| the presence of the FH4_VOL_MIGRATION bit, the client will not | | the presence of the FH4_VOL_MIGRATION bit, the client will not | |
| present the original or old volatile filehandle to the new server. | | present the original or old volatile filehandle to the new server. | |
| The client will start its communication with the new server by | | The client will start its communication with the new server by | |
| recovering its filehandles using the saved file names. | | recovering its filehandles using the saved file names. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| 7. NFS Server Name Space | | 7. NFS Server Name Space | |
| | | | |
| 7.1. Server Exports | | 7.1. Server Exports | |
| | | | |
| On a UNIX server the name space describes all the files reachable by | | On a UNIX server the name space describes all the files reachable by | |
| pathnames under the root directory or "/". On a Windows NT server | | pathnames under the root directory or "/". On a Windows NT server | |
| the name space constitutes all the files on disks named by mapped | | the name space constitutes all the files on disks named by mapped | |
| disk letters. NFS server administrators rarely make the entire | | disk letters. NFS server administrators rarely make the entire | |
| server's filesystem name space available to NFS clients. More often | | server's filesystem name space available to NFS clients. More often | |
| portions of the name space are made available via an "export" | | portions of the name space are made available via an "export" | |
| | | | |
| skipping to change at page 63, line 4 | | skipping to change at page 62, line 37 | |
| filesystem to another. There is a drawback to this representation of | | filesystem to another. There is a drawback to this representation of | |
| the server's name space on the client: it is static. If the server | | the server's name space on the client: it is static. If the server | |
| administrator adds a new export the client will be unaware of it. | | administrator adds a new export the client will be unaware of it. | |
| | | | |
| 7.3. Server Pseudo Filesystem | | 7.3. Server Pseudo Filesystem | |
| | | | |
| NFS version 4 servers avoid this name space inconsistency by | | NFS version 4 servers avoid this name space inconsistency by | |
| presenting all the exports within the framework of a single server | | presenting all the exports within the framework of a single server | |
| name space. An NFS version 4 client uses LOOKUP and READDIR | | name space. An NFS version 4 client uses LOOKUP and READDIR | |
| operations to browse seamlessly from one export to another. Portions | | operations to browse seamlessly from one export to another. Portions | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| of the server name space that are not exported are bridged via a | | of the server name space that are not exported are bridged via a | |
| "pseudo filesystem" that provides a view of exported directories | | "pseudo filesystem" that provides a view of exported directories | |
| only. A pseudo filesystem has a unique fsid and behaves like a | | only. A pseudo filesystem has a unique fsid and behaves like a | |
| normal, read only filesystem. | | normal, read only filesystem. | |
| | | | |
| Based on the construction of the server's name space, it is possible | | Based on the construction of the server's name space, it is possible | |
| that multiple pseudo filesystems may exist. For example, | | that multiple pseudo filesystems may exist. For example, | |
| | | | |
| /a pseudo filesystem | | /a pseudo filesystem | |
| /a/b real filesystem | | /a/b real filesystem | |
| | | | |
| skipping to change at page 63, line 45 | | skipping to change at page 63, line 27 | |
| representation of filesystem(s) available from the server. | | representation of filesystem(s) available from the server. | |
| Therefore, the pseudo filesystem is most likely constructed | | Therefore, the pseudo filesystem is most likely constructed | |
| dynamically when the server is first instantiated. It is expected | | dynamically when the server is first instantiated. It is expected | |
| that the pseudo filesystem may not have an on disk counterpart from | | that the pseudo filesystem may not have an on disk counterpart from | |
| which persistent filehandles could be constructed. Even though it is | | which persistent filehandles could be constructed. Even though it is | |
| preferable that the server provide persistent filehandles for the | | preferable that the server provide persistent filehandles for the | |
| pseudo filesystem, the NFS client should expect that pseudo file | | pseudo filesystem, the NFS client should expect that pseudo file | |
| system filehandles are volatile. This can be confirmed by checking | | system filehandles are volatile. This can be confirmed by checking | |
| the associated "fh_expire_type" attribute for those filehandles in | | the associated "fh_expire_type" attribute for those filehandles in | |
| question. If the filehandles are volatile, the NFS client must be | | question. If the filehandles are volatile, the NFS client must be | |
|
| prepared to recover a filehandle value (e.g. with a multi-component | | prepared to recover a filehandle value (e.g., with a multi-component | |
| LOOKUP) when receiving an error of NFS4ERR_FHEXPIRED. | | LOOKUP) when receiving an error of NFS4ERR_FHEXPIRED. | |
| | | | |
| 7.6. Exported Root | | 7.6. Exported Root | |
| | | | |
| If the server's root filesystem is exported, one might conclude that | | If the server's root filesystem is exported, one might conclude that | |
| a pseudo-filesystem is not needed. This would be wrong. Assume the | | a pseudo-filesystem is not needed. This would be wrong. Assume the | |
| following filesystems on a server: | | following filesystems on a server: | |
| | | | |
| / disk1 (exported) | | / disk1 (exported) | |
| /a disk2 (not exported) | | /a disk2 (not exported) | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| /a/b disk3 (exported) | | /a/b disk3 (exported) | |
| | | | |
| Because disk2 is not exported, disk3 cannot be reached with simple | | Because disk2 is not exported, disk3 cannot be reached with simple | |
| LOOKUPs. The server must bridge the gap with a pseudo-filesystem. | | LOOKUPs. The server must bridge the gap with a pseudo-filesystem. | |
| | | | |
| 7.7. Mount Point Crossing | | 7.7. Mount Point Crossing | |
| | | | |
| The server filesystem environment may be constructed in such a way | | The server filesystem environment may be constructed in such a way | |
| that one filesystem contains a directory which is 'covered' or | | that one filesystem contains a directory which is 'covered' or | |
| mounted upon by a second filesystem. For example: | | mounted upon by a second filesystem. For example: | |
| | | | |
| skipping to change at page 65, line 4 | | skipping to change at page 64, line 40 | |
| to authenticate itself. If, based on its policies, the server | | to authenticate itself. If, based on its policies, the server | |
| chooses to limit the contents of the pseudo filesystem, the server | | chooses to limit the contents of the pseudo filesystem, the server | |
| may effectively hide filesystems from a client that may otherwise | | may effectively hide filesystems from a client that may otherwise | |
| have legitimate access. | | have legitimate access. | |
| | | | |
| As suggested practice, the server should apply the security policy of | | As suggested practice, the server should apply the security policy of | |
| a shared resource in the server's namespace to the components of the | | a shared resource in the server's namespace to the components of the | |
| resource's ancestors. For example: | | resource's ancestors. For example: | |
| | | | |
| / | | / | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| /a/b | | /a/b | |
| /a/b/c | | /a/b/c | |
| | | | |
| The /a/b/c directory is a real filesystem and is the shared resource. | | The /a/b/c directory is a real filesystem and is the shared resource. | |
| The security policy for /a/b/c is Kerberos with integrity. The | | The security policy for /a/b/c is Kerberos with integrity. The | |
| server should apply the same security policy to /, /a, and /a/b. | | server should apply the same security policy to /, /a, and /a/b. | |
| This allows for the extension of the protection of the server's | | This allows for the extension of the protection of the server's | |
| namespace to the ancestors of the real shared resource. | | namespace to the ancestors of the real shared resource. | |
| | | | |
| For the case of the use of multiple, disjoint security mechanisms in | | For the case of the use of multiple, disjoint security mechanisms in | |
| the server's resources, the security for a particular object in the | | the server's resources, the security for a particular object in the | |
| server's namespace should be the union of all security mechanisms of | | server's namespace should be the union of all security mechanisms of | |
| all direct descendants. | | all direct descendants. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| 8. File Locking and Share Reservations | | 8. File Locking and Share Reservations | |
| | | | |
| Integrating locking into the NFS protocol necessarily causes it to be | | Integrating locking into the NFS protocol necessarily causes it to be | |
| stateful. With the inclusion of share reservations the protocol | | stateful. With the inclusion of share reservations the protocol | |
| becomes substantially more dependent on state than the traditional | | becomes substantially more dependent on state than the traditional | |
| combination of NFS and NLM [XNFS]. There are three components to | | combination of NFS and NLM [XNFS]. There are three components to | |
| making this state manageable: | | making this state manageable: | |
| | | | |
| o Clear division between client and server | | o Clear division between client and server | |
| | | | |
| | | | |
| skipping to change at page 67, line 5 | | skipping to change at page 66, line 13 | |
| owner. | | owner. | |
| | | | |
| The following sections describe the transition from the heavy weight | | The following sections describe the transition from the heavy weight | |
| information to the eventual stateid used for most client and server | | information to the eventual stateid used for most client and server | |
| locking and lease interactions. | | locking and lease interactions. | |
| | | | |
| 8.1.1. Client ID | | 8.1.1. Client ID | |
| | | | |
| For each LOCK request, the client must identify itself to the server. | | For each LOCK request, the client must identify itself to the server. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| This is done in such a way as to allow for correct lock | | This is done in such a way as to allow for correct lock | |
| identification and crash recovery. A sequence of a SETCLIENTID | | identification and crash recovery. A sequence of a SETCLIENTID | |
| operation followed by a SETCLIENTID_CONFIRM operation is required to | | operation followed by a SETCLIENTID_CONFIRM operation is required to | |
| establish the identification onto the server. Establishment of | | establish the identification onto the server. Establishment of | |
| identification by a new incarnation of the client also has the effect | | identification by a new incarnation of the client also has the effect | |
| of immediately breaking any leased state that a previous incarnation | | of immediately breaking any leased state that a previous incarnation | |
| of the client might have had on the server, as opposed to forcing the | | of the client might have had on the server, as opposed to forcing the | |
| new client incarnation to wait for the leases to expire. Breaking | | new client incarnation to wait for the leases to expire. Breaking | |
| the lease state amounts to the server removing all lock, share | | the lease state amounts to the server removing all lock, share | |
| reservation, and, where the server is not supporting the | | reservation, and, where the server is not supporting the | |
| | | | |
| skipping to change at page 67, line 29 | | skipping to change at page 66, line 35 | |
| state recovery, see the section "Delegation Recovery". | | state recovery, see the section "Delegation Recovery". | |
| | | | |
| Client identification is encapsulated in the following structure: | | Client identification is encapsulated in the following structure: | |
| | | | |
| struct nfs_client_id4 { | | struct nfs_client_id4 { | |
| verifier4 verifier; | | verifier4 verifier; | |
| opaque id<NFS4_OPAQUE_LIMIT>; | | opaque id<NFS4_OPAQUE_LIMIT>; | |
| }; | | }; | |
| | | | |
| The first field, verifier is a client incarnation verifier that is | | The first field, verifier is a client incarnation verifier that is | |
|
| used to detect client reboots. Only if the verifier is different from | | used to detect client reboots. Only if the verifier is different | |
| that the server has previously recorded the client (as identified by | | from that which the server has previously recorded the client (as | |
| the second field of the structure, id) does the server start the | | identified by the second field of the structure, id) does the server | |
| process of canceling the client's leased state. | | start the process of canceling the client's leased state. | |
| | | | |
| The second field, id is a variable length string that uniquely | | The second field, id is a variable length string that uniquely | |
| defines the client. | | defines the client. | |
| | | | |
| There are several considerations for how the client generates the id | | There are several considerations for how the client generates the id | |
| string: | | string: | |
| | | | |
| o The string should be unique so that multiple clients do not | | o The string should be unique so that multiple clients do not | |
| present the same string. The consequences of two clients | | present the same string. The consequences of two clients | |
|
| presenting the same string range from one client getting an | | presenting the same string range from one client getting an error | |
| error to one client having its leased state abruptly and | | to one client having its leased state abruptly and unexpectedly | |
| unexpectedly canceled. | | canceled. | |
| | | | |
| o The string should be selected so the subsequent incarnations | | o The string should be selected so the subsequent incarnations | |
|
| (e.g. reboots) of the same client cause the client to present | | (e.g., reboots) of the same client cause the client to present the | |
| the same string. The implementor is cautioned from an approach | | same string. The implementor is cautioned against an approach | |
| that requires the string to be recorded in a local file because | | that requires the string to be recorded in a local file because | |
| this precludes the use of the implementation in an environment | | this precludes the use of the implementation in an environment | |
| where there is no local disk and all file access is from an NFS | | where there is no local disk and all file access is from an NFS | |
| version 4 server. | | version 4 server. | |
| | | | |
| o The string should be different for each server network address | | o The string should be different for each server network address | |
|
| that the client accesses, rather than common to all server | | that the client accesses, rather than common to all server network | |
| network addresses. The reason is that it may not be possible for | | addresses. The reason is that it may not be possible for the | |
| the client to tell if same server is listening on multiple | | client to tell if the same server is listening on multiple network | |
| network addresses. If the client issues SETCLIENTID with the | | addresses. If the client issues SETCLIENTID with the same id | |
| | | string to each network address of such a server, the server will | |
| Draft Specification NFS version 4 Protocol November 2002 | | think it is the same client, and each successive SETCLIENTID will | |
| | | cause the server to begin the process of removing the client's | |
| same id string to each network address of such a server, the | | previous leased state. | |
| server will think it is the same client, and each successive | | | |
| SETCLIENTID will cause the server to begin the process of | | | |
| removing the client's previous leased state. | | | |
| | | | |
|
| o The algorithm for generating the string should not assume that | | o The algorithm for generating the string should not assume that the | |
| the client's network address won't change. This includes | | client's network address won't change. This includes changes | |
| changes between client incarnations and even changes while the | | between client incarnations and even changes while the client is | |
| client is stilling running in its current incarnation. This | | stilling running in its current incarnation. This means that if | |
| means that if the client includes just the client's and server's | | the client includes just the client's and server's network address | |
| network address in the id string, there is a real risk, after | | in the id string, there is a real risk, after the client gives up | |
| the client gives up the network address, that another client, | | the network address, that another client, using a similar | |
| using a similar algorithm for generating the id string, will | | algorithm for generating the id string, will generate a | |
| generate a conflicting id string. | | conflicting id string. | |
| | | | |
| Given the above considerations, an example of a well generated id | | Given the above considerations, an example of a well generated id | |
| string is one that includes: | | string is one that includes: | |
| | | | |
| o The server's network address. | | o The server's network address. | |
| | | | |
| o The client's network address. | | o The client's network address. | |
| | | | |
| o For a user level NFS version 4 client, it should contain | | o For a user level NFS version 4 client, it should contain | |
| additional information to distinguish the client from other user | | additional information to distinguish the client from other user | |
| | | | |
| skipping to change at page 69, line 4 | | skipping to change at page 68, line 21 | |
| - A true random number. However since this number ought to be | | - A true random number. However since this number ought to be | |
| the same between client incarnations, this shares the same | | the same between client incarnations, this shares the same | |
| problem as that of the using the timestamp of the software | | problem as that of the using the timestamp of the software | |
| installation. | | installation. | |
| | | | |
| As a security measure, the server MUST NOT cancel a client's leased | | As a security measure, the server MUST NOT cancel a client's leased | |
| state if the principal established the state for a given id string is | | state if the principal established the state for a given id string is | |
| not the same as the principal issuing the SETCLIENTID. | | not the same as the principal issuing the SETCLIENTID. | |
| | | | |
| Note that SETCLIENTID and SETCLIENTID_CONFIRM has a secondary purpose | | Note that SETCLIENTID and SETCLIENTID_CONFIRM has a secondary purpose | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| of establishing the information the server needs to make callbacks to | | of establishing the information the server needs to make callbacks to | |
| the client for purpose of supporting delegations. It is permitted to | | the client for purpose of supporting delegations. It is permitted to | |
| change this information via SETCLIENTID and SETCLIENTID_CONFIRM | | change this information via SETCLIENTID and SETCLIENTID_CONFIRM | |
| within the same incarnation of the client without removing the | | within the same incarnation of the client without removing the | |
| client's leased state. | | client's leased state. | |
| | | | |
| Once a SETCLIENTID and SETCLIENTID_CONFIRM sequence has successfully | | Once a SETCLIENTID and SETCLIENTID_CONFIRM sequence has successfully | |
| completed, the client uses the short hand client identifier, of type | | completed, the client uses the short hand client identifier, of type | |
| clientid4, instead of the longer and less compact nfs_client_id4 | | clientid4, instead of the longer and less compact nfs_client_id4 | |
|
| structure. This short hand client identifier (a clientid) is | | structure. This shorthand client identifier (a clientid) is assigned | |
| assigned by the server and should be chosen so that it will not | | by the server and should be chosen so that it will not conflict with | |
| conflict with a clientid previously assigned by the server. This | | a clientid previously assigned by the server. This applies across | |
| applies across server restarts or reboots. When a clientid is | | server restarts or reboots. When a clientid is presented to a server | |
| presented to a server and that clientid is not recognized, as would | | and that clientid is not recognized, as would happen after a server | |
| happen after a server reboot, the server will reject the request with | | reboot, the server will reject the request with the error | |
| the error NFS4ERR_STALE_CLIENTID. When this happens, the client must | | NFS4ERR_STALE_CLIENTID. When this happens, the client must obtain a | |
| obtain a new clientid by use of the SETCLIENTID operation and then | | new clientid by use of the SETCLIENTID operation and then proceed to | |
| proceed to any other necessary recovery for the server reboot case | | any other necessary recovery for the server reboot case (See the | |
| (See the section "Server Failure and Recovery"). | | section "Server Failure and Recovery"). | |
| | | | |
| The client must also employ the SETCLIENTID operation when it | | The client must also employ the SETCLIENTID operation when it | |
| receives a NFS4ERR_STALE_STATEID error using a stateid derived from | | receives a NFS4ERR_STALE_STATEID error using a stateid derived from | |
| its current clientid, since this also indicates a server reboot which | | its current clientid, since this also indicates a server reboot which | |
| has invalidated the existing clientid (see the next section | | has invalidated the existing clientid (see the next section | |
| "lock_owner and stateid Definition" for details). | | "lock_owner and stateid Definition" for details). | |
| | | | |
| See the detailed descriptions of SETCLIENTID and SETCLIENTID_CONFIRM | | See the detailed descriptions of SETCLIENTID and SETCLIENTID_CONFIRM | |
| for a complete specification of the operations. | | for a complete specification of the operations. | |
| | | | |
| | | | |
| skipping to change at page 70, line 4 | | skipping to change at page 69, line 27 | |
| restarted. Typically a server would not release a clientid unless | | restarted. Typically a server would not release a clientid unless | |
| there had been no activity from that client for many minutes. | | there had been no activity from that client for many minutes. | |
| | | | |
| Note that if the id string in a SETCLIENTID request is properly | | Note that if the id string in a SETCLIENTID request is properly | |
| constructed, and if the client takes care to use the same principal | | constructed, and if the client takes care to use the same principal | |
| for each successive use of SETCLIENTID, then, barring an active | | for each successive use of SETCLIENTID, then, barring an active | |
| denial of service attack, NFS4ERR_CLID_INUSE should never be | | denial of service attack, NFS4ERR_CLID_INUSE should never be | |
| returned. | | returned. | |
| | | | |
| However, client bugs, server bugs, or perhaps a deliberate change of | | However, client bugs, server bugs, or perhaps a deliberate change of | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| the principal owner of the id string (such as the case of a client | | the principal owner of the id string (such as the case of a client | |
| that changes security flavors, and under the new flavor, there is no | | that changes security flavors, and under the new flavor, there is no | |
| mapping to the previous owner) will in rare cases result in | | mapping to the previous owner) will in rare cases result in | |
| NFS4ERR_CLID_INUSE. | | NFS4ERR_CLID_INUSE. | |
| | | | |
| In that event, when the server gets a SETCLIENTID for a client id | | In that event, when the server gets a SETCLIENTID for a client id | |
| that currently has no state, or it has state, but the lease has | | that currently has no state, or it has state, but the lease has | |
| expired, rather than returning NFS4ERR_CLID_INUSE, the server MUST | | expired, rather than returning NFS4ERR_CLID_INUSE, the server MUST | |
| allow the SETCLIENTID, and confirm the new clientid if followed by | | allow the SETCLIENTID, and confirm the new clientid if followed by | |
| the appropriate SETCLIENTID_CONFIRM. | | the appropriate SETCLIENTID_CONFIRM. | |
| | | | |
| skipping to change at page 70, line 48 | | skipping to change at page 70, line 20 | |
| as long as it is able to recognize invalid and out-of-date stateids. | | as long as it is able to recognize invalid and out-of-date stateids. | |
| This requirement includes those stateids generated by earlier | | This requirement includes those stateids generated by earlier | |
| instances of the server. From this, the client can be properly | | instances of the server. From this, the client can be properly | |
| notified of a server restart. This notification will occur when the | | notified of a server restart. This notification will occur when the | |
| client presents a stateid to the server from a previous | | client presents a stateid to the server from a previous | |
| instantiation. | | instantiation. | |
| | | | |
| The server must be able to distinguish the following situations and | | The server must be able to distinguish the following situations and | |
| return the error as specified: | | return the error as specified: | |
| | | | |
|
| o The stateid was generated by an earlier server instance (i.e. | | o The stateid was generated by an earlier server instance (i.e., | |
| before a server reboot). The error NFS4ERR_STALE_STATEID should | | before a server reboot). The error NFS4ERR_STALE_STATEID should | |
| be returned. | | be returned. | |
| | | | |
| o The stateid was generated by the current server instance but the | | o The stateid was generated by the current server instance but the | |
| stateid no longer designates the current locking state for the | | stateid no longer designates the current locking state for the | |
|
| lockowner-file pair in question (i.e. one or more locking | | lockowner-file pair in question (i.e., one or more locking | |
| operations has occurred). The error NFS4ERR_OLD_STATEID should | | operations has occurred). The error NFS4ERR_OLD_STATEID should be | |
| be returned. | | returned. | |
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| This error condition will only occur when the client issues a | | This error condition will only occur when the client issues a | |
|
| locking request which changes a stateid while an I/O request | | locking request which changes a stateid while an I/O request that | |
| that uses that stateid is outstanding. | | uses that stateid is outstanding. | |
| | | | |
| o The stateid was generated by the current server instance but the | | o The stateid was generated by the current server instance but the | |
| stateid does not designate a locking state for any active | | stateid does not designate a locking state for any active | |
| lockowner-file pair. The error NFS4ERR_BAD_STATEID should be | | lockowner-file pair. The error NFS4ERR_BAD_STATEID should be | |
| returned. | | returned. | |
| | | | |
|
| This error condition will occur when there has been a logic | | This error condition will occur when there has been a logic error | |
| error on the part of the client or server. This should not | | on the part of the client or server. This should not happen. | |
| happen. | | | |
| | | | |
| One mechanism that may be used to satisfy these requirements is for | | One mechanism that may be used to satisfy these requirements is for | |
| the server to, | | the server to, | |
| | | | |
| o divide the "other" field of each stateid into two fields: | | o divide the "other" field of each stateid into two fields: | |
| | | | |
|
| - A server verifier which uniquely designates a particular | | - A server verifier which uniquely designates a particular server | |
| server instantiation. | | instantiation. | |
| | | | |
| - An index into a table of locking-state structures. | | - An index into a table of locking-state structures. | |
| | | | |
| o utilize the "seqid" field of each stateid, such that seqid is | | o utilize the "seqid" field of each stateid, such that seqid is | |
|
| monotonically incremented for each stateid that is associated | | monotonically incremented for each stateid that is associated with | |
| with the same index into the locking-state table. | | the same index into the locking-state table. | |
| | | | |
| By matching the incoming stateid and its field values with the state | | By matching the incoming stateid and its field values with the state | |
| held at the server, the server is able to easily determine if a | | held at the server, the server is able to easily determine if a | |
| stateid is valid for its current instantiation and state. If the | | stateid is valid for its current instantiation and state. If the | |
| stateid is not valid, the appropriate error can be supplied to the | | stateid is not valid, the appropriate error can be supplied to the | |
| client. | | client. | |
| | | | |
| 8.1.4. Use of the stateid and Locking | | 8.1.4. Use of the stateid and Locking | |
| | | | |
| All READ, WRITE and SETATTR operations contain a stateid. For the | | All READ, WRITE and SETATTR operations contain a stateid. For the | |
| purposes of this section, SETATTR operations which change the size | | purposes of this section, SETATTR operations which change the size | |
| attribute of a file are treated as if they are writing the area | | attribute of a file are treated as if they are writing the area | |
|
| between the old and new size (i.e. the range truncated or added to | | between the old and new size (i.e., the range truncated or added to | |
| the file by means of the SETATTR), even where SETATTR is not | | the file by means of the SETATTR), even where SETATTR is not | |
| explicitly mentioned in the text. | | explicitly mentioned in the text. | |
| | | | |
| If the lock_owner performs a READ or WRITE in a situation in which it | | If the lock_owner performs a READ or WRITE in a situation in which it | |
| has established a lock or share reservation on the server (any OPEN | | has established a lock or share reservation on the server (any OPEN | |
| constitutes a share reservation) the stateid (previously returned by | | constitutes a share reservation) the stateid (previously returned by | |
| the server) must be used to indicate what locks, including both | | the server) must be used to indicate what locks, including both | |
| record locks and share reservations, are held by the lockowner. If | | record locks and share reservations, are held by the lockowner. If | |
| no state is established by the client, either record lock or share | | no state is established by the client, either record lock or share | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| reservation, a stateid of all bits 0 is used. Regardless whether a | | reservation, a stateid of all bits 0 is used. Regardless whether a | |
| stateid of all bits 0, or a stateid returned by the server is used, | | stateid of all bits 0, or a stateid returned by the server is used, | |
| if there is a conflicting share reservation or mandatory record lock | | if there is a conflicting share reservation or mandatory record lock | |
| held on the file, the server MUST refuse to service the READ or WRITE | | held on the file, the server MUST refuse to service the READ or WRITE | |
| operation. | | operation. | |
| | | | |
| Share reservations are established by OPEN operations and by their | | Share reservations are established by OPEN operations and by their | |
| nature are mandatory in that when the OPEN denies READ or WRITE | | nature are mandatory in that when the OPEN denies READ or WRITE | |
| operations, that denial results in such operations being rejected | | operations, that denial results in such operations being rejected | |
| with error NFS4ERR_LOCKED. Record locks may be implemented by the | | with error NFS4ERR_LOCKED. Record locks may be implemented by the | |
| | | | |
| skipping to change at page 72, line 29 | | skipping to change at page 72, line 5 | |
| file being accessed (for example, some UNIX-based servers support a | | file being accessed (for example, some UNIX-based servers support a | |
| "mandatory lock bit" on the mode attribute such that if set, record | | "mandatory lock bit" on the mode attribute such that if set, record | |
| locks are required on the file before I/O is possible). When record | | locks are required on the file before I/O is possible). When record | |
| locks are advisory, they only prevent the granting of conflicting | | locks are advisory, they only prevent the granting of conflicting | |
| lock requests and have no effect on READs or WRITEs. Mandatory | | lock requests and have no effect on READs or WRITEs. Mandatory | |
| record locks, however, prevent conflicting I/O operations. When they | | record locks, however, prevent conflicting I/O operations. When they | |
| are attempted, they are rejected with NFS4ERR_LOCKED. When the | | are attempted, they are rejected with NFS4ERR_LOCKED. When the | |
| client gets NFS4ERR_LOCKED on a file it knows it has the proper share | | client gets NFS4ERR_LOCKED on a file it knows it has the proper share | |
| reservation for, it will need to issue a LOCK request on the region | | reservation for, it will need to issue a LOCK request on the region | |
| of the file that includes the region the I/O was to be performed on, | | of the file that includes the region the I/O was to be performed on, | |
|
| with an appropriate locktype (i.e. READ*_LT for a READ operation, | | with an appropriate locktype (i.e., READ*_LT for a READ operation, | |
| WRITE*_LT for a WRITE operation). | | WRITE*_LT for a WRITE operation). | |
| | | | |
| With NFS version 3, there was no notion of a stateid so there was no | | With NFS version 3, there was no notion of a stateid so there was no | |
| way to tell if the application process of the client sending the READ | | way to tell if the application process of the client sending the READ | |
| or WRITE operation had also acquired the appropriate record lock on | | or WRITE operation had also acquired the appropriate record lock on | |
|
| the file. Thus there was no way to implement mandatory locking. With | | the file. Thus there was no way to implement mandatory locking. | |
| the stateid construct, this barrier has been removed. | | With the stateid construct, this barrier has been removed. | |
| | | | |
| Note that for UNIX environments that support mandatory file locking, | | Note that for UNIX environments that support mandatory file locking, | |
| the distinction between advisory and mandatory locking is subtle. In | | the distinction between advisory and mandatory locking is subtle. In | |
| fact, advisory and mandatory record locks are exactly the same in so | | fact, advisory and mandatory record locks are exactly the same in so | |
| far as the APIs and requirements on implementation. If the mandatory | | far as the APIs and requirements on implementation. If the mandatory | |
| lock attribute is set on the file, the server checks to see if the | | lock attribute is set on the file, the server checks to see if the | |
| lockowner has an appropriate shared (read) or exclusive (write) | | lockowner has an appropriate shared (read) or exclusive (write) | |
| record lock on the region it wishes to read or write to. If there is | | record lock on the region it wishes to read or write to. If there is | |
| no appropriate lock, the server checks if there is a conflicting lock | | no appropriate lock, the server checks if there is a conflicting lock | |
| (which can be done by attempting to acquire the conflicting lock on | | (which can be done by attempting to acquire the conflicting lock on | |
| | | | |
| skipping to change at page 73, line 5 | | skipping to change at page 72, line 35 | |
| NFS4ERR_LOCKED. | | NFS4ERR_LOCKED. | |
| | | | |
| For Windows environments, there are no advisory record locks, so the | | For Windows environments, there are no advisory record locks, so the | |
| server always checks for record locks during I/O requests. | | server always checks for record locks during I/O requests. | |
| | | | |
| Thus, the NFS version 4 LOCK operation does not need to distinguish | | Thus, the NFS version 4 LOCK operation does not need to distinguish | |
| between advisory and mandatory record locks. It is the NFS version 4 | | between advisory and mandatory record locks. It is the NFS version 4 | |
| server's processing of the READ and WRITE operations that introduces | | server's processing of the READ and WRITE operations that introduces | |
| the distinction. | | the distinction. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| Every stateid other than the special stateid values noted in this | | Every stateid other than the special stateid values noted in this | |
|
| section, whether returned by an OPEN-type operation (i.e. OPEN, | | section, whether returned by an OPEN-type operation (i.e., OPEN, | |
| OPEN_DOWNGRADE), or by a LOCK-type operation (i.e. LOCK or LOCKU), | | OPEN_DOWNGRADE), or by a LOCK-type operation (i.e., LOCK or LOCKU), | |
| defines an access mode for the file (i.e. READ, WRITE, or READ-WRITE) | | defines an access mode for the file (i.e., READ, WRITE, or READ- | |
| as established by the original OPEN which began the stateid sequence, | | WRITE) as established by the original OPEN which began the stateid | |
| and as modified by subsequent OPENs and OPEN_DOWNGRADEs within that | | sequence, and as modified by subsequent OPENs and OPEN_DOWNGRADEs | |
| stateid sequence. When a READ, WRITE, or SETATTR which specifies the | | within that stateid sequence. When a READ, WRITE, or SETATTR which | |
| size attribute, is done, the operation is subject to checking against | | specifies the size attribute, is done, the operation is subject to | |
| the access mode to verify that the operation is appropriate given the | | checking against the access mode to verify that the operation is | |
| OPEN with which the operation is associated. | | appropriate given the OPEN with which the operation is associated. | |
| | | | |
|
| In the case of WRITE-type operations (i.e. WRITEs and SETATTRs which | | In the case of WRITE-type operations (i.e., WRITEs and SETATTRs which | |
| set size), the server must verify that the access mode allows writing | | set size), the server must verify that the access mode allows writing | |
| and return an NFS4ERR_OPENMODE error if it does not. In the case, of | | and return an NFS4ERR_OPENMODE error if it does not. In the case, of | |
| READ, the server may perform the corresponding check on the access | | READ, the server may perform the corresponding check on the access | |
| mode, or it may choose to allow READ on opens for WRITE only, to | | mode, or it may choose to allow READ on opens for WRITE only, to | |
| accommodate clients whose write implementation may unavoidably do | | accommodate clients whose write implementation may unavoidably do | |
|
| reads (e.g. due to buffer cache constraints). However, even if READs | | reads (e.g., due to buffer cache constraints). However, even if | |
| are allowed in these circumstances, the server MUST still check for | | READs are allowed in these circumstances, the server MUST still check | |
| locks that conflict with the READ (e.g. another open specify denial | | for locks that conflict with the READ (e.g., another open specify | |
| of READs). Note that a server which does enforce the access mode | | denial of READs). Note that a server which does enforce the access | |
| check on READs need not explicitly check for conflicting share | | mode check on READs need not explicitly check for conflicting share | |
| reservations since the existence of OPEN for read access guarantees | | reservations since the existence of OPEN for read access guarantees | |
| that no conflicting share reservation can exist. | | that no conflicting share reservation can exist. | |
| | | | |
| A stateid of all bits 1 (one) MAY allow READ operations to bypass | | A stateid of all bits 1 (one) MAY allow READ operations to bypass | |
| locking checks at the server. However, WRITE operations with a | | locking checks at the server. However, WRITE operations with a | |
| stateid with bits all 1 (one) MUST NOT bypass locking checks and are | | stateid with bits all 1 (one) MUST NOT bypass locking checks and are | |
| treated exactly the same as if a stateid of all bits 0 were used. | | treated exactly the same as if a stateid of all bits 0 were used. | |
| | | | |
| A lock may not be granted while a READ or WRITE operation using one | | A lock may not be granted while a READ or WRITE operation using one | |
| of the special stateids is being performed and the range of the lock | | of the special stateids is being performed and the range of the lock | |
| | | | |
| skipping to change at page 74, line 4 | | skipping to change at page 73, line 38 | |
| Locking is different than most NFS operations as it requires "at- | | Locking is different than most NFS operations as it requires "at- | |
| most-one" semantics that are not provided by ONCRPC. ONCRPC over a | | most-one" semantics that are not provided by ONCRPC. ONCRPC over a | |
| reliable transport is not sufficient because a sequence of locking | | reliable transport is not sufficient because a sequence of locking | |
| requests may span multiple TCP connections. In the face of | | requests may span multiple TCP connections. In the face of | |
| retransmission or reordering, lock or unlock requests must have a | | retransmission or reordering, lock or unlock requests must have a | |
| well defined and consistent behavior. To accomplish this, each lock | | well defined and consistent behavior. To accomplish this, each lock | |
| request contains a sequence number that is a consecutively increasing | | request contains a sequence number that is a consecutively increasing | |
| integer. Different lock_owners have different sequences. The server | | integer. Different lock_owners have different sequences. The server | |
| maintains the last sequence number (L) received and the response that | | maintains the last sequence number (L) received and the response that | |
| was returned. The first request issued for any given lock_owner is | | was returned. The first request issued for any given lock_owner is | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| issued with a sequence number of zero. | | issued with a sequence number of zero. | |
| | | | |
| Note that for requests that contain a sequence number, for each | | Note that for requests that contain a sequence number, for each | |
| lock_owner, there should be no more than one outstanding request. | | lock_owner, there should be no more than one outstanding request. | |
| | | | |
| If a request (r) with a previous sequence number (r < L) is received, | | If a request (r) with a previous sequence number (r < L) is received, | |
| it is rejected with the return of error NFS4ERR_BAD_SEQID. Given a | | it is rejected with the return of error NFS4ERR_BAD_SEQID. Given a | |
| properly-functioning client, the response to (r) must have been | | properly-functioning client, the response to (r) must have been | |
| received before the last request (L) was sent. If a duplicate of | | received before the last request (L) was sent. If a duplicate of | |
| last request (r == L) is received, the stored response is returned. | | last request (r == L) is received, the stored response is returned. | |
| If a request beyond the next sequence (r == L + 2) is received, it is | | If a request beyond the next sequence (r == L + 2) is received, it is | |
| rejected with the return of error NFS4ERR_BAD_SEQID. Sequence | | rejected with the return of error NFS4ERR_BAD_SEQID. Sequence | |
| history is reinitialized whenever the SETCLIENTID/SETCLIENTID_CONFIRM | | history is reinitialized whenever the SETCLIENTID/SETCLIENTID_CONFIRM | |
| sequence changes the client verifier. | | sequence changes the client verifier. | |
| | | | |
| Since the sequence number is represented with an unsigned 32-bit | | Since the sequence number is represented with an unsigned 32-bit | |
| integer, the arithmetic involved with the sequence number is mod | | integer, the arithmetic involved with the sequence number is mod | |
|
| 2^32. For an example of modulo arithetic involving sequence numbers | | 2^32. For an example of modulo arithmetic involving sequence numbers | |
| see [RFC793]. | | see [RFC793]. | |
| | | | |
| It is critical the server maintain the last response sent to the | | It is critical the server maintain the last response sent to the | |
| client to provide a more reliable cache of duplicate non-idempotent | | client to provide a more reliable cache of duplicate non-idempotent | |
| requests than that of the traditional cache described in [Juszczak]. | | requests than that of the traditional cache described in [Juszczak]. | |
| The traditional duplicate request cache uses a least recently used | | The traditional duplicate request cache uses a least recently used | |
| algorithm for removing unneeded requests. However, the last lock | | algorithm for removing unneeded requests. However, the last lock | |
| request and response on a given lock_owner must be cached as long as | | request and response on a given lock_owner must be cached as long as | |
| the lock state exists on the server. | | the lock state exists on the server. | |
| | | | |
| | | | |
| skipping to change at page 75, line 5 | | skipping to change at page 74, line 41 | |
| the methods described above, there are no risks of a Byzantine router | | the methods described above, there are no risks of a Byzantine router | |
| re-sending old requests. The server need only maintain the | | re-sending old requests. The server need only maintain the | |
| (lock_owner, sequence number) state as long as there are open files | | (lock_owner, sequence number) state as long as there are open files | |
| or closed files with locks outstanding. | | or closed files with locks outstanding. | |
| | | | |
| LOCK, LOCKU, OPEN, OPEN_DOWNGRADE, and CLOSE each contain a sequence | | LOCK, LOCKU, OPEN, OPEN_DOWNGRADE, and CLOSE each contain a sequence | |
| number and therefore the risk of the replay of these operations | | number and therefore the risk of the replay of these operations | |
| resulting in undesired effects is non-existent while the server | | resulting in undesired effects is non-existent while the server | |
| maintains the lock_owner state. | | maintains the lock_owner state. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| 8.1.7. Releasing lock_owner State | | 8.1.7. Releasing lock_owner State | |
| | | | |
| When a particular lock_owner no longer holds open or file locking | | When a particular lock_owner no longer holds open or file locking | |
| state at the server, the server may choose to release the sequence | | state at the server, the server may choose to release the sequence | |
| number state associated with the lock_owner. The server may make | | number state associated with the lock_owner. The server may make | |
| this choice based on lease expiration, for the reclamation of server | | this choice based on lease expiration, for the reclamation of server | |
| memory, or other implementation specific details. In any event, the | | memory, or other implementation specific details. In any event, the | |
| server is able to do this safely only when the lock_owner no longer | | server is able to do this safely only when the lock_owner no longer | |
| is being utilized by the client. The server may choose to hold the | | is being utilized by the client. The server may choose to hold the | |
| lock_owner state in the event that retransmitted requests are | | lock_owner state in the event that retransmitted requests are | |
| | | | |
| skipping to change at page 75, line 54 | | skipping to change at page 75, line 39 | |
| they would be prevented from acting in a timely fashion on | | they would be prevented from acting in a timely fashion on | |
| information received, because that information would be provisional, | | information received, because that information would be provisional, | |
| subject to deletion upon non-confirmation. Fortunately, these are | | subject to deletion upon non-confirmation. Fortunately, these are | |
| situations in which the server can avoid the need for confirmation | | situations in which the server can avoid the need for confirmation | |
| when responding to open requests. The two constraints are: | | when responding to open requests. The two constraints are: | |
| | | | |
| o The server must not bestow a delegation for any open which would | | o The server must not bestow a delegation for any open which would | |
| require confirmation. | | require confirmation. | |
| | | | |
| o The server MUST NOT require confirmation on a reclaim-type open | | o The server MUST NOT require confirmation on a reclaim-type open | |
|
| (i.e. one specifying claim type CLAIM_PREVIOUS or | | (i.e., one specifying claim type CLAIM_PREVIOUS or | |
| CLAIM_DELEGATE_PREV). | | CLAIM_DELEGATE_PREV). | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| These constraints are related in that reclaim-type opens are the only | | These constraints are related in that reclaim-type opens are the only | |
| ones in which the server may be required to send a delegation. For | | ones in which the server may be required to send a delegation. For | |
| CLAIM_NULL, sending the delegation is optional while for | | CLAIM_NULL, sending the delegation is optional while for | |
| CLAIM_DELEGATE_CUR, no delegation is sent. | | CLAIM_DELEGATE_CUR, no delegation is sent. | |
| | | | |
| Delegations being sent with an open requiring confirmation are | | Delegations being sent with an open requiring confirmation are | |
| troublesome because recovering from non-confirmation adds undue | | troublesome because recovering from non-confirmation adds undue | |
| complexity to the protocol while requiring confirmation on reclaim- | | complexity to the protocol while requiring confirmation on reclaim- | |
|
| type opens poses difficulties in that the inability to resolve the | | type opens poses difficulties in that the inability to resolve | |
| status of the reclaim until lease expiration may make it difficult to | | the status of the reclaim until lease expiration may make it | |
| have timely determination of the set of locks being reclaimed (since | | difficult to have timely determination of the set of locks being | |
| the grace period may expire). | | reclaimed (since the grace period may expire). | |
| | | | |
| Requiring open confirmation on reclaim-type opens is avoidable | | Requiring open confirmation on reclaim-type opens is avoidable | |
| because of the nature of the environments in which such opens are | | because of the nature of the environments in which such opens are | |
| done. For CLAIM_PREVIOUS opens, this is immediately after server | | done. For CLAIM_PREVIOUS opens, this is immediately after server | |
| reboot, so there should be no time for lockowners to be created, | | reboot, so there should be no time for lockowners to be created, | |
| found to be unused, and recycled. For CLAIM_DELEGATE_PREV opens, we | | found to be unused, and recycled. For CLAIM_DELEGATE_PREV opens, we | |
| are dealing with a client reboot situation. A server which supports | | are dealing with a client reboot situation. A server which supports | |
| delegation can be sure that no lockowners for that client have been | | delegation can be sure that no lockowners for that client have been | |
| recycled since client initialization and thus can ensure that | | recycled since client initialization and thus can ensure that | |
| confirmation will not be required. | | confirmation will not be required. | |
| | | | |
| skipping to change at page 77, line 4 | | skipping to change at page 76, line 45 | |
| the recovery of file locking state in the event of server failure. | | the recovery of file locking state in the event of server failure. | |
| As discussed in the section "Server Failure and Recovery" below, the | | As discussed in the section "Server Failure and Recovery" below, the | |
| server may employ certain optimizations during recovery that work | | server may employ certain optimizations during recovery that work | |
| effectively only when the client's behavior during lock recovery is | | effectively only when the client's behavior during lock recovery is | |
| similar to the client's locking behavior prior to server failure. | | similar to the client's locking behavior prior to server failure. | |
| | | | |
| 8.3. Upgrading and Downgrading Locks | | 8.3. Upgrading and Downgrading Locks | |
| | | | |
| If a client has a write lock on a record, it can request an atomic | | If a client has a write lock on a record, it can request an atomic | |
| downgrade of the lock to a read lock via the LOCK request, by setting | | downgrade of the lock to a read lock via the LOCK request, by setting | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| the type to READ_LT. If the server supports atomic downgrade, the | | the type to READ_LT. If the server supports atomic downgrade, the | |
| request will succeed. If not, it will return NFS4ERR_LOCK_NOTSUPP. | | request will succeed. If not, it will return NFS4ERR_LOCK_NOTSUPP. | |
| The client should be prepared to receive this error, and if | | The client should be prepared to receive this error, and if | |
| appropriate, report the error to the requesting application. | | appropriate, report the error to the requesting application. | |
| | | | |
| If a client has a read lock on a record, it can request an atomic | | If a client has a read lock on a record, it can request an atomic | |
| upgrade of the lock to a write lock via the LOCK request by setting | | upgrade of the lock to a write lock via the LOCK request by setting | |
| the type to WRITE_LT or WRITEW_LT. If the server does not support | | the type to WRITE_LT or WRITEW_LT. If the server does not support | |
| atomic upgrade, it will return NFS4ERR_LOCK_NOTSUPP. If the upgrade | | atomic upgrade, it will return NFS4ERR_LOCK_NOTSUPP. If the upgrade | |
| can be achieved without an existing conflict, the request will | | can be achieved without an existing conflict, the request will | |
| | | | |
| skipping to change at page 78, line 4 | | skipping to change at page 77, line 50 | |
| released, allowing a successful return. In this way, clients can | | released, allowing a successful return. In this way, clients can | |
| avoid the burden of needlessly frequent polling for blocking locks. | | avoid the burden of needlessly frequent polling for blocking locks. | |
| The server should take care in the length of delay in the event the | | The server should take care in the length of delay in the event the | |
| client retransmits the request. | | client retransmits the request. | |
| | | | |
| 8.5. Lease Renewal | | 8.5. Lease Renewal | |
| | | | |
| The purpose of a lease is to allow a server to remove stale locks | | The purpose of a lease is to allow a server to remove stale locks | |
| that are held by a client that has crashed or is otherwise | | that are held by a client that has crashed or is otherwise | |
| unreachable. It is not a mechanism for cache consistency and lease | | unreachable. It is not a mechanism for cache consistency and lease | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| renewals may not be denied if the lease interval has not expired. | | renewals may not be denied if the lease interval has not expired. | |
| | | | |
| The following events cause implicit renewal of all of the leases for | | The following events cause implicit renewal of all of the leases for | |
|
| a given client (i.e. all those sharing a given clientid). Each of | | a given client (i.e., all those sharing a given clientid). Each of | |
| these is a positive indication that the client is still active and | | these is a positive indication that the client is still active and | |
| that the associated state held at the server, for the client, is | | that the associated state held at the server, for the client, is | |
| still valid. | | still valid. | |
| | | | |
| o An OPEN with a valid clientid. | | o An OPEN with a valid clientid. | |
| | | | |
| o Any operation made with a valid stateid (CLOSE, DELEGPURGE, | | o Any operation made with a valid stateid (CLOSE, DELEGPURGE, | |
| DELEGRETURN, LOCK, LOCKU, OPEN, OPEN_CONFIRM, OPEN_DOWNGRADE, | | DELEGRETURN, LOCK, LOCKU, OPEN, OPEN_CONFIRM, OPEN_DOWNGRADE, | |
| READ, RENEW, SETATTR, WRITE). This does not include the special | | READ, RENEW, SETATTR, WRITE). This does not include the special | |
| stateids of all bits 0 or all bits 1. | | stateids of all bits 0 or all bits 1. | |
| | | | |
|
| Note that if the client had restarted or rebooted, the | | Note that if the client had restarted or rebooted, the client | |
| client would not be making these requests without issuing | | would not be making these requests without issuing the | |
| the SETCLIENTID/SETCLIENTID_CONFIRM sequence. The use of | | SETCLIENTID/SETCLIENTID_CONFIRM sequence. The use of the | |
| the SETCLIENTID/SETCLIENTID_CONFIRM sequence (one that | | SETCLIENTID/SETCLIENTID_CONFIRM sequence (one that changes the | |
| changes the client verifier) notifies the server to drop | | client verifier) notifies the server to drop the locking state | |
| the locking state associated with the client. | | associated with the client. SETCLIENTID/SETCLIENTID_CONFIRM never | |
| SETCLIENTID/SETCLIENTID_CONFIRM never renews a lease. | | renews a lease. | |
| | | | |
|
| If the server has rebooted, the stateids | | If the server has rebooted, the stateids (NFS4ERR_STALE_STATEID | |
| (NFS4ERR_STALE_STATEID error) or the clientid | | error) or the clientid (NFS4ERR_STALE_CLIENTID error) will not be | |
| (NFS4ERR_STALE_CLIENTID error) will not be valid hence | | valid hence preventing spurious renewals. | |
| preventing spurious renewals. | | | |
| | | | |
| This approach allows for low overhead lease renewal which scales | | This approach allows for low overhead lease renewal which scales | |
| well. In the typical case no extra RPC calls are required for lease | | well. In the typical case no extra RPC calls are required for lease | |
| renewal and in the worst case one RPC is required every lease period | | renewal and in the worst case one RPC is required every lease period | |
|
| (i.e. a RENEW operation). The number of locks held by the client is | | (i.e., a RENEW operation). The number of locks held by the client is | |
| not a factor since all state for the client is involved with the | | not a factor since all state for the client is involved with the | |
| lease renewal action. | | lease renewal action. | |
| | | | |
| Since all operations that create a new lease also renew existing | | Since all operations that create a new lease also renew existing | |
| leases, the server must maintain a common lease expiration time for | | leases, the server must maintain a common lease expiration time for | |
| all valid leases for a given client. This lease time can then be | | all valid leases for a given client. This lease time can then be | |
| easily updated upon implicit lease renewal actions. | | easily updated upon implicit lease renewal actions. | |
| | | | |
| 8.6. Crash Recovery | | 8.6. Crash Recovery | |
| | | | |
| The important requirement in crash recovery is that both the client | | The important requirement in crash recovery is that both the client | |
| and the server know when the other has failed. Additionally, it is | | and the server know when the other has failed. Additionally, it is | |
| required that a client sees a consistent view of data across server | | required that a client sees a consistent view of data across server | |
| restarts or reboots. All READ and WRITE operations that may have | | restarts or reboots. All READ and WRITE operations that may have | |
| been queued within the client or network buffers must wait until the | | been queued within the client or network buffers must wait until the | |
| client has successfully recovered the locks protecting the READ and | | client has successfully recovered the locks protecting the READ and | |
| WRITE operations. | | WRITE operations. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| 8.6.1. Client Failure and Recovery | | 8.6.1. Client Failure and Recovery | |
| | | | |
| In the event that a client fails, the server may recover the client's | | In the event that a client fails, the server may recover the client's | |
| locks when the associated leases have expired. Conflicting locks | | locks when the associated leases have expired. Conflicting locks | |
| from another client may only be granted after this lease expiration. | | from another client may only be granted after this lease expiration. | |
| If the client is able to restart or reinitialize within the lease | | If the client is able to restart or reinitialize within the lease | |
| period the client may be forced to wait the remainder of the lease | | period the client may be forced to wait the remainder of the lease | |
| period before obtaining new locks. | | period before obtaining new locks. | |
| | | | |
| To minimize client delay upon restart, lock requests are associated | | To minimize client delay upon restart, lock requests are associated | |
| | | | |
| skipping to change at page 80, line 4 | | skipping to change at page 80, line 9 | |
| | | | |
| A client can determine that server failure (and thus loss of locking | | A client can determine that server failure (and thus loss of locking | |
| state) has occurred, when it receives one of two errors. The | | state) has occurred, when it receives one of two errors. The | |
| NFS4ERR_STALE_STATEID error indicates a stateid invalidated by a | | NFS4ERR_STALE_STATEID error indicates a stateid invalidated by a | |
| reboot or restart. The NFS4ERR_STALE_CLIENTID error indicates a | | reboot or restart. The NFS4ERR_STALE_CLIENTID error indicates a | |
| clientid invalidated by reboot or restart. When either of these are | | clientid invalidated by reboot or restart. When either of these are | |
| received, the client must establish a new clientid (See the section | | received, the client must establish a new clientid (See the section | |
| "Client ID") and re-establish the locking state as discussed below. | | "Client ID") and re-establish the locking state as discussed below. | |
| | | | |
| The period of special handling of locking and READs and WRITEs, equal | | The period of special handling of locking and READs and WRITEs, equal | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| in duration to the lease period, is referred to as the "grace | | in duration to the lease period, is referred to as the "grace | |
| period". During the grace period, clients recover locks and the | | period". During the grace period, clients recover locks and the | |
|
| associated state by reclaim-type locking requests (i.e. LOCK requests | | associated state by reclaim-type locking requests (i.e., LOCK | |
| with reclaim set to true and OPEN operations with a claim type of | | requests with reclaim set to true and OPEN operations with a claim | |
| CLAIM_PREVIOUS). During the grace period, the server must reject | | type of CLAIM_PREVIOUS). During the grace period, the server must | |
| READ and WRITE operations and non-reclaim locking requests (i.e. | | reject READ and WRITE operations and non-reclaim locking requests | |
| other LOCK and OPEN operations) with an error of NFS4ERR_GRACE. | | (i.e., other LOCK and OPEN operations) with an error of | |
| | | NFS4ERR_GRACE. | |
| | | | |
| If the server can reliably determine that granting a non-reclaim | | If the server can reliably determine that granting a non-reclaim | |
| request will not conflict with reclamation of locks by other clients, | | request will not conflict with reclamation of locks by other clients, | |
| the NFS4ERR_GRACE error does not have to be returned and the non- | | the NFS4ERR_GRACE error does not have to be returned and the non- | |
| reclaim client request can be serviced. For the server to be able to | | reclaim client request can be serviced. For the server to be able to | |
| service READ and WRITE operations during the grace period, it must | | service READ and WRITE operations during the grace period, it must | |
| again be able to guarantee that no possible conflict could arise | | again be able to guarantee that no possible conflict could arise | |
| between an impending reclaim locking request and the READ or WRITE | | between an impending reclaim locking request and the READ or WRITE | |
| operation. If the server is unable to offer that guarantee, the | | operation. If the server is unable to offer that guarantee, the | |
| NFS4ERR_GRACE error must be returned to the client. | | NFS4ERR_GRACE error must be returned to the client. | |
| | | | |
| skipping to change at page 81, line 4 | | skipping to change at page 81, line 15 | |
| Clients should be prepared for the return of NFS4ERR_GRACE errors for | | Clients should be prepared for the return of NFS4ERR_GRACE errors for | |
| non-reclaim lock and I/O requests. In this case the client should | | non-reclaim lock and I/O requests. In this case the client should | |
| employ a retry mechanism for the request. A delay (on the order of | | employ a retry mechanism for the request. A delay (on the order of | |
| several seconds) between retries should be used to avoid overwhelming | | several seconds) between retries should be used to avoid overwhelming | |
| the server. Further discussion of the general issue is included in | | the server. Further discussion of the general issue is included in | |
| [Floyd]. The client must account for the server that is able to | | [Floyd]. The client must account for the server that is able to | |
| perform I/O and non-reclaim locking requests within the grace period | | perform I/O and non-reclaim locking requests within the grace period | |
| as well as those that can not do so. | | as well as those that can not do so. | |
| | | | |
| A reclaim-type locking request outside the server's grace period can | | A reclaim-type locking request outside the server's grace period can | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| only succeed if the server can guarantee that no conflicting lock or | | only succeed if the server can guarantee that no conflicting lock or | |
| I/O request has been granted since reboot or restart. | | I/O request has been granted since reboot or restart. | |
| | | | |
| A server may, upon restart, establish a new value for the lease | | A server may, upon restart, establish a new value for the lease | |
| period. Therefore, clients should, once a new clientid is | | period. Therefore, clients should, once a new clientid is | |
| established, refetch the lease_time attribute and use it as the basis | | established, refetch the lease_time attribute and use it as the basis | |
|
| for lease renewal for the lease associated with that server. However, | | for lease renewal for the lease associated with that server. | |
| the server must establish, for this restart event, a grace period at | | However, the server must establish, for this restart event, a grace | |
| least as long as the lease period for the previous server | | period at least as long as the lease period for the previous server | |
| instantiation. This allows the client state obtained during the | | instantiation. This allows the client state obtained during the | |
| previous server instance to be reliably re-established. | | previous server instance to be reliably re-established. | |
| | | | |
| 8.6.3. Network Partitions and Recovery | | 8.6.3. Network Partitions and Recovery | |
| | | | |
| If the duration of a network partition is greater than the lease | | If the duration of a network partition is greater than the lease | |
| period provided by the server, the server will have not received a | | period provided by the server, the server will have not received a | |
| lease renewal from the client. If this occurs, the server may free | | lease renewal from the client. If this occurs, the server may free | |
| all locks held for the client. As a result, all stateids held by the | | all locks held for the client. As a result, all stateids held by the | |
| client will become invalid or stale. Once the client is able to | | client will become invalid or stale. Once the client is able to | |
| | | | |
| skipping to change at page 81, line 47 | | skipping to change at page 82, line 9 | |
| | | | |
| When a network partition is combined with a server reboot, there are | | When a network partition is combined with a server reboot, there are | |
| edge conditions that place requirements on the server in order to | | edge conditions that place requirements on the server in order to | |
| avoid silent data corruption following the server reboot. Two of | | avoid silent data corruption following the server reboot. Two of | |
| these edge conditions are known, and are discussed below. | | these edge conditions are known, and are discussed below. | |
| | | | |
| The first edge condition has the following scenario: | | The first edge condition has the following scenario: | |
| | | | |
| 1. Client A acquires a lock. | | 1. Client A acquires a lock. | |
| | | | |
|
| 2. Client A and server experience mutual network partition, | | 2. Client A and server experience mutual network partition, such | |
| such that client A is unable to renew its lease. | | that client A is unable to renew its lease. | |
| | | | |
| 3. Client A's lease expires, so server releases lock. | | 3. Client A's lease expires, so server releases lock. | |
| | | | |
|
| 4. Client B acquires a lock that would have conflicted with | | 4. Client B acquires a lock that would have conflicted with that | |
| that of Client A. | | of Client A. | |
| | | | |
| 5. Client B releases the lock | | 5. Client B releases the lock | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| 6. Server reboots | | 6. Server reboots | |
| | | | |
| 7. Network partition between client A and server heals. | | 7. Network partition between client A and server heals. | |
| | | | |
| 8. Client A issues a RENEW operation, and gets back a | | 8. Client A issues a RENEW operation, and gets back a | |
| NFS4ERR_STALE_CLIENTID. | | NFS4ERR_STALE_CLIENTID. | |
| | | | |
| 9. Client A reclaims its lock within the server's grace period. | | 9. Client A reclaims its lock within the server's grace period. | |
| | | | |
| Thus, at the final step, the server has erroneously granted client | | Thus, at the final step, the server has erroneously granted client | |
| A's lock reclaim. If client B modified the object the lock was | | A's lock reclaim. If client B modified the object the lock was | |
| protecting, client A will experience object corruption. | | protecting, client A will experience object corruption. | |
| | | | |
| The second known edge condition follows: | | The second known edge condition follows: | |
| | | | |
| 1. Client A acquires a lock. | | 1. Client A acquires a lock. | |
| | | | |
| 2. Server reboots. | | 2. Server reboots. | |
| | | | |
|
| 3. Client A and server experience mutual network partition, | | 3. Client A and server experience mutual network partition, such | |
| such that client A is unable to reclaim its lock within the | | that client A is unable to reclaim its lock within the grace | |
| grace period. | | period. | |
| | | | |
| 4. Server's reclaim grace period ends. Client A has no locks | | 4. Server's reclaim grace period ends. Client A has no locks | |
| recorded on server. | | recorded on server. | |
| | | | |
|
| 5. Client B acquires a lock that would have conflicted with | | 5. Client B acquires a lock that would have conflicted with that | |
| that of Client A. | | of Client A. | |
| | | | |
|
| 6. Client B releases the lock | | 6. Client B releases the lock. | |
| | | | |
|
| 7. Server reboots a second time | | 7. Server reboots a second time. | |
| | | | |
| 8. Network partition between client A and server heals. | | 8. Network partition between client A and server heals. | |
| | | | |
| 9. Client A issues a RENEW operation, and gets back a | | 9. Client A issues a RENEW operation, and gets back a | |
| NFS4ERR_STALE_CLIENTID. | | NFS4ERR_STALE_CLIENTID. | |
| | | | |
| 10. Client A reclaims its lock within the server's grace period. | | 10. Client A reclaims its lock within the server's grace period. | |
| | | | |
| As with the first edge condition, the final step of the scenario of | | As with the first edge condition, the final step of the scenario of | |
| the second edge condition has the server erroneously granting client | | the second edge condition has the server erroneously granting client | |
| A's lock reclaim. | | A's lock reclaim. | |
| | | | |
| Solving the first and second edge conditions requires that the server | | Solving the first and second edge conditions requires that the server | |
| either assume after it reboots that edge condition occurs, and thus | | either assume after it reboots that edge condition occurs, and thus | |
| return NFS4ERR_NO_GRACE for all reclaim attempts, or that the server | | return NFS4ERR_NO_GRACE for all reclaim attempts, or that the server | |
| record some information stable storage. The amount of information | | record some information stable storage. The amount of information | |
| the server records in stable storage is in inverse proportion to how | | the server records in stable storage is in inverse proportion to how | |
| harsh the server wants to be whenever the edge conditions occur. The | | harsh the server wants to be whenever the edge conditions occur. The | |
| server that is completely tolerant of all edge conditions will record | | server that is completely tolerant of all edge conditions will record | |
| in stable storage every lock that is acquired, removing the lock | | in stable storage every lock that is acquired, removing the lock | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| record from stable storage only when the lock is unlocked by the | | record from stable storage only when the lock is unlocked by the | |
| client and the lock's lockowner advances the sequence number such | | client and the lock's lockowner advances the sequence number such | |
| that the lock release is not the last stateful event for the | | that the lock release is not the last stateful event for the | |
|
| lockowner's sequence. For the two aforementioned edge conditions, the | | lockowner's sequence. For the two aforementioned edge conditions, | |
| harshest a server can be, and still support a grace period for | | the harshest a server can be, and still support a grace period for | |
| reclaims, requires that the server record in stable storage | | reclaims, requires that the server record in stable storage | |
| information some minimal information. For example, a server | | information some minimal information. For example, a server | |
| implementation could, for each client, save in stable storage a | | implementation could, for each client, save in stable storage a | |
| record containing: | | record containing: | |
| | | | |
| o the client's id string | | o the client's id string | |
| | | | |
|
| o a boolean that indicates if the client's lease expired or if | | o a boolean that indicates if the client's lease expired or if there | |
| there was administrative intervention (see the section, | | was administrative intervention (see the section, Server | |
| Server Revocation of Locks) to revoke a record lock, share | | Revocation of Locks) to revoke a record lock, share reservation, | |
| reservation, or delegation | | or delegation | |
| | | | |
|
| o a timestamp that is updated the first time after a server | | o a timestamp that is updated the first time after a server boot or | |
| boot or reboot the client acquires record locking, share | | reboot the client acquires record locking, share reservation, or | |
| reservation, or delegation state on the server. The | | delegation state on the server. The timestamp need not be updated | |
| timestamp need not be updated on subsequent lock requests | | on subsequent lock requests until the server reboots. | |
| until the server reboots. | | | |
| | | | |
| The server implementation would also record in the stable storage the | | The server implementation would also record in the stable storage the | |
| timestamps from the two most recent server reboots. | | timestamps from the two most recent server reboots. | |
| | | | |
| Assuming the above record keeping, for the first edge condition, | | Assuming the above record keeping, for the first edge condition, | |
| after the server reboots, the record that client A's lease expired | | after the server reboots, the record that client A's lease expired | |
| means that another client could have acquired a conflicting record | | means that another client could have acquired a conflicting record | |
| lock, share reservation, or delegation. Hence the server must reject | | lock, share reservation, or delegation. Hence the server must reject | |
| a reclaim from client A with the error NFS4ERR_NO_GRACE. | | a reclaim from client A with the error NFS4ERR_NO_GRACE. | |
| | | | |
| For the second edge condition, after the server reboots for a second | | For the second edge condition, after the server reboots for a second | |
| time, the record that the client had an unexpired record lock, share | | time, the record that the client had an unexpired record lock, share | |
| reservation, or delegation established before the server's previous | | reservation, or delegation established before the server's previous | |
| incarnation means that the server must reject a reclaim from client A | | incarnation means that the server must reject a reclaim from client A | |
| with the error NFS4ERR_NO_GRACE. | | with the error NFS4ERR_NO_GRACE. | |
| | | | |
| Regardless of the level and approach to record keeping, the server | | Regardless of the level and approach to record keeping, the server | |
| MUST implement one of the following strategies (which apply to | | MUST implement one of the following strategies (which apply to | |
| reclaims of share reservations, record locks, and delegations): | | reclaims of share reservations, record locks, and delegations): | |
| | | | |
|
| 1. Reject all reclaims with NFS4ERR_NO_GRACE. This is | | 1. Reject all reclaims with NFS4ERR_NO_GRACE. This is superharsh, | |
| superharsh, but necessary if the server does not want to | | but necessary if the server does not want to record lock state | |
| record lock state in stable storage. | | in stable storage. | |
| | | | |
|
| 2. Record sufficient state in stable storage such that all | | 2. Record sufficient state in stable storage such that all known | |
| known edge conditions involving server reboot, including the | | edge conditions involving server reboot, including the two | |
| two noted in this section, are detected. False positives are | | noted in this section, are detected. False positives are | |
| acceptable. Note that at this time, it is not known if there | | acceptable. Note that at this time, it is not known if there | |
| are other edge conditions. | | are other edge conditions. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | In the event, after a server reboot, the server determines that | |
| | | there is unrecoverable damage or corruption to the the stable | |
| In the event, after a server reboot, the server determines | | storage, then for all clients and/or locks affected, the server | |
| that there is unrecoverable damage or corruption to the the | | MUST return NFS4ERR_NO_GRACE. | |
| stable storage, then for all clients and/or locks affected, | | | |
| the server MUST return NFS4ERR_NO_GRACE. | | | |
| | | | |
| A mandate for the client's handling of the NFS4ERR_NO_GRACE error is | | A mandate for the client's handling of the NFS4ERR_NO_GRACE error is | |
| outside the scope of this specification, since the strategies for | | outside the scope of this specification, since the strategies for | |
| such handling are very dependent on the client's operating | | such handling are very dependent on the client's operating | |
| environment. However, one potential approach is described below. | | environment. However, one potential approach is described below. | |
| | | | |
| When the client receives NFS4ERR_NO_GRACE, it could examine the | | When the client receives NFS4ERR_NO_GRACE, it could examine the | |
| change attribute of the objects the client is trying to reclaim state | | change attribute of the objects the client is trying to reclaim state | |
| for, and use that to determine whether to re-establish the state via | | for, and use that to determine whether to re-establish the state via | |
| normal OPEN or LOCK requests. This is acceptable provided the | | normal OPEN or LOCK requests. This is acceptable provided the | |
| | | | |
| skipping to change at page 84, line 37 | | skipping to change at page 85, line 9 | |
| client should do for dealing with unreclaimed delegations on client | | client should do for dealing with unreclaimed delegations on client | |
| state. | | state. | |
| | | | |
| For further discussion of revocation of locks see the section "Server | | For further discussion of revocation of locks see the section "Server | |
| Revocation of Locks". | | Revocation of Locks". | |
| | | | |
| 8.7. Recovery from a Lock Request Timeout or Abort | | 8.7. Recovery from a Lock Request Timeout or Abort | |
| | | | |
| In the event a lock request times out, a client may decide to not | | In the event a lock request times out, a client may decide to not | |
| retry the request. The client may also abort the request when the | | retry the request. The client may also abort the request when the | |
|
| process for which it was issued is terminated (e.g. in UNIX due to a | | process for which it was issued is terminated (e.g., in UNIX due to a | |
| signal). It is possible though that the server received the request | | signal). It is possible though that the server received the request | |
| and acted upon it. This would change the state on the server without | | and acted upon it. This would change the state on the server without | |
| the client being aware of the change. It is paramount that the | | the client being aware of the change. It is paramount that the | |
| client re-synchronize state with server before it attempts any other | | client re-synchronize state with server before it attempts any other | |
| operation that takes a seqid and/or a stateid with the same | | operation that takes a seqid and/or a stateid with the same | |
| lock_owner. This is straightforward to do without a special re- | | lock_owner. This is straightforward to do without a special re- | |
| synchronize operation. | | synchronize operation. | |
| | | | |
| Since the server maintains the last lock request and response | | Since the server maintains the last lock request and response | |
| received on the lock_owner, for each lock_owner, the client should | | received on the lock_owner, for each lock_owner, the client should | |
| cache the last lock request it sent such that the lock request did | | cache the last lock request it sent such that the lock request did | |
| not receive a response. From this, the next time the client does a | | not receive a response. From this, the next time the client does a | |
| lock operation for the lock_owner, it can send the cached request, if | | lock operation for the lock_owner, it can send the cached request, if | |
|
| there is one, and if the request was one that established state (e.g. | | there is one, and if the request was one that established state | |
| a LOCK or OPEN operation), the server will return the cached result | | (e.g., a LOCK or OPEN operation), the server will return the cached | |
| or if never saw the request, perform it. The client can follow up | | result or if never saw the request, perform it. The client can | |
| with a request to remove the state (e.g. a LOCKU or CLOSE operation). | | follow up with a request to remove the state (e.g., a LOCKU or CLOSE | |
| With this approach, the sequencing and stateid information on the | | operation). With this approach, the sequencing and stateid | |
| client and server for the given lock_owner will re-synchronize and in | | information on the client and server for the given lock_owner will | |
| turn the lock state will re-synchronize. | | re-synchronize and in turn the lock state will re-synchronize. | |
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| 8.8. Server Revocation of Locks | | 8.8. Server Revocation of Locks | |
| | | | |
| At any point, the server can revoke locks held by a client and the | | At any point, the server can revoke locks held by a client and the | |
| client must be prepared for this event. When the client detects that | | client must be prepared for this event. When the client detects that | |
| its locks have been or may have been revoked, the client is | | its locks have been or may have been revoked, the client is | |
| responsible for validating the state information between itself and | | responsible for validating the state information between itself and | |
| the server. Validating locking state for the client means that it | | the server. Validating locking state for the client means that it | |
| must verify or reclaim state for each lock currently held. | | must verify or reclaim state for each lock currently held. | |
| | | | |
| | | | |
| skipping to change at page 86, line 5 | | skipping to change at page 86, line 35 | |
| which the server may grant conflicting locks after the lease period | | which the server may grant conflicting locks after the lease period | |
| has expired for a client. When it is possible that the lease period | | has expired for a client. When it is possible that the lease period | |
| has expired, the client must validate each lock currently held to | | has expired, the client must validate each lock currently held to | |
| ensure that a conflicting lock has not been granted. The client may | | ensure that a conflicting lock has not been granted. The client may | |
| accomplish this task by issuing an I/O request, either a pending I/O | | accomplish this task by issuing an I/O request, either a pending I/O | |
| or a zero-length read, specifying the stateid associated with the | | or a zero-length read, specifying the stateid associated with the | |
| lock in question. If the response to the request is success, the | | lock in question. If the response to the request is success, the | |
| client has validated all of the locks governed by that stateid and | | client has validated all of the locks governed by that stateid and | |
| re-established the appropriate state between itself and the server. | | re-established the appropriate state between itself and the server. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| If the I/O request is not successful, then one or more of the locks | | If the I/O request is not successful, then one or more of the locks | |
| associated with the stateid was revoked by the server and the client | | associated with the stateid was revoked by the server and the client | |
| must notify the owner. | | must notify the owner. | |
| | | | |
| 8.9. Share Reservations | | 8.9. Share Reservations | |
| | | | |
| A share reservation is a mechanism to control access to a file. It | | A share reservation is a mechanism to control access to a file. It | |
| is a separate and independent mechanism from record locking. When a | | is a separate and independent mechanism from record locking. When a | |
| client opens a file, it issues an OPEN operation to the server | | client opens a file, it issues an OPEN operation to the server | |
| specifying the type of access required (READ, WRITE, or BOTH) and the | | specifying the type of access required (READ, WRITE, or BOTH) and the | |
| | | | |
| skipping to change at page 87, line 5 | | skipping to change at page 87, line 35 | |
| | | | |
| To provide correct share semantics, a client MUST use the OPEN | | To provide correct share semantics, a client MUST use the OPEN | |
| operation to obtain the initial filehandle and indicate the desired | | operation to obtain the initial filehandle and indicate the desired | |
| access and what if any access to deny. Even if the client intends to | | access and what if any access to deny. Even if the client intends to | |
| use a stateid of all 0's or all 1's, it must still obtain the | | use a stateid of all 0's or all 1's, it must still obtain the | |
| filehandle for the regular file with the OPEN operation so the | | filehandle for the regular file with the OPEN operation so the | |
| appropriate share semantics can be applied. For clients that do not | | appropriate share semantics can be applied. For clients that do not | |
| have a deny mode built into their open programming interfaces, deny | | have a deny mode built into their open programming interfaces, deny | |
| equal to NONE should be used. | | equal to NONE should be used. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| The OPEN operation with the CREATE flag, also subsumes the CREATE | | The OPEN operation with the CREATE flag, also subsumes the CREATE | |
| operation for regular files as used in previous versions of the NFS | | operation for regular files as used in previous versions of the NFS | |
| protocol. This allows a create with a share to be done atomically. | | protocol. This allows a create with a share to be done atomically. | |
| | | | |
| The CLOSE operation removes all share reservations held by the | | The CLOSE operation removes all share reservations held by the | |
| lock_owner on that file. If record locks are held, the client SHOULD | | lock_owner on that file. If record locks are held, the client SHOULD | |
| release all locks before issuing a CLOSE. The server MAY free all | | release all locks before issuing a CLOSE. The server MAY free all | |
| outstanding locks on CLOSE but some servers may not support the CLOSE | | outstanding locks on CLOSE but some servers may not support the CLOSE | |
| of a file that still has record locks held. The server MUST return | | of a file that still has record locks held. The server MUST return | |
| failure, NFS4ERR_LOCKS_HELD, if any locks would exist after the | | failure, NFS4ERR_LOCKS_HELD, if any locks would exist after the | |
| CLOSE. | | CLOSE. | |
| | | | |
| The LOOKUP operation will return a filehandle without establishing | | The LOOKUP operation will return a filehandle without establishing | |
| any lock state on the server. Without a valid stateid, the server | | any lock state on the server. Without a valid stateid, the server | |
| will assume the client has the least access. For example, a file | | will assume the client has the least access. For example, a file | |
| opened with deny READ/WRITE cannot be accessed using a filehandle | | opened with deny READ/WRITE cannot be accessed using a filehandle | |
| obtained through LOOKUP because it would not have a valid stateid | | obtained through LOOKUP because it would not have a valid stateid | |
|
| (i.e. using a stateid of all bits 0 or all bits 1). | | (i.e., using a stateid of all bits 0 or all bits 1). | |
| | | | |
| 8.10.1. Close and Retention of State Information | | 8.10.1. Close and Retention of State Information | |
| | | | |
| Since a CLOSE operation requests deallocation of a stateid, dealing | | Since a CLOSE operation requests deallocation of a stateid, dealing | |
| with retransmission of the CLOSE, may pose special difficulties, | | with retransmission of the CLOSE, may pose special difficulties, | |
| since the state information, which normally would be used to | | since the state information, which normally would be used to | |
| determine the state of the open file being designated, might be | | determine the state of the open file being designated, might be | |
| deallocated, resulting in an NFS4ERR_BAD_STATEID error. | | deallocated, resulting in an NFS4ERR_BAD_STATEID error. | |
| | | | |
| Servers may deal with this problem in a number of ways. To provide | | Servers may deal with this problem in a number of ways. To provide | |
| the greatest degree assurance that the protocol is being used | | the greatest degree assurance that the protocol is being used | |
| properly, a server should, rather than deallocate the stateid, mark | | properly, a server should, rather than deallocate the stateid, mark | |
| it as close-pending, and retain the stateid with this status, until | | it as close-pending, and retain the stateid with this status, until | |
| later deallocation. In this way, a retransmitted CLOSE can be | | later deallocation. In this way, a retransmitted CLOSE can be | |
| recognized since the stateid points to state information with this | | recognized since the stateid points to state information with this | |
| distinctive status, so that it can be handled without error. | | distinctive status, so that it can be handled without error. | |
| | | | |
| When adopting this strategy, a server should retain the state | | When adopting this strategy, a server should retain the state | |
| information until the earliest of: | | information until the earliest of: | |
| | | | |
|
| o Another validly sequenced request for the same lockowner, that | | o Another validly sequenced request for the same lockowner, that is | |
| is not a retransmission. | | not a retransmission. | |
| | | | |
| o The time that a lockowner is freed by the server due to period | | o The time that a lockowner is freed by the server due to period | |
| with no activity. | | with no activity. | |
| | | | |
| o All locks for the client are freed as a result of a SETCLIENTID. | | o All locks for the client are freed as a result of a SETCLIENTID. | |
| | | | |
| Servers may avoid this complexity, at the cost of less complete | | Servers may avoid this complexity, at the cost of less complete | |
| protocol error checking, by simply responding NFS4_OK in the event of | | protocol error checking, by simply responding NFS4_OK in the event of | |
| a CLOSE for a deallocated stateid, on the assumption that this case | | a CLOSE for a deallocated stateid, on the assumption that this case | |
| must be caused by a retransmitted close. When adopting this | | must be caused by a retransmitted close. When adopting this | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| approach, it is desirable to at least log an error when returning a | | approach, it is desirable to at least log an error when returning a | |
| no-error indication in this situation. If the server maintains a | | no-error indication in this situation. If the server maintains a | |
| reply-cache mechanism, it can verify the CLOSE is indeed a | | reply-cache mechanism, it can verify the CLOSE is indeed a | |
| retransmission and avoid error logging in most cases. | | retransmission and avoid error logging in most cases. | |
| | | | |
| 8.11. Open Upgrade and Downgrade | | 8.11. Open Upgrade and Downgrade | |
| | | | |
| When an OPEN is done for a file and the lockowner for which the open | | When an OPEN is done for a file and the lockowner for which the open | |
| is being done already has the file open, the result is to upgrade the | | is being done already has the file open, the result is to upgrade the | |
| open file status maintained on the server to include the access and | | open file status maintained on the server to include the access and | |
| | | | |
| skipping to change at page 88, line 37 | | skipping to change at page 89, line 21 | |
| to the same file object and returns different filehandles on two | | to the same file object and returns different filehandles on two | |
| different OPENs of the same file object, the server MUST NOT "OR" | | different OPENs of the same file object, the server MUST NOT "OR" | |
| together the access and deny bits and coalesce the two open files. | | together the access and deny bits and coalesce the two open files. | |
| Instead the server must maintain separate OPENs with separate | | Instead the server must maintain separate OPENs with separate | |
| stateids and will require separate CLOSEs to free them. | | stateids and will require separate CLOSEs to free them. | |
| | | | |
| When multiple open files on the client are merged into a single open | | When multiple open files on the client are merged into a single open | |
| file object on the server, the close of one of the open files (on the | | file object on the server, the close of one of the open files (on the | |
| client) may necessitate change of the access and deny status of the | | client) may necessitate change of the access and deny status of the | |
| open file on the server. This is because the union of the access and | | open file on the server. This is because the union of the access and | |
|
| deny bits for the remaining opens may be smaller (i.e. a proper | | deny bits for the remaining opens may be smaller (i.e., a proper | |
| subset) than previously. The OPEN_DOWNGRADE operation is used to | | subset) than previously. The OPEN_DOWNGRADE operation is used to | |
| make the necessary change and the client should use it to update the | | make the necessary change and the client should use it to update the | |
| server so that share reservation requests by other clients are | | server so that share reservation requests by other clients are | |
| handled properly. | | handled properly. | |
| | | | |
| 8.12. Short and Long Leases | | 8.12. Short and Long Leases | |
| | | | |
| When determining the time period for the server lease, the usual | | When determining the time period for the server lease, the usual | |
| lease tradeoffs apply. Short leases are good for fast server | | lease tradeoffs apply. Short leases are good for fast server | |
| recovery at a cost of increased RENEW or READ (with zero length) | | recovery at a cost of increased RENEW or READ (with zero length) | |
| requests. Longer leases are certainly kinder and gentler to servers | | requests. Longer leases are certainly kinder and gentler to servers | |
| trying to handle very large numbers of clients. The number of RENEW | | trying to handle very large numbers of clients. The number of RENEW | |
| requests drop in proportion to the lease time. The disadvantages of | | requests drop in proportion to the lease time. The disadvantages of | |
| long leases are slower recovery after server failure (the server must | | long leases are slower recovery after server failure (the server must | |
| wait for the leases to expire and the grace period to elapse before | | wait for the leases to expire and the grace period to elapse before | |
| granting new lock requests) and increased file contention (if client | | granting new lock requests) and increased file contention (if client | |
| fails to transmit an unlock request then server must wait for lease | | fails to transmit an unlock request then server must wait for lease | |
| expiration before granting new locks). | | expiration before granting new locks). | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| Long leases are usable if the server is able to store lease state in | | Long leases are usable if the server is able to store lease state in | |
| non-volatile memory. Upon recovery, the server can reconstruct the | | non-volatile memory. Upon recovery, the server can reconstruct the | |
| lease state from its non-volatile memory and continue operation with | | lease state from its non-volatile memory and continue operation with | |
| its clients and therefore long leases would not be an issue. | | its clients and therefore long leases would not be an issue. | |
| | | | |
| 8.13. Clocks, Propagation Delay, and Calculating Lease Expiration | | 8.13. Clocks, Propagation Delay, and Calculating Lease Expiration | |
| | | | |
| To avoid the need for synchronized clocks, lease times are granted by | | To avoid the need for synchronized clocks, lease times are granted by | |
| the server as a time delta. However, there is a requirement that the | | the server as a time delta. However, there is a requirement that the | |
| client and server clocks do not drift excessively over the duration | | client and server clocks do not drift excessively over the duration | |
| of the lock. There is also the issue of propagation delay across the | | of the lock. There is also the issue of propagation delay across the | |
| network which could easily be several hundred milliseconds as well as | | network which could easily be several hundred milliseconds as well as | |
| the possibility that requests will be lost and need to be | | the possibility that requests will be lost and need to be | |
| retransmitted. | | retransmitted. | |
| | | | |
| To take propagation delay into account, the client should subtract it | | To take propagation delay into account, the client should subtract it | |
|
| from lease times (e.g. if the client estimates the one-way | | from lease times (e.g., if the client estimates the one-way | |
| propagation delay as 200 msec, then it can assume that the lease is | | propagation delay as 200 msec, then it can assume that the lease is | |
| already 200 msec old when it gets it). In addition, it will take | | already 200 msec old when it gets it). In addition, it will take | |
| another 200 msec to get a response back to the server. So the client | | another 200 msec to get a response back to the server. So the client | |
| must send a lock renewal or write data back to the server 400 msec | | must send a lock renewal or write data back to the server 400 msec | |
| before the lease would expire. | | before the lease would expire. | |
| | | | |
| The server's lease period configuration should take into account the | | The server's lease period configuration should take into account the | |
| network distance of the clients that will be accessing the server's | | network distance of the clients that will be accessing the server's | |
| resources. It is expected that the lease period will take into | | resources. It is expected that the lease period will take into | |
| account the network propagation delays and other network delay | | account the network propagation delays and other network delay | |
| factors for the client population. Since the protocol does not allow | | factors for the client population. Since the protocol does not allow | |
| for an automatic method to determine an appropriate lease period, the | | for an automatic method to determine an appropriate lease period, the | |
| server's administrator may have to tune the lease period. | | server's administrator may have to tune the lease period. | |
| | | | |
| 8.14. Migration, Replication and State | | 8.14. Migration, Replication and State | |
| | | | |
| When responsibility for handling a given file system is transferred | | When responsibility for handling a given file system is transferred | |
| to a new server (migration) or the client chooses to use an alternate | | to a new server (migration) or the client chooses to use an alternate | |
|
| server (e.g. in response to server unresponsiveness) in the context | | server (e.g., in response to server unresponsiveness) in the context | |
| of file system replication, the appropriate handling of state shared | | of file system replication, the appropriate handling of state shared | |
|
| between the client and server (i.e. locks, leases, stateids, and | | between the client and server (i.e., locks, leases, stateids, and | |
| clientids) is as described below. The handling differs between | | clientids) is as described below. The handling differs between | |
| migration and replication. For related discussion of file server | | migration and replication. For related discussion of file server | |
| state and recover of such see the sections under "File Locking and | | state and recover of such see the sections under "File Locking and | |
|
| Share Reservations" | | Share Reservations". | |
| | | | |
| If server replica or a server immigrating a filesystem agrees to, or | | If server replica or a server immigrating a filesystem agrees to, or | |
| is expected to, accept opaque values from the client that originated | | is expected to, accept opaque values from the client that originated | |
| from another server, then it is a wise implementation practice for | | from another server, then it is a wise implementation practice for | |
|
| the servers to encode the "opaque" values in network byte order. This | | the servers to encode the "opaque" values in network byte order. | |
| way, servers acting as replicas or immigrating filesystems will be | | This way, servers acting as replicas or immigrating filesystems will | |
| able to parse values like stateids, directory cookies, filehandles, | | be able to parse values like stateids, directory cookies, | |
| etc. even if their native byte order is different from other servers | | filehandles, etc. even if their native byte order is different from | |
| | | other servers cooperating in the replication and migration of the | |
| Draft Specification NFS version 4 Protocol November 2002 | | filesystem. | |
| | | | |
| cooperating in the replication and migration of the filesystem. | | | |
| | | | |
| 8.14.1. Migration and State | | 8.14.1. Migration and State | |
| | | | |
| In the case of migration, the servers involved in the migration of a | | In the case of migration, the servers involved in the migration of a | |
| filesystem SHOULD transfer all server state from the original to the | | filesystem SHOULD transfer all server state from the original to the | |
| new server. This must be done in a way that is transparent to the | | new server. This must be done in a way that is transparent to the | |
| client. This state transfer will ease the client's transition when a | | client. This state transfer will ease the client's transition when a | |
| filesystem migration occurs. If the servers are successful in | | filesystem migration occurs. If the servers are successful in | |
| transferring all state, the client will continue to use stateids | | transferring all state, the client will continue to use stateids | |
| assigned by the original server. Therefore the new server must | | assigned by the original server. Therefore the new server must | |
| | | | |
| skipping to change at page 91, line 4 | | skipping to change at page 91, line 46 | |
| server control, the handling of state is different. In this case, | | server control, the handling of state is different. In this case, | |
| leases, stateids and clientids do not have validity across a | | leases, stateids and clientids do not have validity across a | |
| transition from one server to another. The client must re-establish | | transition from one server to another. The client must re-establish | |
| its locks on the new server. This can be compared to the re- | | its locks on the new server. This can be compared to the re- | |
| establishment of locks by means of reclaim-type requests after a | | establishment of locks by means of reclaim-type requests after a | |
| server reboot. The difference is that the server has no provision to | | server reboot. The difference is that the server has no provision to | |
| distinguish requests reclaiming locks from those obtaining new locks | | distinguish requests reclaiming locks from those obtaining new locks | |
| or to defer the latter. Thus, a client re-establishing a lock on the | | or to defer the latter. Thus, a client re-establishing a lock on the | |
| new server (by means of a LOCK or OPEN request), may have the | | new server (by means of a LOCK or OPEN request), may have the | |
| requests denied due to a conflicting lock. Since replication is | | requests denied due to a conflicting lock. Since replication is | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| intended for read-only use of filesystems, such denial of locks | | intended for read-only use of filesystems, such denial of locks | |
| should not pose large difficulties in practice. When an attempt to | | should not pose large difficulties in practice. When an attempt to | |
| re-establish a lock on a new server is denied, the client should | | re-establish a lock on a new server is denied, the client should | |
| treat the situation as if his original lock had been revoked. | | treat the situation as if his original lock had been revoked. | |
| | | | |
| 8.14.3. Notification of Migrated Lease | | 8.14.3. Notification of Migrated Lease | |
| | | | |
| In the case of lease renewal, the client may not be submitting | | In the case of lease renewal, the client may not be submitting | |
| requests for a filesystem that has been migrated to another server. | | requests for a filesystem that has been migrated to another server. | |
| This can occur because of the implicit lease renewal mechanism. The | | This can occur because of the implicit lease renewal mechanism. The | |
| client renews leases for all filesystems when submitting a request to | | client renews leases for all filesystems when submitting a request to | |
| any one filesystem at the server. | | any one filesystem at the server. | |
| | | | |
| In order for the client to schedule renewal of leases that may have | | In order for the client to schedule renewal of leases that may have | |
| been relocated to the new server, the client must find out about | | been relocated to the new server, the client must find out about | |
| lease relocation before those leases expire. To accomplish this, all | | lease relocation before those leases expire. To accomplish this, all | |
|
| operations which implicitly renew leases for a client (i.e. OPEN, | | operations which implicitly renew leases for a client (i.e., OPEN, | |
| CLOSE, READ, WRITE, RENEW, LOCK, LOCKT, LOCKU), will return the error | | CLOSE, READ, WRITE, RENEW, LOCK, LOCKT, LOCKU), will return the error | |
| NFS4ERR_LEASE_MOVED if responsibility for any of the leases to be | | NFS4ERR_LEASE_MOVED if responsibility for any of the leases to be | |
| renewed has been transferred to a new server. This condition will | | renewed has been transferred to a new server. This condition will | |
| continue until the client receives an NFS4ERR_MOVED error and the | | continue until the client receives an NFS4ERR_MOVED error and the | |
| server receives the subsequent GETATTR(fs_locations) for an access to | | server receives the subsequent GETATTR(fs_locations) for an access to | |
| each filesystem for which a lease has been moved to a new server. | | each filesystem for which a lease has been moved to a new server. | |
| | | | |
| When a client receives an NFS4ERR_LEASE_MOVED error, it should | | When a client receives an NFS4ERR_LEASE_MOVED error, it should | |
| perform an operation on each filesystem associated with the server in | | perform an operation on each filesystem associated with the server in | |
| question. When the client receives an NFS4ERR_MOVED error, the | | question. When the client receives an NFS4ERR_MOVED error, the | |
| | | | |
| skipping to change at page 92, line 5 | | skipping to change at page 92, line 50 | |
| | | | |
| When state is transferred transparently, that state should include | | When state is transferred transparently, that state should include | |
| the correct value of the lease_time attribute. The lease_time | | the correct value of the lease_time attribute. The lease_time | |
| attribute on the destination server must never be less than that on | | attribute on the destination server must never be less than that on | |
| the source since this would result in premature expiration of leases | | the source since this would result in premature expiration of leases | |
| granted by the source server. Upon migration in which state is | | granted by the source server. Upon migration in which state is | |
| transferred transparently, the client is under no obligation to re- | | transferred transparently, the client is under no obligation to re- | |
| fetch the lease_time attribute and may continue to use the value | | fetch the lease_time attribute and may continue to use the value | |
| previously fetched (on the source server). | | previously fetched (on the source server). | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | If state has not been transferred transparently (i.e., the client | |
| | | sees a real or simulated server reboot), the client should fetch the | |
| If state has not been transferred transparently (i.e. the client sees | | value of lease_time on the new (i.e., destination) server, and use it | |
| a real or simulated server reboot), the client should fetch the value | | for subsequent locking requests. However the server must respect a | |
| of lease_time on the new (i.e. destination) server, and use it for | | grace period at least as long as the lease_time on the source server, | |
| subsequent locking requests. However the server must respect a grace | | in order to ensure that clients have ample time to reclaim their | |
| period at least as long as the lease_time on the source server, in | | locks before potentially conflicting non-reclaimed locks are granted. | |
| order to ensure that clients have ample time to reclaim their locks | | The means by which the new server obtains the value of lease_time on | |
| before potentially conflicting non-reclaimed locks are granted. The | | the old server is left to the server implementations. It is not | |
| means by which the new server obtains the value of lease_time on the | | | |
| old server is left to the server implementations. It is not | | | |
| specified by the NFS version 4 protocol. | | specified by the NFS version 4 protocol. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| 9. Client-Side Caching | | 9. Client-Side Caching | |
| | | | |
| Client-side caching of data, of file attributes, and of file names is | | Client-side caching of data, of file attributes, and of file names is | |
| essential to providing good performance with the NFS protocol. | | essential to providing good performance with the NFS protocol. | |
| Providing distributed cache coherence is a difficult problem and | | Providing distributed cache coherence is a difficult problem and | |
| previous versions of the NFS protocol have not attempted it. | | previous versions of the NFS protocol have not attempted it. | |
| Instead, several NFS client implementation techniques have been used | | Instead, several NFS client implementation techniques have been used | |
| to reduce the problems that a lack of coherence poses for users. | | to reduce the problems that a lack of coherence poses for users. | |
| These techniques have not been clearly defined by earlier protocol | | These techniques have not been clearly defined by earlier protocol | |
| specifications and it is often unclear what is valid or invalid | | specifications and it is often unclear what is valid or invalid | |
| | | | |
| skipping to change at page 94, line 4 | | skipping to change at page 94, line 16 | |
| conflicts exist is expensive. A better option with regards to | | conflicts exist is expensive. A better option with regards to | |
| performance is to allow a client that repeatedly opens a file to do | | performance is to allow a client that repeatedly opens a file to do | |
| so without reference to the server. This is done until potentially | | so without reference to the server. This is done until potentially | |
| conflicting operations from another client actually occur. | | conflicting operations from another client actually occur. | |
| | | | |
| A similar situation arises in connection with file locking. Sending | | A similar situation arises in connection with file locking. Sending | |
| file lock and unlock requests to the server as well as the read and | | file lock and unlock requests to the server as well as the read and | |
| write requests necessary to make data caching consistent with the | | write requests necessary to make data caching consistent with the | |
| locking semantics (see the section "Data Caching and File Locking") | | locking semantics (see the section "Data Caching and File Locking") | |
| can severely limit performance. When locking is used to provide | | can severely limit performance. When locking is used to provide | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| protection against infrequent conflicts, a large penalty is incurred. | | protection against infrequent conflicts, a large penalty is incurred. | |
| This penalty may discourage the use of file locking by applications. | | This penalty may discourage the use of file locking by applications. | |
| | | | |
| The NFS version 4 protocol provides more aggressive caching | | The NFS version 4 protocol provides more aggressive caching | |
| strategies with the following design goals: | | strategies with the following design goals: | |
| | | | |
| o Compatibility with a large range of server semantics. | | o Compatibility with a large range of server semantics. | |
| | | | |
|
| o Provide the same caching benefits as previous versions of the | | o Provide the same caching benefits as previous versions of the NFS | |
| NFS protocol when unable to provide the more aggressive model. | | protocol when unable to provide the more aggressive model. | |
| | | | |
|
| o Requirements for aggressive caching are organized so that a | | o Requirements for aggressive caching are organized so that a large | |
| large portion of the benefit can be obtained even when not all | | portion of the benefit can be obtained even when not all of the | |
| of the requirements can be met. | | requirements can be met. | |
| | | | |
| The appropriate requirements for the server are discussed in later | | The appropriate requirements for the server are discussed in later | |
| sections in which specific forms of caching are covered. (see the | | sections in which specific forms of caching are covered. (see the | |
| section "Open Delegation"). | | section "Open Delegation"). | |
| | | | |
| 9.2. Delegation and Callbacks | | 9.2. Delegation and Callbacks | |
| | | | |
| Recallable delegation of server responsibilities for a file to a | | Recallable delegation of server responsibilities for a file to a | |
| client improves performance by avoiding repeated requests to the | | client improves performance by avoiding repeated requests to the | |
| server in the absence of inter-client conflict. With the use of a | | server in the absence of inter-client conflict. With the use of a | |
| | | | |
| skipping to change at page 95, line 4 | | skipping to change at page 95, line 19 | |
| firewalls, for example), correct protocol operation does not depend | | firewalls, for example), correct protocol operation does not depend | |
| on them. Preliminary testing of callback functionality by means of a | | on them. Preliminary testing of callback functionality by means of a | |
| CB_NULL procedure determines whether callbacks can be supported. The | | CB_NULL procedure determines whether callbacks can be supported. The | |
| CB_NULL procedure checks the continuity of the callback path. A | | CB_NULL procedure checks the continuity of the callback path. A | |
| server makes a preliminary assessment of callback availability to a | | server makes a preliminary assessment of callback availability to a | |
| given client and avoids delegating responsibilities until it has | | given client and avoids delegating responsibilities until it has | |
| determined that callbacks are supported. Because the granting of a | | determined that callbacks are supported. Because the granting of a | |
| delegation is always conditional upon the absence of conflicting | | delegation is always conditional upon the absence of conflicting | |
| access, clients must not assume that a delegation will be granted and | | access, clients must not assume that a delegation will be granted and | |
| they must always be prepared for OPENs to be processed without any | | they must always be prepared for OPENs to be processed without any | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| delegations being granted. | | delegations being granted. | |
| | | | |
| Once granted, a delegation behaves in most ways like a lock. There | | Once granted, a delegation behaves in most ways like a lock. There | |
| is an associated lease that is subject to renewal together with all | | is an associated lease that is subject to renewal together with all | |
| of the other leases held by that client. | | of the other leases held by that client. | |
| | | | |
| Unlike locks, an operation by a second client to a delegated file | | Unlike locks, an operation by a second client to a delegated file | |
| will cause the server to recall a delegation through a callback. | | will cause the server to recall a delegation through a callback. | |
| | | | |
| On recall, the client holding the delegation must flush modified | | On recall, the client holding the delegation must flush modified | |
| | | | |
| skipping to change at page 96, line 4 | | skipping to change at page 96, line 21 | |
| | | | |
| There are three situations that delegation recovery must deal with: | | There are three situations that delegation recovery must deal with: | |
| | | | |
| o Client reboot or restart | | o Client reboot or restart | |
| | | | |
| o Server reboot or restart | | o Server reboot or restart | |
| | | | |
| o Network partition (full or callback-only) | | o Network partition (full or callback-only) | |
| | | | |
| In the event the client reboots or restarts, the failure to renew | | In the event the client reboots or restarts, the failure to renew | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| leases will result in the revocation of record locks and share | | leases will result in the revocation of record locks and share | |
| reservations. Delegations, however, may be treated a bit | | reservations. Delegations, however, may be treated a bit | |
| differently. | | differently. | |
| | | | |
| There will be situations in which delegations will need to be | | There will be situations in which delegations will need to be | |
| reestablished after a client reboots or restarts. The reason for | | reestablished after a client reboots or restarts. The reason for | |
| this is the client may have file data stored locally and this data | | this is the client may have file data stored locally and this data | |
| was associated with the previously held delegations. The client will | | was associated with the previously held delegations. The client will | |
| need to reestablish the appropriate file state on the server. | | need to reestablish the appropriate file state on the server. | |
| | | | |
| | | | |
| skipping to change at page 96, line 35 | | skipping to change at page 96, line 49 | |
| storage so that the delegations can be reclaimed. For open | | storage so that the delegations can be reclaimed. For open | |
| delegations, such delegations are reclaimed using OPEN with a claim | | delegations, such delegations are reclaimed using OPEN with a claim | |
| type of CLAIM_DELEGATE_PREV. (See the sections on "Data Caching and | | type of CLAIM_DELEGATE_PREV. (See the sections on "Data Caching and | |
| Revocation" and "Operation 18: OPEN" for discussion of open | | Revocation" and "Operation 18: OPEN" for discussion of open | |
| delegation and the details of OPEN respectively). | | delegation and the details of OPEN respectively). | |
| | | | |
| A server MAY support a claim type of CLAIM_DELEGATE_PREV, but if it | | A server MAY support a claim type of CLAIM_DELEGATE_PREV, but if it | |
| does, it MUST NOT remove delegations upon SETCLIENTID_CONFIRM, and | | does, it MUST NOT remove delegations upon SETCLIENTID_CONFIRM, and | |
| instead MUST, for a period of time no less than that of the value of | | instead MUST, for a period of time no less than that of the value of | |
| the lease_time attribute, maintain the client's delegations to allow | | the lease_time attribute, maintain the client's delegations to allow | |
|
| time for the client to issue CLAIM_DELEGATE_PREV requests. The server | | time for the client to issue CLAIM_DELEGATE_PREV requests. The | |
| that supports CLAIM_DELEGATE_PREV MUST support the DELEGPURGE | | server that supports CLAIM_DELEGATE_PREV MUST support the DELEGPURGE | |
| operation. | | operation. | |
| | | | |
| When the server reboots or restarts, delegations are reclaimed (using | | When the server reboots or restarts, delegations are reclaimed (using | |
| the OPEN operation with CLAIM_PREVIOUS) in a similar fashion to | | the OPEN operation with CLAIM_PREVIOUS) in a similar fashion to | |
| record locks and share reservations. However, there is a slight | | record locks and share reservations. However, there is a slight | |
| semantic difference. In the normal case if the server decides that a | | semantic difference. In the normal case if the server decides that a | |
| delegation should not be granted, it performs the requested action | | delegation should not be granted, it performs the requested action | |
|
| (e.g. OPEN) without granting any delegation. For reclaim, the server | | (e.g., OPEN) without granting any delegation. For reclaim, the | |
| grants the delegation but a special designation is applied so that | | server grants the delegation but a special designation is applied so | |
| the client treats the delegation as having been granted but recalled | | that the client treats the delegation as having been granted but | |
| by the server. Because of this, the client has the duty to write all | | recalled by the server. Because of this, the client has the duty to | |
| modified state to the server and then return the delegation. This | | write all modified state to the server and then return the | |
| process of handling delegation reclaim reconciles three principles of | | delegation. This process of handling delegation reclaim reconciles | |
| the NFS version 4 protocol: | | three principles of the NFS version 4 protocol: | |
| | | | |
| o Upon reclaim, a client reporting resources assigned to it by an | | o Upon reclaim, a client reporting resources assigned to it by an | |
| earlier server instance must be granted those resources. | | earlier server instance must be granted those resources. | |
| | | | |
| o The server has unquestionable authority to determine whether | | o The server has unquestionable authority to determine whether | |
|
| delegations are to be granted and, once granted, whether they | | delegations are to be granted and, once granted, whether they are | |
| are to be continued. | | to be continued. | |
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| o The use of callbacks is not to be depended upon until the client | | o The use of callbacks is not to be depended upon until the client | |
| has proven its ability to receive them. | | has proven its ability to receive them. | |
| | | | |
| When a network partition occurs, delegations are subject to freeing | | When a network partition occurs, delegations are subject to freeing | |
| by the server when the lease renewal period expires. This is similar | | by the server when the lease renewal period expires. This is similar | |
| to the behavior for locks and share reservations. For delegations, | | to the behavior for locks and share reservations. For delegations, | |
| however, the server may extend the period in which conflicting | | however, the server may extend the period in which conflicting | |
| requests are held off. Eventually the occurrence of a conflicting | | requests are held off. Eventually the occurrence of a conflicting | |
| request from another client will cause revocation of the delegation. | | request from another client will cause revocation of the delegation. | |
|
| A loss of the callback path (e.g. by later network configuration | | A loss of the callback path (e.g., by later network configuration | |
| change) will have the same effect. A recall request will fail and | | change) will have the same effect. A recall request will fail and | |
| revocation of the delegation will result. | | revocation of the delegation will result. | |
| | | | |
| A client normally finds out about revocation of a delegation when it | | A client normally finds out about revocation of a delegation when it | |
| uses a stateid associated with a delegation and receives the error | | uses a stateid associated with a delegation and receives the error | |
| NFS4ERR_EXPIRED. It also may find out about delegation revocation | | NFS4ERR_EXPIRED. It also may find out about delegation revocation | |
| after a client reboot when it attempts to reclaim a delegation and | | after a client reboot when it attempts to reclaim a delegation and | |
| receives that same error. Note that in the case of a revoked write | | receives that same error. Note that in the case of a revoked write | |
| open delegation, there are issues because data may have been modified | | open delegation, there are issues because data may have been modified | |
| by the client whose delegation is revoked and separately by other | | by the client whose delegation is revoked and separately by other | |
| | | | |
| skipping to change at page 98, line 4 | | skipping to change at page 98, line 26 | |
| protocol's data caching must be implemented such that it does not | | protocol's data caching must be implemented such that it does not | |
| invalidate the assumptions that those using these facilities depend | | invalidate the assumptions that those using these facilities depend | |
| upon. | | upon. | |
| | | | |
| 9.3.1. Data Caching and OPENs | | 9.3.1. Data Caching and OPENs | |
| | | | |
| In order to avoid invalidating the sharing assumptions that | | In order to avoid invalidating the sharing assumptions that | |
| applications rely on, NFS version 4 clients should not provide cached | | applications rely on, NFS version 4 clients should not provide cached | |
| data to applications or modify it on behalf of an application when it | | data to applications or modify it on behalf of an application when it | |
| would not be valid to obtain or modify that same data via a READ or | | would not be valid to obtain or modify that same data via a READ or | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| WRITE operation. | | WRITE operation. | |
| | | | |
| Furthermore, in the absence of open delegation (see the section "Open | | Furthermore, in the absence of open delegation (see the section "Open | |
| Delegation") two additional rules apply. Note that these rules are | | Delegation") two additional rules apply. Note that these rules are | |
| obeyed in practice by many NFS version 2 and version 3 clients. | | obeyed in practice by many NFS version 2 and version 3 clients. | |
| | | | |
| o First, cached data present on a client must be revalidated after | | o First, cached data present on a client must be revalidated after | |
| doing an OPEN. Revalidating means that the client fetches the | | doing an OPEN. Revalidating means that the client fetches the | |
| change attribute from the server, compares it with the cached | | change attribute from the server, compares it with the cached | |
| change attribute, and if different, declares the cached data (as | | change attribute, and if different, declares the cached data (as | |
|
| well as the cached attributes) as invalid. This is to ensure | | well as the cached attributes) as invalid. This is to ensure that | |
| that the data for the OPENed file is still correctly reflected | | the data for the OPENed file is still correctly reflected in the | |
| in the client's cache. This validation must be done at least | | client's cache. This validation must be done at least when the | |
| when the client's OPEN operation includes DENY=WRITE or BOTH | | client's OPEN operation includes DENY=WRITE or BOTH thus | |
| thus terminating a period in which other clients may have had | | terminating a period in which other clients may have had the | |
| the opportunity to open the file with WRITE access. Clients may | | opportunity to open the file with WRITE access. Clients may | |
| choose to do the revalidation more often (i.e. at OPENs | | choose to do the revalidation more often (i.e., at OPENs | |
| specifying DENY=NONE) to parallel the NFS version 3 protocol's | | specifying DENY=NONE) to parallel the NFS version 3 protocol's | |
| practice for the benefit of users assuming this degree of cache | | practice for the benefit of users assuming this degree of cache | |
| revalidation. | | revalidation. | |
| | | | |
| Since the change attribute is updated for data and metadata | | Since the change attribute is updated for data and metadata | |
|
| modifications, some client implementors may be tempted to use | | modifications, some client implementors may be tempted to use the | |
| the time_modify attribute and not change to validate cached | | time_modify attribute and not change to validate cached data, so | |
| data, so that metadata changes do not spuriously invalidate | | that metadata changes do not spuriously invalidate clean data. | |
| clean data. The implementor is cautioned in this approach. The | | The implementor is cautioned in this approach. The change | |
| change attribute is guaranteed to change for each update to the | | attribute is guaranteed to change for each update to the file, | |
| file, whereas time_modify is guaranteed to change only at the | | whereas time_modify is guaranteed to change only at the | |
| granularity of the time_delta attribute. Use by the client's | | granularity of the time_delta attribute. Use by the client's data | |
| data cache validation logic of time_modify and not change runs | | cache validation logic of time_modify and not change runs the risk | |
| the risk of the client incorrectly marking stale data as valid. | | of the client incorrectly marking stale data as valid. | |
| | | | |
|
| o Second, modified data must be flushed to the server before | | o Second, modified data must be flushed to the server before closing | |
| closing a file OPENed for write. This is complementary to the | | a file OPENed for write. This is complementary to the first rule. | |
| first rule. If the data is not flushed at CLOSE, the | | If the data is not flushed at CLOSE, the revalidation done after | |
| revalidation done after client OPENs as file is unable to | | client OPENs as file is unable to achieve its purpose. The other | |
| achieve its purpose. The other aspect to flushing the data | | aspect to flushing the data before close is that the data must be | |
| before close is that the data must be committed to stable | | committed to stable storage, at the server, before the CLOSE | |
| storage, at the server, before the CLOSE operation is requested | | operation is requested by the client. In the case of a server | |
| by the client. In the case of a server reboot or restart and a | | reboot or restart and a CLOSEd file, it may not be possible to | |
| CLOSEd file, it may not be possible to retransmit the data to be | | retransmit the data to be written to the file. Hence, this | |
| written to the file. Hence, this requirement. | | requirement. | |
| | | | |
| 9.3.2. Data Caching and File Locking | | 9.3.2. Data Caching and File Locking | |
| | | | |
| For those applications that choose to use file locking instead of | | For those applications that choose to use file locking instead of | |
| share reservations to exclude inconsistent file access, there is an | | share reservations to exclude inconsistent file access, there is an | |
| analogous set of constraints that apply to client side data caching. | | analogous set of constraints that apply to client side data caching. | |
| These rules are effective only if the file locking is used in a way | | These rules are effective only if the file locking is used in a way | |
| that matches in an equivalent way the actual READ and WRITE | | that matches in an equivalent way the actual READ and WRITE | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| operations executed. This is as opposed to file locking that is | | operations executed. This is as opposed to file locking that is | |
| based on pure convention. For example, it is possible to manipulate | | based on pure convention. For example, it is possible to manipulate | |
| a two-megabyte file by dividing the file into two one-megabyte | | a two-megabyte file by dividing the file into two one-megabyte | |
| regions and protecting access to the two regions by file locks on | | regions and protecting access to the two regions by file locks on | |
| bytes zero and one. A lock for write on byte zero of the file would | | bytes zero and one. A lock for write on byte zero of the file would | |
| represent the right to do READ and WRITE operations on the first | | represent the right to do READ and WRITE operations on the first | |
| region. A lock for write on byte one of the file would represent the | | region. A lock for write on byte one of the file would represent the | |
| right to do READ and WRITE operations on the second region. As long | | right to do READ and WRITE operations on the second region. As long | |
| as all applications manipulating the file obey this convention, they | | as all applications manipulating the file obey this convention, they | |
| will work on a local filesystem. However, they may not work with the | | will work on a local filesystem. However, they may not work with the | |
| NFS version 4 protocol unless clients refrain from data caching. | | NFS version 4 protocol unless clients refrain from data caching. | |
| | | | |
| The rules for data caching in the file locking environment are: | | The rules for data caching in the file locking environment are: | |
| | | | |
|
| o First, when a client obtains a file lock for a particular | | o First, when a client obtains a file lock for a particular region, | |
| region, the data cache corresponding to that region (if any | | the data cache corresponding to that region (if any cached data | |
| cache data exists) must be revalidated. If the change attribute | | exists) must be revalidated. If the change attribute indicates | |
| indicates that the file may have been updated since the cached | | that the file may have been updated since the cached data was | |
| data was obtained, the client must flush or invalidate the | | obtained, the client must flush or invalidate the cached data for | |
| cached data for the newly locked region. A client might choose | | the newly locked region. A client might choose to invalidate all | |
| to invalidate all of non-modified cached data that it has for | | of non-modified cached data that it has for the file but the only | |
| the file but the only requirement for correct operation is to | | requirement for correct operation is to invalidate all of the data | |
| invalidate all of the data in the newly locked region. | | in the newly locked region. | |
| | | | |
| o Second, before releasing a write lock for a region, all modified | | o Second, before releasing a write lock for a region, all modified | |
|
| data for that region must be flushed to the server. The | | data for that region must be flushed to the server. The modified | |
| modified data must also be written to stable storage. | | data must also be written to stable storage. | |
| | | | |
| Note that flushing data to the server and the invalidation of cached | | Note that flushing data to the server and the invalidation of cached | |
| data must reflect the actual byte ranges locked or unlocked. | | data must reflect the actual byte ranges locked or unlocked. | |
| Rounding these up or down to reflect client cache block boundaries | | Rounding these up or down to reflect client cache block boundaries | |
| will cause problems if not carefully done. For example, writing a | | will cause problems if not carefully done. For example, writing a | |
| modified block when only half of that block is within an area being | | modified block when only half of that block is within an area being | |
| unlocked may cause invalid modification to the region outside the | | unlocked may cause invalid modification to the region outside the | |
| unlocked area. This, in turn, may be part of a region locked by | | unlocked area. This, in turn, may be part of a region locked by | |
| another client. Clients can avoid this situation by synchronously | | another client. Clients can avoid this situation by synchronously | |
| performing portions of write operations that overlap that portion | | performing portions of write operations that overlap that portion | |
| | | | |
| skipping to change at page 99, line 58 | | skipping to change at page 100, line 32 | |
| client possesses may not be valid. | | client possesses may not be valid. | |
| | | | |
| The data that is written to the server as a prerequisite to the | | The data that is written to the server as a prerequisite to the | |
| unlocking of a region must be written, at the server, to stable | | unlocking of a region must be written, at the server, to stable | |
| storage. The client may accomplish this either with synchronous | | storage. The client may accomplish this either with synchronous | |
| writes or by following asynchronous writes with a COMMIT operation. | | writes or by following asynchronous writes with a COMMIT operation. | |
| This is required because retransmission of the modified data after a | | This is required because retransmission of the modified data after a | |
| server reboot might conflict with a lock held by another client. | | server reboot might conflict with a lock held by another client. | |
| | | | |
| A client implementation may choose to accommodate applications which | | A client implementation may choose to accommodate applications which | |
|
| use record locking in non-standard ways (e.g. using a record lock as | | use record locking in non-standard ways (e.g., using a record lock as | |
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| a global semaphore) by flushing to the server more data upon an LOCKU | | a global semaphore) by flushing to the server more data upon an LOCKU | |
| than is covered by the locked range. This may include modified data | | than is covered by the locked range. This may include modified data | |
| within files other than the one for which the unlocks are being done. | | within files other than the one for which the unlocks are being done. | |
| In such cases, the client must not interfere with applications whose | | In such cases, the client must not interfere with applications whose | |
| READs and WRITEs are being done only within the bounds of record | | READs and WRITEs are being done only within the bounds of record | |
| locks which the application holds. For example, an application locks | | locks which the application holds. For example, an application locks | |
| a single byte of a file and proceeds to write that single byte. A | | a single byte of a file and proceeds to write that single byte. A | |
| client that chose to handle a LOCKU by flushing all modified data to | | client that chose to handle a LOCKU by flushing all modified data to | |
| the server could validly write that single byte in response to an | | the server could validly write that single byte in response to an | |
| unrelated unlock. However, it would not be valid to write the entire | | unrelated unlock. However, it would not be valid to write the entire | |
| | | | |
| skipping to change at page 101, line 4 | | skipping to change at page 101, line 36 | |
| NFS version 3 clients, the typical practice has been to assume for | | NFS version 3 clients, the typical practice has been to assume for | |
| the purpose of caching that distinct filehandles represent distinct | | the purpose of caching that distinct filehandles represent distinct | |
| filesystem objects. The client then has the choice to organize and | | filesystem objects. The client then has the choice to organize and | |
| maintain the data cache on this basis. | | maintain the data cache on this basis. | |
| | | | |
| In the NFS version 4 protocol, there is now the possibility to have | | In the NFS version 4 protocol, there is now the possibility to have | |
| significant deviations from a "one filehandle per object" model | | significant deviations from a "one filehandle per object" model | |
| because a filehandle may be constructed on the basis of the object's | | because a filehandle may be constructed on the basis of the object's | |
| pathname. Therefore, clients need a reliable method to determine if | | pathname. Therefore, clients need a reliable method to determine if | |
| two filehandles designate the same filesystem object. If clients | | two filehandles designate the same filesystem object. If clients | |
|
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| were simply to assume that all distinct filehandles denote distinct | | were simply to assume that all distinct filehandles denote distinct | |
| objects and proceed to do data caching on this basis, caching | | objects and proceed to do data caching on this basis, caching | |
| inconsistencies would arise between the distinct client side objects | | inconsistencies would arise between the distinct client side objects | |
| which mapped to the same server side object. | | which mapped to the same server side object. | |
| | | | |
| By providing a method to differentiate filehandles, the NFS version 4 | | By providing a method to differentiate filehandles, the NFS version 4 | |
| protocol alleviates a potential functional regression in comparison | | protocol alleviates a potential functional regression in comparison | |
| with the NFS version 3 protocol. Without this method, caching | | with the NFS version 3 protocol. Without this method, caching | |
| inconsistencies within the same client could occur and this has not | | inconsistencies within the same client could occur and this has not | |
| been present in previous versions of the NFS protocol. Note that it | | been present in previous versions of the NFS protocol. Note that it | |
| is possible to have such inconsistencies with applications executing | | is possible to have such inconsistencies with applications executing | |
| on multiple clients but that is not the issue being addressed here. | | on multiple clients but that is not the issue being addressed here. | |
| | | | |
| For the purposes of data caching, the following steps allow an NFS | | For the purposes of data caching, the following steps allow an NFS | |
| version 4 client to determine whether two distinct filehandles denote | | version 4 client to determine whether two distinct filehandles denote | |
| the same server side object: | | the same server side object: | |
| | | | |
|
| o If GETATTR directed to two filehandles returns different values | | o If GETATTR directed to two filehandles returns different values of | |
| of the fsid attribute, then the filehandles represent distinct | | the fsid attribute, then the filehandles represent distinct | |
| objects. | | objects. | |
| | | | |
|
| o If GETATTR for any file with an fsid that matches the fsid of | | o If GETATTR for any file with an fsid that matches the fsid of the | |
| the two filehandles in question returns a unique_handles | | two filehandles in question returns a unique_handles attribute | |
| attribute with a value of TRUE, then the two objects are | | with a value of TRUE, then the two objects are distinct. | |
| distinct. | | | |
| | | | |
| o If GETATTR directed to the two filehandles does not return the | | o If GETATTR directed to the two filehandles does not return the | |
| fileid attribute for both of the handles, then it cannot be | | fileid attribute for both of the handles, then it cannot be | |
| determined whether the two objects are the same. Therefore, | | determined whether the two objects are the same. Therefore, | |
|
| operations which depend on that knowledge (e.g. client side data | | operations which depend on that knowledge (e.g., client side data | |
| caching) cannot be done reliably. | | caching) cannot be done reliably. | |
| | | | |
| o If GETATTR directed to the two filehandles returns different | | o If GETATTR directed to the two filehandles returns different | |
| values for the fileid attribute, then they are distinct objects. | | values for the fileid attribute, then they are distinct objects. | |
| | | | |
| o Otherwise they are the same object. | | o Otherwise they are the same object. | |
| | | | |
| 9.4. Open Delegation | | 9.4. Open Delegation | |
| | | | |
| When a file is being OPENed, the server may delegate further handling | | When a file is being OPENed, the server may delegate further handling | |
| | | | |
| skipping to change at page 102, line 5 | | skipping to change at page 102, line 38 | |
| delegation is recallable, since the circumstances that allowed for | | delegation is recallable, since the circumstances that allowed for | |
| the delegation are subject to change. In particular, the server may | | the delegation are subject to change. In particular, the server may | |
| receive a conflicting OPEN from another client, the server must | | receive a conflicting OPEN from another client, the server must | |
| recall the delegation before deciding whether the OPEN from the other | | recall the delegation before deciding whether the OPEN from the other | |
| client may be granted. Making a delegation is up to the server and | | client may be granted. Making a delegation is up to the server and | |
| clients should not assume that any particular OPEN either will or | | clients should not assume that any particular OPEN either will or | |
| will not result in an open delegation. The following is a typical | | will not result in an open delegation. The following is a typical | |
| set of conditions that servers might use in deciding whether OPEN | | set of conditions that servers might use in deciding whether OPEN | |
| should be delegated: | | should be delegated: | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| o The client must be able to respond to the server's callback | | o The client must be able to respond to the server's callback | |
|
| requests. The server will use the CB_NULL procedure for a test | | requests. The server will use the CB_NULL procedure for a test of | |
| of callback ability. | | callback ability. | |
| | | | |
| o The client must have responded properly to previous recalls. | | o The client must have responded properly to previous recalls. | |
| | | | |
| o There must be no current open conflicting with the requested | | o There must be no current open conflicting with the requested | |
| delegation. | | delegation. | |
| | | | |
| o There should be no current delegation that conflicts with the | | o There should be no current delegation that conflicts with the | |
| delegation being requested. | | delegation being requested. | |
| | | | |
|
| o The probability of future conflicting open requests should be | | o The probability of future conflicting open requests should be low | |
| low based on the recent history of the file. | | based on the recent history of the file. | |
| | | | |
|
| o The existence of any server-specific semantics of OPEN/CLOSE | | o The existence of any server-specific semantics of OPEN/CLOSE that | |
| that would make the required handling incompatible with the | | would make the required handling incompatible with the prescribed | |
| prescribed handling that the delegated client would apply (see | | handling that the delegated client would apply (see below). | |
| below). | | | |
| | | | |
| There are two types of open delegations, read and write. A read open | | There are two types of open delegations, read and write. A read open | |
| delegation allows a client to handle, on its own, requests to open a | | delegation allows a client to handle, on its own, requests to open a | |
| file for reading that do not deny read access to others. Multiple | | file for reading that do not deny read access to others. Multiple | |
| read open delegations may be outstanding simultaneously and do not | | read open delegations may be outstanding simultaneously and do not | |
| conflict. A write open delegation allows the client to handle, on | | conflict. A write open delegation allows the client to handle, on | |
| its own, all opens. Only one write open delegation may exist for a | | its own, all opens. Only one write open delegation may exist for a | |
| given file at a given time and it is inconsistent with any read open | | given file at a given time and it is inconsistent with any read open | |
| delegations. | | delegations. | |
| | | | |
| | | | |
| skipping to change at page 102, line 55 | | skipping to change at page 103, line 37 | |
| CLOSEs to the server but updates the appropriate status internally. | | CLOSEs to the server but updates the appropriate status internally. | |
| For a read open delegation, opens that cannot be handled locally | | For a read open delegation, opens that cannot be handled locally | |
| (opens for write or that deny read access) must be sent to the | | (opens for write or that deny read access) must be sent to the | |
| server. | | server. | |
| | | | |
| When an open delegation is made, the response to the OPEN contains an | | When an open delegation is made, the response to the OPEN contains an | |
| open delegation structure which specifies the following: | | open delegation structure which specifies the following: | |
| | | | |
| o the type of delegation (read or write) | | o the type of delegation (read or write) | |
| | | | |
|
| o space limitation information to control flushing of data on | | o space limitation information to control flushing of data on close | |
| close (write open delegation only, see the section "Open | | (write open delegation only, see the section "Open Delegation and | |
| Delegation and Data Caching") | | Data Caching") | |
| | | | |
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| o an nfsace4 specifying read and write permissions | | o an nfsace4 specifying read and write permissions | |
| | | | |
| o a stateid to represent the delegation for READ and WRITE | | o a stateid to represent the delegation for READ and WRITE | |
| | | | |
| The delegation stateid is separate and distinct from the stateid for | | The delegation stateid is separate and distinct from the stateid for | |
| the OPEN proper. The standard stateid, unlike the delegation | | the OPEN proper. The standard stateid, unlike the delegation | |
| stateid, is associated with a particular lock_owner and will continue | | stateid, is associated with a particular lock_owner and will continue | |
| to be valid after the delegation is recalled and the file remains | | to be valid after the delegation is recalled and the file remains | |
| open. | | open. | |
| | | | |
| When a request internal to the client is made to open a file and open | | When a request internal to the client is made to open a file and open | |
| delegation is in effect, it will be accepted or rejected solely on | | delegation is in effect, it will be accepted or rejected solely on | |
| the basis of the following conditions. Any requirement for other | | the basis of the following conditions. Any requirement for other | |
| checks to be made by the delegate should result in open delegation | | checks to be made by the delegate should result in open delegation | |
| being denied so that the checks can be made by the server itself. | | being denied so that the checks can be made by the server itself. | |
| | | | |
|
| o The access and deny bits for the request and the file as | | o The access and deny bits for the request and the file as described | |
| described in the section "Share Reservations". | | in the section "Share Reservations". | |
| | | | |
| o The read and write permissions as determined below. | | o The read and write permissions as determined below. | |
| | | | |
| The nfsace4 passed with delegation can be used to avoid frequent | | The nfsace4 passed with delegation can be used to avoid frequent | |
| ACCESS calls. The permission check should be as follows: | | ACCESS calls. The permission check should be as follows: | |
| | | | |
|
| o If the nfsace4 indicates that the open may be done, then it | | o If the nfsace4 indicates that the open may be done, then it should | |
| should be granted without reference to the server. | | be granted without reference to the server. | |
| | | | |
| o If the nfsace4 indicates that the open may not be done, then an | | o If the nfsace4 indicates that the open may not be done, then an | |
|
| ACCESS request must be sent to the server to obtain the | | ACCESS request must be sent to the server to obtain the definitive | |
| definitive answer. | | answer. | |
| | | | |
| The server may return an nfsace4 that is more restrictive than the | | The server may return an nfsace4 that is more restrictive than the | |
| actual ACL of the file. This includes an nfsace4 that specifies | | actual ACL of the file. This includes an nfsace4 that specifies | |
| denial of all access. Note that some common practices such as | | denial of all access. Note that some common practices such as | |
| mapping the traditional user "root" to the user "nobody" may make it | | mapping the traditional user "root" to the user "nobody" may make it | |
| incorrect to return the actual ACL of the file in the delegation | | incorrect to return the actual ACL of the file in the delegation | |
| response. | | response. | |
| | | | |
| The use of delegation together with various other forms of caching | | The use of delegation together with various other forms of caching | |
| creates the possibility that no server authentication will ever be | | creates the possibility that no server authentication will ever be | |
| performed for a given user since all of the user's requests might be | | performed for a given user since all of the user's requests might be | |
| satisfied locally. Where the client is depending on the server for | | satisfied locally. Where the client is depending on the server for | |
| authentication, the client should be sure authentication occurs for | | authentication, the client should be sure authentication occurs for | |
| each user by use of the ACCESS operation. This should be the case | | each user by use of the ACCESS operation. This should be the case | |
| even if an ACCESS operation would not be required otherwise. As | | even if an ACCESS operation would not be required otherwise. As | |
| mentioned before, the server may enforce frequent authentication by | | mentioned before, the server may enforce frequent authentication by | |
| returning an nfsace4 denying all access with every open delegation. | | returning an nfsace4 denying all access with every open delegation. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| 9.4.1. Open Delegation and Data Caching | | 9.4.1. Open Delegation and Data Caching | |
| | | | |
| OPEN delegation allows much of the message overhead associated with | | OPEN delegation allows much of the message overhead associated with | |
| the opening and closing files to be eliminated. An open when an open | | the opening and closing files to be eliminated. An open when an open | |
| delegation is in effect does not require that a validation message be | | delegation is in effect does not require that a validation message be | |
| sent to the server. The continued endurance of the "read open | | sent to the server. The continued endurance of the "read open | |
| delegation" provides a guarantee that no OPEN for write and thus no | | delegation" provides a guarantee that no OPEN for write and thus no | |
| write has occurred. Similarly, when closing a file opened for write | | write has occurred. Similarly, when closing a file opened for write | |
| and if write open delegation is in effect, the data written does not | | and if write open delegation is in effect, the data written does not | |
| have to be flushed to the server until the open delegation is | | have to be flushed to the server until the open delegation is | |
| | | | |
| skipping to change at page 104, line 36 | | skipping to change at page 105, line 23 | |
| client will force the server to recall a write open delegation. A | | client will force the server to recall a write open delegation. A | |
| WRITE with a special stateid done by another client will force a | | WRITE with a special stateid done by another client will force a | |
| recall of read open delegations. | | recall of read open delegations. | |
| | | | |
| With delegations, a client is able to avoid writing data to the | | With delegations, a client is able to avoid writing data to the | |
| server when the CLOSE of a file is serviced. The file close system | | server when the CLOSE of a file is serviced. The file close system | |
| call is the usual point at which the client is notified of a lack of | | call is the usual point at which the client is notified of a lack of | |
| stable storage for the modified file data generated by the | | stable storage for the modified file data generated by the | |
| application. At the close, file data is written to the server and | | application. At the close, file data is written to the server and | |
| through normal accounting the server is able to determine if the | | through normal accounting the server is able to determine if the | |
|
| available filesystem space for the data has been exceeded (i.e. | | available filesystem space for the data has been exceeded (i.e., | |
| server returns NFS4ERR_NOSPC or NFS4ERR_DQUOT). This accounting | | server returns NFS4ERR_NOSPC or NFS4ERR_DQUOT). This accounting | |
| includes quotas. The introduction of delegations requires that a | | includes quotas. The introduction of delegations requires that a | |
| alternative method be in place for the same type of communication to | | alternative method be in place for the same type of communication to | |
| occur between client and server. | | occur between client and server. | |
| | | | |
| In the delegation response, the server provides either the limit of | | In the delegation response, the server provides either the limit of | |
| the size of the file or the number of modified blocks and associated | | the size of the file or the number of modified blocks and associated | |
| block size. The server must ensure that the client will be able to | | block size. The server must ensure that the client will be able to | |
| flush data to the server of a size equal to that provided in the | | flush data to the server of a size equal to that provided in the | |
| original delegation. The server must make this assurance for all | | original delegation. The server must make this assurance for all | |
| | | | |
| skipping to change at page 105, line 5 | | skipping to change at page 105, line 47 | |
| The server can recall delegations as a result of managing the | | The server can recall delegations as a result of managing the | |
| available filesystem space. The client should abide by the server's | | available filesystem space. The client should abide by the server's | |
| state space limits for delegations. If the client exceeds the stated | | state space limits for delegations. If the client exceeds the stated | |
| limits for the delegation, the server's behavior is undefined. | | limits for the delegation, the server's behavior is undefined. | |
| | | | |
| Based on server conditions, quotas or available filesystem space, the | | Based on server conditions, quotas or available filesystem space, the | |
| server may grant write open delegations with very restrictive space | | server may grant write open delegations with very restrictive space | |
| limitations. The limitations may be defined in a way that will | | limitations. The limitations may be defined in a way that will | |
| always force modified data to be flushed to the server on close. | | always force modified data to be flushed to the server on close. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| With respect to authentication, flushing modified data to the server | | With respect to authentication, flushing modified data to the server | |
| after a CLOSE has occurred may be problematic. For example, the user | | after a CLOSE has occurred may be problematic. For example, the user | |
| of the application may have logged off the client and unexpired | | of the application may have logged off the client and unexpired | |
| authentication credentials may not be present. In this case, the | | authentication credentials may not be present. In this case, the | |
| client may need to take special care to ensure that local unexpired | | client may need to take special care to ensure that local unexpired | |
| credentials will in fact be available. This may be accomplished by | | credentials will in fact be available. This may be accomplished by | |
| tracking the expiration time of credentials and flushing data well in | | tracking the expiration time of credentials and flushing data well in | |
| advance of their expiration or by making private copies of | | advance of their expiration or by making private copies of | |
| credentials to assure their availability when needed. | | credentials to assure their availability when needed. | |
| | | | |
| 9.4.2. Open Delegation and File Locks | | 9.4.2. Open Delegation and File Locks | |
| | | | |
|
| When a client holds a write open delegation, lock operations are | | When a client holds a write open delegation, lock operations may be | |
| performed locally. This includes those required for mandatory file | | performed locally. This includes those required for mandatory file | |
| locking. This can be done since the delegation implies that there | | locking. This can be done since the delegation implies that there | |
| can be no conflicting locks. Similarly, all of the revalidations | | can be no conflicting locks. Similarly, all of the revalidations | |
| that would normally be associated with obtaining locks and the | | that would normally be associated with obtaining locks and the | |
| flushing of data associated with the releasing of locks need not be | | flushing of data associated with the releasing of locks need not be | |
| done. | | done. | |
| | | | |
| When a client holds a read open delegation, lock operations are not | | When a client holds a read open delegation, lock operations are not | |
| performed locally. All lock operations, including those requesting | | performed locally. All lock operations, including those requesting | |
| non-exclusive locks, are sent to the server for resolution. | | non-exclusive locks, are sent to the server for resolution. | |
| | | | |
| skipping to change at page 106, line 5 | | skipping to change at page 106, line 50 | |
| only needs to know about this modified state. If the server | | only needs to know about this modified state. If the server | |
| determines that the file is currently modified, it will respond to | | determines that the file is currently modified, it will respond to | |
| the second client's GETATTR as if the file had been modified locally | | the second client's GETATTR as if the file had been modified locally | |
| at the server. | | at the server. | |
| | | | |
| Since the form of the change attribute is determined by the server | | Since the form of the change attribute is determined by the server | |
| and is opaque to the client, the client and server need to agree on a | | and is opaque to the client, the client and server need to agree on a | |
| method of communicating the modified state of the file. For the size | | method of communicating the modified state of the file. For the size | |
| attribute, the client will report its current view of the file size. | | attribute, the client will report its current view of the file size. | |
| | | | |
|
| Draft Specification NFS version 4 Protocol November 2002 | | | |
| | | | |
| For the change attribute, the handling is more involved. | | For the change attribute, the handling is more involved. | |
| | | | |
| For the client, the following steps will be taken when receiving a | | For the client, the following steps will be taken when receiving a | |
| write delegation: | | write delegation: | |
| | | | |
|
| o The value of the change attribute will be obtained from the | | o The value of the change attribute will be obtained from the server | |
| server and cached. Let this value be represented by c. | | and cached. Let this value be represented by c. | |
| | | | |
| o The client will create a value greater than c that will be used | | o The client will create a value greater than c that will be used | |
| for communicating modified data is held at the client. Let this | | for communicating modified data is held at the client. Let this | |
| value be represented by d. | | value be represented by d. | |
| | | | |
| o When the client is queried via CB_GETATTR for the change | | o When the client is queried via CB_GETATTR for the change | |
| attribute, it checks to see if it holds modified data. If the | | attribute, it checks to see if it holds modified data. If the | |
|
| file is modified, the value d is returned for the change | | file is modified, the value d is returned for the change attribute | |
| attribute value. If this file is not currently modified, the | | value. If this file is not currently modified, the client returns | |
| client returns the value c for the change attribute. | | the value c for the change attribute. | |
| | | | |
| For simplicity of implementation, the client MAY for each CB_GETATTR | | For simplicity of implementation, the client MAY for each CB_GETATTR | |
| return the same value d. This is true even if, between successive | | return the same value d. This is true even if, between successive | |
| CB_GETATTR operations, the client again modifies in the file's data | | CB_GETATTR operations, the client again modifies in the file's data | |
| or metadata in its cache. The client can return the same value | | or metadata in its cache. The client can return the same value | |
| because the only requirement is that the client be able to indicate | | because the only requirement is that the client be able to indicate | |
| to the server that the client holds modified data. Therefore, the | | to the server that the client holds modified data. Therefore, the | |
| value of d may always be c + 1. | | value of d may always be c + 1. | |
| | | | |
| While the change attribute is opaque to the client in the sense that | | While the change attribute is opaque to the client in the sense that | |
| | | | |
| skipping to change at page 106, line 47 | | skipping to change at page 107, line 43 | |
| of the client's changes to that integer. Therefore, the server MUST | | of the client's changes to that integer. Therefore, the server MUST | |
| encode the change attribute in network order when sending it to the | | encode the change attribute in network order when sending it to the | |
| client. The client MUST decode it from network order to its native | | client. The client MUST decode it from network order to its native | |
| order when receiving it and the client MUST encode it network order | | order when receiving it and the client MUST encode it network order | |
| when sending it to the server. For this reason, change is defined as | | when sending it to the server. For this reason, change is defined as | |
| an unsigned integer rather than an opaque array of octets. | | an unsigned integer rather than an opaque array of octets. | |
| | | | |
| For the server, the following steps will be taken when providing a | | For the server, the following steps will be taken when providing a | |
| write delegation: | | write delegation: | |
| | | | |
|
| o Upon providing a write delegation, the server will cache a copy | | o Upon providing a write delegation, the server will cache a copy of | |
| of the change attribute in the data structure it uses to record | | the change attribute in the data structure it uses to record the | |
| the delegation. Let this value be represented by sc. | | delegation. Let this value be represented by sc. | |
| | | | |
|
| o When a second client sends a GETATTR operation on the same file | | o When a second client sends a GETATTR operation on the same file to | |
| to the server, the server obtains the change attribute from the | | the server, the server obtains the change attribute from the first | |
| first client. Let this value be cc. | | client. Let this value be cc. | |
| | | | |
|