draft-ietf-nfsv4-minorversion1-26.txt   draft-ietf-nfsv4-minorversion1-27.txt 
NFSv4 S. Shepler NFSv4 S. Shepler
Internet-Draft M. Eisler Internet-Draft M. Eisler
Intended status: Standards Track D. Noveck Intended status: Standards Track D. Noveck
Expires: March 9, 2009 Editors Expires: June 6, 2009 Editors
September 05, 2008 December 03, 2008
NFS Version 4 Minor Version 1 NFS Version 4 Minor Version 1
draft-ietf-nfsv4-minorversion1-26.txt draft-ietf-nfsv4-minorversion1-27.txt
Status of this Memo Status of this Memo
By submitting this Internet-Draft, each author represents that any By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79. aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
skipping to change at page 1, line 35 skipping to change at page 1, line 35
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt. http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet-Draft will expire on March 9, 2009. This Internet-Draft will expire on June 6, 2009.
Abstract Abstract
This Internet-Draft describes NFS version 4 minor version one, This document describes NFS version 4 minor version one, including
including features retained from the base protocol and protocol features retained from the base protocol (NFS version 4 minor version
extensions made subsequently. Major extensions introduced in NFS zero which is specified in RFC3530) and protocol extensions made
version 4 minor version one include: Sessions, Directory Delegations, subsequently. Major extensions introduced in NFS version 4 minor
and parallel NFS (pNFS). version one include: Sessions, Directory Delegations, and parallel
NFS (pNFS). NFS version 4 minor version one has no dependencies on
NFS version 4 minor version zero, and is considered a separate
protocol. Thus this document neither updates nor obsoletes RFC3530.
NFS minor version one is deemed superior to NFS minor version zero
with no loss of functionality, and its use is preferred over version
zero. Both NFS minor version zero and one can be used simultaneously
on the same network, between the same client and server.
Requirements Language Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [1]. document are to be interpreted as described in RFC 2119 [1].
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 11 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 11
1.1. The NFS Version 4 Minor Version 1 Protocol . . . . . . . 11 1.1. The NFS Version 4 Minor Version 1 Protocol . . . . . . . 11
1.2. Scope of this Document . . . . . . . . . . . . . . . . . 11 1.2. Scope of this Document . . . . . . . . . . . . . . . . . 11
1.3. NFSv4 Goals . . . . . . . . . . . . . . . . . . . . . . 11 1.3. NFSv4 Goals . . . . . . . . . . . . . . . . . . . . . . 11
1.4. NFSv4.1 Goals . . . . . . . . . . . . . . . . . . . . . 12 1.4. NFSv4.1 Goals . . . . . . . . . . . . . . . . . . . . . 12
1.5. General Definitions . . . . . . . . . . . . . . . . . . 12 1.5. General Definitions . . . . . . . . . . . . . . . . . . 12
1.6. Overview of NFSv4.1 Features . . . . . . . . . . . . . . 15 1.6. Overview of NFSv4.1 Features . . . . . . . . . . . . . . 15
1.6.1. RPC and Security . . . . . . . . . . . . . . . . . . 15 1.6.1. RPC and Security . . . . . . . . . . . . . . . . . . 15
1.6.2. Protocol Structure . . . . . . . . . . . . . . . . . 15 1.6.2. Protocol Structure . . . . . . . . . . . . . . . . . 16
1.6.3. File System Model . . . . . . . . . . . . . . . . . 16 1.6.3. File System Model . . . . . . . . . . . . . . . . . 16
1.6.4. Locking Facilities . . . . . . . . . . . . . . . . . 18 1.6.4. Locking Facilities . . . . . . . . . . . . . . . . . 18
1.7. Differences from NFSv4.0 . . . . . . . . . . . . . . . . 19 1.7. Differences from NFSv4.0 . . . . . . . . . . . . . . . . 19
2. Core Infrastructure . . . . . . . . . . . . . . . . . . . . . 20 2. Core Infrastructure . . . . . . . . . . . . . . . . . . . . . 20
2.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 20 2.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 20
2.2. RPC and XDR . . . . . . . . . . . . . . . . . . . . . . 20 2.2. RPC and XDR . . . . . . . . . . . . . . . . . . . . . . 20
2.2.1. RPC-based Security . . . . . . . . . . . . . . . . . 20 2.2.1. RPC-based Security . . . . . . . . . . . . . . . . . 20
2.3. COMPOUND and CB_COMPOUND . . . . . . . . . . . . . . . . 23 2.3. COMPOUND and CB_COMPOUND . . . . . . . . . . . . . . . . 23
2.4. Client Identifiers and Client Owners . . . . . . . . . . 24 2.4. Client Identifiers and Client Owners . . . . . . . . . . 24
2.4.1. Upgrade from NFSv4.0 to NFSv4.1 . . . . . . . . . . 27 2.4.1. Upgrade from NFSv4.0 to NFSv4.1 . . . . . . . . . . 28
2.4.2. Server Release of Client ID . . . . . . . . . . . . 28 2.4.2. Server Release of Client ID . . . . . . . . . . . . 28
2.4.3. Resolving Client Owner Conflicts . . . . . . . . . . 28 2.4.3. Resolving Client Owner Conflicts . . . . . . . . . . 28
2.5. Server Owners . . . . . . . . . . . . . . . . . . . . . 29 2.5. Server Owners . . . . . . . . . . . . . . . . . . . . . 30
2.6. Security Service Negotiation . . . . . . . . . . . . . . 30 2.6. Security Service Negotiation . . . . . . . . . . . . . . 30
2.6.1. NFSv4.1 Security Tuples . . . . . . . . . . . . . . 30 2.6.1. NFSv4.1 Security Tuples . . . . . . . . . . . . . . 31
2.6.2. SECINFO and SECINFO_NO_NAME . . . . . . . . . . . . 31 2.6.2. SECINFO and SECINFO_NO_NAME . . . . . . . . . . . . 31
2.6.3. Security Error . . . . . . . . . . . . . . . . . . . 31 2.6.3. Security Error . . . . . . . . . . . . . . . . . . . 31
2.7. Minor Versioning . . . . . . . . . . . . . . . . . . . . 35 2.7. Minor Versioning . . . . . . . . . . . . . . . . . . . . 36
2.8. Non-RPC-based Security Services . . . . . . . . . . . . 38 2.8. Non-RPC-based Security Services . . . . . . . . . . . . 38
2.8.1. Authorization . . . . . . . . . . . . . . . . . . . 38 2.8.1. Authorization . . . . . . . . . . . . . . . . . . . 38
2.8.2. Auditing . . . . . . . . . . . . . . . . . . . . . . 38 2.8.2. Auditing . . . . . . . . . . . . . . . . . . . . . . 38
2.8.3. Intrusion Detection . . . . . . . . . . . . . . . . 38 2.8.3. Intrusion Detection . . . . . . . . . . . . . . . . 39
2.9. Transport Layers . . . . . . . . . . . . . . . . . . . . 39 2.9. Transport Layers . . . . . . . . . . . . . . . . . . . . 39
2.9.1. REQUIRED and RECOMMENDED Properties of Transports . 39 2.9.1. REQUIRED and RECOMMENDED Properties of Transports . 39
2.9.2. Client and Server Transport Behavior . . . . . . . . 39 2.9.2. Client and Server Transport Behavior . . . . . . . . 40
2.9.3. Ports . . . . . . . . . . . . . . . . . . . . . . . 41 2.9.3. Ports . . . . . . . . . . . . . . . . . . . . . . . 41
2.10. Session . . . . . . . . . . . . . . . . . . . . . . . . 41 2.10. Session . . . . . . . . . . . . . . . . . . . . . . . . 41
2.10.1. Motivation and Overview . . . . . . . . . . . . . . 41 2.10.1. Motivation and Overview . . . . . . . . . . . . . . 41
2.10.2. NFSv4 Integration . . . . . . . . . . . . . . . . . 42 2.10.2. NFSv4 Integration . . . . . . . . . . . . . . . . . 43
2.10.3. Channels . . . . . . . . . . . . . . . . . . . . . . 44 2.10.3. Channels . . . . . . . . . . . . . . . . . . . . . . 44
2.10.4. Server Scope . . . . . . . . . . . . . . . . . . . . 45 2.10.4. Server Scope . . . . . . . . . . . . . . . . . . . . 45
2.10.5. Trunking . . . . . . . . . . . . . . . . . . . . . . 47 2.10.5. Trunking . . . . . . . . . . . . . . . . . . . . . . 48
2.10.6. Exactly Once Semantics . . . . . . . . . . . . . . . 51 2.10.6. Exactly Once Semantics . . . . . . . . . . . . . . . 51
2.10.7. RDMA Considerations . . . . . . . . . . . . . . . . 64 2.10.7. RDMA Considerations . . . . . . . . . . . . . . . . 64
2.10.8. Sessions Security . . . . . . . . . . . . . . . . . 66 2.10.8. Sessions Security . . . . . . . . . . . . . . . . . 67
2.10.9. The SSV GSS Mechanism . . . . . . . . . . . . . . . 72 2.10.9. The Secret State Verifier (SSV) GSS Mechanism . . . 72
2.10.10. Session Mechanics - Steady State . . . . . . . . . . 76 2.10.10. Session Mechanics - Steady State . . . . . . . . . . 76
2.10.11. Session Inactivity Timer . . . . . . . . . . . . . . 78 2.10.11. Session Inactivity Timer . . . . . . . . . . . . . . 78
2.10.12. Session Mechanics - Recovery . . . . . . . . . . . . 78 2.10.12. Session Mechanics - Recovery . . . . . . . . . . . . 78
2.10.13. Parallel NFS and Sessions . . . . . . . . . . . . . 83 2.10.13. Parallel NFS and Sessions . . . . . . . . . . . . . 83
3. Protocol Constants and Data Types . . . . . . . . . . . . . . 83 3. Protocol Constants and Data Types . . . . . . . . . . . . . . 84
3.1. Basic Constants . . . . . . . . . . . . . . . . . . . . 84 3.1. Basic Constants . . . . . . . . . . . . . . . . . . . . 84
3.2. Basic Data Types . . . . . . . . . . . . . . . . . . . . 84 3.2. Basic Data Types . . . . . . . . . . . . . . . . . . . . 85
3.3. Structured Data Types . . . . . . . . . . . . . . . . . 86 3.3. Structured Data Types . . . . . . . . . . . . . . . . . 86
4. Filehandles . . . . . . . . . . . . . . . . . . . . . . . . . 94 4. Filehandles . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.1. Obtaining the First Filehandle . . . . . . . . . . . . . 95 4.1. Obtaining the First Filehandle . . . . . . . . . . . . . 95
4.1.1. Root Filehandle . . . . . . . . . . . . . . . . . . 95 4.1.1. Root Filehandle . . . . . . . . . . . . . . . . . . 95
4.1.2. Public Filehandle . . . . . . . . . . . . . . . . . 95 4.1.2. Public Filehandle . . . . . . . . . . . . . . . . . 96
4.2. Filehandle Types . . . . . . . . . . . . . . . . . . . . 96 4.2. Filehandle Types . . . . . . . . . . . . . . . . . . . . 96
4.2.1. General Properties of a Filehandle . . . . . . . . . 96 4.2.1. General Properties of a Filehandle . . . . . . . . . 97
4.2.2. Persistent Filehandle . . . . . . . . . . . . . . . 97 4.2.2. Persistent Filehandle . . . . . . . . . . . . . . . 97
4.2.3. Volatile Filehandle . . . . . . . . . . . . . . . . 97 4.2.3. Volatile Filehandle . . . . . . . . . . . . . . . . 98
4.3. One Method of Constructing a Volatile Filehandle . . . . 98 4.3. One Method of Constructing a Volatile Filehandle . . . . 99
4.4. Client Recovery from Filehandle Expiration . . . . . . . 99 4.4. Client Recovery from Filehandle Expiration . . . . . . . 99
5. File Attributes . . . . . . . . . . . . . . . . . . . . . . . 100 5. File Attributes . . . . . . . . . . . . . . . . . . . . . . . 100
5.1. REQUIRED Attributes . . . . . . . . . . . . . . . . . . 101 5.1. REQUIRED Attributes . . . . . . . . . . . . . . . . . . 102
5.2. RECOMMENDED Attributes . . . . . . . . . . . . . . . . . 101 5.2. RECOMMENDED Attributes . . . . . . . . . . . . . . . . . 102
5.3. Named Attributes . . . . . . . . . . . . . . . . . . . . 102 5.3. Named Attributes . . . . . . . . . . . . . . . . . . . . 102
5.4. Classification of Attributes . . . . . . . . . . . . . . 103 5.4. Classification of Attributes . . . . . . . . . . . . . . 104
5.5. Set-Only and Get-Only Attributes . . . . . . . . . . . . 104 5.5. Set-Only and Get-Only Attributes . . . . . . . . . . . . 105
5.6. REQUIRED Attributes - List and Definition References . . 104 5.6. REQUIRED Attributes - List and Definition References . . 105
5.7. RECOMMENDED Attributes - List and Definition 5.7. RECOMMENDED Attributes - List and Definition
References . . . . . . . . . . . . . . . . . . . . . . . 105 References . . . . . . . . . . . . . . . . . . . . . . . 106
5.8. Attribute Definitions . . . . . . . . . . . . . . . . . 107 5.8. Attribute Definitions . . . . . . . . . . . . . . . . . 108
5.8.1. Definitions of REQUIRED Attributes . . . . . . . . . 107 5.8.1. Definitions of REQUIRED Attributes . . . . . . . . . 108
5.8.2. Definitions of Uncategorized RECOMMENDED 5.8.2. Definitions of Uncategorized RECOMMENDED
Attributes . . . . . . . . . . . . . . . . . . . . . 109 Attributes . . . . . . . . . . . . . . . . . . . . . 110
5.9. Interpreting owner and owner_group . . . . . . . . . . . 116 5.9. Interpreting owner and owner_group . . . . . . . . . . . 116
5.10. Character Case Attributes . . . . . . . . . . . . . . . 118 5.10. Character Case Attributes . . . . . . . . . . . . . . . 118
5.11. Directory Notification Attributes . . . . . . . . . . . 118 5.11. Directory Notification Attributes . . . . . . . . . . . 119
5.12. pNFS Attribute Definitions . . . . . . . . . . . . . . . 118 5.12. pNFS Attribute Definitions . . . . . . . . . . . . . . . 119
5.13. Retention Attributes . . . . . . . . . . . . . . . . . . 120 5.13. Retention Attributes . . . . . . . . . . . . . . . . . . 121
6. Access Control Attributes . . . . . . . . . . . . . . . . . . 123 6. Access Control Attributes . . . . . . . . . . . . . . . . . . 124
6.1. Goals . . . . . . . . . . . . . . . . . . . . . . . . . 123 6.1. Goals . . . . . . . . . . . . . . . . . . . . . . . . . 124
6.2. File Attributes Discussion . . . . . . . . . . . . . . . 124 6.2. File Attributes Discussion . . . . . . . . . . . . . . . 125
6.2.1. Attribute 12: acl . . . . . . . . . . . . . . . . . 124 6.2.1. Attribute 12: acl . . . . . . . . . . . . . . . . . 125
6.2.2. Attribute 58: dacl . . . . . . . . . . . . . . . . . 139 6.2.2. Attribute 58: dacl . . . . . . . . . . . . . . . . . 140
6.2.3. Attribute 59: sacl . . . . . . . . . . . . . . . . . 139 6.2.3. Attribute 59: sacl . . . . . . . . . . . . . . . . . 140
6.2.4. Attribute 33: mode . . . . . . . . . . . . . . . . . 139 6.2.4. Attribute 33: mode . . . . . . . . . . . . . . . . . 140
6.2.5. Attribute 74: mode_set_masked . . . . . . . . . . . 140 6.2.5. Attribute 74: mode_set_masked . . . . . . . . . . . 141
6.3. Common Methods . . . . . . . . . . . . . . . . . . . . . 141 6.3. Common Methods . . . . . . . . . . . . . . . . . . . . . 142
6.3.1. Interpreting an ACL . . . . . . . . . . . . . . . . 141 6.3.1. Interpreting an ACL . . . . . . . . . . . . . . . . 142
6.3.2. Computing a Mode Attribute from an ACL . . . . . . . 142 6.3.2. Computing a Mode Attribute from an ACL . . . . . . . 143
6.4. Requirements . . . . . . . . . . . . . . . . . . . . . . 143 6.4. Requirements . . . . . . . . . . . . . . . . . . . . . . 144
6.4.1. Setting the mode and/or ACL Attributes . . . . . . . 143 6.4.1. Setting the mode and/or ACL Attributes . . . . . . . 144
6.4.2. Retrieving the mode and/or ACL Attributes . . . . . 145 6.4.2. Retrieving the mode and/or ACL Attributes . . . . . 146
6.4.3. Creating New Objects . . . . . . . . . . . . . . . . 145 6.4.3. Creating New Objects . . . . . . . . . . . . . . . . 146
7. Single-server Namespace . . . . . . . . . . . . . . . . . . . 149 7. Single-server Namespace . . . . . . . . . . . . . . . . . . . 150
7.1. Server Exports . . . . . . . . . . . . . . . . . . . . . 149 7.1. Server Exports . . . . . . . . . . . . . . . . . . . . . 151
7.2. Browsing Exports . . . . . . . . . . . . . . . . . . . . 150 7.2. Browsing Exports . . . . . . . . . . . . . . . . . . . . 151
7.3. Server Pseudo File System . . . . . . . . . . . . . . . 150 7.3. Server Pseudo File System . . . . . . . . . . . . . . . 151
7.4. Multiple Roots . . . . . . . . . . . . . . . . . . . . . 151 7.4. Multiple Roots . . . . . . . . . . . . . . . . . . . . . 152
7.5. Filehandle Volatility . . . . . . . . . . . . . . . . . 151 7.5. Filehandle Volatility . . . . . . . . . . . . . . . . . 152
7.6. Exported Root . . . . . . . . . . . . . . . . . . . . . 151 7.6. Exported Root . . . . . . . . . . . . . . . . . . . . . 153
7.7. Mount Point Crossing . . . . . . . . . . . . . . . . . . 152 7.7. Mount Point Crossing . . . . . . . . . . . . . . . . . . 153
7.8. Security Policy and Namespace Presentation . . . . . . . 152 7.8. Security Policy and Namespace Presentation . . . . . . . 153
8. State Management . . . . . . . . . . . . . . . . . . . . . . 153 8. State Management . . . . . . . . . . . . . . . . . . . . . . 154
8.1. Client and Session ID . . . . . . . . . . . . . . . . . 154 8.1. Client and Session ID . . . . . . . . . . . . . . . . . 155
8.2. Stateid Definition . . . . . . . . . . . . . . . . . . . 154 8.2. Stateid Definition . . . . . . . . . . . . . . . . . . . 156
8.2.1. Stateid Types . . . . . . . . . . . . . . . . . . . 155 8.2.1. Stateid Types . . . . . . . . . . . . . . . . . . . 156
8.2.2. Stateid Structure . . . . . . . . . . . . . . . . . 156 8.2.2. Stateid Structure . . . . . . . . . . . . . . . . . 157
8.2.3. Special Stateids . . . . . . . . . . . . . . . . . . 158 8.2.3. Special Stateids . . . . . . . . . . . . . . . . . . 159
8.2.4. Stateid Lifetime and Validation . . . . . . . . . . 159 8.2.4. Stateid Lifetime and Validation . . . . . . . . . . 160
8.2.5. Stateid Use for I/O Operations . . . . . . . . . . . 162 8.2.5. Stateid Use for I/O Operations . . . . . . . . . . . 163
8.2.6. Stateid Use for SETATTR Operations . . . . . . . . . 163 8.2.6. Stateid Use for SETATTR Operations . . . . . . . . . 164
8.3. Lease Renewal . . . . . . . . . . . . . . . . . . . . . 163 8.3. Lease Renewal . . . . . . . . . . . . . . . . . . . . . 164
8.4. Crash Recovery . . . . . . . . . . . . . . . . . . . . . 165 8.4. Crash Recovery . . . . . . . . . . . . . . . . . . . . . 167
8.4.1. Client Failure and Recovery . . . . . . . . . . . . 166 8.4.1. Client Failure and Recovery . . . . . . . . . . . . 167
8.4.2. Server Failure and Recovery . . . . . . . . . . . . 167 8.4.2. Server Failure and Recovery . . . . . . . . . . . . 168
8.4.3. Network Partitions and Recovery . . . . . . . . . . 171 8.4.3. Network Partitions and Recovery . . . . . . . . . . 173
8.5. Server Revocation of Locks . . . . . . . . . . . . . . . 176 8.5. Server Revocation of Locks . . . . . . . . . . . . . . . 178
8.6. Short and Long Leases . . . . . . . . . . . . . . . . . 177 8.6. Short and Long Leases . . . . . . . . . . . . . . . . . 179
8.7. Clocks, Propagation Delay, and Calculating Lease 8.7. Clocks, Propagation Delay, and Calculating Lease
Expiration . . . . . . . . . . . . . . . . . . . . . . . 177 Expiration . . . . . . . . . . . . . . . . . . . . . . . 179
8.8. Obsolete Locking Infrastructure From NFSv4.0 . . . . . . 178 8.8. Obsolete Locking Infrastructure From NFSv4.0 . . . . . . 180
9. File Locking and Share Reservations . . . . . . . . . . . . . 179 9. File Locking and Share Reservations . . . . . . . . . . . . . 181
9.1. Opens and Byte-Range Locks . . . . . . . . . . . . . . . 179 9.1. Opens and Byte-Range Locks . . . . . . . . . . . . . . . 181
9.1.1. State-owner Definition . . . . . . . . . . . . . . . 179 9.1.1. State-owner Definition . . . . . . . . . . . . . . . 181
9.1.2. Use of the Stateid and Locking . . . . . . . . . . . 179 9.1.2. Use of the Stateid and Locking . . . . . . . . . . . 181
9.2. Lock Ranges . . . . . . . . . . . . . . . . . . . . . . 182 9.2. Lock Ranges . . . . . . . . . . . . . . . . . . . . . . 184
9.3. Upgrading and Downgrading Locks . . . . . . . . . . . . 183 9.3. Upgrading and Downgrading Locks . . . . . . . . . . . . 185
9.4. Stateid Seqid Values and Byte-Range Locks . . . . . . . 183 9.4. Stateid Seqid Values and Byte-Range Locks . . . . . . . 185
9.5. Issues with Multiple Open-Owners . . . . . . . . . . . . 184 9.5. Issues with Multiple Open-Owners . . . . . . . . . . . . 186
9.6. Blocking Locks . . . . . . . . . . . . . . . . . . . . . 184 9.6. Blocking Locks . . . . . . . . . . . . . . . . . . . . . 186
9.7. Share Reservations . . . . . . . . . . . . . . . . . . . 185 9.7. Share Reservations . . . . . . . . . . . . . . . . . . . 187
9.8. OPEN/CLOSE Operations . . . . . . . . . . . . . . . . . 186 9.8. OPEN/CLOSE Operations . . . . . . . . . . . . . . . . . 188
9.9. Open Upgrade and Downgrade . . . . . . . . . . . . . . . 187 9.9. Open Upgrade and Downgrade . . . . . . . . . . . . . . . 189
9.10. Parallel OPENs . . . . . . . . . . . . . . . . . . . . . 188 9.10. Parallel OPENs . . . . . . . . . . . . . . . . . . . . . 190
9.11. Reclaim of Open and Byte-Range Locks . . . . . . . . . . 188 9.11. Reclaim of Open and Byte-Range Locks . . . . . . . . . . 190
10. Client-Side Caching . . . . . . . . . . . . . . . . . . . . . 189 10. Client-Side Caching . . . . . . . . . . . . . . . . . . . . . 191
10.1. Performance Challenges for Client-Side Caching . . . . . 189 10.1. Performance Challenges for Client-Side Caching . . . . . 191
10.2. Delegation and Callbacks . . . . . . . . . . . . . . . . 190 10.2. Delegation and Callbacks . . . . . . . . . . . . . . . . 192
10.2.1. Delegation Recovery . . . . . . . . . . . . . . . . 192 10.2.1. Delegation Recovery . . . . . . . . . . . . . . . . 194
10.3. Data Caching . . . . . . . . . . . . . . . . . . . . . . 195 10.3. Data Caching . . . . . . . . . . . . . . . . . . . . . . 197
10.3.1. Data Caching and OPENs . . . . . . . . . . . . . . . 195 10.3.1. Data Caching and OPENs . . . . . . . . . . . . . . . 197
10.3.2. Data Caching and File Locking . . . . . . . . . . . 196 10.3.2. Data Caching and File Locking . . . . . . . . . . . 198
10.3.3. Data Caching and Mandatory File Locking . . . . . . 198 10.3.3. Data Caching and Mandatory File Locking . . . . . . 200
10.3.4. Data Caching and File Identity . . . . . . . . . . . 198 10.3.4. Data Caching and File Identity . . . . . . . . . . . 200
10.4. Open Delegation . . . . . . . . . . . . . . . . . . . . 199 10.4. Open Delegation . . . . . . . . . . . . . . . . . . . . 201
10.4.1. Open Delegation and Data Caching . . . . . . . . . . 202 10.4.1. Open Delegation and Data Caching . . . . . . . . . . 204
10.4.2. Open Delegation and File Locks . . . . . . . . . . . 203 10.4.2. Open Delegation and File Locks . . . . . . . . . . . 205
10.4.3. Handling of CB_GETATTR . . . . . . . . . . . . . . . 203 10.4.3. Handling of CB_GETATTR . . . . . . . . . . . . . . . 205
10.4.4. Recall of Open Delegation . . . . . . . . . . . . . 206 10.4.4. Recall of Open Delegation . . . . . . . . . . . . . 208
10.4.5. Clients that Fail to Honor Delegation Recalls . . . 208 10.4.5. Clients that Fail to Honor Delegation Recalls . . . 210
10.4.6. Delegation Revocation . . . . . . . . . . . . . . . 209 10.4.6. Delegation Revocation . . . . . . . . . . . . . . . 211
10.4.7. Delegations via WANT_DELEGATION . . . . . . . . . . 209 10.4.7. Delegations via WANT_DELEGATION . . . . . . . . . . 211
10.5. Data Caching and Revocation . . . . . . . . . . . . . . 210 10.5. Data Caching and Revocation . . . . . . . . . . . . . . 212
10.5.1. Revocation Recovery for Write Open Delegation . . . 211 10.5.1. Revocation Recovery for Write Open Delegation . . . 213
10.6. Attribute Caching . . . . . . . . . . . . . . . . . . . 211 10.6. Attribute Caching . . . . . . . . . . . . . . . . . . . 213
10.7. Data and Metadata Caching and Memory Mapped Files . . . 213 10.7. Data and Metadata Caching and Memory Mapped Files . . . 215
10.8. Name and Directory Caching without Directory 10.8. Name and Directory Caching without Directory
Delegations . . . . . . . . . . . . . . . . . . . . . . 216 Delegations . . . . . . . . . . . . . . . . . . . . . . 218
10.8.1. Name Caching . . . . . . . . . . . . . . . . . . . . 216 10.8.1. Name Caching . . . . . . . . . . . . . . . . . . . . 218
10.8.2. Directory Caching . . . . . . . . . . . . . . . . . 217 10.8.2. Directory Caching . . . . . . . . . . . . . . . . . 219
10.9. Directory Delegations . . . . . . . . . . . . . . . . . 218 10.9. Directory Delegations . . . . . . . . . . . . . . . . . 220
10.9.1. Introduction to Directory Delegations . . . . . . . 218 10.9.1. Introduction to Directory Delegations . . . . . . . 220
10.9.2. Directory Delegation Design . . . . . . . . . . . . 219 10.9.2. Directory Delegation Design . . . . . . . . . . . . 221
10.9.3. Attributes in Support of Directory Notifications . . 220 10.9.3. Attributes in Support of Directory Notifications . . 222
10.9.4. Directory Delegation Recall . . . . . . . . . . . . 220 10.9.4. Directory Delegation Recall . . . . . . . . . . . . 222
10.9.5. Directory Delegation Recovery . . . . . . . . . . . 221 10.9.5. Directory Delegation Recovery . . . . . . . . . . . 223
11. Multi-Server Namespace . . . . . . . . . . . . . . . . . . . 221 11. Multi-Server Namespace . . . . . . . . . . . . . . . . . . . 223
11.1. Location Attributes . . . . . . . . . . . . . . . . . . 222 11.1. Location Attributes . . . . . . . . . . . . . . . . . . 224
11.2. File System Presence or Absence . . . . . . . . . . . . 222 11.2. File System Presence or Absence . . . . . . . . . . . . 224
11.3. Getting Attributes for an Absent File System . . . . . . 223 11.3. Getting Attributes for an Absent File System . . . . . . 225
11.3.1. GETATTR Within an Absent File System . . . . . . . . 224 11.3.1. GETATTR Within an Absent File System . . . . . . . . 226
11.3.2. READDIR and Absent File Systems . . . . . . . . . . 225 11.3.2. READDIR and Absent File Systems . . . . . . . . . . 227
11.4. Uses of Location Information . . . . . . . . . . . . . . 225 11.4. Uses of Location Information . . . . . . . . . . . . . . 227
11.4.1. File System Replication . . . . . . . . . . . . . . 226 11.4.1. File System Replication . . . . . . . . . . . . . . 228
11.4.2. File System Migration . . . . . . . . . . . . . . . 227 11.4.2. File System Migration . . . . . . . . . . . . . . . 229
11.4.3. Referrals . . . . . . . . . . . . . . . . . . . . . 228 11.4.3. Referrals . . . . . . . . . . . . . . . . . . . . . 230
11.5. Location Entries and Server Identity . . . . . . . . . . 230 11.5. Location Entries and Server Identity . . . . . . . . . . 232
11.6. Additional Client-side Considerations . . . . . . . . . 230 11.6. Additional Client-side Considerations . . . . . . . . . 232
11.7. Effecting File System Transitions . . . . . . . . . . . 231 11.7. Effecting File System Transitions . . . . . . . . . . . 233
11.7.1. File System Transitions and Simultaneous Access . . 232 11.7.1. File System Transitions and Simultaneous Access . . 234
11.7.2. Simultaneous Use and Transparent Transitions . . . . 233 11.7.2. Simultaneous Use and Transparent Transitions . . . . 235
11.7.3. Filehandles and File System Transitions . . . . . . 236 11.7.3. Filehandles and File System Transitions . . . . . . 238
11.7.4. Fileids and File System Transitions . . . . . . . . 236 11.7.4. Fileids and File System Transitions . . . . . . . . 238
11.7.5. Fsids and File System Transitions . . . . . . . . . 237 11.7.5. Fsids and File System Transitions . . . . . . . . . 239
11.7.6. The Change Attribute and File System Transitions . . 238 11.7.6. The Change Attribute and File System Transitions . . 240
11.7.7. Lock State and File System Transitions . . . . . . . 238 11.7.7. Lock State and File System Transitions . . . . . . . 240
11.7.8. Write Verifiers and File System Transitions . . . . 243 11.7.8. Write Verifiers and File System Transitions . . . . 245
11.7.9. Readdir Cookies and Verifiers and File System 11.7.9. Readdir Cookies and Verifiers and File System
Transitions . . . . . . . . . . . . . . . . . . . . 243 Transitions . . . . . . . . . . . . . . . . . . . . 245
11.7.10. File System Data and File System Transitions . . . . 243 11.7.10. File System Data and File System Transitions . . . . 245
11.8. Effecting File System Referrals . . . . . . . . . . . . 245 11.8. Effecting File System Referrals . . . . . . . . . . . . 247
11.8.1. Referral Example (LOOKUP) . . . . . . . . . . . . . 245 11.8.1. Referral Example (LOOKUP) . . . . . . . . . . . . . 247
11.8.2. Referral Example (READDIR) . . . . . . . . . . . . . 249 11.8.2. Referral Example (READDIR) . . . . . . . . . . . . . 251
11.9. The Attribute fs_locations . . . . . . . . . . . . . . . 251 11.9. The Attribute fs_locations . . . . . . . . . . . . . . . 253
11.10. The Attribute fs_locations_info . . . . . . . . . . . . 254 11.10. The Attribute fs_locations_info . . . . . . . . . . . . 256
11.10.1. The fs_locations_server4 Structure . . . . . . . . . 258 11.10.1. The fs_locations_server4 Structure . . . . . . . . . 260
11.10.2. The fs_locations_info4 Structure . . . . . . . . . . 263 11.10.2. The fs_locations_info4 Structure . . . . . . . . . . 265
11.10.3. The fs_locations_item4 Structure . . . . . . . . . . 264 11.10.3. The fs_locations_item4 Structure . . . . . . . . . . 266
11.11. The Attribute fs_status . . . . . . . . . . . . . . . . 266 11.11. The Attribute fs_status . . . . . . . . . . . . . . . . 268
12. Parallel NFS (pNFS) . . . . . . . . . . . . . . . . . . . . . 270 12. Parallel NFS (pNFS) . . . . . . . . . . . . . . . . . . . . . 272
12.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 270 12.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 272
12.2. pNFS Definitions . . . . . . . . . . . . . . . . . . . . 272 12.2. pNFS Definitions . . . . . . . . . . . . . . . . . . . . 273
12.2.1. Metadata . . . . . . . . . . . . . . . . . . . . . . 273 12.2.1. Metadata . . . . . . . . . . . . . . . . . . . . . . 274
12.2.2. Metadata Server . . . . . . . . . . . . . . . . . . 273 12.2.2. Metadata Server . . . . . . . . . . . . . . . . . . 274
12.2.3. pNFS Client . . . . . . . . . . . . . . . . . . . . 273 12.2.3. pNFS Client . . . . . . . . . . . . . . . . . . . . 274
12.2.4. Storage Device . . . . . . . . . . . . . . . . . . . 273 12.2.4. Storage Device . . . . . . . . . . . . . . . . . . . 274
12.2.5. Storage Protocol . . . . . . . . . . . . . . . . . . 273 12.2.5. Storage Protocol . . . . . . . . . . . . . . . . . . 275
12.2.6. Control Protocol . . . . . . . . . . . . . . . . . . 273 12.2.6. Control Protocol . . . . . . . . . . . . . . . . . . 275
12.2.7. Layout Types . . . . . . . . . . . . . . . . . . . . 274 12.2.7. Layout Types . . . . . . . . . . . . . . . . . . . . 276
12.2.8. Layout . . . . . . . . . . . . . . . . . . . . . . . 274 12.2.8. Layout . . . . . . . . . . . . . . . . . . . . . . . 276
12.2.9. Layout Iomode . . . . . . . . . . . . . . . . . . . 275 12.2.9. Layout Iomode . . . . . . . . . . . . . . . . . . . 277
12.2.10. Device IDs . . . . . . . . . . . . . . . . . . . . . 275 12.2.10. Device IDs . . . . . . . . . . . . . . . . . . . . . 277
12.3. pNFS Operations . . . . . . . . . . . . . . . . . . . . 277 12.3. pNFS Operations . . . . . . . . . . . . . . . . . . . . 279
12.4. pNFS Attributes . . . . . . . . . . . . . . . . . . . . 278 12.4. pNFS Attributes . . . . . . . . . . . . . . . . . . . . 280
12.5. Layout Semantics . . . . . . . . . . . . . . . . . . . . 278 12.5. Layout Semantics . . . . . . . . . . . . . . . . . . . . 280
12.5.1. Guarantees Provided by Layouts . . . . . . . . . . . 278 12.5.1. Guarantees Provided by Layouts . . . . . . . . . . . 280
12.5.2. Getting a Layout . . . . . . . . . . . . . . . . . . 279 12.5.2. Getting a Layout . . . . . . . . . . . . . . . . . . 281
12.5.3. Layout Stateid . . . . . . . . . . . . . . . . . . . 280 12.5.3. Layout Stateid . . . . . . . . . . . . . . . . . . . 282
12.5.4. Committing a Layout . . . . . . . . . . . . . . . . 281 12.5.4. Committing a Layout . . . . . . . . . . . . . . . . 283
12.5.5. Recalling a Layout . . . . . . . . . . . . . . . . . 284 12.5.5. Recalling a Layout . . . . . . . . . . . . . . . . . 286
12.5.6. Revoking Layouts . . . . . . . . . . . . . . . . . . 292 12.5.6. Revoking Layouts . . . . . . . . . . . . . . . . . . 294
12.5.7. Metadata Server Write Propagation . . . . . . . . . 293 12.5.7. Metadata Server Write Propagation . . . . . . . . . 295
12.6. pNFS Mechanics . . . . . . . . . . . . . . . . . . . . . 293 12.6. pNFS Mechanics . . . . . . . . . . . . . . . . . . . . . 295
12.7. Recovery . . . . . . . . . . . . . . . . . . . . . . . . 294 12.7. Recovery . . . . . . . . . . . . . . . . . . . . . . . . 296
12.7.1. Recovery from Client Restart . . . . . . . . . . . . 295 12.7.1. Recovery from Client Restart . . . . . . . . . . . . 297
12.7.2. Dealing with Lease Expiration on the Client . . . . 295 12.7.2. Dealing with Lease Expiration on the Client . . . . 297
12.7.3. Dealing with Loss of Layout State on the Metadata 12.7.3. Dealing with Loss of Layout State on the Metadata
Server . . . . . . . . . . . . . . . . . . . . . . . 296 Server . . . . . . . . . . . . . . . . . . . . . . . 298
12.7.4. Recovery from Metadata Server Restart . . . . . . . 297 12.7.4. Recovery from Metadata Server Restart . . . . . . . 299
12.7.5. Operations During Metadata Server Grace Period . . . 299 12.7.5. Operations During Metadata Server Grace Period . . . 301
12.7.6. Storage Device Recovery . . . . . . . . . . . . . . 299 12.7.6. Storage Device Recovery . . . . . . . . . . . . . . 301
12.8. Metadata and Storage Device Roles . . . . . . . . . . . 299 12.8. Metadata and Storage Device Roles . . . . . . . . . . . 301
12.9. Security Considerations for pNFS . . . . . . . . . . . . 300 12.9. Security Considerations for pNFS . . . . . . . . . . . . 302
13. PNFS: NFSv4.1 File Layout Type . . . . . . . . . . . . . . . 301 13. NFSv4.1 as a Storage Protocol in pNFS: the File Layout Type . 303
13.1. Client ID and Session Considerations . . . . . . . . . . 301 13.1. Client ID and Session Considerations . . . . . . . . . . 303
13.1.1. Sessions Considerations for Data Servers . . . . . . 303 13.1.1. Sessions Considerations for Data Servers . . . . . . 306
13.2. File Layout Definitions . . . . . . . . . . . . . . . . 304 13.2. File Layout Definitions . . . . . . . . . . . . . . . . 306
13.3. File Layout Data Types . . . . . . . . . . . . . . . . . 305 13.3. File Layout Data Types . . . . . . . . . . . . . . . . . 307
13.4. Interpreting the File Layout . . . . . . . . . . . . . . 309 13.4. Interpreting the File Layout . . . . . . . . . . . . . . 311
13.4.1. Determining the Stripe Unit Number . . . . . . . . . 309 13.4.1. Determining the Stripe Unit Number . . . . . . . . . 311
13.4.2. Interpreting the File Layout Using Sparse Packing . 309 13.4.2. Interpreting the File Layout Using Sparse Packing . 311
13.4.3. Interpreting the File Layout Using Dense Packing . . 311 13.4.3. Interpreting the File Layout Using Dense Packing . . 314
13.4.4. Sparse and Dense Stripe Unit Packing . . . . . . . . 314 13.4.4. Sparse and Dense Stripe Unit Packing . . . . . . . . 316
13.5. Data Server Multipathing . . . . . . . . . . . . . . . . 315 13.5. Data Server Multipathing . . . . . . . . . . . . . . . . 318
13.6. Operations Sent to NFSv4.1 Data Servers . . . . . . . . 316 13.6. Operations Sent to NFSv4.1 Data Servers . . . . . . . . 319
13.7. COMMIT Through Metadata Server . . . . . . . . . . . . . 319 13.7. COMMIT Through Metadata Server . . . . . . . . . . . . . 321
13.8. The Layout Iomode . . . . . . . . . . . . . . . . . . . 320 13.8. The Layout Iomode . . . . . . . . . . . . . . . . . . . 323
13.9. Metadata and Data Server State Coordination . . . . . . 320 13.9. Metadata and Data Server State Coordination . . . . . . 323
13.9.1. Global Stateid Requirements . . . . . . . . . . . . 320 13.9.1. Global Stateid Requirements . . . . . . . . . . . . 323
13.9.2. Data Server State Propagation . . . . . . . . . . . 321 13.9.2. Data Server State Propagation . . . . . . . . . . . 324
13.10. Data Server Component File Size . . . . . . . . . . . . 323 13.10. Data Server Component File Size . . . . . . . . . . . . 326
13.11. Layout Revocation and Fencing . . . . . . . . . . . . . 324 13.11. Layout Revocation and Fencing . . . . . . . . . . . . . 327
13.12. Security Considerations for the File Layout Type . . . . 325 13.12. Security Considerations for the File Layout Type . . . . 327
14. Internationalization . . . . . . . . . . . . . . . . . . . . 326 14. Internationalization . . . . . . . . . . . . . . . . . . . . 328
14.1. Stringprep profile for the utf8str_cs type . . . . . . . 327 14.1. Stringprep profile for the utf8str_cs type . . . . . . . 329
14.2. Stringprep profile for the utf8str_cis type . . . . . . 328 14.2. Stringprep profile for the utf8str_cis type . . . . . . 331
14.3. Stringprep profile for the utf8str_mixed type . . . . . 330 14.3. Stringprep profile for the utf8str_mixed type . . . . . 332
14.4. UTF-8 Capabilities . . . . . . . . . . . . . . . . . . . 331 14.4. UTF-8 Capabilities . . . . . . . . . . . . . . . . . . . 334
14.5. UTF-8 Related Errors . . . . . . . . . . . . . . . . . . 331 14.5. UTF-8 Related Errors . . . . . . . . . . . . . . . . . . 334
15. Error Values . . . . . . . . . . . . . . . . . . . . . . . . 332 15. Error Values . . . . . . . . . . . . . . . . . . . . . . . . 335
15.1. Error Definitions . . . . . . . . . . . . . . . . . . . 332 15.1. Error Definitions . . . . . . . . . . . . . . . . . . . 335
15.1.1. General Errors . . . . . . . . . . . . . . . . . . . 334 15.1.1. General Errors . . . . . . . . . . . . . . . . . . . 337
15.1.2. Filehandle Errors . . . . . . . . . . . . . . . . . 336 15.1.2. Filehandle Errors . . . . . . . . . . . . . . . . . 339
15.1.3. Compound Structure Errors . . . . . . . . . . . . . 338 15.1.3. Compound Structure Errors . . . . . . . . . . . . . 340
15.1.4. File System Errors . . . . . . . . . . . . . . . . . 339 15.1.4. File System Errors . . . . . . . . . . . . . . . . . 342
15.1.5. State Management Errors . . . . . . . . . . . . . . 341 15.1.5. State Management Errors . . . . . . . . . . . . . . 344
15.1.6. Security Errors . . . . . . . . . . . . . . . . . . 342 15.1.6. Security Errors . . . . . . . . . . . . . . . . . . 345
15.1.7. Name Errors . . . . . . . . . . . . . . . . . . . . 343 15.1.7. Name Errors . . . . . . . . . . . . . . . . . . . . 345
15.1.8. Locking Errors . . . . . . . . . . . . . . . . . . . 343 15.1.8. Locking Errors . . . . . . . . . . . . . . . . . . . 346
15.1.9. Reclaim Errors . . . . . . . . . . . . . . . . . . . 345 15.1.9. Reclaim Errors . . . . . . . . . . . . . . . . . . . 347
15.1.10. pNFS Errors . . . . . . . . . . . . . . . . . . . . 345 15.1.10. pNFS Errors . . . . . . . . . . . . . . . . . . . . 348
15.1.11. Session Use Errors . . . . . . . . . . . . . . . . . 347 15.1.11. Session Use Errors . . . . . . . . . . . . . . . . . 349
15.1.12. Session Management Errors . . . . . . . . . . . . . 348 15.1.12. Session Management Errors . . . . . . . . . . . . . 351
15.1.13. Client Management Errors . . . . . . . . . . . . . . 348 15.1.13. Client Management Errors . . . . . . . . . . . . . . 351
15.1.14. Delegation Errors . . . . . . . . . . . . . . . . . 349 15.1.14. Delegation Errors . . . . . . . . . . . . . . . . . 352
15.1.15. Attribute Handling Errors . . . . . . . . . . . . . 350 15.1.15. Attribute Handling Errors . . . . . . . . . . . . . 352
15.1.16. Obsoleted Errors . . . . . . . . . . . . . . . . . . 350 15.1.16. Obsoleted Errors . . . . . . . . . . . . . . . . . . 353
15.2. Operations and their valid errors . . . . . . . . . . . 351 15.2. Operations and their valid errors . . . . . . . . . . . 354
15.3. Callback operations and their valid errors . . . . . . . 367 15.3. Callback operations and their valid errors . . . . . . . 370
15.4. Errors and the operations that use them . . . . . . . . 369 15.4. Errors and the operations that use them . . . . . . . . 372
16. NFSv4.1 Procedures . . . . . . . . . . . . . . . . . . . . . 383 16. NFSv4.1 Procedures . . . . . . . . . . . . . . . . . . . . . 387
16.1. Procedure 0: NULL - No Operation . . . . . . . . . . . . 383 16.1. Procedure 0: NULL - No Operation . . . . . . . . . . . . 387
16.2. Procedure 1: COMPOUND - Compound Operations . . . . . . 384 16.2. Procedure 1: COMPOUND - Compound Operations . . . . . . 388
17. Operations: REQUIRED, RECOMMENDED, or OPTIONAL . . . . . . . 395 17. Operations: REQUIRED, RECOMMENDED, or OPTIONAL . . . . . . . 399
18. NFSv4.1 Operations . . . . . . . . . . . . . . . . . . . . . 398 18. NFSv4.1 Operations . . . . . . . . . . . . . . . . . . . . . 402
18.1. Operation 3: ACCESS - Check Access Rights . . . . . . . 398 18.1. Operation 3: ACCESS - Check Access Rights . . . . . . . 402
18.2. Operation 4: CLOSE - Close File . . . . . . . . . . . . 404 18.2. Operation 4: CLOSE - Close File . . . . . . . . . . . . 408
18.3. Operation 5: COMMIT - Commit Cached Data . . . . . . . . 405 18.3. Operation 5: COMMIT - Commit Cached Data . . . . . . . . 409
18.4. Operation 6: CREATE - Create a Non-Regular File Object . 408 18.4. Operation 6: CREATE - Create a Non-Regular File Object . 412
18.5. Operation 7: DELEGPURGE - Purge Delegations Awaiting 18.5. Operation 7: DELEGPURGE - Purge Delegations Awaiting
Recovery . . . . . . . . . . . . . . . . . . . . . . . . 411 Recovery . . . . . . . . . . . . . . . . . . . . . . . . 415
18.6. Operation 8: DELEGRETURN - Return Delegation . . . . . . 412 18.6. Operation 8: DELEGRETURN - Return Delegation . . . . . . 416
18.7. Operation 9: GETATTR - Get Attributes . . . . . . . . . 412 18.7. Operation 9: GETATTR - Get Attributes . . . . . . . . . 416
18.8. Operation 10: GETFH - Get Current Filehandle . . . . . . 414 18.8. Operation 10: GETFH - Get Current Filehandle . . . . . . 418
18.9. Operation 11: LINK - Create Link to a File . . . . . . . 415 18.9. Operation 11: LINK - Create Link to a File . . . . . . . 419
18.10. Operation 12: LOCK - Create Lock . . . . . . . . . . . . 418 18.10. Operation 12: LOCK - Create Lock . . . . . . . . . . . . 422
18.11. Operation 13: LOCKT - Test For Lock . . . . . . . . . . 422 18.11. Operation 13: LOCKT - Test For Lock . . . . . . . . . . 426
18.12. Operation 14: LOCKU - Unlock File . . . . . . . . . . . 423 18.12. Operation 14: LOCKU - Unlock File . . . . . . . . . . . 427
18.13. Operation 15: LOOKUP - Lookup Filename . . . . . . . . . 425 18.13. Operation 15: LOOKUP - Lookup Filename . . . . . . . . . 429
18.14. Operation 16: LOOKUPP - Lookup Parent Directory . . . . 426 18.14. Operation 16: LOOKUPP - Lookup Parent Directory . . . . 430
18.15. Operation 17: NVERIFY - Verify Difference in 18.15. Operation 17: NVERIFY - Verify Difference in
Attributes . . . . . . . . . . . . . . . . . . . . . . . 428 Attributes . . . . . . . . . . . . . . . . . . . . . . . 432
18.16. Operation 18: OPEN - Open a Regular File . . . . . . . . 429 18.16. Operation 18: OPEN - Open a Regular File . . . . . . . . 433
18.17. Operation 19: OPENATTR - Open Named Attribute 18.17. Operation 19: OPENATTR - Open Named Attribute
Directory . . . . . . . . . . . . . . . . . . . . . . . 448 Directory . . . . . . . . . . . . . . . . . . . . . . . 452
18.18. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access . 449 18.18. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access . 453
18.19. Operation 22: PUTFH - Set Current Filehandle . . . . . . 451 18.19. Operation 22: PUTFH - Set Current Filehandle . . . . . . 455
18.20. Operation 23: PUTPUBFH - Set Public Filehandle . . . . . 451 18.20. Operation 23: PUTPUBFH - Set Public Filehandle . . . . . 455
18.21. Operation 24: PUTROOTFH - Set Root Filehandle . . . . . 453 18.21. Operation 24: PUTROOTFH - Set Root Filehandle . . . . . 457
18.22. Operation 25: READ - Read from File . . . . . . . . . . 454 18.22. Operation 25: READ - Read from File . . . . . . . . . . 458
18.23. Operation 26: READDIR - Read Directory . . . . . . . . . 456 18.23. Operation 26: READDIR - Read Directory . . . . . . . . . 460
18.24. Operation 27: READLINK - Read Symbolic Link . . . . . . 460 18.24. Operation 27: READLINK - Read Symbolic Link . . . . . . 464
18.25. Operation 28: REMOVE - Remove File System Object . . . . 461 18.25. Operation 28: REMOVE - Remove File System Object . . . . 465
18.26. Operation 29: RENAME - Rename Directory Entry . . . . . 463 18.26. Operation 29: RENAME - Rename Directory Entry . . . . . 467
18.27. Operation 31: RESTOREFH - Restore Saved Filehandle . . . 467 18.27. Operation 31: RESTOREFH - Restore Saved Filehandle . . . 471
18.28. Operation 32: SAVEFH - Save Current Filehandle . . . . . 468 18.28. Operation 32: SAVEFH - Save Current Filehandle . . . . . 472
18.29. Operation 33: SECINFO - Obtain Available Security . . . 469 18.29. Operation 33: SECINFO - Obtain Available Security . . . 473
18.30. Operation 34: SETATTR - Set Attributes . . . . . . . . . 473 18.30. Operation 34: SETATTR - Set Attributes . . . . . . . . . 477
18.31. Operation 37: VERIFY - Verify Same Attributes . . . . . 476 18.31. Operation 37: VERIFY - Verify Same Attributes . . . . . 480
18.32. Operation 38: WRITE - Write to File . . . . . . . . . . 477 18.32. Operation 38: WRITE - Write to File . . . . . . . . . . 481
18.33. Operation 40: BACKCHANNEL_CTL - Backchannel Control . . 481 18.33. Operation 40: BACKCHANNEL_CTL - Backchannel Control . . 485
18.34. Operation 41: BIND_CONN_TO_SESSION - Associate 18.34. Operation 41: BIND_CONN_TO_SESSION - Associate
Connection with Session . . . . . . . . . . . . . . . . 483 Connection with Session . . . . . . . . . . . . . . . . 487
18.35. Operation 42: EXCHANGE_ID - Instantiate Client ID . . . 486 18.35. Operation 42: EXCHANGE_ID - Instantiate Client ID . . . 490
18.36. Operation 43: CREATE_SESSION - Create New Session and 18.36. Operation 43: CREATE_SESSION - Create New Session and
Confirm Client ID . . . . . . . . . . . . . . . . . . . 503 Confirm Client ID . . . . . . . . . . . . . . . . . . . 507
18.37. Operation 44: DESTROY_SESSION - Destroy a Session . . . 513 18.37. Operation 44: DESTROY_SESSION - Destroy a Session . . . 517
18.38. Operation 45: FREE_STATEID - Free Stateid with No 18.38. Operation 45: FREE_STATEID - Free Stateid with No
Locks . . . . . . . . . . . . . . . . . . . . . . . . . 514 Locks . . . . . . . . . . . . . . . . . . . . . . . . . 518
18.39. Operation 46: GET_DIR_DELEGATION - Get a directory 18.39. Operation 46: GET_DIR_DELEGATION - Get a directory
delegation . . . . . . . . . . . . . . . . . . . . . . . 515 delegation . . . . . . . . . . . . . . . . . . . . . . . 519
18.40. Operation 47: GETDEVICEINFO - Get Device Information . . 519 18.40. Operation 47: GETDEVICEINFO - Get Device Information . . 523
18.41. Operation 48: GETDEVICELIST - Get All Device Mappings 18.41. Operation 48: GETDEVICELIST - Get All Device Mappings
for a File System . . . . . . . . . . . . . . . . . . . 521 for a File System . . . . . . . . . . . . . . . . . . . 525
18.42. Operation 49: LAYOUTCOMMIT - Commit Writes Made Using 18.42. Operation 49: LAYOUTCOMMIT - Commit Writes Made Using
a Layout . . . . . . . . . . . . . . . . . . . . . . . . 523 a Layout . . . . . . . . . . . . . . . . . . . . . . . . 527
18.43. Operation 50: LAYOUTGET - Get Layout Information . . . . 526 18.43. Operation 50: LAYOUTGET - Get Layout Information . . . . 530
18.44. Operation 51: LAYOUTRETURN - Release Layout 18.44. Operation 51: LAYOUTRETURN - Release Layout
Information . . . . . . . . . . . . . . . . . . . . . . 536 Information . . . . . . . . . . . . . . . . . . . . . . 540
18.45. Operation 52: SECINFO_NO_NAME - Get Security on 18.45. Operation 52: SECINFO_NO_NAME - Get Security on
Unnamed Object . . . . . . . . . . . . . . . . . . . . . 540 Unnamed Object . . . . . . . . . . . . . . . . . . . . . 544
18.46. Operation 53: SEQUENCE - Supply Per-Procedure 18.46. Operation 53: SEQUENCE - Supply Per-Procedure
Sequencing and Control . . . . . . . . . . . . . . . . . 541 Sequencing and Control . . . . . . . . . . . . . . . . . 545
18.47. Operation 54: SET_SSV - Update SSV for a Client ID . . . 547 18.47. Operation 54: SET_SSV - Update SSV for a Client ID . . . 551
18.48. Operation 55: TEST_STATEID - Test Stateids for 18.48. Operation 55: TEST_STATEID - Test Stateids for
Validity . . . . . . . . . . . . . . . . . . . . . . . . 549 Validity . . . . . . . . . . . . . . . . . . . . . . . . 553
18.49. Operation 56: WANT_DELEGATION - Request Delegation . . . 551 18.49. Operation 56: WANT_DELEGATION - Request Delegation . . . 555
18.50. Operation 57: DESTROY_CLIENTID - Destroy a Client ID . . 555 18.50. Operation 57: DESTROY_CLIENTID - Destroy a Client ID . . 559
18.51. Operation 58: RECLAIM_COMPLETE - Indicates Reclaims 18.51. Operation 58: RECLAIM_COMPLETE - Indicates Reclaims
Finished . . . . . . . . . . . . . . . . . . . . . . . . 555 Finished . . . . . . . . . . . . . . . . . . . . . . . . 559
18.52. Operation 10044: ILLEGAL - Illegal operation . . . . . . 558 18.52. Operation 10044: ILLEGAL - Illegal operation . . . . . . 562
19. NFSv4.1 Callback Procedures . . . . . . . . . . . . . . . . . 558 19. NFSv4.1 Callback Procedures . . . . . . . . . . . . . . . . . 562
19.1. Procedure 0: CB_NULL - No Operation . . . . . . . . . . 559 19.1. Procedure 0: CB_NULL - No Operation . . . . . . . . . . 563
19.2. Procedure 1: CB_COMPOUND - Compound Operations . . . . . 559 19.2. Procedure 1: CB_COMPOUND - Compound Operations . . . . . 563
20. NFSv4.1 Callback Operations . . . . . . . . . . . . . . . . . 563 20. NFSv4.1 Callback Operations . . . . . . . . . . . . . . . . . 567
20.1. Operation 3: CB_GETATTR - Get Attributes . . . . . . . . 563 20.1. Operation 3: CB_GETATTR - Get Attributes . . . . . . . . 567
20.2. Operation 4: CB_RECALL - Recall a Delegation . . . . . . 564 20.2. Operation 4: CB_RECALL - Recall a Delegation . . . . . . 568
20.3. Operation 5: CB_LAYOUTRECALL - Recall Layout from 20.3. Operation 5: CB_LAYOUTRECALL - Recall Layout from
Client . . . . . . . . . . . . . . . . . . . . . . . . . 565 Client . . . . . . . . . . . . . . . . . . . . . . . . . 569
20.4. Operation 6: CB_NOTIFY - Notify Client of Directory 20.4. Operation 6: CB_NOTIFY - Notify Client of Directory
Changes . . . . . . . . . . . . . . . . . . . . . . . . 569 Changes . . . . . . . . . . . . . . . . . . . . . . . . 573
20.5. Operation 7: CB_PUSH_DELEG - Offer Previously 20.5. Operation 7: CB_PUSH_DELEG - Offer Previously
Requested Delegation to Client . . . . . . . . . . . . . 573 Requested Delegation to Client . . . . . . . . . . . . . 577
20.6. Operation 8: CB_RECALL_ANY - Keep Any N Recallable 20.6. Operation 8: CB_RECALL_ANY - Keep Any N Recallable
Objects . . . . . . . . . . . . . . . . . . . . . . . . 574 Objects . . . . . . . . . . . . . . . . . . . . . . . . 578
20.7. Operation 9: CB_RECALLABLE_OBJ_AVAIL - Signal 20.7. Operation 9: CB_RECALLABLE_OBJ_AVAIL - Signal
Resources for Recallable Objects . . . . . . . . . . . . 577 Resources for Recallable Objects . . . . . . . . . . . . 581
20.8. Operation 10: CB_RECALL_SLOT - Change Flow Control 20.8. Operation 10: CB_RECALL_SLOT - Change Flow Control
Limits . . . . . . . . . . . . . . . . . . . . . . . . . 578 Limits . . . . . . . . . . . . . . . . . . . . . . . . . 582
20.9. Operation 11: CB_SEQUENCE - Supply Backchannel 20.9. Operation 11: CB_SEQUENCE - Supply Backchannel
Sequencing and Control . . . . . . . . . . . . . . . . . 579 Sequencing and Control . . . . . . . . . . . . . . . . . 583
20.10. Operation 12: CB_WANTS_CANCELLED - Cancel Pending 20.10. Operation 12: CB_WANTS_CANCELLED - Cancel Pending
Delegation Wants . . . . . . . . . . . . . . . . . . . . 581 Delegation Wants . . . . . . . . . . . . . . . . . . . . 585
20.11. Operation 13: CB_NOTIFY_LOCK - Notify Client of 20.11. Operation 13: CB_NOTIFY_LOCK - Notify Client of
Possible Lock Availability . . . . . . . . . . . . . . . 582 Possible Lock Availability . . . . . . . . . . . . . . . 586
20.12. Operation 14: CB_NOTIFY_DEVICEID - Notify Client of 20.12. Operation 14: CB_NOTIFY_DEVICEID - Notify Client of
Device ID Changes . . . . . . . . . . . . . . . . . . . 584 Device ID Changes . . . . . . . . . . . . . . . . . . . 588
20.13. Operation 10044: CB_ILLEGAL - Illegal Callback 20.13. Operation 10044: CB_ILLEGAL - Illegal Callback
Operation . . . . . . . . . . . . . . . . . . . . . . . 586 Operation . . . . . . . . . . . . . . . . . . . . . . . 590
21. Security Considerations . . . . . . . . . . . . . . . . . . . 586 21. Security Considerations . . . . . . . . . . . . . . . . . . . 590
22. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 588 22. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 592
22.1. Named Attribute Definitions . . . . . . . . . . . . . . 588 22.1. Named Attribute Definitions . . . . . . . . . . . . . . 592
22.1.1. Initial Registry . . . . . . . . . . . . . . . . . . 589 22.1.1. Initial Registry . . . . . . . . . . . . . . . . . . 593
22.1.2. Updating Registrations . . . . . . . . . . . . . . . 589 22.1.2. Updating Registrations . . . . . . . . . . . . . . . 593
22.2. Device ID Notifications . . . . . . . . . . . . . . . . 589 22.2. Device ID Notifications . . . . . . . . . . . . . . . . 593
22.2.1. Initial Registry . . . . . . . . . . . . . . . . . . 590 22.2.1. Initial Registry . . . . . . . . . . . . . . . . . . 594
22.2.2. Updating Registrations . . . . . . . . . . . . . . . 590 22.2.2. Updating Registrations . . . . . . . . . . . . . . . 594
22.3. Object Recall Types . . . . . . . . . . . . . . . . . . 590 22.3. Object Recall Types . . . . . . . . . . . . . . . . . . 594
22.3.1. Initial Registry . . . . . . . . . . . . . . . . . . 592 22.3.1. Initial Registry . . . . . . . . . . . . . . . . . . 596
22.3.2. Updating Registrations . . . . . . . . . . . . . . . 592 22.3.2. Updating Registrations . . . . . . . . . . . . . . . 596
22.4. Layout Types . . . . . . . . . . . . . . . . . . . . . . 592 22.4. Layout Types . . . . . . . . . . . . . . . . . . . . . . 596
22.4.1. Initial Registry . . . . . . . . . . . . . . . . . . 593 22.4.1. Initial Registry . . . . . . . . . . . . . . . . . . 597
22.4.2. Updating Registrations . . . . . . . . . . . . . . . 593 22.4.2. Updating Registrations . . . . . . . . . . . . . . . 597
22.4.3. Guidelines for Writing Layout Type Specifications . 593 22.4.3. Guidelines for Writing Layout Type Specifications . 597
22.5. Path Variable Definitions . . . . . . . . . . . . . . . 595 22.5. Path Variable Definitions . . . . . . . . . . . . . . . 599
22.5.1. Path Variables Registry . . . . . . . . . . . . . . 595 22.5.1. Path Variables Registry . . . . . . . . . . . . . . 599
22.5.2. Values for the ${ietf.org:CPU_ARCH} Variable . . . . 597 22.5.2. Values for the ${ietf.org:CPU_ARCH} Variable . . . . 601
22.5.3. Values for the ${ietf.org:OS_TYPE} Variable . . . . 597 22.5.3. Values for the ${ietf.org:OS_TYPE} Variable . . . . 601
23. References . . . . . . . . . . . . . . . . . . . . . . . . . 598 23. References . . . . . . . . . . . . . . . . . . . . . . . . . 602
23.1. Normative References . . . . . . . . . . . . . . . . . . 598 23.1. Normative References . . . . . . . . . . . . . . . . . . 602
23.2. Informative References . . . . . . . . . . . . . . . . . 600 23.2. Informative References . . . . . . . . . . . . . . . . . 605
Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 601 Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 606
Appendix B. RFC Editor Notes . . . . . . . . . . . . . . . . . . 603 Appendix B. RFC Editor Notes . . . . . . . . . . . . . . . . . . 608
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 604 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 609
Intellectual Property and Copyright Statements . . . . . . . . . 605 Intellectual Property and Copyright Statements . . . . . . . . . 610
1. Introduction 1. Introduction
1.1. The NFS Version 4 Minor Version 1 Protocol 1.1. The NFS Version 4 Minor Version 1 Protocol
The NFS version 4 minor version 1 (NFSv4.1) protocol is the second The NFS version 4 minor version 1 (NFSv4.1) protocol is the second
minor version of the NFS version 4 (NFSv4) protocol. The first minor minor version of the NFS version 4 (NFSv4) protocol. The first minor
version, NFSv4.0 is described in [20]. It generally follows the version, NFSv4.0 is described in [30]. It generally follows the
guidelines for minor versioning model listed in Section 10 of RFC guidelines for minor versioning model listed in Section 10 of RFC
3530. However, it diverges from guidelines 11 ("a client and server 3530. However, it diverges from guidelines 11 ("a client and server
that supports minor version X must support minor versions 0 through that supports minor version X must support minor versions 0 through
X-1"), and 12 ("no features may be introduced as mandatory in a minor X-1"), and 12 ("no features may be introduced as mandatory in a minor
version"). These divergences are due to the introduction of the version"). These divergences are due to the introduction of the
sessions model for managing non-idempotent operations and the sessions model for managing non-idempotent operations and the
RECLAIM_COMPLETE operation. These two new features are RECLAIM_COMPLETE operation. These two new features are
infrastructural in nature and simplify implementation of existing and infrastructural in nature and simplify implementation of existing and
other new features. Making them anything but REQUIRED would add other new features. Making them anything but REQUIRED would add
undue complexity to protocol definition and implementation. NFSv4.1 undue complexity to protocol definition and implementation. NFSv4.1
skipping to change at page 11, line 45 skipping to change at page 11, line 45
o describe the NFSv4.0 protocol, except where needed to contrast o describe the NFSv4.0 protocol, except where needed to contrast
with NFSv4.1. with NFSv4.1.
o modify the specification of the NFSv4.0 protocol. o modify the specification of the NFSv4.0 protocol.
o clarify the NFSv4.0 protocol. o clarify the NFSv4.0 protocol.
1.3. NFSv4 Goals 1.3. NFSv4 Goals
The NFSv4 protocol is a further revision of the NFS protocol defined The NFSv4 protocol is a further revision of the NFS protocol defined
already by NFSv3 [21]. It retains the essential characteristics of already by NFSv3 [31]. It retains the essential characteristics of
previous versions: easy recovery; independence of transport previous versions: easy recovery; independence of transport
protocols, operating systems and file systems; simplicity; and good protocols, operating systems and file systems; simplicity; and good
performance. NFSv4 has the following goals: performance. NFSv4 has the following goals:
o Improved access and good performance on the Internet. o Improved access and good performance on the Internet.
The protocol is designed to transit firewalls easily, perform well The protocol is designed to transit firewalls easily, perform well
where latency is high and bandwidth is low, and scale to very where latency is high and bandwidth is low, and scale to very
large numbers of clients per server. large numbers of clients per server.
skipping to change at page 14, line 9 skipping to change at page 14, line 9
All leases granted by a server have the same fixed interval. Note All leases granted by a server have the same fixed interval. Note
that the fixed interval was chosen to alleviate the expense a that the fixed interval was chosen to alleviate the expense a
server would have in maintaining state about variable length server would have in maintaining state about variable length
leases across server failures. leases across server failures.
Lock The term "lock" is used to refer to byte-range (in UNIX Lock The term "lock" is used to refer to byte-range (in UNIX
environments, also known as record) locks, share reservations, environments, also known as record) locks, share reservations,
delegations, or layouts unless specifically stated otherwise. delegations, or layouts unless specifically stated otherwise.
Secret State Verifier (SSV) The SSV is a unique secret key shared
between a client and server. The SSV serves as the secret key for
an internal (that is, internal to NFSv4.1) GSS mechanism (the SSV
GSS mechanism, see Section 2.10.9). The SSV GSS mechanism uses
the SSV to compute Message Integrity Code (MIC) and Wrap tokens.
See Section 2.10.8.3 for more details on how NFSv4.1 uses the SSV
and the SSV GSS mechanism.
Server The "Server" is the entity responsible for coordinating Server The "Server" is the entity responsible for coordinating
client access to a set of file systems and is identified by a client access to a set of file systems and is identified by a
Server owner. A server can span multiple network addresses. Server owner. A server can span multiple network addresses.
Server Owner The "Server Owner" identifies the server to the client. Server Owner The "Server Owner" identifies the server to the client.
The server owner consists of a major and minor identifier. When The server owner consists of a major and minor identifier. When
the client has two connections each to a peer with the same major the client has two connections each to a peer with the same major
identifier, the client assumes both peers are the same server (the identifier, the client assumes both peers are the same server (the
server namespace is the same via each connection), and assumes and server namespace is the same via each connection), and assumes and
lock state is sharable across both connections. When each peer lock state is sharable across both connections. When each peer
has both the same major and minor identifier, the client assumes has both the same major and minor identifier, the client assumes
each connection might be associable with the same session. each connection might be associable with the same session.
Stable Storage NFSv4.1 servers must be able to recover without data Stable Storage Stable storage is storage from which data stored by
loss from multiple power failures (including cascading power an NFSv4.1 server can be recovered without data loss from multiple
failures, that is, several power failures in quick succession), power failures (including cascading power failures, that is,
operating system failures, and hardware failure of components several power failures in quick succession), operating system
other than the storage medium itself (for example, disk, failures, and/or hardware failure of components other than the
nonvolatile RAM). storage medium itself (such as disk, nonvolatile RAM, flash
memory, etc.).
Some examples of stable storage that are allowable for an NFS Some examples of stable storage that are allowable for an NFS
server include: server include:
1. Media commit of data, that is, the modified data has been 1. Media commit of data, that is, the modified data has been
successfully written to the disk media, for example, the disk successfully written to the disk media, for example, the disk
platter. platter.
2. An immediate reply disk drive with battery-backed on- drive 2. An immediate reply disk drive with battery-backed on- drive
intermediate storage or uninterruptible power system (UPS). intermediate storage or uninterruptible power system (UPS).
skipping to change at page 15, line 22 skipping to change at page 15, line 27
the NFSv4.1 protocol will be reviewed in brief. This will be done to the NFSv4.1 protocol will be reviewed in brief. This will be done to
provide an appropriate context for both the reader who is familiar provide an appropriate context for both the reader who is familiar
with the previous versions of the NFS protocol and the reader that is with the previous versions of the NFS protocol and the reader that is
new to the NFS protocols. For the reader new to the NFS protocols, new to the NFS protocols. For the reader new to the NFS protocols,
there is still a set of fundamental knowledge that is expected. The there is still a set of fundamental knowledge that is expected. The
reader should be familiar with the XDR and RPC protocols as described reader should be familiar with the XDR and RPC protocols as described
in [2] and [3]. A basic knowledge of file systems and distributed in [2] and [3]. A basic knowledge of file systems and distributed
file systems is expected as well. file systems is expected as well.
In general this specification of NFSv4.1 will not distinguish those In general this specification of NFSv4.1 will not distinguish those
added in minor version one from those present in the base protocol features added in minor version one from those present in the base
but will treat NFSv4.1 as a unified whole. See Section 1.7 for a protocol but will treat NFSv4.1 as a unified whole. See Section 1.7
summary of the differences between NFSv4.0 and NFSv4.1. for a summary of the differences between NFSv4.0 and NFSv4.1.
1.6.1. RPC and Security 1.6.1. RPC and Security
As with previous versions of NFS, the External Data Representation As with previous versions of NFS, the External Data Representation
(XDR) and Remote Procedure Call (RPC) mechanisms used for the NFSv4.1 (XDR) and Remote Procedure Call (RPC) mechanisms used for the NFSv4.1
protocol are those defined in [2] and [3]. To meet end-to-end protocol are those defined in [2] and [3]. To meet end-to-end
security requirements, the RPCSEC_GSS framework [4] will be used to security requirements, the RPCSEC_GSS framework [4] is used to extend
extend the basic RPC security. With the use of RPCSEC_GSS, various the basic RPC security. With the use of RPCSEC_GSS, various
mechanisms can be provided to offer authentication, integrity, and mechanisms can be provided to offer authentication, integrity, and
privacy to the NFSv4 protocol. Kerberos V5 will be used as described privacy to the NFSv4 protocol. Kerberos V5 is used as described in
in [5] to provide one security framework. The LIPKEY and SPKM-3 GSS- [5] to provide one security framework. The LIPKEY and SPKM-3 GSS-API
API mechanisms described in [6] will be used to provide for the use mechanisms described in [6] are used to provide for the use of user
of user password and client/server public key certificates by the password and client/server public key certificates by the NFSv4
NFSv4 protocol. With the use of RPCSEC_GSS, other mechanisms may protocol. With the use of RPCSEC_GSS, other mechanisms may also be
also be specified and used for NFSv4.1 security. specified and used for NFSv4.1 security.
To enable in-band security negotiation, the NFSv4.1 protocol has To enable in-band security negotiation, the NFSv4.1 protocol has
operations which provide the client a method of querying the server operations which provide the client a method of querying the server
about its policies regarding which security mechanisms must be used about its policies regarding which security mechanisms must be used
for access to the server's file system resources. With this, the for access to the server's file system resources. With this, the
client can securely match the security mechanism that meets the client can securely match the security mechanism that meets the
policies specified at both the client and server. policies specified at both the client and server.
1.6.2. Protocol Structure 1.6.2. Protocol Structure
skipping to change at page 17, line 14 skipping to change at page 17, line 16
filehandles with more limited validity guarantees, called volatile filehandles with more limited validity guarantees, called volatile
filehandles. filehandles.
1.6.3.2. File Attributes 1.6.3.2. File Attributes
The NFSv4.1 protocol has a rich and extensible file object attribute The NFSv4.1 protocol has a rich and extensible file object attribute
structure, which is divided into REQUIRED, RECOMMENDED, and named structure, which is divided into REQUIRED, RECOMMENDED, and named
attributes (see Section 5). attributes (see Section 5).
Several (but not all) of the REQUIRED attributes are derived from the Several (but not all) of the REQUIRED attributes are derived from the
attributes of NFSv3 (see definition of the fattr3 data type in [21]). attributes of NFSv3 (see the definition of the fattr3 data type in
An example of a REQUIRED attribute is the file object's type [31]). An example of a REQUIRED attribute is the file object's type
(Section 5.8.1.2) so that regular files can be distinguished from (Section 5.8.1.2) so that regular files can be distinguished from
directories (also known as folders in some operating environments) directories (also known as folders in some operating environments)
and other types of objects. REQUIRED attributes are discussed in and other types of objects. REQUIRED attributes are discussed in
Section 5.1. Section 5.1.
An example of three RECOMMENDED attributes are acl, sacl, and dacl. An example of three RECOMMENDED attributes are acl, sacl, and dacl.
These attributes define an Access Control List (ACL) on a file object These attributes define an Access Control List (ACL) on a file object
((Section 6). An ACL provides directory and file access control ((Section 6). An ACL provides directory and file access control
beyond the model used in NFSv3. The ACL definition allows for beyond the model used in NFSv3. The ACL definition allows for
specification of specific sets of permissions for individual users specification of specific sets of permissions for individual users
skipping to change at page 19, line 42 skipping to change at page 19, line 45
* A method to allow a server to indicate it is recalling one or * A method to allow a server to indicate it is recalling one or
more delegations for resource management reasons, and thus a more delegations for resource management reasons, and thus a
method to allow the client to pick which delegations to return method to allow the client to pick which delegations to return
(Section 20.6). (Section 20.6).
o Attributes can be set atomically during exclusive file create via o Attributes can be set atomically during exclusive file create via
the OPEN operation (see the new EXCLUSIVE4_1 creation method in the OPEN operation (see the new EXCLUSIVE4_1 creation method in
Section 18.16). Section 18.16).
o Open files can be preserved if removed and the hard link count o Open files can be preserved if removed and the hard link count
goes to zero thus obviating the need for clients to rename deleted ("hard link" is defined in an Open Group [7] standard) goes to
files to partially hidden names -- colloquially called "silly zero thus obviating the need for clients to rename deleted files
rename" (see the new OPEN4_RESULT_PRESERVE_UNLINKED reply flag in to partially hidden names -- colloquially called "silly rename"
(see the new OPEN4_RESULT_PRESERVE_UNLINKED reply flag in
Section 18.16). Section 18.16).
o Improved compatibility with Microsoft Windows for Access Control o Improved compatibility with Microsoft Windows for Access Control
Lists (Section 6.2.3, Section 6.2.2, Section 6.4.3.2). Lists (Section 6.2.3, Section 6.2.2, Section 6.4.3.2).
o Data retention (Section 5.13). o Data retention (Section 5.13).
o Identification of the implementation of the NFS client and server o Identification of the implementation of the NFS client and server
(Section 18.35). (Section 18.35).
skipping to change at page 21, line 7 skipping to change at page 21, line 9
Every RPC header conveys information used to identify and Every RPC header conveys information used to identify and
authenticate a client and server. As discussed in Section 2.2.1.1.1, authenticate a client and server. As discussed in Section 2.2.1.1.1,
some security flavors provide additional security services. some security flavors provide additional security services.
NFSv4.1 clients and servers MUST implement RPCSEC_GSS. (This NFSv4.1 clients and servers MUST implement RPCSEC_GSS. (This
requirement to implement is not a requirement to use.) Other requirement to implement is not a requirement to use.) Other
flavors, such as AUTH_NONE, and AUTH_SYS, MAY be implemented as well. flavors, such as AUTH_NONE, and AUTH_SYS, MAY be implemented as well.
2.2.1.1.1. RPCSEC_GSS and Security Services 2.2.1.1.1. RPCSEC_GSS and Security Services
RPCSEC_GSS ([4]) uses the functionality of GSS-API [7]. This allows RPCSEC_GSS ([4]) uses the functionality of GSS-API [8]. This allows
for the use of various security mechanisms by the RPC layer without for the use of various security mechanisms by the RPC layer without
the additional implementation overhead of adding RPC security the additional implementation overhead of adding RPC security
flavors. flavors.
2.2.1.1.1.1. Identification, Authentication, Integrity, Privacy 2.2.1.1.1.1. Identification, Authentication, Integrity, Privacy
Via the GSS-API, RPCSEC_GSS can be used to identify and authenticate Via the GSS-API, RPCSEC_GSS can be used to identify and authenticate
users on clients to servers, and servers to users. It can also users on clients to servers, and servers to users. It can also
perform integrity checking on the entire RPC message, including the perform integrity checking on the entire RPC message, including the
RPC header, and the arguments or results. Finally, privacy, usually RPC header, and the arguments or results. Finally, privacy, usually
skipping to change at page 22, line 31 skipping to change at page 22, line 34
------------------------------------------------------------------ ------------------------------------------------------------------
390003 krb5 1.2.840.113554.1.2.2 rpc_gss_svc_none yes yes 390003 krb5 1.2.840.113554.1.2.2 rpc_gss_svc_none yes yes
390004 krb5i 1.2.840.113554.1.2.2 rpc_gss_svc_integrity yes yes 390004 krb5i 1.2.840.113554.1.2.2 rpc_gss_svc_integrity yes yes
390005 krb5p 1.2.840.113554.1.2.2 rpc_gss_svc_privacy no yes 390005 krb5p 1.2.840.113554.1.2.2 rpc_gss_svc_privacy no yes
Note that the number and name of the pseudo flavor is presented here Note that the number and name of the pseudo flavor is presented here
as a mapping aid to the implementor. Because the NFSv4.1 protocol as a mapping aid to the implementor. Because the NFSv4.1 protocol
includes a method to negotiate security and it understands the GSS- includes a method to negotiate security and it understands the GSS-
API mechanism, the pseudo flavor is not needed. The pseudo flavor is API mechanism, the pseudo flavor is not needed. The pseudo flavor is
needed for the NFSv3 since the security negotiation is done via the needed for the NFSv3 since the security negotiation is done via the
MOUNT protocol as described in [22]. MOUNT protocol as described in [32].
2.2.1.1.1.2.2. LIPKEY 2.2.1.1.1.2.2. LIPKEY
The LIPKEY V5 GSS-API mechanism as described in [6] MUST be The LIPKEY V5 GSS-API mechanism as described in [6] MUST be
implemented with the RPCSEC_GSS services as specified in the implemented with the RPCSEC_GSS services as specified in the
following table: following table:
1 2 3 4 5 6 1 2 3 4 5 6
------------------------------------------------------------------ ------------------------------------------------------------------
390006 lipkey 1.3.6.1.5.5.9 rpc_gss_svc_none yes yes 390006 lipkey 1.3.6.1.5.5.9 rpc_gss_svc_none yes yes
skipping to change at page 26, line 5 skipping to change at page 26, line 12
same string. The implementor is cautioned from an approach that same string. The implementor is cautioned from an approach that
requires the string to be recorded in a local file because this requires the string to be recorded in a local file because this
precludes the use of the implementation in an environment where precludes the use of the implementation in an environment where
there is no local disk and all file access is from an NFSv4.1 there is no local disk and all file access is from an NFSv4.1
server. server.
o The string should be the same for each server network address that o The string should be the same for each server network address that
the client accesses. This way, if a server has multiple the client accesses. This way, if a server has multiple
interfaces, the client can trunk traffic over multiple network interfaces, the client can trunk traffic over multiple network
paths as described in Section 2.10.5. (Note: the precise opposite paths as described in Section 2.10.5. (Note: the precise opposite
was advised in the NFSv4.0 specification [20].) was advised in the NFSv4.0 specification [30].)
o The algorithm for generating the string should not assume that the o The algorithm for generating the string should not assume that the
client's network address will not change, unless the client client's network address will not change, unless the client
implementation knows it is using statically assigned network implementation knows it is using statically assigned network
addresses. This includes changes between client incarnations and addresses. This includes changes between client incarnations and
even changes while the client is still running in its current even changes while the client is still running in its current
incarnation. Thus with dynamic address assignment, if the client incarnation. Thus with dynamic address assignment, if the client
includes just the client's network address in the co_ownerid includes just the client's network address in the co_ownerid
string, there is a real risk that after the client gives up the string, there is a real risk that after the client gives up the
network address, another client, using a similar algorithm for network address, another client, using a similar algorithm for
skipping to change at page 28, line 4 skipping to change at page 28, line 11
See the descriptions of EXCHANGE_ID (Section 18.35) and See the descriptions of EXCHANGE_ID (Section 18.35) and
CREATE_SESSION (Section 18.36) for a complete specification of these CREATE_SESSION (Section 18.36) for a complete specification of these
operations. operations.
2.4.1. Upgrade from NFSv4.0 to NFSv4.1 2.4.1. Upgrade from NFSv4.0 to NFSv4.1
To facilitate upgrade from NFSv4.0 to NFSv4.1, a server may compare a To facilitate upgrade from NFSv4.0 to NFSv4.1, a server may compare a
client_owner4 in an EXCHANGE_ID with an nfs_client_id4 established client_owner4 in an EXCHANGE_ID with an nfs_client_id4 established
using the SETCLIENTID operation of NFSv4.0. A server that does so using the SETCLIENTID operation of NFSv4.0. A server that does so
will allow an upgraded client to avoid waiting until the lease (i.e. will allow an upgraded client to avoid waiting until the lease (i.e.
the lease established by the NFSv4.0 instance client) expires. This the lease established by the NFSv4.0 instance client) expires. This
requires the client_owner4 be constructed the same way as the requires the client_owner4 be constructed the same way as the
nfs_client_id4. If the latter's contents included the server's nfs_client_id4. If the latter's contents included the server's
network address (per the recommendations of the NFSv4.0 specification network address (per the recommendations of the NFSv4.0 specification
[20]), and the NFSv4.1 client does not wish to use a client ID that [30]), and the NFSv4.1 client does not wish to use a client ID that
prevents trunking, it should send two EXCHANGE_ID operations. The prevents trunking, it should send two EXCHANGE_ID operations. The
first EXCHANGE_ID will have a client_owner4 equal to the first EXCHANGE_ID will have a client_owner4 equal to the
nfs_client_id4. This will clear the state created by the NFSv4.0 nfs_client_id4. This will clear the state created by the NFSv4.0
client. The second EXCHANGE_ID will not have the server's network client. The second EXCHANGE_ID will not have the server's network
address. The state created for the second EXCHANGE_ID will not have address. The state created for the second EXCHANGE_ID will not have
to wait for lease expiration, because there will be no state to to wait for lease expiration, because there will be no state to
expire. expire.
2.4.2. Server Release of Client ID 2.4.2. Server Release of Client ID
skipping to change at page 35, line 49 skipping to change at page 36, line 12
worth the security benefits, and so allow the security policy of the worth the security benefits, and so allow the security policy of the
current filehandle to override that of the saved filehandle. current filehandle to override that of the saved filehandle.
2.7. Minor Versioning 2.7. Minor Versioning
To address the requirement of an NFS protocol that can evolve as the To address the requirement of an NFS protocol that can evolve as the
need arises, the NFSv4.1 protocol contains the rules and framework to need arises, the NFSv4.1 protocol contains the rules and framework to
allow for future minor changes or versioning. allow for future minor changes or versioning.
The base assumption with respect to minor versioning is that any The base assumption with respect to minor versioning is that any
future accepted minor version must follow the IETF process and be future accepted minor version will be documented in one or more
documented in a standards track RFC. Therefore, each minor version standards track RFCs. Minor version zero of the NFSv4 protocol is
number will correspond to one or more new RFCs. Minor version zero represented by [30], and minor version one is represented by this
of the NFSv4 protocol is represented by [20], and minor version one document [[Comment.1: RFC Editor: change "document" to "RFC" when we
is represented by this document [[Comment.1: RFC Editor: change publish]]. The COMPOUND and CB_COMPOUND procedures support the
"document" to "RFC" when we publish]]. The COMPOUND and CB_COMPOUND encoding of the minor version being requested by the client.
procedures support the encoding of the minor version being requested
by the client.
The following items represent the basic rules for the development of The following items represent the basic rules for the development of
minor versions. Note that a future minor version may decide to minor versions. Note that a future minor version may modify or add
modify or add to the following rules as part of the minor version to the following rules as part of the minor version definition.
definition.
1. Procedures are not added or deleted 1. Procedures are not added or deleted
To maintain the general RPC model, NFSv4 minor versions will not To maintain the general RPC model, NFSv4 minor versions will not
add to or delete procedures from the NFS program. add to or delete procedures from the NFS program.
2. Minor versions may add operations to the COMPOUND and 2. Minor versions may add operations to the COMPOUND and
CB_COMPOUND procedures. CB_COMPOUND procedures.
The addition of operations to the COMPOUND and CB_COMPOUND The addition of operations to the COMPOUND and CB_COMPOUND
skipping to change at page 37, line 18 skipping to change at page 37, line 24
with such bitmaps. with such bitmaps.
* adding bits to existing attributes like ACLs that have flag * adding bits to existing attributes like ACLs that have flag
words words
* extending enumerated types (including NFS4ERR_*) with new * extending enumerated types (including NFS4ERR_*) with new
values values
* adding cases to a switched union * adding cases to a switched union
4. Minor versions may not modify the structure of existing 4. Minor versions must not modify the structure of existing
attributes. attributes.
5. Minor versions may not delete operations. 5. Minor versions must not delete operations.
This prevents the potential reuse of a particular operation This prevents the potential reuse of a particular operation
"slot" in a future minor version. "slot" in a future minor version.
6. Minor versions may not delete attributes. 6. Minor versions must not delete attributes.
7. Minor versions may not delete flag bits or enumeration values. 7. Minor versions must not delete flag bits or enumeration values.
8. Minor versions may declare an operation MUST NOT be implemented. 8. Minor versions may declare an operation MUST NOT be implemented.
Specifying an operation MUST NOT be implemented is equivalent to Specifying an operation MUST NOT be implemented is equivalent to
obsoleting an operation. For the client, it means that the obsoleting an operation. For the client, it means that the
operation should not be sent to the server. For the server, an operation should not be sent to the server. For the server, an
NFS error can be returned as opposed to "dropping" the request NFS error can be returned as opposed to "dropping" the request
as an XDR decode error. This approach allows for the as an XDR decode error. This approach allows for the
obsolescence of an operation while maintaining its structure so obsolescence of an operation while maintaining its structure so
that a future minor version can reintroduce the operation. that a future minor version can reintroduce the operation.
skipping to change at page 38, line 8 skipping to change at page 38, line 14
9. Minor versions may downgrade features from REQUIRED to 9. Minor versions may downgrade features from REQUIRED to
RECOMMENDED, or RECOMMENDED to OPTIONAL. RECOMMENDED, or RECOMMENDED to OPTIONAL.
10. Minor versions may upgrade features from OPTIONAL to RECOMMENDED 10. Minor versions may upgrade features from OPTIONAL to RECOMMENDED
or RECOMMENDED to REQUIRED. or RECOMMENDED to REQUIRED.
11. A client and server that supports minor version X should support 11. A client and server that supports minor version X should support
minor versions 0 (zero) through X-1 as well. minor versions 0 (zero) through X-1 as well.
12. Except for infrastructural changes, no new features may be 12. Except for infrastructural changes, a minor version must not
introduced as REQUIRED in a minor version. introduce REQUIRED new features.
This rule allows for the introduction of new functionality and This rule allows for the introduction of new functionality and
forces the use of implementation experience before designating a forces the use of implementation experience before designating a
feature as REQUIRED. On the other hand, some classes of feature as REQUIRED. On the other hand, some classes of
features are infrastructural and have broad effects. Allowing features are infrastructural and have broad effects. Allowing
such features to not be REQUIRED complicates implementation of infrastructural features to be RECOMMENDED or OPTIONAL
the minor version. complicates implementation of the minor version.
13. A client MUST NOT attempt to use a stateid, filehandle, or 13. A client MUST NOT attempt to use a stateid, filehandle, or
similar returned object from the COMPOUND procedure with minor similar returned object from the COMPOUND procedure with minor
version X for another COMPOUND procedure with minor version Y, version X for another COMPOUND procedure with minor version Y,
where X != Y. where X != Y.
2.8. Non-RPC-based Security Services 2.8. Non-RPC-based Security Services
As described in Section 2.2.1.1.1.1, NFSv4.1 relies on RPC for As described in Section 2.2.1.1.1.1, NFSv4.1 relies on RPC for
identification, authentication, integrity, and privacy. NFSv4.1 identification, authentication, integrity, and privacy. NFSv4.1
skipping to change at page 39, line 17 skipping to change at page 39, line 23
2.9. Transport Layers 2.9. Transport Layers
2.9.1. REQUIRED and RECOMMENDED Properties of Transports 2.9.1. REQUIRED and RECOMMENDED Properties of Transports
NFSv4.1 works over RDMA and non-RDMA-based transports with the NFSv4.1 works over RDMA and non-RDMA-based transports with the
following attributes: following attributes:
o The transport supports reliable delivery of data, which NFSv4.1 o The transport supports reliable delivery of data, which NFSv4.1
requires but neither NFSv4.1 nor RPC has facilities for ensuring. requires but neither NFSv4.1 nor RPC has facilities for ensuring.
[23] [33]
o The transport delivers data in the order it was sent. Ordered o The transport delivers data in the order it was sent. Ordered
delivery simplifies detection of transmit errors, and simplifies delivery simplifies detection of transmit errors, and simplifies
the sending of arbitrary sized requests and responses, via the the sending of arbitrary sized requests and responses, via the
record marking protocol [3]. record marking protocol [3].
Where an NFSv4.1 implementation supports operation over the IP Where an NFSv4.1 implementation supports operation over the IP
network protocol, any transport used between NFS and IP MUST be among network protocol, any transport used between NFS and IP MUST be among
the IETF-approved congestion control transport protocols. At the the IETF-approved congestion control transport protocols. At the
time this document was written, the only two transports that had the time this document was written, the only two transports that had the
above attributes were TCP and SCTP. To enhance the possibilities for above attributes were TCP and SCTP. To enhance the possibilities for
interoperability, an NFSv4.1 implementation MUST support operation interoperability, an NFSv4.1 implementation MUST support operation
over the TCP transport protocol. over the TCP transport protocol.
Even if NFSv4.1 is used over a non-IP network protocol, it is Even if NFSv4.1 is used over a non-IP network protocol, it is
RECOMMENDED that the transport support congestion control. RECOMMENDED that the transport support congestion control.
It is permissible for a connectionless transport to be used under It is permissible for a connectionless transport to be used under
NFSv4.1, however reliable and in-order delivery of data combined with NFSv4.1, however reliable and in-order delivery of data combined with
congestion control by the connectionless transport is REQUIRED. congestion control by the connectionless transport is REQUIRED; as a
consequence UDP by itself MUST NOT be used as an NFSv4.1 transport.
NFSv4.1 assumes that a client transport address and server transport NFSv4.1 assumes that a client transport address and server transport
address used to send data over a transport together constitute a address used to send data over a transport together constitute a
connection, even if the underlying transport eschews the concept of a connection, even if the underlying transport eschews the concept of a
connection. connection.
2.9.2. Client and Server Transport Behavior 2.9.2. Client and Server Transport Behavior
If a connection-oriented transport (e.g. TCP) is used, the client If a connection-oriented transport (e.g. TCP) is used, the client
and server SHOULD use long lived connections for at least three and server SHOULD use long lived connections for at least three
reasons: reasons:
skipping to change at page 41, line 21 skipping to change at page 41, line 31
contents must not be blindly used when replies are sent from it, contents must not be blindly used when replies are sent from it,
and credit information appropriate to the channel must be and credit information appropriate to the channel must be
refreshed by the RPC layer. refreshed by the RPC layer.
In addition, as described in Section 2.10.6.2, while a session is In addition, as described in Section 2.10.6.2, while a session is
active, the NFSv4.1 requester MUST NOT stop waiting for a reply. active, the NFSv4.1 requester MUST NOT stop waiting for a reply.
2.9.3. Ports 2.9.3. Ports
Historically, NFSv3 servers have listened over TCP port 2049. The Historically, NFSv3 servers have listened over TCP port 2049. The
registered port 2049 [24] for the NFS protocol should be the default registered port 2049 [34] for the NFS protocol should be the default
configuration. NFSv4.1 clients SHOULD NOT use the RPC binding configuration. NFSv4.1 clients SHOULD NOT use the RPC binding
protocols as described in [25]. protocols as described in [35].
2.10. Session 2.10. Session
NFSv4.1 clients and servers MUST support and MUST use the session
feature as described in this section.
2.10.1. Motivation and Overview 2.10.1. Motivation and Overview
Previous versions and minor versions of NFS have suffered from the Previous versions and minor versions of NFS have suffered from the
following: following:
o Lack of support for Exactly Once Semantics (EOS). This includes o Lack of support for Exactly Once Semantics (EOS). This includes
lack of support for EOS through server failure and recovery. lack of support for EOS through server failure and recovery.
o Limited callback support, including no support for sending o Limited callback support, including no support for sending
callbacks through firewalls, and races between replies to normal callbacks through firewalls, and races between replies to normal
skipping to change at page 55, line 23 skipping to change at page 55, line 39
Given that well formulated XIDs continue to be required, this begs Given that well formulated XIDs continue to be required, this begs
the question why SEQUENCE and CB_SEQUENCE replies have a session ID, the question why SEQUENCE and CB_SEQUENCE replies have a session ID,
slot ID and sequence ID? Having the session ID in the reply means slot ID and sequence ID? Having the session ID in the reply means
the requester does not have to use the XID to lookup the session ID, the requester does not have to use the XID to lookup the session ID,
which would be necessary if the connection were associated with which would be necessary if the connection were associated with
multiple sessions. Having the slot ID and sequence ID in the reply multiple sessions. Having the slot ID and sequence ID in the reply
means requester does not have to use the XID to lookup the slot ID means requester does not have to use the XID to lookup the slot ID
and sequence ID. Furthermore, since the XID is only 32 bits, it is and sequence ID. Furthermore, since the XID is only 32 bits, it is
too small to guarantee the re-association of a reply with its request too small to guarantee the re-association of a reply with its request
([26]); having session ID, slot ID, and sequence ID in the reply ([36]); having session ID, slot ID, and sequence ID in the reply
allows the client to validate that the reply in fact belongs to the allows the client to validate that the reply in fact belongs to the
matched request. matched request.
The SEQUENCE (and CB_SEQUENCE) operation also carries a The SEQUENCE (and CB_SEQUENCE) operation also carries a
"highest_slotid" value which carries additional requester slot usage "highest_slotid" value which carries additional requester slot usage
information. The requester must always indicate the slot ID information. The requester must always indicate the slot ID
representing the outstanding request with the highest-numbered slot representing the outstanding request with the highest-numbered slot
value. The requester should in all cases provide the most value. The requester should in all cases provide the most
conservative value possible, although it can be increased somewhat conservative value possible, although it can be increased somewhat
above the actual instantaneous usage to maintain some minimum or above the actual instantaneous usage to maintain some minimum or
skipping to change at page 57, line 48 skipping to change at page 58, line 19
cache entry for the slot whenever an error is returned from SEQUENCE cache entry for the slot whenever an error is returned from SEQUENCE
or CB_SEQUENCE. or CB_SEQUENCE.
2.10.6.1.3. Optional Reply Caching 2.10.6.1.3. Optional Reply Caching
On a per-request basis the requester can choose to direct the replier On a per-request basis the requester can choose to direct the replier
to cache the reply to all operations after the first operation to cache the reply to all operations after the first operation
(SEQUENCE or CB_SEQUENCE) via the sa_cachethis or csa_cachethis (SEQUENCE or CB_SEQUENCE) via the sa_cachethis or csa_cachethis
fields of the arguments to SEQUENCE or CB_SEQUENCE. The reason it fields of the arguments to SEQUENCE or CB_SEQUENCE. The reason it
would not direct the replier to cache the entire reply is that the would not direct the replier to cache the entire reply is that the
request is composed of all idempotent operations [23]. Caching the request is composed of all idempotent operations [33]. Caching the
reply may offer little benefit. If the reply is too large (see reply may offer little benefit. If the reply is too large (see
Section 2.10.6.4), it may not be cacheable anyway. Even if the reply Section 2.10.6.4), it may not be cacheable anyway. Even if the reply
to idempotent request is small enough to cache, unnecessarily caching to idempotent request is small enough to cache, unnecessarily caching
the reply slows down the server and increases RPC latency. the reply slows down the server and increases RPC latency.
Whether the requester requests the reply to be cached or not has no Whether the requester requests the reply to be cached or not has no
effect on the slot processing. If the results of SEQUENCE or effect on the slot processing. If the results of SEQUENCE or
CB_SEQUENCE are NFS4_OK, then the slot's sequence ID MUST be CB_SEQUENCE are NFS4_OK, then the slot's sequence ID MUST be
incremented by one. If a requester does not direct the replier to incremented by one. If a requester does not direct the replier to
cache the reply, the replier MUST do one of following: cache the reply, the replier MUST do one of following:
skipping to change at page 59, line 13 skipping to change at page 59, line 31
requester does not know what sequence ID to use for the slot on its requester does not know what sequence ID to use for the slot on its
next request. For example, suppose a requester sends a request with next request. For example, suppose a requester sends a request with
sequence ID 1, and does not wait for the response. The next time it sequence ID 1, and does not wait for the response. The next time it
uses the slot, it sends the new request with sequence ID 2. If the uses the slot, it sends the new request with sequence ID 2. If the
replier has not seen the request with sequence ID 1, then the replier replier has not seen the request with sequence ID 1, then the replier
is not expecting sequence ID 2, and rejects the requester's new is not expecting sequence ID 2, and rejects the requester's new
request with NFS4ERR_SEQ_MISORDERED (as the result from SEQUENCE or request with NFS4ERR_SEQ_MISORDERED (as the result from SEQUENCE or
CB_SEQUENCE). CB_SEQUENCE).
RDMA fabrics do not guarantee that the memory handles (Steering Tags) RDMA fabrics do not guarantee that the memory handles (Steering Tags)
within each RPC/RDMA "chunk" ([8]) are valid on a scope outside that within each RPC/RDMA "chunk" ([9]) are valid on a scope outside that
of a single connection. Therefore, handles used by the direct of a single connection. Therefore, handles used by the direct
operations become invalid after connection loss. The server must operations become invalid after connection loss. The server must
ensure that any RDMA operations which must be replayed from the reply ensure that any RDMA operations which must be replayed from the reply
cache use the newly provided handle(s) from the most recent request. cache use the newly provided handle(s) from the most recent request.
A retry might be sent while the original request is still in progress A retry might be sent while the original request is still in progress
on the replier. The replier SHOULD deal with the issue by returning on the replier. The replier SHOULD deal with the issue by returning
NFS4ERR_DELAY as the reply to SEQUENCE or CB_SEQUENCE operation, but NFS4ERR_DELAY as the reply to SEQUENCE or CB_SEQUENCE operation, but
implementations MAY return NFS4ERR_MISORDERED. Since errors from implementations MAY return NFS4ERR_MISORDERED. Since errors from
SEQUENCE and CB_SEQUENCE are never recorded in the reply cache, this SEQUENCE and CB_SEQUENCE are never recorded in the reply cache, this
skipping to change at page 63, line 51 skipping to change at page 64, line 21
view the problem is as a single transaction consisting of each view the problem is as a single transaction consisting of each
operation in the COMPOUND followed by storing the result in operation in the COMPOUND followed by storing the result in
persistent storage, then finally a transaction commit. If there is a persistent storage, then finally a transaction commit. If there is a
failure before the transaction is committed, then the server rolls failure before the transaction is committed, then the server rolls
back the transaction. If server itself fails, then when it restarts, back the transaction. If server itself fails, then when it restarts,
its recovery logic could roll back the transaction before starting its recovery logic could roll back the transaction before starting
the NFSv4.1 server. the NFSv4.1 server.
While the description of the implementation for atomic execution of While the description of the implementation for atomic execution of
the request and caching of the reply is beyond the scope of this the request and caching of the reply is beyond the scope of this
document, an example implementation for NFSv2 [27] is described in document, an example implementation for NFSv2 [37] is described in
[28]. [38].
2.10.7. RDMA Considerations 2.10.7. RDMA Considerations
A complete discussion of the operation of RPC-based protocols over A complete discussion of the operation of RPC-based protocols over
RDMA transports is in [8]. A discussion of the operation of NFSv4, RDMA transports is in [9]. A discussion of the operation of NFSv4,
including NFSv4.1, over RDMA is in [9]. Where RDMA is considered, including NFSv4.1, over RDMA is in [10]. Where RDMA is considered,
this specification assumes the use of such a layering; it addresses this specification assumes the use of such a layering; it addresses
only the upper layer issues relevant to making best use of RPC/RDMA. only the upper layer issues relevant to making best use of RPC/RDMA.
2.10.7.1. RDMA Connection Resources 2.10.7.1. RDMA Connection Resources
RDMA requires its consumers to register memory and post buffers of a RDMA requires its consumers to register memory and post buffers of a
specific size and number for receive operations. specific size and number for receive operations.
Registration of memory can be a relatively high-overhead operation, Registration of memory can be a relatively high-overhead operation,
since it requires pinning of buffers, assignment of attributes (e.g. since it requires pinning of buffers, assignment of attributes (e.g.
skipping to change at page 65, line 10 skipping to change at page 65, line 28
flow control and will terminate a connection in error when limits are flow control and will terminate a connection in error when limits are
exceeded. Limits such as maximum number of requests outstanding are exceeded. Limits such as maximum number of requests outstanding are
therefore negotiated when a session is created (see the therefore negotiated when a session is created (see the
ca_maxrequests field in Section 18.36). These limits then provide ca_maxrequests field in Section 18.36). These limits then provide
the maxima which each connection associated with the session's the maxima which each connection associated with the session's
channel(s) must remain within. RDMA connections are managed within channel(s) must remain within. RDMA connections are managed within
these limits as described in section 3.3 ("Flow Control"[[Comment.2: these limits as described in section 3.3 ("Flow Control"[[Comment.2:
RFC Editor: please verify section and title of the RPCRDMA document RFC Editor: please verify section and title of the RPCRDMA document
which is currently at which is currently at
http://tools.ietf.org/html/draft-ietf-nfsv4-rpcrdma-08#section-3.3]]) http://tools.ietf.org/html/draft-ietf-nfsv4-rpcrdma-08#section-3.3]])
of [8]; if there are multiple RDMA connections, then the maximum of [9]; if there are multiple RDMA connections, then the maximum
number of requests for a channel will be divided among the RDMA number of requests for a channel will be divided among the RDMA
connections. Put a different way, the onus is on the replier to connections. Put a different way, the onus is on the replier to
ensure that total number of RDMA credits across all connections ensure that total number of RDMA credits across all connections
associated with the replier's channel does exceed the channel's associated with the replier's channel does exceed the channel's
maximum number of outstanding requests. maximum number of outstanding requests.
The limits may also be modified dynamically at the replier's choosing The limits may also be modified dynamically at the replier's choosing
by manipulating certain parameters present in each NFSv4.1 reply. In by manipulating certain parameters present in each NFSv4.1 reply. In
addition, the CB_RECALL_SLOT callback operation (see Section 20.8) addition, the CB_RECALL_SLOT callback operation (see Section 20.8)
can be sent by a server to a client to return RDMA credits to the can be sent by a server to a client to return RDMA credits to the
server, thereby lowering the maximum number of requests a client can server, thereby lowering the maximum number of requests a client can
have outstanding to the server. have outstanding to the server.
2.10.7.3. Padding 2.10.7.3. Padding
Header padding is requested by each peer at session initiation (see Header padding is requested by each peer at session initiation (see
the ca_headerpadsize argument to CREATE_SESSION in Section 18.36), the ca_headerpadsize argument to CREATE_SESSION in Section 18.36),
and subsequently used by the RPC RDMA layer, as described in [8]. and subsequently used by the RPC RDMA layer, as described in [9].
Zero padding is permitted. Zero padding is permitted.
Padding leverages the useful property that RDMA preserve alignment of Padding leverages the useful property that RDMA preserve alignment of
data, even when they are placed into anonymous (untagged) buffers. data, even when they are placed into anonymous (untagged) buffers.
If requested, client inline writes will insert appropriate pad bytes If requested, client inline writes will insert appropriate pad bytes
within the request header to align the data payload on the specified within the request header to align the data payload on the specified
boundary. The client is encouraged to add sufficient padding (up to boundary. The client is encouraged to add sufficient padding (up to
the negotiated size) so that the "data" field of the NFSv4.1 WRITE the negotiated size) so that the "data" field of the NFSv4.1 WRITE
operation is aligned. Most servers can make good use of such operation is aligned. Most servers can make good use of such
padding, which allows them to chain receive buffers in such a way padding, which allows them to chain receive buffers in such a way
skipping to change at page 66, line 31 skipping to change at page 66, line 49
posted receive if unused by the actual received request, or may pass posted receive if unused by the actual received request, or may pass
the now-complete buffers by reference for normal write processing. the now-complete buffers by reference for normal write processing.
For a server which can make use of it, this removes any need for data For a server which can make use of it, this removes any need for data
copies of incoming data, without resorting to complicated end-to-end copies of incoming data, without resorting to complicated end-to-end
buffer advertisement and management. This includes most kernel-based buffer advertisement and management. This includes most kernel-based
and integrated server designs, among many others. The client may and integrated server designs, among many others. The client may
perform similar optimizations, if desired. perform similar optimizations, if desired.
2.10.7.4. Dual RDMA and Non-RDMA Transports 2.10.7.4. Dual RDMA and Non-RDMA Transports
Some RDMA transports (e.g., [10]), permit a "streaming" (non-RDMA) Some RDMA transports (e.g., [11]), permit a "streaming" (non-RDMA)
phase, where ordinary traffic might flow before "stepping up" to RDMA phase, where ordinary traffic might flow before "stepping up" to RDMA
mode, commencing RDMA traffic. Some RDMA transports start mode, commencing RDMA traffic. Some RDMA transports start
connections always in RDMA mode. NFSv4.1 allows, but does not connections always in RDMA mode. NFSv4.1 allows, but does not
assume, a streaming phase before RDMA mode. When a connection is assume, a streaming phase before RDMA mode. When a connection is
associated with a session, the client and server negotiate whether associated with a session, the client and server negotiate whether
the connection is used in RDMA or non-RDMA mode (see Section 18.36 the connection is used in RDMA or non-RDMA mode (see Section 18.36
and Section 18.34). and Section 18.34).
2.10.8. Sessions Security 2.10.8. Sessions Security
2.10.8.1. Session Callback Security 2.10.8.1. Session Callback Security
Via session / connection association, NFSv4.1 improves security over Via session / connection association, NFSv4.1 improves security over
that provided by NFSv4.0 for the backchannel. The connection is that provided by NFSv4.0 for the backchannel. The connection is
client-initiated (see Section 18.34), and subject to the same client-initiated (see Section 18.34), and subject to the same
firewall and routing checks as the fore channel. The connection firewall and routing checks as the fore channel. At the client's
cannot be hijacked by an attacker who connects to the client port option (see Section 18.35), connection association is fully
prior to the intended server as is possible with NFSv4.0. At the
client's option (see Section 18.35), connection association is fully
authenticated before being activated (see Section 18.34). Traffic authenticated before being activated (see Section 18.34). Traffic
from the server over the backchannel is authenticated exactly as the from the server over the backchannel is authenticated exactly as the
client specifies (see Section 2.10.8.2). client specifies (see Section 2.10.8.2).
2.10.8.2. Backchannel RPC Security 2.10.8.2. Backchannel RPC Security
When the NFSv4.1 client establishes the backchannel, it informs the When the NFSv4.1 client establishes the backchannel, it informs the
server of the security flavors and principals to use when sending server of the security flavors and principals to use when sending
requests. If the security flavor is RPCSEC_GSS, the client expresses requests. If the security flavor is RPCSEC_GSS, the client expresses
the principal in the form of an established RPCSEC_GSS context. The the principal in the form of an established RPCSEC_GSS context. The
server is free to use any of the flavor/principal combinations the server is free to use any of the flavor/principal combinations the
client offers, but it MUST NOT use unoffered combinations. This way, client offers, but it MUST NOT use unoffered combinations. This way,
the client need not provide a target GSS principal for the the client need not provide a target GSS principal for the
backchannel as it did with NFSv4.0, nor the server have to implement backchannel as it did with NFSv4.0, nor the server have to implement
an RPCSEC_GSS initiator as it did with NFSv4.0 [20]. an RPCSEC_GSS initiator as it did with NFSv4.0 [30].
The CREATE_SESSION (Section 18.36) and BACKCHANNEL_CTL The CREATE_SESSION (Section 18.36) and BACKCHANNEL_CTL
(Section 18.33) operations allow the client to specify flavor/ (Section 18.33) operations allow the client to specify flavor/
principal combinations. principal combinations.
Also note that the SP4_SSV state protection mode (see Section 18.35 Also note that the SP4_SSV state protection mode (see Section 18.35
and Section 2.10.8.3) has the side benefit of providing SSV-derived and Section 2.10.8.3) has the side benefit of providing SSV-derived
RPCSEC_GSS contexts (Section 2.10.9). RPCSEC_GSS contexts (Section 2.10.9).
2.10.8.3. Protection from Unauthorized State Changes 2.10.8.3. Protection from Unauthorized State Changes
skipping to change at page 69, line 14 skipping to change at page 69, line 32
expires, then session and client ID maintenance cannot occur, but expires, then session and client ID maintenance cannot occur, but
since the client has a single user, only that user is since the client has a single user, only that user is
inconvenienced. inconvenienced.
3. The physical client has multiple users, but the client 3. The physical client has multiple users, but the client
implementation has a unique client ID for each user. This is implementation has a unique client ID for each user. This is
effectively the same as the second scenario, but a disadvantage effectively the same as the second scenario, but a disadvantage
is that each user must be allocated at least one session each, so is that each user must be allocated at least one session each, so
the approach suffers from lack of economy. the approach suffers from lack of economy.
The SP4_SSV protection option uses a Secret State Verifier (SSV) The SP4_SSV protection option uses the SSV (Section 1.5), via
which is shared between a client and server. The SSV serves as the RPCSEC_GSS and the SSV GSS mechanism (Section 2.10.9) to protect
secret key for an internal (that is, internal to NFSv4.1) GSS state from attack. The SP4_SSV protection option is intended for the
mechanism that uses the secret key for Message Integrity Code (MIC) situation comprised of a client that has multiple active users, and a
and Wrap tokens (Section 2.10.9). The SP4_SSV protection option is system administrator who wants to avoid the burden of installing a
intended for the client that has multiple users, and the system permanent machine credential on each client. The SSV is established
administrator does not wish to configure a permanent machine and updated on the server via SET_SSV (see Section 18.47). To
credential for each client. The SSV is established on the server via prevent eavesdropping, a client SHOULD send SET_SSV via RPCSEC_GSS
SET_SSV (see Section 18.47). To prevent eavesdropping, a client with the privacy service. Several aspects of the SSV make it
SHOULD send SET_SSV via RPCSEC_GSS with the privacy service. Several intractable for an attacker to guess the SSV, and thus associate
aspects of the SSV make it intractable for an attacker to guess the rogue connections with a session, and rogue sessions with a client
SSV, and thus associate rogue connections with a session, and rogue ID:
sessions with a client ID:
o The arguments to and results of SET_SSV include digests of the old o The arguments to and results of SET_SSV include digests of the old
and new SSV, respectively. and new SSV, respectively.
o Because the initial value of the SSV is zero, therefore known, the o Because the initial value of the SSV is zero, therefore known, the
client that opts for SP4_SSV protection and opts to apply SP4_SSV client that opts for SP4_SSV protection and opts to apply SP4_SSV
protection to BIND_CONN_TO_SESSION and CREATE_SESSION MUST send at protection to BIND_CONN_TO_SESSION and CREATE_SESSION MUST send at
least one SET_SSV operation before the first BIND_CONN_TO_SESSION least one SET_SSV operation before the first BIND_CONN_TO_SESSION
operation or before the second CREATE_SESSION operation on a operation or before the second CREATE_SESSION operation on a
client ID. If it does not, the SSV mechanism will not generate client ID. If it does not, the SSV mechanism will not generate
tokens (Section 2.10.9). A client SHOULD send SET_SSV as soon as tokens (Section 2.10.9). A client SHOULD send SET_SSV as soon as
a session is created. a session is created.
o A SET_SSV does not replace the SSV with the argument to SET_SSV. o A SET_SSV request does not replace the SSV with the argument to
Instead, the current SSV on the server is logically exclusive ORed SET_SSV. Instead, the current SSV on the server is logically
(XORed) with the argument to SET_SSV. Each time a new principal exclusive ORed (XORed) with the argument to SET_SSV. Each time a
uses a client ID for the first time, the client SHOULD send a new principal uses a client ID for the first time, the client
SET_SSV with that principal's RPCSEC_GSS credentials, with SHOULD send a SET_SSV with that principal's RPCSEC_GSS
RPCSEC_GSS service set to RPC_GSS_SVC_PRIVACY. credentials, with RPCSEC_GSS service set to RPC_GSS_SVC_PRIVACY.
Here are the types of attacks that can be attempted by an attacker Here are the types of attacks that can be attempted by an attacker
named Eve on a victim named Bob, and how SP4_SSV protection foils named Eve on a victim named Bob, and how SP4_SSV protection foils
each attack: each attack:
o Suppose Eve is the first user to log into a legitimate client. o Suppose Eve is the first user to log into a legitimate client.
Eve's use of an NFSv4.1 file system will cause the legitimate Eve's use of an NFSv4.1 file system will cause the legitimate
client to create a client ID with SP4_SSV protection, specifying client to create a client ID with SP4_SSV protection, specifying
that the BIND_CONN_TO_SESSION operation MUST use the SSV that the BIND_CONN_TO_SESSION operation MUST use the SSV
credential. Eve's use of the file system also causes an SSV to be credential. Eve's use of the file system also causes an SSV to be
skipping to change at page 72, line 11 skipping to change at page 72, line 26
is to prevent connection hijacking, the use of IPsec is RECOMMENDED. is to prevent connection hijacking, the use of IPsec is RECOMMENDED.
If a connection hijack occurs, the hijacker could in theory change If a connection hijack occurs, the hijacker could in theory change
locking state and negatively impact the service to legitimate locking state and negatively impact the service to legitimate
clients. However if the server is configured to require the use of clients. However if the server is configured to require the use of
RPCSEC_GSS with integrity or privacy on the affected file objects, RPCSEC_GSS with integrity or privacy on the affected file objects,
and if EXCHGID4_FLAG_BIND_PRINC_STATEID capability (Section 18.35), and if EXCHGID4_FLAG_BIND_PRINC_STATEID capability (Section 18.35),
is in force, this will thwart unauthorized attempts to change locking is in force, this will thwart unauthorized attempts to change locking
state. state.
2.10.9. The SSV GSS Mechanism 2.10.9. The Secret State Verifier (SSV) GSS Mechanism
The SSV provides the secret key for a mechanism that NFSv4.1 uses for The SSV provides the secret key for a GSS mechanism internal to
state protection. Contexts for this mechanism are not established NFSv4.1 that NFSv4.1 uses for state protection. Contexts for this
via the RPCSEC_GSS protocol. Instead, the contexts are automatically mechanism are not established via the RPCSEC_GSS protocol. Instead,
created when EXCHANGE_ID specifies SP4_SSV protection. The only the contexts are automatically created when EXCHANGE_ID specifies
tokens defined are the PerMsgToken (emitted by GSS_GetMIC) and the SP4_SSV protection. The only tokens defined are the PerMsgToken
SealedMessage token (emitted by GSS_Wrap). (emitted by GSS_GetMIC) and the SealedMessage token (emitted by
GSS_Wrap).
The mechanism OID for the SSV mechanism is: The mechanism OID for the SSV mechanism is:
iso.org.dod.internet.private.enterprise.Michael Eisler.nfs.ssv_mech iso.org.dod.internet.private.enterprise.Michael Eisler.nfs.ssv_mech
(1.3.6.1.4.1.28882.1.1). While the SSV mechanism does not define any (1.3.6.1.4.1.28882.1.1). While the SSV mechanism does not define any
initial context tokens, the OID can be used to let servers indicate initial context tokens, the OID can be used to let servers indicate
that the SSV mechanism is acceptable whenever the client sends a that the SSV mechanism is acceptable whenever the client sends a
SECINFO or SECINFO_NO_NAME operation (see Section 2.6). SECINFO or SECINFO_NO_NAME operation (see Section 2.6).
The SSV mechanism defines four subkeys derived from the SSV value. The SSV mechanism defines four subkeys derived from the SSV value.
Each time SET_SSV is invoked the subkeys are recalculated by the Each time SET_SSV is invoked the subkeys are recalculated by the
client and server. The calculation of each of the four subkeys client and server. The calculation of each of the four subkeys
depends on each of the four respective ssv_subkey4 enumerated values. depends on each of the four respective ssv_subkey4 enumerated values.
The calculation uses the HMAC [11], algorithm, using the current SSV The calculation uses the HMAC [12], algorithm, using the current SSV
as the key, the one way hash algorithm as negotiated by EXCHANGE_ID, as the key, the one way hash algorithm as negotiated by EXCHANGE_ID,
and the input text as represented by the XDR encoded enumeration of and the input text as represented by the XDR encoded enumeration
type ssv_subkey4. value for that subkey of data type ssv_subkey4.
/* Input for computing subkeys */ /* Input for computing subkeys */
enum ssv_subkey4 { enum ssv_subkey4 {
SSV4_SUBKEY_MIC_I2T = 1, SSV4_SUBKEY_MIC_I2T = 1,
SSV4_SUBKEY_MIC_T2I = 2, SSV4_SUBKEY_MIC_T2I = 2,
SSV4_SUBKEY_SEAL_I2T = 3, SSV4_SUBKEY_SEAL_I2T = 3,
SSV4_SUBKEY_SEAL_T2I = 4 SSV4_SUBKEY_SEAL_T2I = 4
}; };
The subkey derived from SSV4_SUBKEY_MIC_I2T is used for calculating The subkey derived from SSV4_SUBKEY_MIC_I2T is used for calculating
skipping to change at page 73, line 28 skipping to change at page 73, line 43
uint32_t smt_ssv_seq; uint32_t smt_ssv_seq;
opaque smt_hmac<>; opaque smt_hmac<>;
}; };
The field smt_hmac is an HMAC calculated by using the subkey derived The field smt_hmac is an HMAC calculated by using the subkey derived
from SSV4_SUBKEY_MIC_I2T or SSV4_SUBKEY_MIC_T2I as the key, the one from SSV4_SUBKEY_MIC_I2T or SSV4_SUBKEY_MIC_T2I as the key, the one
way hash algorithm as negotiated by EXCHANGE_ID, and the input text way hash algorithm as negotiated by EXCHANGE_ID, and the input text
as represented by data of type ssv_mic_plain_tkn4. The field as represented by data of type ssv_mic_plain_tkn4. The field
smpt_ssv_seq is the same as smt_ssv_seq. The field smpt_orig_plain smpt_ssv_seq is the same as smt_ssv_seq. The field smpt_orig_plain
is the "message" input passed to GSS_GetMIC() (see Section 2.3.1 of is the "message" input passed to GSS_GetMIC() (see Section 2.3.1 of
[7]). The caller of GSS_GetMIC() provides a pointer to a buffer [8]). The caller of GSS_GetMIC() provides a pointer to a buffer
containing the plain text. The SSV mechanism's entry point for containing the plain text. The SSV mechanism's entry point for
GSS_GetMIC() encodes this into an opaque array, and the encoding will GSS_GetMIC() encodes this into an opaque array, and the encoding will
include an initial four byte length, plus any necessary padding. include an initial four byte length, plus any necessary padding.
Prepended to this will be the XDR encoded value of smpt_ssv_seq thus Prepended to this will be the XDR encoded value of smpt_ssv_seq thus
making up an XDR encoding of a value of data type ssv_mic_plain_tkn4, making up an XDR encoding of a value of data type ssv_mic_plain_tkn4,
which in turn is the input into the HMAC. which in turn is the input into the HMAC.
The token emitted by GSS_GetMIC() is XDR encoded and of XDR data type The token emitted by GSS_GetMIC() is XDR encoded and of XDR data type
ssv_mic_tkn4. The field smt_ssv_seq comes from the SSV sequence ssv_mic_tkn4. The field smt_ssv_seq comes from the SSV sequence
number which is equal to 1 after SET_SSV (Section 18.47) is called number which is equal to 1 after SET_SSV (Section 18.47) is called
the first time on a client ID. Thereafter, it is incremented on each the first time on a client ID. Thereafter, it is incremented on each
SET_SSV. Thus smt_ssv_seq represents the version of the SSV at the SET_SSV. Thus smt_ssv_seq represents the version of the SSV at the
time GSS_GetMIC() was called. As noted in Section 18.35, the client time GSS_GetMIC() was called. As noted in Section 18.35, the client
skipping to change at page 75, line 11 skipping to change at page 75, line 26
key is the subkey derived from SSV4_SUBKEY_MIC_I2T or key is the subkey derived from SSV4_SUBKEY_MIC_I2T or
SSV4_SUBKEY_MIC_T2I, and the one way hash algorithm is that SSV4_SUBKEY_MIC_T2I, and the one way hash algorithm is that
negotiated by EXCHANGE_ID. negotiated by EXCHANGE_ID.
The sspt_confounder field is a random value. The sspt_confounder field is a random value.
The sspt_ssv_seq field is the same as ssvt_ssv_seq. The sspt_ssv_seq field is the same as ssvt_ssv_seq.
The field sspt_orig_plain field is the original plaintext and is the The field sspt_orig_plain field is the original plaintext and is the
"input_message" input passed to GSS_Wrap() (see Section 2.3.3 of "input_message" input passed to GSS_Wrap() (see Section 2.3.3 of
[7]). As with the handling of the plaintext by the SSV mechanism's [8]). As with the handling of the plaintext by the SSV mechanism's
GSS_GetMIC() entry point, the entry point for GSS_Wrap() expects a GSS_GetMIC() entry point, the entry point for GSS_Wrap() expects a
pointer to the plaintext, and will XDR encode an opaque array into pointer to the plaintext, and will XDR encode an opaque array into
sspt_orig_plain representing the plain text, along with the other sspt_orig_plain representing the plain text, along with the other
fields of an instance of data type ssv_seal_plain_tkn4. fields of an instance of data type ssv_seal_plain_tkn4.
The sspt_pad field is present to support encryption algorithms that The sspt_pad field is present to support encryption algorithms that
require inputs to be in fixed sized blocks. The content of sspt_pad require inputs to be in fixed sized blocks. The content of sspt_pad
is zero filled except for the length. Beware that the XDR encoding is zero filled except for the length. Beware that the XDR encoding
of ssv_seal_plain_tkn4 contains three variable length arrays, and so of ssv_seal_plain_tkn4 contains three variable length arrays, and so
each array consumes four bytes for an array length, and each array each array consumes four bytes for an array length, and each array
skipping to change at page 81, line 17 skipping to change at page 81, line 39
4. If the client knows of no other connections associated with the 4. If the client knows of no other connections associated with the
session ID, and server network addresses that are, or have been session ID, and server network addresses that are, or have been
associated with the session ID, then the client can use DNS to associated with the session ID, then the client can use DNS to
find other network addresses. If it does not, or if DNS does not find other network addresses. If it does not, or if DNS does not
find any other addresses for the server, then the client will be find any other addresses for the server, then the client will be
unable to provide NFSv4.1 service, and fatal errors should be unable to provide NFSv4.1 service, and fatal errors should be
returned to processes that were using the server. If the client returned to processes that were using the server. If the client
is using a "mount" paradigm, unmounting the server is advised. is using a "mount" paradigm, unmounting the server is advised.
If there is a reconfiguration event which results in the same network If there is a reconfiguration event which results in the same network
being assigned to servers where the eir_server_scope value is address being assigned to servers where the eir_server_scope value is
different, it cannot be guaranteed that a session ID generated by the different, it cannot be guaranteed that a session ID generated by the
first will be recognized as invalid by the first. Therefore, in first will be recognized as invalid by the first. Therefore, in
managing server reconfigurations among servers with different server managing server reconfigurations among servers with different server
scope values, it is necessary to make sure that all clients have scope values, it is necessary to make sure that all clients have
disconnected from the first server before effecting the disconnected from the first server before effecting the
reconfiguration. Nonetheless, clients should not assume that servers reconfiguration. Nonetheless, clients should not assume that servers
will always adhere to this requirement; clients MUST be prepared to will always adhere to this requirement; clients MUST be prepared to
deal with unexpected effects of server reconfigurations. Even where deal with unexpected effects of server reconfigurations. Even where
a session ID is inappropriately recognized as valid, it is likely a session ID is inappropriately recognized as valid, it is likely
that either the connection will not be recognized as valid, or that a that either the connection will not be recognized as valid, or that a
skipping to change at page 83, line 37 skipping to change at page 84, line 12
created under the client ID, and to allow the server to indicate how created under the client ID, and to allow the server to indicate how
it will allow the sessions to be used. See Section 13.1 for pNFS it will allow the sessions to be used. See Section 13.1 for pNFS
sessions considerations. sessions considerations.
3. Protocol Constants and Data Types 3. Protocol Constants and Data Types
The syntax and semantics to describe the data types of the NFSv4.1 The syntax and semantics to describe the data types of the NFSv4.1
protocol are defined in the XDR RFC4506 [2] and RPC RFC1831 [3] protocol are defined in the XDR RFC4506 [2] and RPC RFC1831 [3]
documents. The next sections build upon the XDR data types to define documents. The next sections build upon the XDR data types to define
constants, types and structures specific to this protocol. The full constants, types and structures specific to this protocol. The full
list of XDR data types is in [12]. list of XDR data types is in [13].
3.1. Basic Constants 3.1. Basic Constants
const NFS4_FHSIZE = 128; const NFS4_FHSIZE = 128;
const NFS4_VERIFIER_SIZE = 8; const NFS4_VERIFIER_SIZE = 8;
const NFS4_OPAQUE_LIMIT = 1024; const NFS4_OPAQUE_LIMIT = 1024;
const NFS4_SESSIONID_SIZE = 16; const NFS4_SESSIONID_SIZE = 16;
const NFS4_INT64_MAX = 0x7fffffffffffffff; const NFS4_INT64_MAX = 0x7fffffffffffffff;
const NFS4_UINT64_MAX = 0xffffffffffffffff; const NFS4_UINT64_MAX = 0xffffffffffffffff;
skipping to change at page 85, line 18 skipping to change at page 85, line 31
| | Used for file/directory attributes. | | | Used for file/directory attributes. |
| bitmap4 | typedef uint32_t bitmap4<>; | | bitmap4 | typedef uint32_t bitmap4<>; |
| | Used in attribute array encoding. | | | Used in attribute array encoding. |
| changeid4 | typedef uint64_t changeid4; | | changeid4 | typedef uint64_t changeid4; |
| | Used in the definition of change_info4. | | | Used in the definition of change_info4. |
| clientid4 | typedef uint64_t clientid4; | | clientid4 | typedef uint64_t clientid4; |
| | Shorthand reference to client identification. | | | Shorthand reference to client identification. |
| count4 | typedef uint32_t count4; | | count4 | typedef uint32_t count4; |
| | Various count parameters (READ, WRITE, COMMIT). | | | Various count parameters (READ, WRITE, COMMIT). |
| length4 | typedef uint64_t length4; | | length4 | typedef uint64_t length4; |
| | Describes LOCK lengths. | | | The length of a byte range within a file. |
| mode4 | typedef uint32_t mode4; | | mode4 | typedef uint32_t mode4; |
| | Mode attribute data type. | | | Mode attribute data type. |
| nfs_cookie4 | typedef uint64_t nfs_cookie4; | | nfs_cookie4 | typedef uint64_t nfs_cookie4; |
| | Opaque cookie value for READDIR. | | | Opaque cookie value for READDIR. |
| nfs_fh4 | typedef opaque nfs_fh4<NFS4_FHSIZE>; | | nfs_fh4 | typedef opaque nfs_fh4<NFS4_FHSIZE>; |
| | Filehandle definition. | | | Filehandle definition. |
| nfs_ftype4 | enum nfs_ftype4; | | nfs_ftype4 | enum nfs_ftype4; |
| | Various defined file types. | | | Various defined file types. |
| nfsstat4 | enum nfsstat4; | | nfsstat4 | enum nfsstat4; |
| | Return value for operations. | | | Return value for operations. |
| offset4 | typedef uint64_t offset4; | | offset4 | typedef uint64_t offset4; |
| | Various offset designations (READ, WRITE, LOCK, | | | Various offset designations (READ, WRITE, LOCK, |
| | COMMIT). | | | COMMIT). |
| qop4 | typedef uint32_t qop4; | | qop4 | typedef uint32_t qop4; |
| | Quality of protection designation in SECINFO. | | | Quality of protection designation in SECINFO. |
| sec_oid4 | typedef opaque sec_oid4<>; | | sec_oid4 | typedef opaque sec_oid4<>; |
| | Security Object Identifier. The sec_oid4 data | | | Security Object Identifier. The sec_oid4 data |
| | type is not really opaque. Instead it contains an | | | type is not really opaque. Instead it contains an |
| | ASN.1 OBJECT IDENTIFIER as used by GSS-API in the | | | ASN.1 OBJECT IDENTIFIER as used by GSS-API in the |
| | mech_type argument to GSS_Init_sec_context. See | | | mech_type argument to GSS_Init_sec_context. See |
| | [7] for details. | | | [8] for details. |
| sequenceid4 | typedef uint32_t sequenceid4; | | sequenceid4 | typedef uint32_t sequenceid4; |
| | Sequence number used for various session | | | Sequence number used for various session |
| | operations (EXCHANGE_ID, CREATE_SESSION, | | | operations (EXCHANGE_ID, CREATE_SESSION, |
| | SEQUENCE, CB_SEQUENCE). | | | SEQUENCE, CB_SEQUENCE). |
| seqid4 | typedef uint32_t seqid4; | | seqid4 | typedef uint32_t seqid4; |
| | Sequence identifier used for file locking. | | | Sequence identifier used for file locking. |
| sessionid4 | typedef opaque sessionid4[NFS4_SESSIONID_SIZE]; | | sessionid4 | typedef opaque sessionid4[NFS4_SESSIONID_SIZE]; |
| | Session identifier. | | | Session identifier. |
| slotid4 | typedef uint32_t slotid4; | | slotid4 | typedef uint32_t slotid4; |
| | Sequencing artifact for various session | | | Sequencing artifact for various session |
skipping to change at page 86, line 15 skipping to change at page 86, line 27
| utf8str_cis | typedef utf8string utf8str_cis; | | utf8str_cis | typedef utf8string utf8str_cis; |
| | Case-insensitive UTF-8 string. | | | Case-insensitive UTF-8 string. |
| utf8str_cs | typedef utf8string utf8str_cs; | | utf8str_cs | typedef utf8string utf8str_cs; |
| | Case-sensitive UTF-8 string. | | | Case-sensitive UTF-8 string. |
| utf8str_mixed | typedef utf8string utf8str_mixed; | | utf8str_mixed | typedef utf8string utf8str_mixed; |
| | UTF-8 strings with a case sensitive prefix and a | | | UTF-8 strings with a case sensitive prefix and a |
| | case insensitive suffix. | | | case insensitive suffix. |
| component4 | typedef utf8str_cs component4; | | component4 | typedef utf8str_cs component4; |
| | Represents path name components. | | | Represents path name components. |
| linktext4 | typedef utf8str_cs linktext4; | | linktext4 | typedef utf8str_cs linktext4; |
| | Symbolic link contents. | | | Symbolic link contents ("symbolic link" is |
| | defined in an Open Group [14] standard). |
| pathname4 | typedef component4 pathname4<>; | | pathname4 | typedef component4 pathname4<>; |
| | Represents path name for fs_locations. | | | Represents path name for fs_locations. |
| verifier4 | typedef opaque verifier4[NFS4_VERIFIER_SIZE]; | | verifier4 | typedef opaque verifier4[NFS4_VERIFIER_SIZE]; |
| | Verifier used for various operations (COMMIT, | | | Verifier used for various operations (COMMIT, |
| | CREATE, EXCHANGE_ID, OPEN, READDIR, WRITE) | | | CREATE, EXCHANGE_ID, OPEN, READDIR, WRITE) |
| | NFS4_VERIFIER_SIZE is defined as 8. | | | NFS4_VERIFIER_SIZE is defined as 8. |
+---------------+---------------------------------------------------+ +---------------+---------------------------------------------------+
End of Base Data Types End of Base Data Types
skipping to change at page 89, line 15 skipping to change at page 89, line 28
3.3.9. netaddr4 3.3.9. netaddr4
struct netaddr4 { struct netaddr4 {
/* see struct rpcb in RFC 1833 */ /* see struct rpcb in RFC 1833 */
string na_r_netid<>; /* network id */ string na_r_netid<>; /* network id */
string na_r_addr<>; /* universal address */ string na_r_addr<>; /* universal address */
}; };
The netaddr4 data type is used to identify network transport The netaddr4 data type is used to identify network transport
endpoints. The r_netid and r_addr fields respectively contain a endpoints. The r_netid and r_addr fields respectively contain a
netid and uaddr. The netid and uaddr concepts are defined in [13]. netid and uaddr. The netid and uaddr concepts are defined in [15].
The netid and uaddr formats for TCP over IPv4 and TCP over IPv6 are The netid and uaddr formats for TCP over IPv4 and TCP over IPv6 are
defined in [13], specifically Tables 2 and 3 and Sections 3.2.3.3 and defined in [15], specifically Tables 2 and 3 and Sections 3.2.3.3 and
3.2.3.4. 3.2.3.4.
3.3.10. state_owner4 3.3.10. state_owner4
struct state_owner4 { struct state_owner4 {
clientid4 clientid; clientid4 clientid;
opaque owner<NFS4_OPAQUE_LIMIT>; opaque owner<NFS4_OPAQUE_LIMIT>;
}; };
typedef state_owner4 open_owner4; typedef state_owner4 open_owner4;
skipping to change at page 90, line 48 skipping to change at page 91, line 16
The layouttype4 data type is 32 bits in length. The range The layouttype4 data type is 32 bits in length. The range
represented by the layout type is split into three parts. Type 0x0 represented by the layout type is split into three parts. Type 0x0
is reserved. Types within the range 0x00000001-0x7FFFFFFF are is reserved. Types within the range 0x00000001-0x7FFFFFFF are
globally unique and are assigned according to the description in globally unique and are assigned according to the description in
Section 22.4; they are maintained by IANA. Types within the range Section 22.4; they are maintained by IANA. Types within the range
0x80000000-0xFFFFFFFF are site specific and for private use only. 0x80000000-0xFFFFFFFF are site specific and for private use only.
The LAYOUT4_NFSV4_1_FILES enumeration specifies that the NFSv4.1 file The LAYOUT4_NFSV4_1_FILES enumeration specifies that the NFSv4.1 file
layout type, as defined in Section 13, is to be used. The layout type, as defined in Section 13, is to be used. The
LAYOUT4_OSD2_OBJECTS enumeration specifies that the object layout, as LAYOUT4_OSD2_OBJECTS enumeration specifies that the object layout, as
defined in [29], is to be used. Similarly, the LAYOUT4_BLOCK_VOLUME defined in [39], is to be used. Similarly, the LAYOUT4_BLOCK_VOLUME
enumeration specifies that the block/volume layout, as defined in enumeration specifies that the block/volume layout, as defined in
[30], is to be used. [40], is to be used.
3.3.14. deviceid4 3.3.14. deviceid4
const NFS4_DEVICEID4_SIZE = 16; const NFS4_DEVICEID4_SIZE = 16;
typedef opaque deviceid4[NFS4_DEVICEID4_SIZE]; typedef opaque deviceid4[NFS4_DEVICEID4_SIZE];
Layout information includes device IDs that specify a storage device Layout information includes device IDs that specify a storage device
through a compact handle. Addressing and type information is through a compact handle. Addressing and type information is
obtained with the GETDEVICEINFO operation. Device IDs are not obtained with the GETDEVICEINFO operation. Device IDs are not
skipping to change at page 91, line 36 skipping to change at page 91, line 51
storage device. Different layout types will require different data storage device. Different layout types will require different data
types to define how they communicate with storage devices. The types to define how they communicate with storage devices. The
opaque da_addr_body field must be interpreted based on the specified opaque da_addr_body field must be interpreted based on the specified
da_layout_type field. da_layout_type field.
This document defines the device address for the NFSv4.1 file layout This document defines the device address for the NFSv4.1 file layout
(see Section 13.3), which identifies a storage device by network IP (see Section 13.3), which identifies a storage device by network IP
address and port number. This is sufficient for the clients to address and port number. This is sufficient for the clients to
communicate with the NFSv4.1 storage devices, and may be sufficient communicate with the NFSv4.1 storage devices, and may be sufficient
for other layout types as well. Device types for object storage for other layout types as well. Device types for object storage
devices and block storage devices (e.g., SCSI volume labels) will be devices and block storage devices (e.g., SCSI volume labels) are
defined by their respective layout specifications. defined by their respective layout specifications.
3.3.16. layout_content4 3.3.16. layout_content4
struct layout_content4 { struct layout_content4 {
layouttype4 loc_type; layouttype4 loc_type;
opaque loc_body<>; opaque loc_body<>;
}; };
The loc_body field must be interpreted based on the layout type The loc_body field must be interpreted based on the layout type
skipping to change at page 95, line 10 skipping to change at page 95, line 29
for a file system object. The contents of the filehandle are opaque for a file system object. The contents of the filehandle are opaque
to the client. Therefore, the server is responsible for translating to the client. Therefore, the server is responsible for translating
the filehandle to an internal representation of the file system the filehandle to an internal representation of the file system
object. object.
4.1. Obtaining the First Filehandle 4.1. Obtaining the First Filehandle
The operations of the NFS protocol are defined in terms of one or The operations of the NFS protocol are defined in terms of one or
more filehandles. Therefore, the client needs a filehandle to more filehandles. Therefore, the client needs a filehandle to
initiate communication with the server. With the NFSv3 protocol initiate communication with the server. With the NFSv3 protocol
RFC1813 [21], there exists an ancillary protocol to obtain this first RFC1813 [31], there exists an ancillary protocol to obtain this first
filehandle. The MOUNT protocol, RPC program number 100005, provides filehandle. The MOUNT protocol, RPC program number 100005, provides
the mechanism of translating a string based file system path name to the mechanism of translating a string based file system path name to
a filehandle which can then be used by the NFS protocols. a filehandle which can then be used by the NFS protocols.
The MOUNT protocol has deficiencies in the area of security and use The MOUNT protocol has deficiencies in the area of security and use
via firewalls. This is one reason that the use of the public via firewalls. This is one reason that the use of the public
filehandle was introduced in RFC2054 [31] and RFC2055 [32]. With the filehandle was introduced in RFC2054 [41] and RFC2055 [42]. With the
use of the public filehandle in combination with the LOOKUP operation use of the public filehandle in combination with the LOOKUP operation
in the NFSv3 protocol, it has been demonstrated that the MOUNT in the NFSv3 protocol, it has been demonstrated that the MOUNT
protocol is unnecessary for viable interaction between NFS client and protocol is unnecessary for viable interaction between NFS client and
server. server.
Therefore, the NFSv4.1 protocol will not use an ancillary protocol Therefore, the NFSv4.1 protocol will not use an ancillary protocol
for translation from string based path names to a filehandle. Two for translation from string based path names to a filehandle. Two
special filehandles will be used as starting points for the NFS special filehandles will be used as starting points for the NFS
client. client.
skipping to change at page 97, line 4 skipping to change at page 97, line 27
behavior. All clients need to be prepared for situations in which it behavior. All clients need to be prepared for situations in which it
cannot be determined whether two filehandles denote the same object cannot be determined whether two filehandles denote the same object
and in such cases, avoid making invalid assumptions which might cause and in such cases, avoid making invalid assumptions which might cause
incorrect behavior. Further discussion of filehandle and attribute incorrect behavior. Further discussion of filehandle and attribute
comparison in the context of data caching is presented in the comparison in the context of data caching is presented in the
Section 10.3.4. Section 10.3.4.
As an example, in the case that two different path names when As an example, in the case that two different path names when
traversed at the server terminate at the same file system object, the traversed at the server terminate at the same file system object, the
server SHOULD return the same filehandle for each path. This can server SHOULD return the same filehandle for each path. This can
occur if a hard link is used to create two file names which refer to occur if a hard link (see [7]) is used to create two file names which
the same underlying file object and associated data. For example, if refer to the same underlying file object and associated data. For
paths /a/b/c and /a/d/c refer to the same file, the server SHOULD example, if paths /a/b/c and /a/d/c refer to the same file, the
return the same filehandle for both path names traversals. server SHOULD return the same filehandle for both path names
traversals.
4.2.2. Persistent Filehandle 4.2.2. Persistent Filehandle
A persistent filehandle is defined as having a fixed value for the A persistent filehandle is defined as having a fixed value for the
lifetime of the file system object to which it refers. Once the lifetime of the file system object to which it refers. Once the
server creates the filehandle for a file system object, the server server creates the filehandle for a file system object, the server
MUST accept the same filehandle for the object for the lifetime of MUST accept the same filehandle for the object for the lifetime of
the object. If the server restarts, the NFS server must honor the the object. If the server restarts, the NFS server must honor the
same filehandle value as it did in the server's previous same filehandle value as it did in the server's previous
instantiation. Similarly, if the file system is migrated, the new instantiation. Similarly, if the file system is migrated, the new
skipping to change at page 103, line 43 skipping to change at page 104, line 16
named attribute directories and ordinary directories is not named attribute directories and ordinary directories is not
allowed. allowed.
Names of attributes will not be controlled by this document or other Names of attributes will not be controlled by this document or other
IETF standards track documents. See Section 22.1 for further IETF standards track documents. See Section 22.1 for further
discussion. discussion.
5.4. Classification of Attributes 5.4. Classification of Attributes
Each of the REQUIRED and RECOMMENDED attributes can be classified in Each of the REQUIRED and RECOMMENDED attributes can be classified in
one of three categories: per server, per file system, or per file one of three categories: per server (i.e. the value of the attribute
system object. Note that it is possible that some per file system will be the same for all file objects that share the same server
attributes may vary within the file system. See the "homogeneous" owner; see Section 2.5 for a definition of server owner), per file
attribute for its definition. Note that the attributes system (i.e. the value of the attribute will be the same for some or
all file objects that share the same fsid attribute (Section 5.8.1.9)
and Server Owner), or per file system object. Note that it is
possible that some per file system attributes may vary within the
file system, depending on the value of the "homogeneous"
(Section 5.8.2.16) attribute. Note that the attributes
time_access_set and time_modify_set are not listed in this section time_access_set and time_modify_set are not listed in this section
because they are write-only attributes corresponding to time_access because they are write-only attributes corresponding to time_access
and time_modify, and are used in a special instance of SETATTR. and time_modify, and are used in a special instance of SETATTR.
o The per server attribute is: o The per server attribute is:
lease_time lease_time
o The per file system attributes are: o The per file system attributes are:
skipping to change at page 104, line 52 skipping to change at page 105, line 27
MUST return NFS4ERR_INVAL. MUST return NFS4ERR_INVAL.
5.6. REQUIRED Attributes - List and Definition References 5.6. REQUIRED Attributes - List and Definition References
The list of REQUIRED attributes appears in Table 2. The meaning of The list of REQUIRED attributes appears in Table 2. The meaning of
the columns of the table are: the columns of the table are:
o Name: the name of attribute o Name: the name of attribute
o Id: the number assigned to the attribute. In the event of o Id: the number assigned to the attribute. In the event of
conflicts between the assigned number and [12], the latter is conflicts between the assigned number and [13], the latter is
authoritative. authoritative.
o Data Type: The XDR data type of the attribute. o Data Type: The XDR data type of the attribute.
o Acc: Access allowed to the attribute. R means read-only (GETATTR o Acc: Access allowed to the attribute. R means read-only (GETATTR
may retrieve, SETATTR may not set). W means write-only (SETATTR may retrieve, SETATTR may not set). W means write-only (SETATTR
may set, GETATTR may not retrieve). R W means read/write (GETATTR may set, GETATTR may not retrieve). R W means read/write (GETATTR
may retrieve, SETATTR may set). may retrieve, SETATTR may set).
o Defined in: the section of this specification that describes the o Defined in: the section of this specification that describes the
skipping to change at page 114, line 25 skipping to change at page 115, line 10
The value in bytes which represent the amount of disc space used by The value in bytes which represent the amount of disc space used by
this file or directory and possibly a number of other similar files this file or directory and possibly a number of other similar files
or directories, where the set of "similar" meets at least the or directories, where the set of "similar" meets at least the
criterion that allocating space to any file or directory in the set criterion that allocating space to any file or directory in the set
will reduce the "quota_avail_hard" of every other file or directory will reduce the "quota_avail_hard" of every other file or directory
in the set. in the set.
Note that there may be a number of distinct but overlapping sets of Note that there may be a number of distinct but overlapping sets of
files or directories for which a quota_used value is maintained. files or directories for which a quota_used value is maintained.
E.g. "all files with a given owner", "all files with a given group E.g. "all files with a given owner", "all files with a given group
owner". etc. owner". etc. The server is at liberty to choose any of those sets
when providing the content of the quota_used attribute, but should do
The server is at liberty to choose any of those sets but should do so so in a repeatable way. The rule may be configured per file system
in a repeatable way. The rule may be configured per file system or or may be "choose the set with the smallest quota".
may be "choose the set with the smallest quota".
5.8.2.31. Attribute 41: rawdev 5.8.2.31. Attribute 41: rawdev
Raw device identifier; the UNIX device major/minor node information. Raw device identifier; the UNIX device major/minor node information.
If the value of type is not NF4BLK or NF4CHR, the value returned If the value of type is not NF4BLK or NF4CHR, the value returned
SHOULD NOT be considered useful. SHOULD NOT be considered useful.
5.8.2.32. Attribute 42: space_avail 5.8.2.32. Attribute 42: space_avail
Disk space in bytes available to this user on the file system Disk space in bytes available to this user on the file system
skipping to change at page 115, line 20 skipping to change at page 115, line 50
This attribute is TRUE if this file is a "system" file with respect This attribute is TRUE if this file is a "system" file with respect
to the Windows operating environment. to the Windows operating environment.
5.8.2.37. Attribute 47: time_access 5.8.2.37. Attribute 47: time_access
The time_access attribute represents the time of last access to the The time_access attribute represents the time of last access to the
object by a read that was satisfied by the server. The notion of object by a read that was satisfied by the server. The notion of
what is an "access" depends on server's operating environment and/or what is an "access" depends on server's operating environment and/or
the server's file system semantics. For example, for servers obeying the server's file system semantics. For example, for servers obeying
POSIX semantics, time_access would be updated only by the READLINK, POSIX semantics, time_access would be updated only by the READ and
READ, and READDIR operations and not any of the operations that READDIR operations and not any of the operations that modify the
modify the content of the object. Of course, setting the content of the object [16], [17], [18]. Of course, setting the
corresponding time_access_set attribute is another way to modify the corresponding time_access_set attribute is another way to modify the
time_access attribute. time_access attribute.
Whenever the file object resides on a writable file system, the Whenever the file object resides on a writable file system, the
server should make best efforts to record time_access into stable server should make best efforts to record time_access into stable
storage. However, to mitigate the performance effects of doing so, storage. However, to mitigate the performance effects of doing so,
and most especially whenever the server is satisfying the read of the and most especially whenever the server is satisfying the read of the
object's content from its cache, the server MAY cache access time object's content from its cache, the server MAY cache access time
updates and lazily write them to stable storage. It is also updates and lazily write them to stable storage. It is also
acceptable to give administrators of the server the option to disable acceptable to give administrators of the server the option to disable
skipping to change at page 116, line 23 skipping to change at page 117, line 5
5.8.2.44. Attribute 54: time_modify_set 5.8.2.44. Attribute 54: time_modify_set
Set the time of last modification to the object. SETATTR use only. Set the time of last modification to the object. SETATTR use only.
5.9. Interpreting owner and owner_group 5.9. Interpreting owner and owner_group
The RECOMMENDED attributes "owner" and "owner_group" (and also users The RECOMMENDED attributes "owner" and "owner_group" (and also users
and groups within the "acl" attribute) are represented in terms of a and groups within the "acl" attribute) are represented in terms of a
UTF-8 string. To avoid a representation that is tied to a particular UTF-8 string. To avoid a representation that is tied to a particular
underlying implementation at the client or server, the use of the underlying implementation at the client or server, the use of the
UTF-8 string has been chosen. Note that section 6.1 of RFC2624 [33] UTF-8 string has been chosen. Note that section 6.1 of RFC2624 [43]
provides additional rationale. It is expected that the client and provides additional rationale. It is expected that the client and
server will have their own local representation of owner and server will have their own local representation of owner and
owner_group that is used for local storage or presentation to the end owner_group that is used for local storage or presentation to the end
user. Therefore, it is expected that when these attributes are user. Therefore, it is expected that when these attributes are
transferred between the client and server that the local transferred between the client and server that the local
representation is translated to a syntax of the form "user@ representation is translated to a syntax of the form "user@
dns_domain". This will allow for a client and server that do not use dns_domain". This will allow for a client and server that do not use
the same local representation the ability to translate to a common the same local representation the ability to translate to a common
syntax that can be interpreted by both. syntax that can be interpreted by both.
skipping to change at page 117, line 23 skipping to change at page 118, line 6
owner and group strings in an acl), it is promising to return that owner and group strings in an acl), it is promising to return that
same string when a corresponding GETATTR is done. Configuration same string when a corresponding GETATTR is done. Configuration
changes (including changes from the mapping of the string to the changes (including changes from the mapping of the string to the
local representation) and ill-constructed name translations (those local representation) and ill-constructed name translations (those
that contain aliasing) may make that promise impossible to honor. that contain aliasing) may make that promise impossible to honor.
Servers should make appropriate efforts to avoid a situation in which Servers should make appropriate efforts to avoid a situation in which
these attributes have their values changed when no real change to these attributes have their values changed when no real change to
ownership has occurred. ownership has occurred.
The "dns_domain" portion of the owner string is meant to be a DNS The "dns_domain" portion of the owner string is meant to be a DNS
domain name. For example, user@ietf.org. Servers should accept as domain name. For example, user@example.org. Servers should accept
valid a set of users for at least one domain. A server may treat as valid a set of users for at least one domain. A server may treat
other domains as having no valid translations. A more general other domains as having no valid translations. A more general
service is provided when a server is capable of accepting users for service is provided when a server is capable of accepting users for
multiple domains, or for all domains, subject to security multiple domains, or for all domains, subject to security
constraints. constraints.
In the case where there is no translation available to the client or In the case where there is no translation available to the client or
server, the attribute value must be constructed without the "@". server, the attribute value must be constructed without the "@".
Therefore, the absence of the @ from the owner or owner_group Therefore, the absence of the @ from the owner or owner_group
attribute signifies that no translation was available at the sender attribute signifies that no translation was available at the sender
and that the receiver of the attribute should not use that string as and that the receiver of the attribute should not use that string as
skipping to change at page 118, line 12 skipping to change at page 118, line 43
group translation, so that a client might pass all of the owners and group translation, so that a client might pass all of the owners and
groups in numeric form, a server SHOULD return an NFS4ERR_BADOWNER groups in numeric form, a server SHOULD return an NFS4ERR_BADOWNER
error when there is a valid translation for the user or owner error when there is a valid translation for the user or owner
designated in this way. In that case, the client must use the designated in this way. In that case, the client must use the
appropriate name@domain string and not the special form for appropriate name@domain string and not the special form for
compatibility. compatibility.
The owner string "nobody" may be used to designate an anonymous user, The owner string "nobody" may be used to designate an anonymous user,
which will be associated with a file created by a security principal which will be associated with a file created by a security principal
that cannot be mapped through normal means to the owner attribute. that cannot be mapped through normal means to the owner attribute.
Users and implementations of NFSv4.1 SHOULD NOT use "nobody" to
designate a real user whose access is not anonymous.
5.10. Character Case Attributes 5.10. Character Case Attributes
With respect to the case_insensitive and case_preserving attributes, With respect to the case_insensitive and case_preserving attributes,
each UCS-4 character (which UTF-8 encodes) has a "long descriptive each UCS-4 character (which UTF-8 encodes) can be mapped according to
name" RFC1345 [34] which may or may not include the word "CAPITAL" or Appendix B.2 of RFC3454 [19]. For general character handling and
"SMALL". The presence of SMALL or CAPITAL allows an NFS server to internationalization issues, see Section 14.
implement unambiguous and efficient table driven mappings for case
insensitive comparisons, and non-case-preserving storage. For
general character handling and internationalization issues, see
Section 14.
5.11. Directory Notification Attributes 5.11. Directory Notification Attributes
As described in Section 18.39, the client can request a minimum delay As described in Section 18.39, the client can request a minimum delay
for notifications of changes to attributes, but the server is free to for notifications of changes to attributes, but the server is free to
ignore what the client requests. The client can determine in advance ignore what the client requests. The client can determine in advance
what notification delays the server will accept by issuing a GETATTR what notification delays the server will accept by issuing a GETATTR
for either or both of two directory notification attributes. When for either or both of two directory notification attributes. When
the client calls the GET_DIR_DELEGATION operation and asks for the client calls the GET_DIR_DELEGATION operation and asks for
attribute change notifications, it should request notification delays attribute change notifications, it should request notification delays
skipping to change at page 135, line 28 skipping to change at page 136, line 13
chgrp(). chgrp().
ACE4_SYNCHRONIZE ACE4_SYNCHRONIZE
Operation(s) affected: Operation(s) affected:
NONE NONE
Discussion: Discussion:
Permission to access file locally at the server with Permission to use the file object as a synchronization
synchronized reads and writes. primitive for interprocess communication. This permission is
not enforced or interpreted by the NFSv4.1 server on behalf of
the client.
Typically, the ACE4_SYNCHRONIZE permission is only meaningful
on local file systems, i.e. file systems not accessed via
NFSv4.1. The reason that the permission bit exists is that
some operating environments, such as Windows, use
ACE4_SYNCHRONIZE.
For example, if a client copies a file that has
ACE4_SYNCHRONIZE set from a local file system to an NFSv4.1
server, and then later copies the file from the NFSv4.1 server
to a local file system, it is likely that if ACE4_SYNCHRONIZE
was set in the original file, the client will want it set in
the second copy. The first copy will not have the permission
set unless the NFSv4.1 server has the means to set the
ACE4_SYNCHRONIZE bit. The second copy will not have the
permission set unless the NFSv4.1 server has the means to
retrieve the ACE4_SYNCHRONIZE bit.
Server implementations need not provide the granularity of control Server implementations need not provide the granularity of control
that is implied by this list of masks. For example, POSIX-based that is implied by this list of masks. For example, POSIX-based
systems might not distinguish ACE4_APPEND_DATA (the ability to append systems might not distinguish ACE4_APPEND_DATA (the ability to append
to a file) from ACE4_WRITE_DATA (the ability to modify existing to a file) from ACE4_WRITE_DATA (the ability to modify existing
contents); both masks would be tied to a single "write" permission. contents); both masks would be tied to a single "write" permission
When such a server returns attributes to the client, it would show [20]. When such a server returns attributes to the client, it would
both ACE4_APPEND_DATA and ACE4_WRITE_DATA if and only if the write show both ACE4_APPEND_DATA and ACE4_WRITE_DATA if and only if the
permission is enabled. write permission is enabled.
If a server receives a SETATTR request that it cannot accurately If a server receives a SETATTR request that it cannot accurately
implement, it should err in the direction of more restricted access, implement, it should err in the direction of more restricted access,
except in the previously discussed cases of execute and read. For except in the previously discussed cases of execute and read. For
example, suppose a server cannot distinguish overwriting data from example, suppose a server cannot distinguish overwriting data from
appending new data, as described in the previous paragraph. If a appending new data, as described in the previous paragraph. If a
client submits an ALLOW ACE where ACE4_APPEND_DATA is set but client submits an ALLOW ACE where ACE4_APPEND_DATA is set but
ACE4_WRITE_DATA is not (or vice versa), the server should either turn ACE4_WRITE_DATA is not (or vice versa), the server should either turn
off ACE4_APPEND_DATA or reject the request with NFS4ERR_ATTRNOTSUPP. off ACE4_APPEND_DATA or reject the request with NFS4ERR_ATTRNOTSUPP.
skipping to change at page 141, line 25 skipping to change at page 142, line 25
6.3. Common Methods 6.3. Common Methods
The requirements in this section will be referred to in future The requirements in this section will be referred to in future
sections, especially Section 6.4. sections, especially Section 6.4.
6.3.1. Interpreting an ACL 6.3.1. Interpreting an ACL
6.3.1.1. Server Considerations 6.3.1.1. Server Considerations
The server uses the algorithm described in Section 6.2.1 to determine The server uses the algorithm described in Section 6.2.1 to determine
whether an ACL allows access to an object. However, the ACL may not whether an ACL allows access to an object. However, the ACL might
be the sole determiner of access. For example: not be the sole determiner of access. For example:
o In the case of a file system exported as read-only, the server may o In the case of a file system exported as read-only, the server may
deny write permissions even though an object's ACL grants it. deny write permissions even though an object's ACL grants it.
o Server implementations MAY grant ACE4_WRITE_ACL and ACE4_READ_ACL o Server implementations MAY grant ACE4_WRITE_ACL and ACE4_READ_ACL
permissions to prevent a situation from arising in which there is permissions to prevent a situation from arising in which there is
no valid way to ever modify the ACL. no valid way to ever modify the ACL.
o All servers will allow a user the ability to read the data of the o All servers will allow a user the ability to read the data of the
file when only the execute permission is granted (i.e. If the ACL file when only the execute permission is granted (i.e. If the ACL
skipping to change at page 142, line 5 skipping to change at page 143, line 5
of the object is allowed to override accesses that are denied by of the object is allowed to override accesses that are denied by
the ACL. This may be helpful, for example, to allow users the ACL. This may be helpful, for example, to allow users
continued access to open files on which the permissions have continued access to open files on which the permissions have
changed. changed.
o Many servers have the notion of a "superuser" that has privileges o Many servers have the notion of a "superuser" that has privileges
beyond an ordinary user. The superuser may be able to read or beyond an ordinary user. The superuser may be able to read or
write data or metadata in ways that would not be permitted by the write data or metadata in ways that would not be permitted by the
ACL. ACL.
o A retention attribute might also block access otherwise allowed by
ACLs (see Section 5.13).
6.3.1.2. Client Considerations 6.3.1.2. Client Considerations
Clients SHOULD NOT do their own access checks based on their Clients SHOULD NOT do their own access checks based on their
interpretation the ACL, but rather use the OPEN and ACCESS operations interpretation the ACL, but rather use the OPEN and ACCESS operations
to do access checks. This allows the client to act on the results of to do access checks. This allows the client to act on the results of
having the server determine whether or not access should be granted having the server determine whether or not access should be granted
based on its interpretation of the ACL. based on its interpretation of the ACL.
Clients must be aware of situations in which an object's ACL will Clients must be aware of situations in which an object's ACL will
define a certain access even though the server will not enforce it. define a certain access even though the server will not enforce it.
skipping to change at page 153, line 43 skipping to change at page 155, line 7
clients should use strong security mechanisms to access the pseudo clients should use strong security mechanisms to access the pseudo
file system in order to prevent man-in-the-middle attacks. file system in order to prevent man-in-the-middle attacks.
8. State Management 8. State Management
Integrating locking into the NFS protocol necessarily causes it to be Integrating locking into the NFS protocol necessarily causes it to be
stateful. With the inclusion of such features as share reservations, stateful. With the inclusion of such features as share reservations,
file and directory delegations, recallable layouts, and support for file and directory delegations, recallable layouts, and support for
mandatory byte-range locking, the protocol becomes substantially more mandatory byte-range locking, the protocol becomes substantially more
dependent on proper management of state than the traditional dependent on proper management of state than the traditional
combination of NFS and NLM [35]. These features include expanded combination of NFS and NLM [44]. These features include expanded
locking facilities, which provide some measure of interclient locking facilities, which provide some measure of interclient
exclusion, but the state also offers features not readily providable exclusion, but the state also offers features not readily providable
using a stateless model. There are three components to making this using a stateless model. There are three components to making this
state manageable: state manageable:
o Clear division between client and server o Clear division between client and server
o Ability to reliably detect inconsistency in state between client o Ability to reliably detect inconsistency in state between client
and server and server
o Simple and robust recovery mechanisms o Simple and robust recovery mechanisms
In this model, the server owns the state information. The client In this model, the server owns the state information. The client
requests changes in locks and the server responds with the changes requests changes in locks and the server responds with the changes
made. Non-client-initiated changes in locking state are infrequent. made. Non-client-initiated changes in locking state are infrequent.
The client receives prompt notification of such changes and can The client receives prompt notification of such changes and can
adjust its view of the locking state to reflect the server's changes. adjust its view of the locking state to reflect the server's changes.
skipping to change at page 162, line 48 skipping to change at page 164, line 11
o If there is no lock stateid, then the open stateid for the open o If there is no lock stateid, then the open stateid for the open
file in question SHOULD be used. file in question SHOULD be used.
o Finally, if none of the above apply, then a special stateid SHOULD o Finally, if none of the above apply, then a special stateid SHOULD
be used. be used.
Ignoring these rules may result in situations in which the server Ignoring these rules may result in situations in which the server
does not have information necessary to properly process the request. does not have information necessary to properly process the request.
For example, when mandatory byte-range locks are in effect, if the For example, when mandatory byte-range locks are in effect, if the
stateid does not indicate the proper lockowner, via a lock stateid, a stateid does not indicate the proper lock-owner, via a lock stateid,
request might be avoidably rejected. a request might be avoidably rejected.
The server however should not try to enforce these ordering rules and The server however should not try to enforce these ordering rules and
should use whatever information is available to proper process I/O should use whatever information is available to proper process I/O
requests. In particular, when a client has a delegation for a given requests. In particular, when a client has a delegation for a given
file, it SHOULD take note of this fact in processing a request, even file, it SHOULD take note of this fact in processing a request, even
if it is sent with a special stateid. if it is sent with a special stateid.
8.2.6. Stateid Use for SETATTR Operations 8.2.6. Stateid Use for SETATTR Operations
Because each operation is associated with a session ID and from that Because each operation is associated with a session ID and from that
skipping to change at page 170, line 27 skipping to change at page 171, line 34
requests to be processed during the grace period, it MUST determine requests to be processed during the grace period, it MUST determine
that no lock subsequently reclaimed will be rejected and that no lock that no lock subsequently reclaimed will be rejected and that no lock
subsequently reclaimed would have prevented any I/O operation subsequently reclaimed would have prevented any I/O operation
processed during the grace period. processed during the grace period.
Clients should be prepared for the return of NFS4ERR_GRACE errors for Clients should be prepared for the return of NFS4ERR_GRACE errors for
non-reclaim lock and I/O requests. In this case the client should non-reclaim lock and I/O requests. In this case the client should
employ a retry mechanism for the request. A delay (on the order of employ a retry mechanism for the request. A delay (on the order of
several seconds) between retries should be used to avoid overwhelming several seconds) between retries should be used to avoid overwhelming
the server. Further discussion of the general issue is included in the server. Further discussion of the general issue is included in
[36]. The client must account for the server that can perform I/O [45]. The client must account for the server that can perform I/O
and non-reclaim locking requests within the grace period as well as and non-reclaim locking requests within the grace period as well as
those that cannot do so. those that cannot do so.
A reclaim-type locking request outside the server's grace period can A reclaim-type locking request outside the server's grace period can
only succeed if the server can guarantee that no conflicting lock or only succeed if the server can guarantee that no conflicting lock or
I/O request has been granted since restart. I/O request has been granted since restart.
A server may, upon restart, establish a new value for the lease A server may, upon restart, establish a new value for the lease
period. Therefore, clients should, once a new client ID is period. Therefore, clients should, once a new client ID is
established, refetch the lease_time attribute and use it as the basis established, refetch the lease_time attribute and use it as the basis
skipping to change at page 171, line 24 skipping to change at page 172, line 29
reclaim locks, even if the eir_server_owner value is different. reclaim locks, even if the eir_server_owner value is different.
In this situation, it is the responsibility of the server to In this situation, it is the responsibility of the server to
return NFS4ERR_NO_GRACE if it cannot provide correct support for return NFS4ERR_NO_GRACE if it cannot provide correct support for
lock reclaim operations, including the prevention of edge lock reclaim operations, including the prevention of edge
conditions. conditions.
The eir_server_owner field is not used in making this determination. The eir_server_owner field is not used in making this determination.
Its function is to specify trunking possibilities for the client (see Its function is to specify trunking possibilities for the client (see
Section 2.10.5) and not to control lock reclaim. Section 2.10.5) and not to control lock reclaim.
8.4.2.1.1. Security Considerations for State Reclaim
During the grace period, a client can reclaim state it believes or
asserts it had before the server restarted. Unless the server
maintained a complete record of all the state the client had, the
server has little choice but to trust the client. (Of course if the
server maintained a complete record, then it would not have to force
the client to reclaim state after server restart.) While the server
has to trust the client to tell the truth, such trust does not have
any negative consequences for security. The fundamental rule for the
server when processing reclaim requests is that it MUST NOT grant the
reclaim if an equivalent non-reclaim request would not be granted
during steady-state due to access control or access conflict issues.
For example an OPEN request during a reclaim will be refused with
NFS4ERR_ACCESS if the principal making the request does not have
access to open the file according to the discretionary ACL
(Section 6.2.2) on the file.
Nonetheless, it is possible that client operating in error or
maliciously could, during reclaim, prevent another client from
reclaiming access to state. For example, an attacker could send an
OPEN reclaim operation with a deny mode that prevents another client
from reclaiming the open state it had before the server restarted.
The attacker could perform the same denial of service during steady
state prior to server restart, as long as the the attacker had
permissions. Given that the attack vectors are equivalent, the grace
period does not offer any additional opportunity for denial of
service, and any concerns about this attack vector, whether during
grace or steady state are addressed the same way: use RPCSEC_GSS for
authentication, and limit access to the file only to principals the
owner of the file trusts.
Note that if prior to restart the server had client IDs with the
EXCHGID4_FLAG_BIND_PRINC_STATEID (Section 18.35) capability set, then
the server SHOULD record in stable storage the client owner and the
principal that established the client ID via EXCHANGE_ID. If the
server does not, then there is a risk a client will be unable to
reclaim state if it does not have a credential for a principal that
was originally authorized to establish the state.
8.4.3. Network Partitions and Recovery 8.4.3. Network Partitions and Recovery
If the duration of a network partition is greater than the lease If the duration of a network partition is greater than the lease
period provided by the server, the server will not have received a period provided by the server, the server will not have received a
lease renewal from the client. If this occurs, the server may free lease renewal from the client. If this occurs, the server may free
all locks held for the client, or it may allow the lock state to all locks held for the client, or it may allow the lock state to
remain for a considerable period, subject to the constraint that if a remain for a considerable period, subject to the constraint that if a
request for a conflicting lock is made, locks associated with an request for a conflicting lock is made, locks associated with an
expired lease do not prevent such a conflicting lock from being expired lease do not prevent such a conflicting lock from being
granted but MUST be revoked as necessary so as not to interfere with granted but MUST be revoked as necessary so as not to interfere with
skipping to change at page 196, line 12 skipping to change at page 198, line 12
Since the change attribute is updated for data and metadata Since the change attribute is updated for data and metadata
modifications, some client implementors may be tempted to use the modifications, some client implementors may be tempted to use the
time_modify attribute and not the change attribute to validate time_modify attribute and not the change attribute to validate
cached data, so that metadata changes do not spuriously invalidate cached data, so that metadata changes do not spuriously invalidate
clean data. The implementor is cautioned in this approach. The clean data. The implementor is cautioned in this approach. The
change attribute is guaranteed to change for each update to the change attribute is guaranteed to change for each update to the
file, whereas time_modify is guaranteed to change only at the file, whereas time_modify is guaranteed to change only at the
granularity of the time_delta attribute. Use by the client's data granularity of the time_delta attribute. Use by the client's data
cache validation logic of time_modify and not change runs the risk cache validation logic of time_modify and not change runs the risk
of the client incorrectly marking stale data as valid. of the client incorrectly marking stale data as valid. Thus any
cache validation approach by the client MUST include the use of
the change attribute.
o Second, modified data must be flushed to the server before closing o Second, modified data must be flushed to the server before closing
a file OPENed for write. This is complementary to the first rule. a file OPENed for write. This is complementary to the first rule.
If the data is not flushed at CLOSE, the revalidation done after If the data is not flushed at CLOSE, the revalidation done after
client OPENs as file is unable to achieve its purpose. The other client OPENs as file is unable to achieve its purpose. The other
aspect to flushing the data before close is that the data must be aspect to flushing the data before close is that the data must be
committed to stable storage, at the server, before the CLOSE committed to stable storage, at the server, before the CLOSE
operation is requested by the client. In the case of a server operation is requested by the client. In the case of a server
restart and a CLOSEd file, it may not be possible to retransmit restart and a CLOSEd file, it may not be possible to retransmit
the data to be written to the file. Hence, this requirement. the data to be written to the file. Hence, this requirement.
skipping to change at page 201, line 35 skipping to change at page 203, line 39
o If the nfsace4 indicates that the open may be done, then it should o If the nfsace4 indicates that the open may be done, then it should
be granted without reference to the server. be granted without reference to the server.
o If the nfsace4 indicates that the open may not be done, then an o If the nfsace4 indicates that the open may not be done, then an
ACCESS request must be sent to the server to obtain the definitive ACCESS request must be sent to the server to obtain the definitive
answer. answer.
The server may return an nfsace4 that is more restrictive than the The server may return an nfsace4 that is more restrictive than the
actual ACL of the file. This includes an nfsace4 that specifies actual ACL of the file. This includes an nfsace4 that specifies
denial of all access. Note that some common practices such as denial of all access. Note that some common practices such as
mapping the traditional user "root" to the user "nobody" may make it mapping the traditional user "root" to the user "nobody" (see
incorrect to return the actual ACL of the file in the delegation Section 5.9) may make it incorrect to return the actual ACL of the
response. file in the delegation response.
The use of a delegation together with various other forms of caching The use of a delegation together with various other forms of caching
creates the possibility that no server authentication and creates the possibility that no server authentication and
authorization will ever be performed for a given user since all of authorization will ever be performed for a given user since all of
the user's requests might be satisfied locally. Where the client is the user's requests might be satisfied locally. Where the client is
depending on the server for authentication and authorization, the depending on the server for authentication and authorization, the
client should be sure authentication and authorization occurs for client should be sure authentication and authorization occurs for
each user by use of the ACCESS operation. This should be the case each user by use of the ACCESS operation. This should be the case
even if an ACCESS operation would not be required otherwise. As even if an ACCESS operation would not be required otherwise. As
mentioned before, the server may enforce frequent authentication by mentioned before, the server may enforce frequent authentication by
skipping to change at page 213, line 14 skipping to change at page 215, line 14
attributes obtained via GETATTR. attributes obtained via GETATTR.
A client may validate its cached version of attributes for a file by A client may validate its cached version of attributes for a file by
fetching just both the change and time_access attributes and assuming fetching just both the change and time_access attributes and assuming
that if the change attribute has the same value as it did when the that if the change attribute has the same value as it did when the
attributes were cached, then no attributes other than time_access attributes were cached, then no attributes other than time_access
have changed. The reason why time_access is also fetched is because have changed. The reason why time_access is also fetched is because
many servers operate in environments where the operation that updates many servers operate in environments where the operation that updates
change does not update time_access. For example, POSIX file change does not update time_access. For example, POSIX file
semantics do not update access time when a file is modified by the semantics do not update access time when a file is modified by the
write system call. Therefore, the client that wants a current write system call [18]. Therefore, the client that wants a current
time_access value should fetch it with change during the attribute time_access value should fetch it with change during the attribute
cache validation processing and update its cached time_access. cache validation processing and update its cached time_access.
The client may maintain a cache of modified attributes for those The client may maintain a cache of modified attributes for those
attributes intimately connected with data of modified regular files attributes intimately connected with data of modified regular files
(size, time_modify, and change). Other than those three attributes, (size, time_modify, and change). Other than those three attributes,
the client MUST NOT maintain a cache of modified attributes. the client MUST NOT maintain a cache of modified attributes.
Instead, attribute changes are immediately sent to the server. Instead, attribute changes are immediately sent to the server.
In some operating environments, the equivalent to time_access is In some operating environments, the equivalent to time_access is
skipping to change at page 241, line 11 skipping to change at page 243, line 11
Servers are encouraged to provide facilities to allow locks to be Servers are encouraged to provide facilities to allow locks to be
reclaimed on the new server after a file system transition. Often, reclaimed on the new server after a file system transition. Often,
however, in cases in which the two servers do not share a server however, in cases in which the two servers do not share a server
scope value, such facilities may not be available and client should scope value, such facilities may not be available and client should
be prepared to re-obtain locks, even though it is possible that the be prepared to re-obtain locks, even though it is possible that the
client may have its LOCK or OPEN request denied due to a conflicting client may have its LOCK or OPEN request denied due to a conflicting
lock. lock.
The consequences of having no facilities available to reclaim locks The consequences of having no facilities available to reclaim locks
on the sew server will depend on the type of environment. In some on the new server will depend on the type of environment. In some
environments, such as the transition between read-only file systems, environments, such as the transition between read-only file systems,
such denial of locks should not pose large difficulties in practice. such denial of locks should not pose large difficulties in practice.
When an attempt to re-establish a lock on a new server is denied, the When an attempt to re-establish a lock on a new server is denied, the
client should treat the situation as if its original lock had been client should treat the situation as if its original lock had been
revoked. Note that when the lock is granted, the client cannot revoked. Note that when the lock is granted, the client cannot
assume that no conflicting lock could have been granted in the assume that no conflicting lock could have been granted in the
interim. Where change attribute continuity is present, the client interim. Where change attribute continuity is present, the client
may check the change attribute to check for unwanted file may check the change attribute to check for unwanted file
modifications. Where even this is not available, and the file system modifications. Where even this is not available, and the file system
is not read-only, a client may reasonably treat all pending locks as is not read-only, a client may reasonably treat all pending locks as
skipping to change at page 252, line 14 skipping to change at page 254, line 14
}; };
The fs_location4 data type is used to represent the location of a The fs_location4 data type is used to represent the location of a
file system by providing a server name and the path to the root of file system by providing a server name and the path to the root of
the file system within that server's namespace. When a set of the file system within that server's namespace. When a set of
servers have corresponding file systems at the same path within their servers have corresponding file systems at the same path within their
namespaces, an array of server names may be provided. An entry in namespaces, an array of server names may be provided. An entry in
the server array is a UTF-8 string and represents one of a the server array is a UTF-8 string and represents one of a
traditional DNS host name, IPv4 address, or IPv6 address, or a zero- traditional DNS host name, IPv4 address, or IPv6 address, or a zero-
length string. An IPv4 or IPv6 address is represented as a universal length string. An IPv4 or IPv6 address is represented as a universal
address (see Section 3.3.9 and [13]), minus the netid, and either address (see Section 3.3.9 and [15]), minus the netid, and either
with or without the trailing ".p1.p2" suffix that represents the port with or without the trailing ".p1.p2" suffix that represents the port
number. If the suffix is omitted, then the default port, 2049, number. If the suffix is omitted, then the default port, 2049,
SHOULD be assumed. A zero-length string SHOULD be used to indicate SHOULD be assumed. A zero-length string SHOULD be used to indicate
the current address being used for the RPC call. It is not a the current address being used for the RPC call. It is not a
requirement that all servers that share the same rootpath be listed requirement that all servers that share the same rootpath be listed
in one fs_location4 instance. The array of server names is provided in one fs_location4 instance. The array of server names is provided
for convenience. Servers that share the same rootpath may also be for convenience. Servers that share the same rootpath may also be
listed in separate fs_location4 entries in the fs_locations listed in separate fs_location4 entries in the fs_locations
attribute. attribute.
skipping to change at page 270, line 42 skipping to change at page 272, line 42
12.1. Introduction 12.1. Introduction
pNFS is an OPTIONAL feature within NFSv4.1; the pNFS feature set pNFS is an OPTIONAL feature within NFSv4.1; the pNFS feature set
allows direct client access to the storage devices containing file allows direct client access to the storage devices containing file
data. When file data for a single NFSv4 server is stored on multiple data. When file data for a single NFSv4 server is stored on multiple
and/or higher throughput storage devices (by comparison to the and/or higher throughput storage devices (by comparison to the
server's throughput capability), the result can be significantly server's throughput capability), the result can be significantly
better file access performance. The relationship among multiple better file access performance. The relationship among multiple
clients, a single server, and multiple storage devices for pNFS clients, a single server, and multiple storage devices for pNFS
(server and clients have access to all storage devices) is shown in (server and clients have access to all storage devices) is shown in
this diagram: Figure 1.
+-----------+ +-----------+
|+-----------+ +-----------+ |+-----------+ +-----------+
||+-----------+ | | ||+-----------+ | |
||| | NFSv4.1 + pNFS | | ||| | NFSv4.1 + pNFS | |
+|| Clients |<------------------------------>| Server | +|| Clients |<------------------------------>| Server |
+| | | | +| | | |
+-----------+ | | +-----------+ | |
||| +-----------+ ||| +-----------+
||| | ||| |
skipping to change at page 271, line 29 skipping to change at page 273, line 29
+------------------+|| Storage |------------+ +------------------+|| Storage |------------+
+| Devices | +| Devices |
+-----------+ +-----------+
Figure 1 Figure 1
In this model, the clients, server, and storage devices are In this model, the clients, server, and storage devices are
responsible for managing file access. This is in contrast to NFSv4 responsible for managing file access. This is in contrast to NFSv4
without pNFS where it is primarily the server's responsibility; some without pNFS where it is primarily the server's responsibility; some
of this responsibility may be delegated to the client under strictly of this responsibility may be delegated to the client under strictly
specified conditions. specified conditions. See Section 12.2.6 for a discussion of the
Control Protocol. See Section 12.2.5 for a discussion of the Storage
Protocol.
pNFS takes the form of OPTIONAL operations that manage protocol pNFS takes the form of OPTIONAL operations that manage protocol
objects called 'layouts' which contain a byte-range and storage objects called 'layouts' (Section 12.2.7) which contain a byte-range
location information. The layout is managed in a similar fashion as and storage location information. The layout is managed in a similar
NFSv4.1 data delegations. For example, the layout is leased, fashion as NFSv4.1 data delegations. For example, the layout is
recallable and revocable. However, layouts are distinct abstractions leased, recallable and revocable. However, layouts are distinct
and are manipulated with new operations. When a client holds a abstractions and are manipulated with new operations. When a client
layout, it is granted the ability to directly access the byte-range holds a layout, it is granted the ability to directly access the
at the storage location specified in the layout. byte-range at the storage location specified in the layout.
There are interactions between layouts and other NFSv4.1 abstractions There are interactions between layouts and other NFSv4.1 abstractions
such as data delegations and byte-range locking. Delegation issues such as data delegations and byte-range locking. Delegation issues
are discussed in Section 12.5.5. Byte range locking issues are are discussed in Section 12.5.5. Byte range locking issues are
discussed in Section 12.2.9 and Section 12.5.1. discussed in Section 12.2.9 and Section 12.5.1.
The NFSv4.1 pNFS feature has been structured to allow for a variety
of storage protocols to be defined and used. As noted in the diagram
above, the storage protocol is the method used by the client to store
and retrieve data directly from the storage devices. The NFSv4.1
protocol directly defines one storage protocol, the NFSv4.1 storage
type, and its use.
Examples of other storage protocols that could be used with NFSv4.1's
pNFS are:
o Block/volume protocols such as iSCSI ([37]), and FCP ([38]). The
block/volume protocol support can be independent of the addressing
structure of the block/volume protocol used, allowing more than
one protocol to access the same file data and enabling
extensibility to other block/volume protocols.
o Object protocols such as OSD over iSCSI or Fibre Channel [39].
o Other storage protocols, including PVFS and other file systems
that are in use in HPC environments.
It is possible that various storage protocols are available to both
client and server and it may be possible that a client and server do
not have a matching storage protocol available to them. Because of
this, the pNFS server MUST support normal NFSv4.1 access to any file
accessible by the pNFS feature; this will allow for continued
interoperability between an NFSv4.1 client and server.
12.2. pNFS Definitions 12.2. pNFS Definitions
NFSv4.1's pNFS feature provides parallel data access to a file system NFSv4.1's pNFS feature provides parallel data access to a file system
that stripes its content across multiple storage servers. The first that stripes its content across multiple storage servers. The first
instantiation of pNFS, as part of NFSv4.1, separates the file system instantiation of pNFS, as part of NFSv4.1, separates the file system
protocol processing into two parts: metadata processing and data protocol processing into two parts: metadata processing and data
processing. Data consist of the contents of regular files which are processing. Data consist of the contents of regular files which are
striped across storage servers. Data striping occurs in at least two striped across storage servers. Data striping occurs in at least two
ways: on a file-by-file basis, and within sufficiently large files, ways: on a file-by-file basis, and within sufficiently large files,
on a block-by-block basis. In contrast, striped access to metadata on a block-by-block basis. In contrast, striped access to metadata
skipping to change at page 273, line 36 skipping to change at page 275, line 7
12.2.4. Storage Device 12.2.4. Storage Device
A storage device stores a regular file's data, but leaves metadata A storage device stores a regular file's data, but leaves metadata
management to the metadata server. A storage device could be another management to the metadata server. A storage device could be another
NFSv4.1 server, an object storage device (OSD), a block device NFSv4.1 server, an object storage device (OSD), a block device
accessed over a SAN (e.g., either FiberChannel or iSCSI SAN), or some accessed over a SAN (e.g., either FiberChannel or iSCSI SAN), or some
other entity. other entity.
12.2.5. Storage Protocol 12.2.5. Storage Protocol
A storage protocol is the protocol used between the pNFS client and As noted in the Figure 1, the storage protocol is the method used by
the storage device to access the file data. the client to store and retrieve data directly from the storage
devices.
The NFSv4.1 pNFS feature has been structured to allow for a variety
of storage protocols to be defined and used. One example storage
protocol is NFSv4.1 itself (as documented in Section 13). Other
options for the storage protocol are described elsewhere and include:
o Block/volume protocols such as iSCSI ([46]), and FCP ([47]). The
block/volume protocol support can be independent of the addressing
structure of the block/volume protocol used, allowing more than
one protocol to access the same file data and enabling
extensibility to other block/volume protocols. See [40] for a
layout specification that allows pNFS to use block/volume storage
protocols.
o Object protocols such as OSD over iSCSI or Fibre Channel [48].
See [39] for a layout specifications that allows pNFS to use
object storage protocols.
It is possible that various storage protocols are available to both
client and server and it may be possible that a client and server do
not have a matching storage protocol available to them. Because of
this, the pNFS server MUST support normal NFSv4.1 access to any file
accessible by the pNFS feature; this will allow for continued
interoperability between an NFSv4.1 client and server.
12.2.6. Control Protocol 12.2.6. Control Protocol
The control protocol is used by the exported file system between the The control protocol is used by the exported file system between the
metadata server and storage devices. Specification of such protocols metadata server and storage devices. Specification of such protocols
is outside the scope of the NFSv4.1 protocol. Such control protocols is outside the scope of the NFSv4.1 protocol. Such control protocols
would be used to control activities such as the allocation and would be used to control activities such as the allocation and
deallocation of storage and the management of state required by the deallocation of storage, the management of state required by the
storage devices to perform client access control. storage devices to perform client access control, and, depending on
the storage protocol, the enforcement of authentication and
authorization so that restrictions that would be enforced by the
metadata server are also enforced by the storage device.
A particular control protocol is not REQUIRED by NFSv4.1 but A particular control protocol is not REQUIRED by NFSv4.1 but
requirements are placed on the control protocol for maintaining requirements are placed on the control protocol for maintaining
attributes like modify time, the change attribute, and the end-of- attributes like modify time, the change attribute, and the end-of-
file (EOF) position. file (EOF) position. Note that if pNFS is layered over a clustered,
parallel file system (e.g. PVFS [49]), the mechanisms that enable
clustering and parallelism in that file system can be considered the
control protocol.
12.2.7. Layout Types 12.2.7. Layout Types
A layout describes the mapping of a file's data to the storage A layout describes the mapping of a file's data to the storage
devices that hold the data. A layout is said to belong to a specific devices that hold the data. A layout is said to belong to a specific
layout type (data type layouttype4, see Section 3.3.13). The layout layout type (data type layouttype4, see Section 3.3.13). The layout
type allows for variants to handle different storage protocols, such type allows for variants to handle different storage protocols, such
as those associated with block/volume [30], object [29], and file as those associated with block/volume [40], object [39], and file
(Section 13) layout types. A metadata server, along with its control (Section 13) layout types. A metadata server, along with its control
protocol, MUST support at least one layout type. A private sub-range protocol, MUST support at least one layout type. A private sub-range
of the layout type name space is also defined. Values from the of the layout type name space is also defined. Values from the
private layout type range MAY be used for internal testing or private layout type range MAY be used for internal testing or
experimentation. experimentation.
As an example, the organization of the file layout type could be an As an example, the organization of the file layout type could be an
array of tuples (e.g., device ID, filehandle), along with a array of tuples (e.g., device ID, filehandle), along with a
definition of how the data is stored across the devices (e.g., definition of how the data is stored across the devices (e.g.,
striping). A block/volume layout might be an array of tuples that striping). A block/volume layout might be an array of tuples that
skipping to change at page 278, line 43 skipping to change at page 280, line 46
which a layout is held, does not necessarily conflict with the which a layout is held, does not necessarily conflict with the
holding of the layout that describes the file being modified. holding of the layout that describes the file being modified.
Therefore, it is the requirement of the storage protocol or layout Therefore, it is the requirement of the storage protocol or layout
type that determines the necessary behavior. For example, block/ type that determines the necessary behavior. For example, block/
volume layout types require that the layout's iomode agree with the volume layout types require that the layout's iomode agree with the
type of I/O being performed. type of I/O being performed.
Depending upon the layout type and storage protocol in use, storage Depending upon the layout type and storage protocol in use, storage
device access permissions may be granted by LAYOUTGET and may be device access permissions may be granted by LAYOUTGET and may be
encoded within the type-specific layout. For an example of storage encoded within the type-specific layout. For an example of storage
device access permissions see an object based protocol such as [39]. device access permissions see an object based protocol such as [48].
If access permissions are encoded within the layout, the metadata If access permissions are encoded within the layout, the metadata
server SHOULD recall the layout when those permissions become invalid server SHOULD recall the layout when those permissions become invalid
for any reason; for example when a file becomes unwritable or for any reason; for example when a file becomes unwritable or
inaccessible to a client. Note, clients are still required to inaccessible to a client. Note, clients are still required to
perform the appropriate access operations with open, lock and access perform the appropriate access operations with open, lock and access
as described above. The degree to which it is possible for the as described above. The degree to which it is possible for the
client to circumvent these access operations and the consequences of client to circumvent these access operations and the consequences of
doing so must be clearly specified by the individual layout type doing so must be clearly specified by the individual layout type
specifications. In addition, these specifications must be clear specifications. In addition, these specifications must be clear
about the requirements and non-requirements for the checking about the requirements and non-requirements for the checking
skipping to change at page 287, line 45 skipping to change at page 289, line 45
return type (FILE, FSID, or ALL), and byte range; even if layouts return type (FILE, FSID, or ALL), and byte range; even if layouts
pertaining to partial ranges were previously returned. In addition, pertaining to partial ranges were previously returned. In addition,
if the client holds no layouts that overlaps the range being if the client holds no layouts that overlaps the range being
recalled, the client should return the NFS4ERR_NOMATCHING_LAYOUT recalled, the client should return the NFS4ERR_NOMATCHING_LAYOUT
error code to CB_LAYOUTRECALL. This allows the server to update its error code to CB_LAYOUTRECALL. This allows the server to update its
view of the client's layout state. view of the client's layout state.
12.5.5.2. Sequencing of Layout Operations 12.5.5.2. Sequencing of Layout Operations
As with other stateful operations, pNFS requires the correct As with other stateful operations, pNFS requires the correct
sequencing of layout operations. PNFS uses the "seqid" in the layout sequencing of layout operations. pNFS uses the "seqid" in the layout
stateid to provide the correct sequencing between regular operations stateid to provide the correct sequencing between regular operations
and callbacks. It is the server's responsibility to avoid and callbacks. It is the server's responsibility to avoid
inconsistencies regarding the layouts provided and the client's inconsistencies regarding the layouts provided and the client's
responsibility to properly serialize its layout requests and layout responsibility to properly serialize its layout requests and layout
returns. returns.
12.5.5.2.1. Layout Recall and Return Sequencing 12.5.5.2.1. Layout Recall and Return Sequencing
One critical issue with regard to layout operations sequencing One critical issue with regard to layout operations sequencing
concerns callbacks. The protocol must defend against races between concerns callbacks. The protocol must defend against races between
skipping to change at page 291, line 50 skipping to change at page 293, line 50
error is returned to the client. The server further validates the error is returned to the client. The server further validates the
"seqid" to ensure it is within the range of parallelism, "seqid" to ensure it is within the range of parallelism,
VALID_SEQID_RANGE. If the "seqid" value is outside of that range, VALID_SEQID_RANGE. If the "seqid" value is outside of that range,
the error NFS4ERR_OLD_STATEID is returned to the client. Upon the error NFS4ERR_OLD_STATEID is returned to the client. Upon
receipt of NFS4ERR_OLD_STATEID, the client updates the stateid in the receipt of NFS4ERR_OLD_STATEID, the client updates the stateid in the
layout request based on processing of other layout requests and re- layout request based on processing of other layout requests and re-
sends the operation to the server. sends the operation to the server.
12.5.5.2.1.5. Bulk Recall and Return 12.5.5.2.1.5. Bulk Recall and Return
PNFS supports recalling and returning all layouts that are for files pNFS supports recalling and returning all layouts that are for files
belonging to a particular fsid (LAYOUTRECALL4_FSID, belonging to a particular fsid (LAYOUTRECALL4_FSID,
LAYOUTRETURN4_FSID) or client ID (LAYOUTRECALL4_ALL, LAYOUTRETURN4_FSID) or client ID (LAYOUTRECALL4_ALL,
LAYOUTRETURN4_ALL). There are no "bulk" stateids, so detection of LAYOUTRETURN4_ALL). There are no "bulk" stateids, so detection of
races via the seqid is not possible. The server MUST NOT initiate races via the seqid is not possible. The server MUST NOT initiate
bulk recall while another recall is in progress, or the corresponding bulk recall while another recall is in progress, or the corresponding
LAYOUTRETURN is in progress or pending. In the event the server LAYOUTRETURN is in progress or pending. In the event the server
sends a bulk recall while the client has pending or in progress sends a bulk recall while the client has pending or in progress
LAYOUTRETURN, CB_LAYOUTRECALL, or LAYOUTGET, the client returns LAYOUTRETURN, CB_LAYOUTRECALL, or LAYOUTGET, the client returns
NFS4ERR_DELAY. In the event the client sends a LAYOUTGET or NFS4ERR_DELAY. In the event the client sends a LAYOUTGET or
LAYOUTRETURN while a bulk recall is in progress, the server returns LAYOUTRETURN while a bulk recall is in progress, the server returns
skipping to change at page 301, line 15 skipping to change at page 303, line 15
NFSv4.1, it is beyond the scope of this document to specify the NFSv4.1, it is beyond the scope of this document to specify the
security mechanisms for storage access protocols. security mechanisms for storage access protocols.
pNFS implementations MUST NOT remove NFSv4.1's access controls. The pNFS implementations MUST NOT remove NFSv4.1's access controls. The
combination of clients, storage devices, and the metadata server are combination of clients, storage devices, and the metadata server are
responsible for ensuring that all client to storage device file data responsible for ensuring that all client to storage device file data
access respects NFSv4.1's ACLs and file open modes. This entails access respects NFSv4.1's ACLs and file open modes. This entails
performing both of these checks on every access in the client, the performing both of these checks on every access in the client, the
storage device, or both (as applicable; when the storage device is an storage device, or both (as applicable; when the storage device is an
NFSv4.1 server, the storage device is ultimately responsible for NFSv4.1 server, the storage device is ultimately responsible for
controlling access). If a pNFS configuration performs these checks controlling access as described in Section 13.9.2). If a pNFS
only in the client, the risk of a misbehaving client obtaining configuration performs these checks only in the client, the risk of a
unauthorized access is an important consideration in determining when misbehaving client obtaining unauthorized access is an important
it is appropriate to use such a pNFS configuration. Such layout consideration in determining when it is appropriate to use such a
types SHOULD NOT be used when client-only access checks do not pNFS configuration. Such layout types SHOULD NOT be used when
provide sufficient assurance that NFSv4.1 access control is being client-only access checks do not provide sufficient assurance that
applied correctly. NFSv4.1 access control is being applied correctly. (This is not a
problem for the file layout type described in Section 13 because the
storage access protocol for LAYOUT4_NFSV4_1_FILES is NFSv4.1, and
thus the security model for storage device access via
LAYOUT4_NFSv4_1_FILES is the sames as that of the metadata server.)
For handling of access control specific to a layout, the reader
should examine the layout specification, such as the NFSv4.1/
files-based layout (Section 13) of this document, the blocks layout
[40], and objects layout [39].
13. PNFS: NFSv4.1 File Layout Type 13. NFSv4.1 as a Storage Protocol in pNFS: the File Layout Type
This section describes the semantics and format of NFSv4.1 file-based This section describes the semantics and format of NFSv4.1 file-based
layouts for pNFS. NFSv4.1 file-based layouts uses the layouts for pNFS. NFSv4.1 file-based layouts uses the
LAYOUT4_NFSV4_1_FILES layout type. The LAYOUT4_NFSV4_1_FILES type LAYOUT4_NFSV4_1_FILES layout type. The LAYOUT4_NFSV4_1_FILES type
defines striping data across multiple NFSv4.1 data servers. defines striping data across multiple NFSv4.1 data servers.
13.1. Client ID and Session Considerations 13.1. Client ID and Session Considerations
Sessions are a REQUIRED feature of NFSv4.1, and this extends to both Sessions are a REQUIRED feature of NFSv4.1, and this extends to both
the metadata server and file-based (NFSv4.1-based) data servers. the metadata server and file-based (NFSv4.1-based) data servers.
skipping to change at page 302, line 24 skipping to change at page 304, line 32
+--------------------------------------------------------+ +--------------------------------------------------------+
As the above table implies, a server can have one or two roles. A As the above table implies, a server can have one or two roles. A
server can be both a metadata server and a data server or it can be server can be both a metadata server and a data server or it can be
both a data server and non-metadata server. In addition to returning both a data server and non-metadata server. In addition to returning
two roles in EXCHANGE_ID's results, and thus serving both roles via a two roles in EXCHANGE_ID's results, and thus serving both roles via a
common client ID, a server can serve two roles by returning a unique common client ID, a server can serve two roles by returning a unique
client ID and server owner for each role in each of two EXCHANGE_ID client ID and server owner for each role in each of two EXCHANGE_ID
results, with each result indicating each role. results, with each result indicating each role.
In the case of a server with concurrent PNFS roles that are served by In the case of a server with concurrent pNFS roles that are served by
a common client ID, if the EXCHANGE_ID request from the client has a common client ID, if the EXCHANGE_ID request from the client has
zero or a combination of the bits set in eia_flags, the server result zero or a combination of the bits set in eia_flags, the server result
should set bits which represent the higher of the acceptable should set bits which represent the higher of the acceptable
combination of the server roles, with a preference to match the roles combination of the server roles, with a preference to match the roles
requested by the client. Thus if a client request has requested by the client. Thus if a client request has
(EXCHGID4_FLAG_USE_NON_PNFS | EXCHGID4_FLAG_USE_PNFS_MDS | (EXCHGID4_FLAG_USE_NON_PNFS | EXCHGID4_FLAG_USE_PNFS_MDS |
EXCHGID4_FLAG_USE_PNFS_DS) flags set, and the server is both a EXCHGID4_FLAG_USE_PNFS_DS) flags set, and the server is both a
metadata server and a data server, serving both the roles by a common metadata server and a data server, serving both the roles by a common
client ID, the server SHOULD return with (EXCHGID4_FLAG_USE_PNFS_MDS client ID, the server SHOULD return with (EXCHGID4_FLAG_USE_PNFS_MDS
| EXCHGID4_FLAG_USE_PNFS_DS) set. | EXCHGID4_FLAG_USE_PNFS_DS) set.
In the case of a server that has multiple concurrent PNFS roles, each In the case of a server that has multiple concurrent pNFS roles, each
role served by a unique client ID, if the client specifies zero or a role served by a unique client ID, if the client specifies zero or a
combination of roles in the request, the server results SHOULD return combination of roles in the request, the server results SHOULD return
only one of the roles from the combination specified by the client only one of the roles from the combination specified by the client
request. If the role specified by the server result does not match request. If the role specified by the server result does not match
the intended use by the client, the client should send the the intended use by the client, the client should send the
EXCHANGE_ID specifying just the interested PNFS role. EXCHANGE_ID specifying just the interested pNFS role.
If a pNFS metadata client gets a layout that refers it to an NFSv4.1 If a pNFS metadata client gets a layout that refers it to an NFSv4.1
data server, it needs a client ID on that data server. If it does data server, it needs a client ID on that data server. If it does
not yet have a client ID from the server that had the not yet have a client ID from the server that had the
EXCHGID4_FLAG_USE_PNFS_DS flag set in the EXCHANGE_ID results, then EXCHGID4_FLAG_USE_PNFS_DS flag set in the EXCHANGE_ID results, then
the client must send an EXCHANGE_ID to the data server, using the the client must send an EXCHANGE_ID to the data server, using the
same co_ownerid as it sent to the metadata server, with the same co_ownerid as it sent to the metadata server, with the
EXCHGID4_FLAG_USE_PNFS_DS flag set in the arguments. If the server's EXCHGID4_FLAG_USE_PNFS_DS flag set in the arguments. If the server's
EXCHANGE_ID results have EXCHGID4_FLAG_USE_PNFS_DS set, then the EXCHANGE_ID results have EXCHGID4_FLAG_USE_PNFS_DS set, then the
client may use the client ID to create sessions that will exchange client may use the client ID to create sessions that will exchange
skipping to change at page 303, line 41 skipping to change at page 305, line 50
If a server is both a metadata server and a data server, the server If a server is both a metadata server and a data server, the server
might need to distinguish operations on files that are directed to might need to distinguish operations on files that are directed to
the metadata server from those that are directed to the data server. the metadata server from those that are directed to the data server.
It is RECOMMENDED that the values of the filehandles returned by the It is RECOMMENDED that the values of the filehandles returned by the
LAYOUTGET operation to be different than the value of the filehandle LAYOUTGET operation to be different than the value of the filehandle
returned by the OPEN of the same file. returned by the OPEN of the same file.
Another scenario is for the metadata server and the storage device to Another scenario is for the metadata server and the storage device to
be distinct from one client's point of view, and the roles reversed be distinct from one client's point of view, and the roles reversed
from another client's point of view. For example, in the cluster from another client's point of view. For example, in the cluster
file system model, a metadata server to one client may be a data file system model, a metadata server to one client might be a data
server to another client. If NFSv4.1 is being used as the storage server to another client. If NFSv4.1 is being used as the storage
protocol, then pNFS servers need to encode the values of filehandles protocol, then pNFS servers need to encode the values of filehandles
according to their specific roles. according to their specific roles.
13.1.1. Sessions Considerations for Data Servers 13.1.1. Sessions Considerations for Data Servers
Section 2.10.10.2 states that a client has to keep its lease renewed Section 2.10.10.2 states that a client has to keep its lease renewed
in order to prevent a session from being deleted by the server. If in order to prevent a session from being deleted by the server. If
the reply to EXCHANGE_ID has just the EXCHGID4_FLAG_USE_PNFS_DS role the reply to EXCHANGE_ID has just the EXCHGID4_FLAG_USE_PNFS_DS role
set, then as noted in Section 13.6 the client will not be able to set, then as noted in Section 13.6 the client will not be able to
skipping to change at page 306, line 46 skipping to change at page 309, line 5
The nfsv4_1_file_layout_ds_addr4 data type represents the device The nfsv4_1_file_layout_ds_addr4 data type represents the device
address. It is composed of two fields: address. It is composed of two fields:
1. nflda_multipath_ds_list: An array of lists of data servers, where 1. nflda_multipath_ds_list: An array of lists of data servers, where
each list can be one or more elements, and each element each list can be one or more elements, and each element
represents a (see Section 13.5) data server address which may represents a (see Section 13.5) data server address which may
serve equally as the target of IO operations. The length of this serve equally as the target of IO operations. The length of this
array might be different than the stripe count. array might be different than the stripe count.
2. nflda_stripe_indices: An array of indexes used to index into 2. nflda_stripe_indices: An array of indices used to index into
nflda_multipath_ds_list. Each element of nflda_stripe_indices nflda_multipath_ds_list. The value of each element of
MUST be less than the number of elements in nflda_stripe_indices MUST be less than the number of elements in
nflda_multipath_ds_list. Each element of nflda_multipath_ds_list nflda_multipath_ds_list. Each element of nflda_multipath_ds_list
SHOULD be referred to by one or more elements of SHOULD be referred to by one or more elements of
nflda_stripe_indices. The number of elements in nflda_stripe_indices. The number of elements in
nflda_stripe_indices is always equal to the stripe count. nflda_stripe_indices is always equal to the stripe count.
/* Encoded in the loc_body field of type layout_content4: */ /* Encoded in the loc_body field of type layout_content4: */
struct nfsv4_1_file_layout4 { struct nfsv4_1_file_layout4 {
deviceid4 nfl_deviceid; deviceid4 nfl_deviceid;
nfl_util4 nfl_util; nfl_util4 nfl_util;
uint32_t nfl_first_stripe_index; uint32_t nfl_first_stripe_index;
skipping to change at page 317, line 7 skipping to change at page 319, line 32
only, and the data consist of exact replicas. only, and the data consist of exact replicas.
13.6. Operations Sent to NFSv4.1 Data Servers 13.6. Operations Sent to NFSv4.1 Data Servers
Clients accessing data on an NFSv4.1 data server MUST send only the Clients accessing data on an NFSv4.1 data server MUST send only the
NULL procedure and COMPOUND procedures whose operations are taken NULL procedure and COMPOUND procedures whose operations are taken
only from two restricted subsets of the operations defined as valid only from two restricted subsets of the operations defined as valid
NFSv4.1 operations. Clients MUST use the filehandle specified by the NFSv4.1 operations. Clients MUST use the filehandle specified by the
layout when accessing data on NFSv4.1 data servers. layout when accessing data on NFSv4.1 data servers.
The first of these operation subsets consist of management operations The first of these operation subsets consist of management
where the current filehandle is not relevant. This subset consists operations. This subset consists of the BACKCHANNEL_CTL,
of the BACKCHANNEL_CTL, BIND_CONN_TO_SESSION, CREATE_SESSION, BIND_CONN_TO_SESSION, CREATE_SESSION, DESTROY_CLIENTID,
DESTROY_CLIENTID, DESTROY_SESSION, EXCHANGE_ID, SECINFO_NO_NAME, DESTROY_SESSION, EXCHANGE_ID, SECINFO_NO_NAME, SET_SSV, and SEQUENCE
SET_SSV, and SEQUENCE operations. The client may use these operations. The client may use these operations in order to set up
operations in order to set up and maintain the appropriate client and maintain the appropriate client IDs, sessions, and security
IDs, sessions, and security contexts involved in communication with contexts involved in communication with the data server. Henceforth
the data server. Henceforth these will be referred to as data-server these will be referred to as data-server housekeeping operations.
housekeeping operations.
The second subset consists of COMMIT, READ, WRITE, and PUTFH, These The second subset consists of COMMIT, READ, WRITE, and PUTFH, These
operations must be used with a current filehandle specified by the operations must be used with a current filehandle specified by the
layout. In the case of PUTFH, the new current filehandle must be one layout. In the case of PUTFH, the new current filehandle must be one
taken from the layout. Henceforth, these will be referred to as taken from the layout. Henceforth, these will be referred to as
data-server I/O operations. As described in Section 12.5.1, a client data-server I/O operations. As described in Section 12.5.1, a client
MUST NOT send an I/O to a data server for which it does not hold a MUST NOT send an I/O to a data server for which it does not hold a
valid layout; the data server MUST reject such an I/O. valid layout; the data server MUST reject such an I/O.
Unless the server has a concurrent non-data-server personality, i.e. Unless the server has a concurrent non-data-server personality, i.e.
skipping to change at page 321, line 46 skipping to change at page 324, line 26
use of the open stateid, then the client should use the lock use of the open stateid, then the client should use the lock
stateid whenever one exists for that open file with the current stateid whenever one exists for that open file with the current
lock-owner. lock-owner.
o Special stateids should never be used and if used the data server o Special stateids should never be used and if used the data server
MUST reject the I/O with an NFS4ERR_BAD_STATEID error. MUST reject the I/O with an NFS4ERR_BAD_STATEID error.
13.9.2. Data Server State Propagation 13.9.2. Data Server State Propagation
Since the metadata server, which handles lock and open-mode state Since the metadata server, which handles lock and open-mode state
changes, as well as ACLs, may not be co-located with the data servers changes, as well as ACLs, might not be co-located with the data
where I/O access are validated, the server implementation MUST take servers where I/O access are validated, the server implementation
care of propagating changes of this state to the data servers. Once MUST take care of propagating changes of this state to the data
the propagation to the data servers is complete, the full effect of servers. Once the propagation to the data servers is complete, the
those changes MUST be in effect at the data servers. However, some full effect of those changes MUST be in effect at the data servers.
state changes need not be propagated immediately, although all However, some state changes need not be propagated immediately,
changes SHOULD be propagated promptly. These state propagations have although all changes SHOULD be propagated promptly. These state
an impact on the design of the control protocol, even though the propagations have an impact on the design of the control protocol,
control protocol is outside of the scope of this specification. even though the control protocol is outside of the scope of this
Immediate propagation refers to the synchronous propagation of state specification. Immediate propagation refers to the synchronous
from the metadata server to the data server(s); the propagation must propagation of state from the metadata server to the data server(s);
be complete before returning to the client. the propagation must be complete before returning to the client.
13.9.2.1. Lock State Propagation 13.9.2.1. Lock State Propagation
If the pNFS server supports mandatory locking, any mandatory locks on If the pNFS server supports mandatory locking, any mandatory locks on
a file MUST be made effective at the data servers before the request a file MUST be made effective at the data servers before the request
that establishes them returns to the caller. The effect MUST be the that establishes them returns to the caller. The effect MUST be the
same as if the mandatory lock state were synchronously propagated to same as if the mandatory lock state were synchronously propagated to
the data servers, even though the details of the control protocol may the data servers, even though the details of the control protocol may
avoid actual transfer of the state under certain circumstances. avoid actual transfer of the state under certain circumstances.
skipping to change at page 324, line 30 skipping to change at page 327, line 10
a LAYOUTCOMMIT will be done at close (along with the data WRITEs) and a LAYOUTCOMMIT will be done at close (along with the data WRITEs) and
will update the file's size and change attribute. Access from will update the file's size and change attribute. Access from
another client after that point will result in the appropriate size another client after that point will result in the appropriate size
being returned. being returned.
13.11. Layout Revocation and Fencing 13.11. Layout Revocation and Fencing
As described in Section 12.7, the layout type-specific storage As described in Section 12.7, the layout type-specific storage
protocol is responsible for handling the effects of I/Os started protocol is responsible for handling the effects of I/Os started
before lease expiration, extending through lease expiration. The before lease expiration, extending through lease expiration. The
LAYOUT4_NFSV4_1_FILES layout type can prevents all I/Os to data LAYOUT4_NFSV4_1_FILES layout type can prevent all I/Os to data
servers from being executed after lease expiration, without relying servers from being executed after lease expiration, without relying
on a precise client lease timer and without requiring data servers to on a precise client lease timer and without requiring data servers to
maintain lease timers. However, while LAYOUT4_NFSV4_1_FILES pNFS maintain lease timers. However, while LAYOUT4_NFSV4_1_FILES pNFS
server is free to deny the client all access to the data servers, server is free to deny the client all access to the data servers,
because it supports revocation of layouts, it is also free to perform because it supports revocation of layouts, it is also free to perform
a denial on a per file basis only when revoking a layout. a denial on a per file basis only when revoking a layout.
In addition to lease expiration, the reasons a layout can be revoked In addition to lease expiration, the reasons a layout can be revoked
include: client fails to respond to a CB_LAYOUTRECALL, the metadata include: client fails to respond to a CB_LAYOUTRECALL, the metadata
server restarts, or administrative intervention. Regardless of the server restarts, or administrative intervention. Regardless of the
skipping to change at page 326, line 12 skipping to change at page 328, line 38
layouts, then the implementation MUST support the SECINFO_NO_NAME layouts, then the implementation MUST support the SECINFO_NO_NAME
operation, on both the metadata and data servers. operation, on both the metadata and data servers.
14. Internationalization 14. Internationalization
The primary issue in which NFSv4.1 needs to deal with The primary issue in which NFSv4.1 needs to deal with
internationalization, or I18N, is with respect to file names and internationalization, or I18N, is with respect to file names and
other strings as used within the protocol. The choice of string other strings as used within the protocol. The choice of string
representation must allow reasonable name/string access to clients representation must allow reasonable name/string access to clients
which use various languages. The UTF-8 encoding of the UCS as which use various languages. The UTF-8 encoding of the UCS as
defined by ISO10646 [14] allows for this type of access and follows defined by ISO10646 [21] allows for this type of access and follows
the policy described in "IETF Policy on Character Sets and the policy described in "IETF Policy on Character Sets and
Languages", RFC2277 [15]. Languages", RFC2277 [22].
RFC3454 [16], otherwise know as "stringprep", documents a framework RFC3454 [19], otherwise know as "stringprep", documents a framework
for using Unicode/UTF-8 in networking protocols, so as "to increase for using Unicode/UTF-8 in networking protocols, so as "to increase
the likelihood that string input and string comparison work in ways the likelihood that string input and string comparison work in ways
that make sense for typical users throughout the world." A protocol that make sense for typical users throughout the world." A protocol
must define a profile of stringprep "in order to fully specify the must define a profile of stringprep "in order to fully specify the
processing options." The remainder of this Internationalization processing options." The remainder of this Internationalization
section defines the NFSv4.1 stringprep profiles. Much of terminology section defines the NFSv4.1 stringprep profiles. Much of terminology
used for the remainder of this section comes from stringprep. used for the remainder of this section comes from stringprep.
There are three UTF-8 string types defined for NFSv4.1: utf8str_cs, There are three UTF-8 string types defined for NFSv4.1: utf8str_cs,
utf8str_cis, and utf8str_mixed. Separate profiles are defined for utf8str_cis, and utf8str_mixed. Separate profiles are defined for
skipping to change at page 327, line 8 skipping to change at page 329, line 37
section 6 of stringprep) section 6 of stringprep)
o Any additional characters that are prohibited as output specific o Any additional characters that are prohibited as output specific
to the profile to the profile
Stringprep discusses Unicode characters, whereas NFSv4.1 renders Stringprep discusses Unicode characters, whereas NFSv4.1 renders
UTF-8 characters. Since there is a one-to-one mapping from UTF-8 to UTF-8 characters. Since there is a one-to-one mapping from UTF-8 to
Unicode, when the remainder of this document refers to Unicode, the Unicode, when the remainder of this document refers to Unicode, the
reader should assume UTF-8. reader should assume UTF-8.
Much of the text for the profiles comes from RFC3491 [17]. Much of the text for the profiles comes from RFC3491 [23].
14.1. Stringprep profile for the utf8str_cs type 14.1. Stringprep profile for the utf8str_cs type
Every use of the utf8str_cs type definition in the NFSv4 protocol Every use of the utf8str_cs type definition in the NFSv4 protocol
specification follows the profile named nfs4_cs_prep. specification follows the profile named nfs4_cs_prep.
14.1.1. Intended applicability of the nfs4_cs_prep profile 14.1.1. Intended applicability of the nfs4_cs_prep profile
The utf8str_cs type is a case sensitive string of UTF-8 characters. The utf8str_cs type is a case sensitive string of UTF-8 characters.
Its primary use in NFSv4.1 is for naming components and pathnames. Its primary use in NFSv4.1 is for naming components and pathnames.
skipping to change at page 344, line 36 skipping to change at page 347, line 21
15.1.8.6. NFS4ERR_LOCK_NOTSUPP (Error Code 10043) 15.1.8.6. NFS4ERR_LOCK_NOTSUPP (Error Code 10043)
A locking request was attempted which would require the upgrade or A locking request was attempted which would require the upgrade or
downgrade of a lock range already held by the owner when the server downgrade of a lock range already held by the owner when the server
does not support atomic upgrade or downgrade of locks. does not support atomic upgrade or downgrade of locks.
15.1.8.7. NFS4ERR_LOCK_RANGE (Error Code 10028) 15.1.8.7. NFS4ERR_LOCK_RANGE (Error Code 10028)
A lock request is operating on a range that overlaps in part a A lock request is operating on a range that overlaps in part a
currently held lock for the current lock owner and does not precisely currently held lock for the current lock-owner and does not precisely
match a single such lock where the server does not support this type match a single such lock where the server does not support this type
of request, and thus does not implement POSIX locking semantics. See of request, and thus does not implement POSIX locking semantics [24].
Section 18.10.4, Section 18.11.4, and Section 18.12.4 for a See Section 18.10.4, Section 18.11.4, and Section 18.12.4 for a
discussion of how this applies to LOCK, LOCKT, and LOCKU discussion of how this applies to LOCK, LOCKT, and LOCKU
respectively. respectively.
15.1.8.8. NFS4ERR_OPENMODE (Error Code 10038) 15.1.8.8. NFS4ERR_OPENMODE (Error Code 10038)
The client attempted a READ, WRITE, LOCK or other operation not The client attempted a READ, WRITE, LOCK or other operation not
sanctioned by the stateid passed (e.g. writing to a file opened only sanctioned by the stateid passed (e.g. writing to a file opened only
for read). for read).
15.1.8.9. NFS4ERR_SHARE_DENIED (Error Code 10015) 15.1.8.9. NFS4ERR_SHARE_DENIED (Error Code 10015)
skipping to change at page 368, line 4 skipping to change at page 371, line 4
| | NFS4ERR_WRONG_TYPE | | | NFS4ERR_WRONG_TYPE |
| CB_NOTIFY | NFS4ERR_BADHANDLE, NFS4ERR_BADXDR, | | CB_NOTIFY | NFS4ERR_BADHANDLE, NFS4ERR_BADXDR, |
| | NFS4ERR_BAD_STATEID, NFS4ERR_DELAY, | | | NFS4ERR_BAD_STATEID, NFS4ERR_DELAY, |
| | NFS4ERR_INVAL, NFS4ERR_NOTSUPP, | | | NFS4ERR_INVAL, NFS4ERR_NOTSUPP, |
| | NFS4ERR_OP_NOT_IN_SESSION, | | | NFS4ERR_OP_NOT_IN_SESSION, |
| | NFS4ERR_REP_TOO_BIG, | | | NFS4ERR_REP_TOO_BIG, |
| | NFS4ERR_REP_TOO_BIG_TO_CACHE, | | | NFS4ERR_REP_TOO_BIG_TO_CACHE, |
| | NFS4ERR_REQ_TOO_BIG, | | | NFS4ERR_REQ_TOO_BIG, |
| | NFS4ERR_SERVERFAULT, | | | NFS4ERR_SERVERFAULT, |
| | NFS4ERR_TOO_MANY_OPS | | | NFS4ERR_TOO_MANY_OPS |
| CB_NOTIFY_DEVICEID | NFS4ERR_BADXDR, NFS4ERR_DELAY, |
| | NFS4ERR_INVAL, NFS4ERR_NOTSUPP, |
| | NFS4ERR_OP_NOT_IN_SESSION, |
| | NFS4ERR_REP_TOO_BIG, |
| | NFS4ERR_REP_TOO_BIG_TO_CACHE, |
| | NFS4ERR_REQ_TOO_BIG, |
| | NFS4ERR_SERVERFAULT, |
| | NFS4ERR_TOO_MANY_OPS |
| CB_NOTIFY_LOCK | NFS4ERR_BADHANDLE, NFS4ERR_BADXDR, | | CB_NOTIFY_LOCK | NFS4ERR_BADHANDLE, NFS4ERR_BADXDR, |
| | NFS4ERR_BAD_STATEID, NFS4ERR_DELAY, | | | NFS4ERR_BAD_STATEID, NFS4ERR_DELAY, |
| | NFS4ERR_NOTSUPP, | | | NFS4ERR_NOTSUPP, |
| | NFS4ERR_OP_NOT_IN_SESSION, | | | NFS4ERR_OP_NOT_IN_SESSION, |
| | NFS4ERR_REP_TOO_BIG, | | | NFS4ERR_REP_TOO_BIG, |
| | NFS4ERR_REP_TOO_BIG_TO_CACHE, | | | NFS4ERR_REP_TOO_BIG_TO_CACHE, |
| | NFS4ERR_REQ_TOO_BIG, | | | NFS4ERR_REQ_TOO_BIG, |
| | NFS4ERR_SERVERFAULT, | | | NFS4ERR_SERVERFAULT, |
| | NFS4ERR_TOO_MANY_OPS | | | NFS4ERR_TOO_MANY_OPS |
| CB_PUSH_DELEG | NFS4ERR_BADHANDLE, NFS4ERR_BADXDR, | | CB_PUSH_DELEG | NFS4ERR_BADHANDLE, NFS4ERR_BADXDR, |
skipping to change at page 371, line 8 skipping to change at page 374, line 8
| NFS4ERR_BADOWNER | CREATE, OPEN, SETATTR | | NFS4ERR_BADOWNER | CREATE, OPEN, SETATTR |
| NFS4ERR_BADSESSION | BIND_CONN_TO_SESSION, | | NFS4ERR_BADSESSION | BIND_CONN_TO_SESSION, |
| | CB_SEQUENCE, DESTROY_SESSION, | | | CB_SEQUENCE, DESTROY_SESSION, |
| | SEQUENCE | | | SEQUENCE |
| NFS4ERR_BADSLOT | CB_SEQUENCE, SEQUENCE | | NFS4ERR_BADSLOT | CB_SEQUENCE, SEQUENCE |
| NFS4ERR_BADTYPE | CREATE | | NFS4ERR_BADTYPE | CREATE |
| NFS4ERR_BADXDR | ACCESS, BACKCHANNEL_CTL, | | NFS4ERR_BADXDR | ACCESS, BACKCHANNEL_CTL, |
| | BIND_CONN_TO_SESSION, | | | BIND_CONN_TO_SESSION, |
| | CB_GETATTR, CB_ILLEGAL, | | | CB_GETATTR, CB_ILLEGAL, |
| | CB_LAYOUTRECALL, CB_NOTIFY, | | | CB_LAYOUTRECALL, CB_NOTIFY, |
| | CB_NOTIFY_DEVICEID, |
| | CB_NOTIFY_LOCK, | | | CB_NOTIFY_LOCK, |
| | CB_PUSH_DELEG, CB_RECALL, | | | CB_PUSH_DELEG, CB_RECALL, |
| | CB_RECALLABLE_OBJ_AVAIL, | | | CB_RECALLABLE_OBJ_AVAIL, |
| | CB_RECALL_ANY, | | | CB_RECALL_ANY, |
| | CB_RECALL_SLOT, CB_SEQUENCE, | | | CB_RECALL_SLOT, CB_SEQUENCE, |
| | CB_WANTS_CANCELLED, CLOSE, | | | CB_WANTS_CANCELLED, CLOSE, |
| | COMMIT, CREATE, | | | COMMIT, CREATE, |
| | CREATE_SESSION, DELEGPURGE, | | | CREATE_SESSION, DELEGPURGE, |
| | DELEGRETURN, | | | DELEGRETURN, |
| | DESTROY_CLIENTID, | | | DESTROY_CLIENTID, |
skipping to change at page 373, line 7 skipping to change at page 376, line 7
| | READ, READDIR, READLINK, | | | READ, READDIR, READLINK, |
| | RECLAIM_COMPLETE, REMOVE, | | | RECLAIM_COMPLETE, REMOVE, |
| | RENAME, RESTOREFH, SAVEFH, | | | RENAME, RESTOREFH, SAVEFH, |
| | SECINFO, SECINFO_NO_NAME, | | | SECINFO, SECINFO_NO_NAME, |
| | SEQUENCE, SETATTR, SET_SSV, | | | SEQUENCE, SETATTR, SET_SSV, |
| | TEST_STATEID, VERIFY, | | | TEST_STATEID, VERIFY, |
| | WANT_DELEGATION, WRITE | | | WANT_DELEGATION, WRITE |
| NFS4ERR_DELAY | ACCESS, BACKCHANNEL_CTL, | | NFS4ERR_DELAY | ACCESS, BACKCHANNEL_CTL, |
| | BIND_CONN_TO_SESSION, | | | BIND_CONN_TO_SESSION, |
| | CB_GETATTR, CB_LAYOUTRECALL, | | | CB_GETATTR, CB_LAYOUTRECALL, |
| | CB_NOTIFY, CB_NOTIFY_LOCK, | | | CB_NOTIFY, |
| | CB_NOTIFY_DEVICEID, |
| | CB_NOTIFY_LOCK, |
| | CB_PUSH_DELEG, CB_RECALL, | | | CB_PUSH_DELEG, CB_RECALL, |
| | CB_RECALLABLE_OBJ_AVAIL, | | | CB_RECALLABLE_OBJ_AVAIL, |
| | CB_RECALL_ANY, | | | CB_RECALL_ANY, |
| | CB_RECALL_SLOT, CB_SEQUENCE, | | | CB_RECALL_SLOT, CB_SEQUENCE, |
| | CB_WANTS_CANCELLED, CLOSE, | | | CB_WANTS_CANCELLED, CLOSE, |
| | COMMIT, CREATE, | | | COMMIT, CREATE, |
| | CREATE_SESSION, DELEGPURGE, | | | CREATE_SESSION, DELEGPURGE, |
| | DELEGRETURN, | | | DELEGRETURN, |
| | DESTROY_CLIENTID, | | | DESTROY_CLIENTID, |
| | DESTROY_SESSION, EXCHANGE_ID, | | | DESTROY_SESSION, EXCHANGE_ID, |
skipping to change at page 374, line 31 skipping to change at page 378, line 7
| | LAYOUTCOMMIT, LAYOUTGET, | | | LAYOUTCOMMIT, LAYOUTGET, |
| | LAYOUTRETURN, LINK, LOCK, | | | LAYOUTRETURN, LINK, LOCK, |
| | LOCKT, NVERIFY, OPEN, READ, | | | LOCKT, NVERIFY, OPEN, READ, |
| | REMOVE, RENAME, SETATTR, | | | REMOVE, RENAME, SETATTR, |
| | VERIFY, WANT_DELEGATION, | | | VERIFY, WANT_DELEGATION, |
| | WRITE | | | WRITE |
| NFS4ERR_HASH_ALG_UNSUPP | EXCHANGE_ID | | NFS4ERR_HASH_ALG_UNSUPP | EXCHANGE_ID |
| NFS4ERR_INVAL | ACCESS, BACKCHANNEL_CTL, | | NFS4ERR_INVAL | ACCESS, BACKCHANNEL_CTL, |
| | BIND_CONN_TO_SESSION, | | | BIND_CONN_TO_SESSION, |
| | CB_GETATTR, CB_LAYOUTRECALL, | | | CB_GETATTR, CB_LAYOUTRECALL, |
| | CB_NOTIFY, CB_PUSH_DELEG, | | | CB_NOTIFY, |
| | CB_NOTIFY_DEVICEID, |
| | CB_PUSH_DELEG, |
| | CB_RECALLABLE_OBJ_AVAIL, | | | CB_RECALLABLE_OBJ_AVAIL, |
| | CB_RECALL_ANY, CREATE, | | | CB_RECALL_ANY, CREATE, |
| | CREATE_SESSION, DELEGRETURN, | | | CREATE_SESSION, DELEGRETURN, |
| | EXCHANGE_ID, GETATTR, | | | EXCHANGE_ID, GETATTR, |
| | GETDEVICEINFO, GETDEVICELIST, | | | GETDEVICEINFO, GETDEVICELIST, |
| | GET_DIR_DELEGATION, | | | GET_DIR_DELEGATION, |
| | LAYOUTCOMMIT, LAYOUTGET, | | | LAYOUTCOMMIT, LAYOUTGET, |
| | LAYOUTRETURN, LINK, LOCK, | | | LAYOUTRETURN, LINK, LOCK, |
| | LOCKT, LOCKU, LOOKUP, | | | LOCKT, LOCKU, LOOKUP, |
| | NVERIFY, OPEN, | | | NVERIFY, OPEN, |
skipping to change at page 376, line 30 skipping to change at page 380, line 5
| NFS4ERR_NOSPC | CREATE, CREATE_SESSION, | | NFS4ERR_NOSPC | CREATE, CREATE_SESSION, |
| | LAYOUTGET, LINK, OPEN, | | | LAYOUTGET, LINK, OPEN, |
| | OPENATTR, RENAME, SETATTR, | | | OPENATTR, RENAME, SETATTR, |
| | WRITE | | | WRITE |
| NFS4ERR_NOTDIR | CREATE, GET_DIR_DELEGATION, | | NFS4ERR_NOTDIR | CREATE, GET_DIR_DELEGATION, |
| | LINK, LOOKUP, LOOKUPP, OPEN, | | | LINK, LOOKUP, LOOKUPP, OPEN, |
| | READDIR, REMOVE, RENAME, | | | READDIR, REMOVE, RENAME, |
| | SECINFO, SECINFO_NO_NAME | | | SECINFO, SECINFO_NO_NAME |
| NFS4ERR_NOTEMPTY | REMOVE, RENAME | | NFS4ERR_NOTEMPTY | REMOVE, RENAME |
| NFS4ERR_NOTSUPP | CB_LAYOUTRECALL, CB_NOTIFY, | | NFS4ERR_NOTSUPP | CB_LAYOUTRECALL, CB_NOTIFY, |
| | CB_NOTIFY_DEVICEID, |
| | CB_NOTIFY_LOCK, | | | CB_NOTIFY_LOCK, |
| | CB_PUSH_DELEG, | | | CB_PUSH_DELEG, |
| | CB_RECALLABLE_OBJ_AVAIL, | | | CB_RECALLABLE_OBJ_AVAIL, |
| | CB_WANTS_CANCELLED, | | | CB_WANTS_CANCELLED, |
| | DELEGPURGE, DELEGRETURN, | | | DELEGPURGE, DELEGRETURN, |
| | GETDEVICEINFO, GETDEVICELIST, | | | GETDEVICEINFO, GETDEVICELIST, |
| | GET_DIR_DELEGATION, | | | GET_DIR_DELEGATION, |
| | LAYOUTCOMMIT, LAYOUTGET, | | | LAYOUTCOMMIT, LAYOUTGET, |
| | LAYOUTRETURN, LINK, OPENATTR, | | | LAYOUTRETURN, LINK, OPENATTR, |
| | OPEN_CONFIRM, | | | OPEN_CONFIRM, |
skipping to change at page 377, line 14 skipping to change at page 381, line 6
| NFS4ERR_OLD_STATEID | CLOSE, DELEGRETURN, | | NFS4ERR_OLD_STATEID | CLOSE, DELEGRETURN, |
| | FREE_STATEID, LAYOUTGET, | | | FREE_STATEID, LAYOUTGET, |
| | LAYOUTRETURN, LOCK, LOCKU, | | | LAYOUTRETURN, LOCK, LOCKU, |
| | OPEN, OPEN_DOWNGRADE, READ, | | | OPEN, OPEN_DOWNGRADE, READ, |
| | SETATTR, WRITE | | | SETATTR, WRITE |
| NFS4ERR_OPENMODE | LAYOUTGET, LOCK, READ, | | NFS4ERR_OPENMODE | LAYOUTGET, LOCK, READ, |
| | SETATTR, WRITE | | | SETATTR, WRITE |
| NFS4ERR_OP_ILLEGAL | CB_ILLEGAL, ILLEGAL | | NFS4ERR_OP_ILLEGAL | CB_ILLEGAL, ILLEGAL |
| NFS4ERR_OP_NOT_IN_SESSION | ACCESS, BACKCHANNEL_CTL, | | NFS4ERR_OP_NOT_IN_SESSION | ACCESS, BACKCHANNEL_CTL, |
| | CB_GETATTR, CB_LAYOUTRECALL, | | | CB_GETATTR, CB_LAYOUTRECALL, |
| | CB_NOTIFY, CB_NOTIFY_LOCK, | | | CB_NOTIFY, |
| | CB_NOTIFY_DEVICEID, |
| | CB_NOTIFY_LOCK, |
| | CB_PUSH_DELEG, CB_RECALL, | | | CB_PUSH_DELEG, CB_RECALL, |
| | CB_RECALLABLE_OBJ_AVAIL, | | | CB_RECALLABLE_OBJ_AVAIL, |
| | CB_RECALL_ANY, | | | CB_RECALL_ANY, |
| | CB_RECALL_SLOT, | | | CB_RECALL_SLOT, |
| | CB_WANTS_CANCELLED, CLOSE, | | | CB_WANTS_CANCELLED, CLOSE, |
| | COMMIT, CREATE, DELEGPURGE, | | | COMMIT, CREATE, DELEGPURGE, |
| | DELEGRETURN, FREE_STATEID, | | | DELEGRETURN, FREE_STATEID, |
| | GETATTR, GETDEVICEINFO, | | | GETATTR, GETDEVICEINFO, |
| | GETDEVICELIST, GETFH, | | | GETDEVICELIST, GETFH, |
| | GET_DIR_DELEGATION, | | | GET_DIR_DELEGATION, |
skipping to change at page 378, line 7 skipping to change at page 382, line 7
| NFS4ERR_PNFS_NO_LAYOUT | READ, WRITE | | NFS4ERR_PNFS_NO_LAYOUT | READ, WRITE |
| NFS4ERR_RECALLCONFLICT | LAYOUTGET, WANT_DELEGATION | | NFS4ERR_RECALLCONFLICT | LAYOUTGET, WANT_DELEGATION |
| NFS4ERR_RECLAIM_BAD | LAYOUTCOMMIT, LOCK, OPEN, | | NFS4ERR_RECLAIM_BAD | LAYOUTCOMMIT, LOCK, OPEN, |
| | WANT_DELEGATION | | | WANT_DELEGATION |
| NFS4ERR_RECLAIM_CONFLICT | LAYOUTCOMMIT, LOCK, OPEN, | | NFS4ERR_RECLAIM_CONFLICT | LAYOUTCOMMIT, LOCK, OPEN, |
| | WANT_DELEGATION | | | WANT_DELEGATION |
| NFS4ERR_REJECT_DELEG | CB_PUSH_DELEG | | NFS4ERR_REJECT_DELEG | CB_PUSH_DELEG |
| NFS4ERR_REP_TOO_BIG | ACCESS, BACKCHANNEL_CTL, | | NFS4ERR_REP_TOO_BIG | ACCESS, BACKCHANNEL_CTL, |
| | BIND_CONN_TO_SESSION, | | | BIND_CONN_TO_SESSION, |
| | CB_GETATTR, CB_LAYOUTRECALL, | | | CB_GETATTR, CB_LAYOUTRECALL, |
| | CB_NOTIFY, CB_NOTIFY_LOCK, | | | CB_NOTIFY, |
| | CB_NOTIFY_DEVICEID, |
| | CB_NOTIFY_LOCK, |
| | CB_PUSH_DELEG, CB_RECALL, | | | CB_PUSH_DELEG, CB_RECALL, |
| | CB_RECALLABLE_OBJ_AVAIL, | | | CB_RECALLABLE_OBJ_AVAIL, |
| | CB_RECALL_ANY, | | | CB_RECALL_ANY, |
| | CB_RECALL_SLOT, CB_SEQUENCE, | | | CB_RECALL_SLOT, CB_SEQUENCE, |
| | CB_WANTS_CANCELLED, CLOSE, | | | CB_WANTS_CANCELLED, CLOSE, |
| | COMMIT, CREATE, | | | COMMIT, CREATE, |
| | CREATE_SESSION, DELEGPURGE, | | | CREATE_SESSION, DELEGPURGE, |
| | DELEGRETURN, | | | DELEGRETURN, |
| | DESTROY_CLIENTID, | | | DESTROY_CLIENTID, |
| | DESTROY_SESSION, EXCHANGE_ID, | | | DESTROY_SESSION, EXCHANGE_ID, |
skipping to change at page 379, line 7 skipping to change at page 383, line 7
| | READ, READDIR, READLINK, | | | READ, READDIR, READLINK, |
| | RECLAIM_COMPLETE, REMOVE, | | | RECLAIM_COMPLETE, REMOVE, |
| | RENAME, RESTOREFH, SAVEFH, | | | RENAME, RESTOREFH, SAVEFH, |
| | SECINFO, SECINFO_NO_NAME, | | | SECINFO, SECINFO_NO_NAME, |
| | SEQUENCE, SETATTR, SET_SSV, | | | SEQUENCE, SETATTR, SET_SSV, |
| | TEST_STATEID, VERIFY, | | | TEST_STATEID, VERIFY, |
| | WANT_DELEGATION, WRITE | | | WANT_DELEGATION, WRITE |
| NFS4ERR_REP_TOO_BIG_TO_CACHE | ACCESS, BACKCHANNEL_CTL, | | NFS4ERR_REP_TOO_BIG_TO_CACHE | ACCESS, BACKCHANNEL_CTL, |
| | BIND_CONN_TO_SESSION, | | | BIND_CONN_TO_SESSION, |
| | CB_GETATTR, CB_LAYOUTRECALL, | | | CB_GETATTR, CB_LAYOUTRECALL, |
| | CB_NOTIFY, CB_NOTIFY_LOCK, | | | CB_NOTIFY, |
| | CB_NOTIFY_DEVICEID, |
| | CB_NOTIFY_LOCK, |
| | CB_PUSH_DELEG, CB_RECALL, | | | CB_PUSH_DELEG, CB_RECALL, |
| | CB_RECALLABLE_OBJ_AVAIL, | | | CB_RECALLABLE_OBJ_AVAIL, |
| | CB_RECALL_ANY, | | | CB_RECALL_ANY, |
| | CB_RECALL_SLOT, CB_SEQUENCE, | | | CB_RECALL_SLOT, CB_SEQUENCE, |
| | CB_WANTS_CANCELLED, CLOSE, | | | CB_WANTS_CANCELLED, CLOSE, |
| | COMMIT, CREATE, | | | COMMIT, CREATE, |
| | CREATE_SESSION, DELEGPURGE, | | | CREATE_SESSION, DELEGPURGE, |
| | DELEGRETURN, | | | DELEGRETURN, |
| | DESTROY_CLIENTID, | | | DESTROY_CLIENTID, |
| | DESTROY_SESSION, EXCHANGE_ID, | | | DESTROY_SESSION, EXCHANGE_ID, |
skipping to change at page 380, line 7 skipping to change at page 384, line 7
| | READ, READDIR, READLINK, | | | READ, READDIR, READLINK, |
| | RECLAIM_COMPLETE, REMOVE, | | | RECLAIM_COMPLETE, REMOVE, |
| | RENAME, RESTOREFH, SAVEFH, | | | RENAME, RESTOREFH, SAVEFH, |
| | SECINFO, SECINFO_NO_NAME, | | | SECINFO, SECINFO_NO_NAME, |
| | SEQUENCE, SETATTR, SET_SSV, | | | SEQUENCE, SETATTR, SET_SSV, |
| | TEST_STATEID, VERIFY, | | | TEST_STATEID, VERIFY, |
| | WANT_DELEGATION, WRITE | | | WANT_DELEGATION, WRITE |
| NFS4ERR_REQ_TOO_BIG | ACCESS, BACKCHANNEL_CTL, | | NFS4ERR_REQ_TOO_BIG | ACCESS, BACKCHANNEL_CTL, |
| | BIND_CONN_TO_SESSION, | | | BIND_CONN_TO_SESSION, |
| | CB_GETATTR, CB_LAYOUTRECALL, | | | CB_GETATTR, CB_LAYOUTRECALL, |
| | CB_NOTIFY, CB_NOTIFY_LOCK, | | | CB_NOTIFY, |
| | CB_NOTIFY_DEVICEID, |
| | CB_NOTIFY_LOCK, |
| | CB_PUSH_DELEG, CB_RECALL, | | | CB_PUSH_DELEG, CB_RECALL, |
| | CB_RECALLABLE_OBJ_AVAIL, | | | CB_RECALLABLE_OBJ_AVAIL, |
| | CB_RECALL_ANY, | | | CB_RECALL_ANY, |
| | CB_RECALL_SLOT, CB_SEQUENCE, | | | CB_RECALL_SLOT, CB_SEQUENCE, |
| | CB_WANTS_CANCELLED, CLOSE, | | | CB_WANTS_CANCELLED, CLOSE, |
| | COMMIT, CREATE, | | | COMMIT, CREATE, |
| | CREATE_SESSION, DELEGPURGE, | | | CREATE_SESSION, DELEGPURGE, |
| | DELEGRETURN, | | | DELEGRETURN, |
| | DESTROY_CLIENTID, | | | DESTROY_CLIENTID, |
| | DESTROY_SESSION, EXCHANGE_ID, | | | DESTROY_SESSION, EXCHANGE_ID, |
skipping to change at page 381, line 6 skipping to change at page 385, line 6
| | OPEN, OPENATTR, | | | OPEN, OPENATTR, |
| | OPEN_DOWNGRADE, REMOVE, | | | OPEN_DOWNGRADE, REMOVE, |
| | RENAME, SETATTR, WRITE | | | RENAME, SETATTR, WRITE |
| NFS4ERR_SAME | NVERIFY | | NFS4ERR_SAME | NVERIFY |
| NFS4ERR_SEQUENCE_POS | CB_SEQUENCE, SEQUENCE | | NFS4ERR_SEQUENCE_POS | CB_SEQUENCE, SEQUENCE |
| NFS4ERR_SEQ_FALSE_RETRY | CB_SEQUENCE, SEQUENCE | | NFS4ERR_SEQ_FALSE_RETRY | CB_SEQUENCE, SEQUENCE |
| NFS4ERR_SEQ_MISORDERED | CB_SEQUENCE, CREATE_SESSION, | | NFS4ERR_SEQ_MISORDERED | CB_SEQUENCE, CREATE_SESSION, |
| | SEQUENCE | | | SEQUENCE |
| NFS4ERR_SERVERFAULT | ACCESS, BIND_CONN_TO_SESSION, | | NFS4ERR_SERVERFAULT | ACCESS, BIND_CONN_TO_SESSION, |
| | CB_GETATTR, CB_NOTIFY, | | | CB_GETATTR, CB_NOTIFY, |
| | CB_NOTIFY_DEVICEID, |
| | CB_NOTIFY_LOCK, | | | CB_NOTIFY_LOCK, |
| | CB_PUSH_DELEG, CB_RECALL, | | | CB_PUSH_DELEG, CB_RECALL, |
| | CB_RECALLABLE_OBJ_AVAIL, | | | CB_RECALLABLE_OBJ_AVAIL, |
| | CB_WANTS_CANCELLED, CLOSE, | | | CB_WANTS_CANCELLED, CLOSE, |
| | COMMIT, CREATE, | | | COMMIT, CREATE, |
| | CREATE_SESSION, DELEGPURGE, | | | CREATE_SESSION, DELEGPURGE, |
| | DELEGRETURN, | | | DELEGRETURN, |
| | DESTROY_CLIENTID, | | | DESTROY_CLIENTID, |
| | DESTROY_SESSION, EXCHANGE_ID, | | | DESTROY_SESSION, EXCHANGE_ID, |
| | FREE_STATEID, GETATTR, | | | FREE_STATEID, GETATTR, |
skipping to change at page 382, line 13 skipping to change at page 386, line 13
| | DESTROY_SESSION | | | DESTROY_SESSION |
| NFS4ERR_SYMLINK | COMMIT, LAYOUTCOMMIT, LINK, | | NFS4ERR_SYMLINK | COMMIT, LAYOUTCOMMIT, LINK, |
| | LOCK, LOCKT, LOOKUP, LOOKUPP, | | | LOCK, LOCKT, LOOKUP, LOOKUPP, |
| | OPEN, READ, WRITE | | | OPEN, READ, WRITE |
| NFS4ERR_TOOSMALL | CREATE_SESSION, | | NFS4ERR_TOOSMALL | CREATE_SESSION, |
| | GETDEVICEINFO, LAYOUTGET, | | | GETDEVICEINFO, LAYOUTGET, |
| | READDIR | | | READDIR |
| NFS4ERR_TOO_MANY_OPS | ACCESS, BACKCHANNEL_CTL, | | NFS4ERR_TOO_MANY_OPS | ACCESS, BACKCHANNEL_CTL, |
| | BIND_CONN_TO_SESSION, | | | BIND_CONN_TO_SESSION, |
| | CB_GETATTR, CB_LAYOUTRECALL, | | | CB_GETATTR, CB_LAYOUTRECALL, |
| | CB_NOTIFY, CB_NOTIFY_LOCK, | | | CB_NOTIFY, |
| | CB_NOTIFY_DEVICEID, |
| | CB_NOTIFY_LOCK, |
| | CB_PUSH_DELEG, CB_RECALL, | | | CB_PUSH_DELEG, CB_RECALL, |
| | CB_RECALLABLE_OBJ_AVAIL, | | | CB_RECALLABLE_OBJ_AVAIL, |
| | CB_RECALL_ANY, | | | CB_RECALL_ANY, |
| | CB_RECALL_SLOT, CB_SEQUENCE, | | | CB_RECALL_SLOT, CB_SEQUENCE, |
| | CB_WANTS_CANCELLED, CLOSE, | | | CB_WANTS_CANCELLED, CLOSE, |
| | COMMIT, CREATE, | | | COMMIT, CREATE, |
| | CREATE_SESSION, DELEGPURGE, | | | CREATE_SESSION, DELEGPURGE, |
| | DELEGRETURN, | | | DELEGRETURN, |
| | DESTROY_CLIENTID, | | | DESTROY_CLIENTID, |
| | DESTROY_SESSION, EXCHANGE_ID, | | | DESTROY_SESSION, EXCHANGE_ID, |
skipping to change at page 392, line 32 skipping to change at page 396, line 32
GETFH GETFH
Figure 2 Figure 2
In this example, the PUTFH (Section 18.19) operation explicitly sets In this example, the PUTFH (Section 18.19) operation explicitly sets
the current filehandle value while the result of each LOOKUP the current filehandle value while the result of each LOOKUP
operation sets the current filehandle value to the resultant file operation sets the current filehandle value to the resultant file
system object. Also, the client is able to insert GETATTR operations system object. Also, the client is able to insert GETATTR operations
using the current filehandle as an argument. using the current filehandle as an argument.
The PUTROOTFH (Section 18.21) and PUTPUBFH (Section 18.21) operations The PUTROOTFH (Section 18.21) and PUTPUBFH (Section 18.20) operations
also set the current filehandle. The above example would replace also set the current filehandle. The above example would replace
"PUTFH fh1" with PUTROOTFH or PUTPUBFH with no filehandle argument in "PUTFH fh1" with PUTROOTFH or PUTPUBFH with no filehandle argument in
order to achieve the same effect (on the assumption that "compA" is order to achieve the same effect (on the assumption that "compA" is
directly below the root of the namespace). directly below the root of the namespace).
Along with the current filehandle, there is a saved filehandle. Along with the current filehandle, there is a saved filehandle.
While the current filehandle is set as the result of operations like While the current filehandle is set as the result of operations like
LOOKUP, the saved filehandle must be set directly with the use of the LOOKUP, the saved filehandle must be set directly with the use of the
SAVEFH operation. The SAVEFH operations copies the current SAVEFH operation. The SAVEFH operations copies the current
filehandle value to the saved value. The saved filehandle value is filehandle value to the saved value. The saved filehandle value is
skipping to change at page 398, line 21 skipping to change at page 402, line 21
Callback Operations Callback Operations
+-------------------------+-----------+-------------+---------------+ +-------------------------+-----------+-------------+---------------+
| Operation | REQ, REC, | Feature | Definition | | Operation | REQ, REC, | Feature | Definition |
| | OPT, or | (REQ, REC, | | | | OPT, or | (REQ, REC, | |
| | MNI | or OPT) | | | | MNI | or OPT) | |
+-------------------------+-----------+-------------+---------------+ +-------------------------+-----------+-------------+---------------+
| CB_GETATTR | OPT | FDELG (REQ) | Section 20.1 | | CB_GETATTR | OPT | FDELG (REQ) | Section 20.1 |
| CB_LAYOUTRECALL | OPT | pNFS (REQ) | Section 20.3 | | CB_LAYOUTRECALL | OPT | pNFS (REQ) | Section 20.3 |
| CB_NOTIFY | OPT | DDELG (REQ) | Section 20.4 | | CB_NOTIFY | OPT | DDELG (REQ) | Section 20.4 |
| CB_NOTIFY_DEVICEID | OPT | pNFS (OPT) | Section 20.4 | | CB_NOTIFY_DEVICEID | OPT | pNFS (OPT) | Section 20.12 |
| CB_NOTIFY_LOCK | OPT | | Section 20.11 | | CB_NOTIFY_LOCK | OPT | | Section 20.11 |
| CB_PUSH_DELEG | OPT | FDELG (OPT) | Section 20.5 | | CB_PUSH_DELEG | OPT | FDELG (OPT) | Section 20.5 |
| CB_RECALL | OPT | FDELG, | Section 20.2 | | CB_RECALL | OPT | FDELG, | Section 20.2 |
| | | DDELG, pNFS | | | | | DDELG, pNFS | |
| | | (REQ) | | | | | (REQ) | |
| CB_RECALL_ANY | OPT | FDELG, | Section 20.6 | | CB_RECALL_ANY | OPT | FDELG, | Section 20.6 |
| | | DDELG, pNFS | | | | | DDELG, pNFS | |
| | | (REQ) | | | | | (REQ) | |
| CB_RECALL_SLOT | REQ | | Section 20.8 | | CB_RECALL_SLOT | REQ | | Section 20.8 |
| CB_RECALLABLE_OBJ_AVAIL | OPT | DDELG, pNFS | Section 20.7 | | CB_RECALLABLE_OBJ_AVAIL | OPT | DDELG, pNFS | Section 20.7 |
skipping to change at page 401, line 5 skipping to change at page 405, line 5
o When a client executes a regular file, it has to read the file o When a client executes a regular file, it has to read the file
from the server. Strictly speaking, the server should not allow from the server. Strictly speaking, the server should not allow
the client to read a file being executed unless the user has read the client to read a file being executed unless the user has read
permissions on the file. Requiring users and administers to set permissions on the file. Requiring users and administers to set
read permissions on executable files in order to access them over read permissions on executable files in order to access them over
NFS is not going to be acceptable to some people. Historically, NFS is not going to be acceptable to some people. Historically,
NFS servers have allowed a user to READ a file if the user has NFS servers have allowed a user to READ a file if the user has
execute access to the file. execute access to the file.
As a practical example, the UNIX specification [40] states that an As a practical example, the UNIX specification [50] states that an
implementation claiming conformance to UNIX may indicate in the implementation claiming conformance to UNIX may indicate in the
access() programming interface's result that a privileged user has access() programming interface's result that a privileged user has
execute rights, even if no execute permission bits are set on the execute rights, even if no execute permission bits are set on the
regular file's attributes. It is possible to claim conformance to regular file's attributes. It is possible to claim conformance to
the UNIX specification and instead not indicate execute rights in the UNIX specification and instead not indicate execute rights in
that situation, which is true for some operating environments. that situation, which is true for some operating environments.
Suppose the operating environments of the client and server are Suppose the operating environments of the client and server are
implementing the access() semantics for privileged users differently, implementing the access() semantics for privileged users differently,
and the ACCESS operation implementations of the client and server and the ACCESS operation implementations of the client and server
follow their respective access() semantics. This can cause undesired follow their respective access() semantics. This can cause undesired
skipping to change at page 406, line 47 skipping to change at page 410, line 47
event or instantiation that may lead to a loss of uncommitted data. event or instantiation that may lead to a loss of uncommitted data.
Most commonly this occurs when the server is restarted; however, Most commonly this occurs when the server is restarted; however,
other events at the server may result in uncommitted data loss as other events at the server may result in uncommitted data loss as
well. well.
On success, the current filehandle retains its value. On success, the current filehandle retains its value.
18.3.4. IMPLEMENTATION 18.3.4. IMPLEMENTATION
The COMMIT operation is similar in operation and semantics to the The COMMIT operation is similar in operation and semantics to the
POSIX fsync(2) system call that synchronizes a file's state with the POSIX fsync() [25] system interface that synchronizes a file's state
disk (file data and metadata is flushed to disk or stable storage). with the disk (file data and metadata is flushed to disk or stable
COMMIT performs the same operation for a client, flushing any storage). COMMIT performs the same operation for a client, flushing
unsynchronized data and metadata on the server to the server's disk any unsynchronized data and metadata on the server to the server's
or stable storage for the specified file. Like fsync(2), it may be disk or stable storage for the specified file. Like fsync(2), it may
that there is some modified data or no modified data to synchronize. be that there is some modified data or no modified data to
The data may have been synchronized by the server's normal periodic synchronize. The data may have been synchronized by the server's
buffer synchronization activity. COMMIT should return NFS4_OK, normal periodic buffer synchronization activity. COMMIT should
unless there has been an unexpected error. return NFS4_OK, unless there has been an unexpected error.
COMMIT differs from fsync(2) in that it is possible for the client to COMMIT differs from fsync(2) in that it is possible for the client to
flush a range of the file (most likely triggered by a buffer- flush a range of the file (most likely triggered by a buffer-
reclamation scheme on the client before file has been completely reclamation scheme on the client before file has been completely
written). written).
The server implementation of COMMIT is reasonably simple. If the The server implementation of COMMIT is reasonably simple. If the
server receives a full file COMMIT request, that is starting at server receives a full file COMMIT request, that is starting at
offset 0 and count 0, it should do the equivalent of fsync()'ing the offset 0 and count 0, it should do the equivalent of fsync()'ing the
file. Otherwise, it should arrange to have the modified data in the file. Otherwise, it should arrange to have the modified data in the
skipping to change at page 410, line 24 skipping to change at page 414, line 24
MUST derive the owner (or the owner ACE). This would typically be MUST derive the owner (or the owner ACE). This would typically be
from the principal indicated in the RPC credentials of the call, but from the principal indicated in the RPC credentials of the call, but
the server's operating environment or file system semantics may the server's operating environment or file system semantics may
dictate other methods of derivation. Similarly, if createattrs dictate other methods of derivation. Similarly, if createattrs
includes neither the group attribute nor a group ACE, and if the includes neither the group attribute nor a group ACE, and if the
server's file system both supports and requires the notion of a group server's file system both supports and requires the notion of a group
attribute (or group ACE), the server MUST derive the group attribute attribute (or group ACE), the server MUST derive the group attribute
(or the corresponding owner ACE) for the file. This could be from (or the corresponding owner ACE) for the file. This could be from
the RPC call's credentials, such as the group principal if the the RPC call's credentials, such as the group principal if the
credentials include it (such as with AUTH_SYS), from the group credentials include it (such as with AUTH_SYS), from the group
identifier associated with the principal in the credentials (for identifier associated with the principal in the credentials (e.g.,
e.g., POSIX systems have a passwd database that has the group POSIX systems have a user database [26] that has a group identifier
identifier for every user identifier), inherited from directory the for every user identifier), inherited from directory the object is
object is created in, or whatever else the server's operating created in, or whatever else the server's operating environment or
environment or file system semantics dictate. This applies to the file system semantics dictate. This applies to the OPEN operation
OPEN operation too. too.
Conversely, it is possible the client will specify in createattrs an Conversely, it is possible the client will specify in createattrs an
owner attribute, group attribute, or ACL that the principal indicated owner attribute, group attribute, or ACL that the principal indicated
the RPC call's credentials does not have permissions to create files the RPC call's credentials does not have permissions to create files
for. The error to be returned in this instance is NFS4ERR_PERM. for. The error to be returned in this instance is NFS4ERR_PERM.
This applies to the OPEN operation too. This applies to the OPEN operation too.
If the current filehandle designates a directory for which another If the current filehandle designates a directory for which another
client holds a directory delegation, then, unless the delegation is client holds a directory delegation, then, unless the delegation is
such that the situation can be resolved by sending a notification, such that the situation can be resolved by sending a notification,
skipping to change at page 423, line 30 skipping to change at page 427, line 30
be available. be available.
As noted in Section 18.10.4, some servers may return As noted in Section 18.10.4, some servers may return
NFS4ERR_LOCK_RANGE to certain (otherwise non-conflicting) lock NFS4ERR_LOCK_RANGE to certain (otherwise non-conflicting) lock
requests that overlap ranges already granted to the current lock- requests that overlap ranges already granted to the current lock-
owner. owner.
The LOCKT operation's test for conflicting locks SHOULD exclude locks The LOCKT operation's test for conflicting locks SHOULD exclude locks
for the current lock-owner, and thus should return NFS4_OK in such for the current lock-owner, and thus should return NFS4_OK in such
cases. Note that this means that a server might return NFS4_OK to a cases. Note that this means that a server might return NFS4_OK to a
LOCKT request even though a LOCK request for the same range and lock LOCKT request even though a LOCK request for the same range and lock-
owner would fail with NFS4ERR_LOCK_RANGE. owner would fail with NFS4ERR_LOCK_RANGE.
When a client holds a write delegation, it may choose (see When a client holds a write delegation, it may choose (see
Section 18.10.4) to handle LOCK requests locally. In such a case, Section 18.10.4) to handle LOCK requests locally. In such a case,
LOCKT requests will similarly be handled locally. LOCKT requests will similarly be handled locally.
18.12. Operation 14: LOCKU - Unlock File 18.12. Operation 14: LOCKU - Unlock File
18.12.1. ARGUMENTS 18.12.1. ARGUMENTS
skipping to change at page 424, line 51 skipping to change at page 428, line 51
Section 18.35) to send LOCKU. Section 18.35) to send LOCKU.
18.12.4. IMPLEMENTATION 18.12.4. IMPLEMENTATION
If the area to be unlocked does not correspond exactly to a lock If the area to be unlocked does not correspond exactly to a lock
actually held by the lock-owner the server may return the error actually held by the lock-owner the server may return the error
NFS4ERR_LOCK_RANGE. This includes the case in which the area is not NFS4ERR_LOCK_RANGE. This includes the case in which the area is not
locked, where the area is a sub-range of the area locked, where it locked, where the area is a sub-range of the area locked, where it
overlaps the area locked without matching exactly or the area overlaps the area locked without matching exactly or the area
specified includes multiple locks held by the lock-owner. In all of specified includes multiple locks held by the lock-owner. In all of
these cases, allowed by POSIX locking semantics, a client receiving these cases, allowed by POSIX locking [24] semantics, a client
this error, should if it desires support for such operations, receiving this error, should if it desires support for such
simulate the operation using LOCKU on ranges corresponding to locks operations, simulate the operation using LOCKU on ranges
it actually holds, possibly followed by LOCK requests for the sub- corresponding to locks it actually holds, possibly followed by LOCK
ranges not being unlocked. requests for the sub-ranges not being unlocked.
When a client holds a write delegation, it may choose (See When a client holds a write delegation, it may choose (See
Section 18.10.4) to handle LOCK requests locally. In such a case, Section 18.10.4) to handle LOCK requests locally. In such a case,
LOCKU requests will similarly be handled locally. LOCKU requests will similarly be handled locally.
18.13. Operation 15: LOOKUP - Lookup Filename 18.13. Operation 15: LOOKUP - Lookup Filename
18.13.1. ARGUMENTS 18.13.1. ARGUMENTS
struct LOOKUP4args { struct LOOKUP4args {
skipping to change at page 440, line 25 skipping to change at page 444, line 25
In this case, delegation will always be granted, although the server In this case, delegation will always be granted, although the server
may specify an immediate recall in the delegation structure. may specify an immediate recall in the delegation structure.
The rflags returned by a successful OPEN allow the server to return The rflags returned by a successful OPEN allow the server to return
information governing how the open file is to be handled. information governing how the open file is to be handled.
o OPEN4_RESULT_CONFIRM is deprecated and MUST NOT be returned by an o OPEN4_RESULT_CONFIRM is deprecated and MUST NOT be returned by an
NFSv4.1 server. NFSv4.1 server.
o OPEN4_RESULT_LOCKTYPE_POSIX indicates the server's file locking o OPEN4_RESULT_LOCKTYPE_POSIX indicates the server's file locking
behavior supports the complete set of Posix locking techniques. behavior supports the complete set of POSIX locking techniques
From this the client can choose to manage file locking state in a [24]. From this the client can choose to manage file locking
way to handle a mis-match of file locking management. state in a way to handle a mis-match of file locking management.
o OPEN4_RESULT_PRESERVE_UNLINKED indicates the server will preserve o OPEN4_RESULT_PRESERVE_UNLINKED indicates the server will preserve
the open file if the client (or any other client) removes the file the open file if the client (or any other client) removes the file
as long as it is open. Furthermore, the server promises to as long as it is open. Furthermore, the server promises to
preserve the file through the grace period after server restart, preserve the file through the grace period after server restart,
thereby giving the client the opportunity to reclaim its open. thereby giving the client the opportunity to reclaim its open.
o OPEN4_RESULT_MAY_NOTIFY_LOCK indicates that the server may attempt o OPEN4_RESULT_MAY_NOTIFY_LOCK indicates that the server may attempt
CB_NOTIFY_LOCK callbacks for locks on this file. This flag is a CB_NOTIFY_LOCK callbacks for locks on this file. This flag is a
hint only, and may be safely ignored by the client. hint only, and may be safely ignored by the client.
skipping to change at page 444, line 5 skipping to change at page 448, line 5
operation by setting delegation_type in the results to operation by setting delegation_type in the results to
OPEN_DELEGATE_NONE_EXT, ond_why to WND4_RESOURCE, and OPEN_DELEGATE_NONE_EXT, ond_why to WND4_RESOURCE, and
ond_server_will_signal_avail set to TRUE. If ond_server_will_signal_avail set to TRUE. If
ond_server_will_signal_avail is set to TRUE, the server MUST later ond_server_will_signal_avail is set to TRUE, the server MUST later
send a CB_RECALLABLE_OBJ_AVAIL operation. send a CB_RECALLABLE_OBJ_AVAIL operation.
If the client specifies If the client specifies
OPEN4_SHARE_ACCESS_WANT_SIGNAL_DELEG_WHEN_UNCONTENDED, then it wishes OPEN4_SHARE_ACCESS_WANT_SIGNAL_DELEG_WHEN_UNCONTENDED, then it wishes
to register a "want" for a delegation, in the event the OPEN results to register a "want" for a delegation, in the event the OPEN results
do not include a delegation. If so and the server denies the do not include a delegation. If so and the server denies the
delegation due to insufficient resources, the server MAY later inform delegation due to contention, the server MAY later inform the client,
the client, via the CB_PUSH_DELEG operation, that the resource via the CB_PUSH_DELEG operation, that the contention condition has
limitation condition has eased. The server will tell the client that eased. The server will tell the client that it intends to send a
it intends to send a future CB_PUSH_DELEG operation by setting future CB_PUSH_DELEG operation by setting delegation_type in the
delegation_type in the results to OPEN_DELEGATE_NONE_EXT, ond_why to results to OPEN_DELEGATE_NONE_EXT, ond_why to WND4_CONTENTION, and
WND4_CONTENTION, and ond_server_will_push_deleg to TRUE. If ond_server_will_push_deleg to TRUE. If ond_server_will_push_deleg is
ond_server_will_push_deleg is TRUE, the server MUST later send a TRUE, the server MUST later send a CB_PUSH_DELEG operation.
CB_RECALLABLE_OBJ_AVAIL operation.
If the client has previously registered a want for a delegation on a If the client has previously registered a want for a delegation on a
file, and then sends a request to register a want for a delegation on file, and then sends a request to register a want for a delegation on
the same file, the server MUST return a new error: the same file, the server MUST return a new error:
NFS4ERR_DELEG_ALREADY_WANTED. If the client wishes to register a NFS4ERR_DELEG_ALREADY_WANTED. If the client wishes to register a
different type of delegation want for the same file, it MUST cancel different type of delegation want for the same file, it MUST cancel
the existing delegation WANT. the existing delegation WANT.
18.16.4. IMPLEMENTATION 18.16.4. IMPLEMENTATION
skipping to change at page 452, line 25 skipping to change at page 456, line 25
18.20.3. DESCRIPTION 18.20.3. DESCRIPTION
Replaces the current filehandle with the filehandle that represents Replaces the current filehandle with the filehandle that represents
the public filehandle of the server's name space. This filehandle the public filehandle of the server's name space. This filehandle
may be different from the "root" filehandle which may be associated may be different from the "root" filehandle which may be associated
with some other directory on the server. with some other directory on the server.
PUTPUBFH also clears the current stateid. PUTPUBFH also clears the current stateid.
The public filehandle represents the concepts embodied in RFC2054 The public filehandle represents the concepts embodied in RFC2054
[31], RFC2055 [32], RFC2224 [41]. The intent for NFSv4.1 is that the [41], RFC2055 [42], RFC2224 [51]. The intent for NFSv4.1 is that the
public filehandle (represented by the PUTPUBFH operation) be used as public filehandle (represented by the PUTPUBFH operation) be used as
a method of providing WebNFS server compatibility with NFSv3. a method of providing WebNFS server compatibility with NFSv3.
The public filehandle and the root filehandle (represented by the The public filehandle and the root filehandle (represented by the
PUTROOTFH operation) SHOULD be equivalent. If the public and root PUTROOTFH operation) SHOULD be equivalent. If the public and root
filehandles are not equivalent, then the public filehandle MUST be a filehandles are not equivalent, then the public filehandle MUST be a
descendant of the root filehandle. descendant of the root filehandle.
See Section 16.2.3.1.1 for more details on the current filehandle. See Section 16.2.3.1.1 for more details on the current filehandle.
skipping to change at page 452, line 47 skipping to change at page 456, line 47
18.20.4. IMPLEMENTATION 18.20.4. IMPLEMENTATION
Used as the second operator (after SEQUENCE) in an NFS request to set Used as the second operator (after SEQUENCE) in an NFS request to set
the context for file accessing operations that follow in the same the context for file accessing operations that follow in the same
COMPOUND request. COMPOUND request.
With the NFSv3 public filehandle, the client is able to specify With the NFSv3 public filehandle, the client is able to specify
whether the path name provided in the LOOKUP should be evaluated as whether the path name provided in the LOOKUP should be evaluated as
either an absolute path relative to the server's root or relative to either an absolute path relative to the server's root or relative to
the public filehandle. RFC2224 [41] contains further discussion of the public filehandle. RFC2224 [51] contains further discussion of
the functionality. With NFSv4.1, that type of specification is not the functionality. With NFSv4.1, that type of specification is not
directly available in the LOOKUP operation. The reason for this is directly available in the LOOKUP operation. The reason for this is
because the component separators needed to specify absolute vs. because the component separators needed to specify absolute vs.
relative are not allowed in NFSv4. Therefore, the client is relative are not allowed in NFSv4. Therefore, the client is
responsible for constructing its request such that the use of either responsible for constructing its request such that the use of either
PUTROOTFH or PUTPUBFH are used to signify absolute or relative PUTROOTFH or PUTPUBFH are used to signify absolute or relative
evaluation of an NFS URL respectively. evaluation of an NFS URL respectively.
Note that there are warnings mentioned in RFC2224 [41] with respect Note that there are warnings mentioned in RFC2224 [51] with respect
to the use of absolute evaluation and the restrictions the server may to the use of absolute evaluation and the restrictions the server may
place on that evaluation with respect to how much of its namespace place on that evaluation with respect to how much of its namespace
has been made available. These same warnings apply to NFSv4.1. It has been made available. These same warnings apply to NFSv4.1. It
is likely, therefore that because of server implementation details, is likely, therefore that because of server implementation details,
an NFSv3 absolute public filehandle lookup may behave differently an NFSv3 absolute public filehandle lookup may behave differently
than an NFSv4.1 absolute resolution. than an NFSv4.1 absolute resolution.
There is a form of security negotiation as described in RFC2755 [42] There is a form of security negotiation as described in RFC2755 [52]
that uses the public filehandle and an overloading of the pathname. that uses the public filehandle and an overloading of the pathname.
This method is not available with NFSv4.1 as filehandles are not This method is not available with NFSv4.1 as filehandles are not
overloaded with special meaning and therefore do not provide the same overloaded with special meaning and therefore do not provide the same
framework as NFSv3. Clients should therefore use the security framework as NFSv3. Clients should therefore use the security
negotiation mechanisms described in Section 2.6. negotiation mechanisms described in Section 2.6.
18.21. Operation 24: PUTROOTFH - Set Root Filehandle 18.21. Operation 24: PUTROOTFH - Set Root Filehandle
18.21.1. ARGUMENTS 18.21.1. ARGUMENTS
skipping to change at page 462, line 5 skipping to change at page 466, line 5
the UTF-8 definition (and the server is enforcing UTF-8 encoding, see the UTF-8 definition (and the server is enforcing UTF-8 encoding, see
Section 14.4), the error NFS4ERR_INVAL will be returned. Section 14.4), the error NFS4ERR_INVAL will be returned.
On success, the current filehandle retains its value. On success, the current filehandle retains its value.
18.25.4. IMPLEMENTATION 18.25.4. IMPLEMENTATION
NFSv3 required a different operator RMDIR for directory removal and NFSv3 required a different operator RMDIR for directory removal and
REMOVE for non-directory removal. This allowed clients to skip REMOVE for non-directory removal. This allowed clients to skip
checking the file type when being passed a non-directory delete checking the file type when being passed a non-directory delete
system call (e.g. unlink() in POSIX) to remove a directory, as well system call (e.g. unlink() [27] in POSIX) to remove a directory, as
as the converse (e.g. a rmdir() on a non-directory) because they knew well as the converse (e.g. a rmdir() on a non-directory) because they
the server would check the file type. NFSv4.1 REMOVE can be used to knew the server would check the file type. NFSv4.1 REMOVE can be
delete any directory entry independent of its file type. The used to delete any directory entry independent of its file type. The
implementor of an NFSv4.1 client's entry points from the unlink() and implementor of an NFSv4.1 client's entry points from the unlink() and
rmdir() system calls should first check the file type against the rmdir() system calls should first check the file type against the
types the system call is allowed to remove before issuing a REMOVE. types the system call is allowed to remove before issuing a REMOVE.
Alternatively, the implementor can produce a COMPOUND call that Alternatively, the implementor can produce a COMPOUND call that
includes a LOOKUP/VERIFY sequence to verify the file type before a includes a LOOKUP/VERIFY sequence to verify the file type before a
REMOVE operation in the same COMPOUND call. REMOVE operation in the same COMPOUND call.
The concept of last reference is server specific. However, if the The concept of last reference is server specific. However, if the
numlinks field in the previous attributes of the object had the value numlinks field in the previous attributes of the object had the value
1, the client should not rely on referring to the object via a 1, the client should not rely on referring to the object via a
skipping to change at page 471, line 16 skipping to change at page 475, line 16
it supports. The array entries are represented by the secinfo4 it supports. The array entries are represented by the secinfo4
structure. The field 'flavor' will contain a value of AUTH_NONE, structure. The field 'flavor' will contain a value of AUTH_NONE,
AUTH_SYS (as defined in RFC1831 [3]), or RPCSEC_GSS (as defined in AUTH_SYS (as defined in RFC1831 [3]), or RPCSEC_GSS (as defined in
RFC2203 [4]). The field flavor can also be any other security flavor RFC2203 [4]). The field flavor can also be any other security flavor
registered with IANA. registered with IANA.
For the flavors AUTH_NONE and AUTH_SYS, no additional security For the flavors AUTH_NONE and AUTH_SYS, no additional security
information is returned. The same is true of many (if not most) information is returned. The same is true of many (if not most)
other security flavors, including AUTH_DH. For a return value of other security flavors, including AUTH_DH. For a return value of
RPCSEC_GSS, a security triple is returned that contains the mechanism RPCSEC_GSS, a security triple is returned that contains the mechanism
object identifier (OID, as defined in RFC2743 [7]), the quality of object identifier (OID, as defined in RFC2743 [8]), the quality of
protection (as defined in RFC2743 [7]) and the service type (as protection (as defined in RFC2743 [8]) and the service type (as
defined in RFC2203 [4]). It is possible for SECINFO to return defined in RFC2203 [4]). It is possible for SECINFO to return
multiple entries with flavor equal to RPCSEC_GSS with different multiple entries with flavor equal to RPCSEC_GSS with different
security triple values. security triple values.
On success, the current filehandle is consumed (see On success, the current filehandle is consumed (see
Section 2.6.3.1.1.8), and if the next operation after SECINFO tries Section 2.6.3.1.1.8), and if the next operation after SECINFO tries
to use the current filehandle, that operation will fail with the to use the current filehandle, that operation will fail with the
status NFS4ERR_NOFILEHANDLE. status NFS4ERR_NOFILEHANDLE.
If the name has a length of 0 (zero), or if name does not obey the If the name has a length of 0 (zero), or if name does not obey the
skipping to change at page 495, line 33 skipping to change at page 499, line 33
spo_must_allow and the server agrees. spo_must_allow and the server agrees.
The SP4_SSV protection parameters also have: The SP4_SSV protection parameters also have:
ssp_hash_algs: ssp_hash_algs:
This is the set of algorithms the client supports for the purpose This is the set of algorithms the client supports for the purpose
of computing the digests needed for the internal SSV GSS mechanism of computing the digests needed for the internal SSV GSS mechanism
and for the SET_SSV operation. Each algorithm is specified as an and for the SET_SSV operation. Each algorithm is specified as an
object identifier (OID). The REQUIRED algorithms for a server are object identifier (OID). The REQUIRED algorithms for a server are
id-sha1, id-sha224, id-sha256, id-sha384, and id-sha512 [18]. The id-sha1, id-sha224, id-sha256, id-sha384, and id-sha512 [28]. The
algorithm the server selects among the set is indicated in algorithm the server selects among the set is indicated in
spi_hash_alg, a field of spr_ssv_prot_info. The field spi_hash_alg, a field of spr_ssv_prot_info. The field
spi_hash_alg is an index into the array ssp_hash_algs. If the spi_hash_alg is an index into the array ssp_hash_algs. If the
server does not support any of the offered algorithms, it returns server does not support any of the offered algorithms, it returns
NFS4ERR_HASH_ALG_UNSUPP. If ssp_hash_algs is empty, the server NFS4ERR_HASH_ALG_UNSUPP. If ssp_hash_algs is empty, the server
MUST return NFS4ERR_INVAL. MUST return NFS4ERR_INVAL.
ssp_encr_algs: ssp_encr_algs:
This is the set of algorithms the client supports for the purpose This is the set of algorithms the client supports for the purpose
of providing privacy protection for the internal SSV GSS of providing privacy protection for the internal SSV GSS
mechanism. Each algorithm is specified as an OID. The REQUIRED mechanism. Each algorithm is specified as an OID. The REQUIRED
algorithm for a server is id-aes256-CBC. The RECOMMENDED algorithm for a server is id-aes256-CBC. The RECOMMENDED
algorithms are id-aes192-CBC and id-aes128-CBC [19]. The selected algorithms are id-aes192-CBC and id-aes128-CBC [29]. The selected
algorithm is returned in spi_encr_alg, an index into algorithm is returned in spi_encr_alg, an index into
ssp_encr_algs. If the server does not support any of the offered ssp_encr_algs. If the server does not support any of the offered
algorithms, it returns NFS4ERR_ENCR_ALG_UNSUPP. If ssp_encr_algs algorithms, it returns NFS4ERR_ENCR_ALG_UNSUPP. If ssp_encr_algs
is empty, the server MUST return NFS4ERR_INVAL. is empty, the server MUST return NFS4ERR_INVAL.
ssp_window: ssp_window:
This is the number of SSV versions the client wants the server to This is the number of SSV versions the client wants the server to
maintain (i.e. each successful call to SET_SSV produces a new maintain (i.e. each successful call to SET_SSV produces a new
version of the SSV). If ssp_window is zero, the server MUST version of the SSV). If ssp_window is zero, the server MUST
skipping to change at page 506, line 19 skipping to change at page 510, line 19
If CREATE_SESSION4_FLAG_CONN_RDMA is set in csa_flags, and if If CREATE_SESSION4_FLAG_CONN_RDMA is set in csa_flags, and if
the connection CREATE_SESSION is called over is currently in the connection CREATE_SESSION is called over is currently in
non-RDMA mode, but has the capability to operate in RDMA mode, non-RDMA mode, but has the capability to operate in RDMA mode,
then client is requesting the server agree to "step up" to RDMA then client is requesting the server agree to "step up" to RDMA
mode on the connection. The server sets mode on the connection. The server sets
CREATE_SESSION4_FLAG_CONN_RDMA in the result field csr_flags if CREATE_SESSION4_FLAG_CONN_RDMA in the result field csr_flags if
it agrees. If CREATE_SESSION4_FLAG_CONN_RDMA is not set in it agrees. If CREATE_SESSION4_FLAG_CONN_RDMA is not set in
csa_flags, then CREATE_SESSION4_FLAG_CONN_RDMA MUST NOT be set csa_flags, then CREATE_SESSION4_FLAG_CONN_RDMA MUST NOT be set
in csr_flags. Note that once the server agrees to step up, it in csr_flags. Note that once the server agrees to step up, it
and the client MUST exchange all future traffic on the and the client MUST exchange all future traffic on the
connection with RPC RDMA framing and not Record Marking ([8]). connection with RPC RDMA framing and not Record Marking ([9]).
csa_fore_chan_attrs, csa_fore_chan_attrs: csa_fore_chan_attrs, csa_fore_chan_attrs:
The csa_fore_chan_attrs and csa_back_chan_attrs fields apply to The csa_fore_chan_attrs and csa_back_chan_attrs fields apply to
attributes of the fore channel (which conveys requests originating attributes of the fore channel (which conveys requests originating
from the client to the server), and the backchannel (the channel from the client to the server), and the backchannel (the channel
that conveys callback requests originating from the server to the that conveys callback requests originating from the server to the
client), respectively. The results are in corresponding client), respectively. The results are in corresponding
structures called csr_fore_chan_attrs and csr_back_chan_attrs. structures called csr_fore_chan_attrs and csr_back_chan_attrs.
The results establish attributes for each channel, and on all The results establish attributes for each channel, and on all
skipping to change at page 548, line 34 skipping to change at page 552, line 34
This operation is used to update the SSV for a client ID. Before This operation is used to update the SSV for a client ID. Before
SET_SSV is called the first time on a client ID, the SSV is zero (0). SET_SSV is called the first time on a client ID, the SSV is zero (0).
The SSV is the key used for the SSV GSS mechanism (Section 2.10.9) The SSV is the key used for the SSV GSS mechanism (Section 2.10.9)
SET_SSV MUST be preceded by a SEQUENCE operation in the same SET_SSV MUST be preceded by a SEQUENCE operation in the same
COMPOUND. It MUST NOT be used if the client did not opt for SP4_SSV COMPOUND. It MUST NOT be used if the client did not opt for SP4_SSV
state protection when the client ID was created (see Section 18.35); state protection when the client ID was created (see Section 18.35);
the server returns NFS4ERR_INVAL in that case. the server returns NFS4ERR_INVAL in that case.
The field ssa_digest is computed as the output of the HMAC RFC2104 The field ssa_digest is computed as the output of the HMAC RFC2104
[11] using the subkey derived from the SSV4_SUBKEY_MIC_I2T and [12] using the subkey derived from the SSV4_SUBKEY_MIC_I2T and
current SSV as the key (See Section 2.10.9 for a description of current SSV as the key (See Section 2.10.9 for a description of
subkeys), and an XDR encoded value of data type ssa_digest_input4. subkeys), and an XDR encoded value of data type ssa_digest_input4.
The field sdi_seqargs is equal to the arguments of the SEQUENCE The field sdi_seqargs is equal to the arguments of the SEQUENCE
operation for the COMPOUND procedure that SET_SSV is within. operation for the COMPOUND procedure that SET_SSV is within.
The argument ssa_ssv is XORed with the current SSV to produce the new The argument ssa_ssv is XORed with the current SSV to produce the new
SSV. The argument ssa_ssv SHOULD be generated randomly. SSV. The argument ssa_ssv SHOULD be generated randomly.
In the response, ssr_digest is the output of the HMAC using the In the response, ssr_digest is the output of the HMAC using the
subkey derived from SSV4_SUBKEY_MIC_T2I and new SSV as the key, and subkey derived from SSV4_SUBKEY_MIC_T2I and new SSV as the key, and
skipping to change at page 556, line 49 skipping to change at page 560, line 49
system transition have been completed. Presence of a current system transition have been completed. Presence of a current
filehandle is only required when rca_one_fs is set to TRUE. filehandle is only required when rca_one_fs is set to TRUE.
Once a RECLAIM_COMPLETE is done, there can be no further reclaim Once a RECLAIM_COMPLETE is done, there can be no further reclaim
operations for locks whose scope is defined as having completed operations for locks whose scope is defined as having completed
recovery. Once the client sends RECLAIM_COMPLETE, the server will recovery. Once the client sends RECLAIM_COMPLETE, the server will
not allow the client to do subsequent reclaims of locking state for not allow the client to do subsequent reclaims of locking state for
that scope and if these are attempted, will return NFS4ERR_NO_GRACE. that scope and if these are attempted, will return NFS4ERR_NO_GRACE.
Whenever a client establishes a new client ID and before it does the Whenever a client establishes a new client ID and before it does the
first non-reclaim operation that obtains a lock, it MUST do a global first non-reclaim operation that obtains a lock, it MUST send a
RECLAIM_COMPLETE, even if there are no locks to reclaim. If non- RECLAIM_COMPLETE with rca_one_fs set to FALSE, even if there are no
reclaim locking operations are done before the RECLAIM_COMPLETE, an locks to reclaim. If non-reclaim locking operations are done before
NFS4ERR_GRACE error will be returned. the RECLAIM_COMPLETE, an NFS4ERR_GRACE error will be returned.
Similarly, when the client accesses a file system on a new server, Similarly, when the client accesses a file system on a new server,
before it sends the first non-reclaim operation that obtains a lock before it sends the first non-reclaim operation that obtains a lock
on this new server, it must do a RECLAIM_COMPLETE with rca_one_fs set on this new server, it MUST send a RECLAIM_COMPLETE with rca_one_fs
to TRUE and current filehandle within that file system, even if there set to TRUE and current filehandle within that file system, even if
are no locks to reclaim. If non-reclaim locking operations are done there are no locks to reclaim. If non-reclaim locking operations are
on that file system before the RECLAIM_COMPLETE, an NFS4ERR_GRACE done on that file system before the RECLAIM_COMPLETE, an
error will be returned. NFS4ERR_GRACE error will be returned.
Any locks not reclaimed at the point at which RECLAIM_COMPLETE is Any locks not reclaimed at the point at which RECLAIM_COMPLETE is
done become non-reclaimable. The client MUST NOT attempt to reclaim done become non-reclaimable. The client MUST NOT attempt to reclaim
them, either during the current server instance or in any subsequent them, either during the current server instance or in any subsequent
server instance, or on another server to which responsibility for server instance, or on another server to which responsibility for
that file system is transferred. If the client were to do so, it that file system is transferred. If the client were to do so, it
would be violating the protocol by representing itself as owning would be violating the protocol by representing itself as owning
locks that it does not own, and so has no right to reclaim. See locks that it does not own, and so has no right to reclaim. See
Section 8.4.3 for a discussion of edge conditions related to lock Section 8.4.3 for a discussion of edge conditions related to lock
reclaim. reclaim.
skipping to change at page 571, line 14 skipping to change at page 575, line 14
20.4.2. RESULT 20.4.2. RESULT
struct CB_NOTIFY4res { struct CB_NOTIFY4res {
nfsstat4 cnr_status; nfsstat4 cnr_status;
}; };
20.4.3. DESCRIPTION 20.4.3. DESCRIPTION
The CB_NOTIFY operation is used by the server to send notifications The CB_NOTIFY operation is used by the server to send notifications
to clients about changes to delegated directories The registration of to clients about changes to delegated directories. The registration
notifications for the directories occurs when the delegation is of notifications for the directories occurs when the delegation is
established using GET_DIR_DELEGATION. These notifications are sent established using GET_DIR_DELEGATION. These notifications are sent
over the backchannel. The notification is sent once the original over the backchannel. The notification is sent once the original
request has been processed on the server. The server will send an request has been processed on the server. The server will send an
array of notifications for changes that might have occurred in the array of notifications for changes that might have occurred in the
directory. The notifications are sent as list of pairs of bitmaps directory. The notifications are sent as list of pairs of bitmaps
and values. See Section 3.3.7 for a description of how NFSv4.1 and values. See Section 3.3.7 for a description of how NFSv4.1
bitmaps work. bitmaps work.
If the server has more notifications than can fit in the CB_COMPOUND If the server has more notifications than can fit in the CB_COMPOUND
request, it SHOULD send a sequence of serial CB_COMPOUND requests so request, it SHOULD send a sequence of serial CB_COMPOUND requests so
skipping to change at page 576, line 4 skipping to change at page 580, line 4
RCA4_TYPE_MASK_DIR_DLG RCA4_TYPE_MASK_DIR_DLG
The client is to return directory delegations. The client is to return directory delegations.
RCA4_TYPE_MASK_FILE_LAYOUT RCA4_TYPE_MASK_FILE_LAYOUT
The client is to return layouts of type LAYOUT4_NFSV4_1_FILES. The client is to return layouts of type LAYOUT4_NFSV4_1_FILES.
RCA4_TYPE_MASK_BLK_LAYOUT RCA4_TYPE_MASK_BLK_LAYOUT
See [30] for a description. See [40] for a description.
RCA4_TYPE_MASK_OBJ_LAYOUT_MIN to RCA4_TYPE_MASK_OBJ_LAYOUT_MAX RCA4_TYPE_MASK_OBJ_LAYOUT_MIN to RCA4_TYPE_MASK_OBJ_LAYOUT_MAX
See [29] for a description. See [39] for a description.
RCA4_TYPE_MASK_OTHER_LAYOUT_MIN to RCA4_TYPE_MASK_OTHER_LAYOUT_MAX RCA4_TYPE_MASK_OTHER_LAYOUT_MIN to RCA4_TYPE_MASK_OTHER_LAYOUT_MAX
This range is reserved for telling the client to recall layouts of This range is reserved for telling the client to recall layouts of
experimental or site specific layout types (see Section 3.3.13). experimental or site specific layout types (see Section 3.3.13).
When a bit is set in the type mask that corresponds to an undefined When a bit is set in the type mask that corresponds to an undefined
type of recallable object, NFS4ERR_INVAL MUST be returned. When a type of recallable object, NFS4ERR_INVAL MUST be returned. When a
bit is set that corresponds to a defined type of object, but the bit is set that corresponds to a defined type of object, but the
client does not support an object of the type, NFS4ERR_INVAL MUST NOT client does not support an object of the type, NFS4ERR_INVAL MUST NOT
skipping to change at page 587, line 49 skipping to change at page 591, line 49
protection is any GETATTR for the fs_locations and protection is any GETATTR for the fs_locations and
fs_locations_info attributes. The attack has two steps. First fs_locations_info attributes. The attack has two steps. First
the attacker modifies the unprotected results of some operation to the attacker modifies the unprotected results of some operation to
return NFS4ERR_MOVED. Second, when the client follows up with a return NFS4ERR_MOVED. Second, when the client follows up with a
GETATTR for the fs_locations or fs_locations_info attributes, the GETATTR for the fs_locations or fs_locations_info attributes, the
attacker modifies the results to cause the client migrate its attacker modifies the results to cause the client migrate its
traffic to a server controlled by the attacker. traffic to a server controlled by the attacker.
Relative to previous NFS versions, NFSv4.1 has additional security Relative to previous NFS versions, NFSv4.1 has additional security
considerations for pNFS (see Section 12.9 and Section 13.12), locking considerations for pNFS (see Section 12.9 and Section 13.12), locking
and session state (see Section 2.10.8.3). and session state (see Section 2.10.8.3), and state recovery during
grace period (see Section 8.4.2.1.1).
22. IANA Considerations 22. IANA Considerations
This section uses terms that are defined in [43]. This section uses terms that are defined in [53].
22.1. Named Attribute Definitions 22.1. Named Attribute Definitions
IANA will create a registry called the "NFSv4 Named Attribute IANA will create a registry called the "NFSv4 Named Attribute
Definitions Registry". Definitions Registry".
The NFSv4.1 protocol supports the association of a file with zero or The NFSv4.1 protocol supports the association of a file with zero or
more named attributes. The name space identifiers for these more named attributes. The name space identifiers for these
attributes are defined as string names. The protocol does not define attributes are defined as string names. The protocol does not define
the specific assignment of the name space for these file attributes. the specific assignment of the name space for these file attributes.
skipping to change at page 588, line 30 skipping to change at page 592, line 30
attributes as needed, they are encouraged to register the attributes attributes as needed, they are encouraged to register the attributes
with IANA. with IANA.
Such registered named attributes are presumed to apply to all minor Such registered named attributes are presumed to apply to all minor
versions of NFSv4, including those defined subsequently to the versions of NFSv4, including those defined subsequently to the
registration. Where the named attribute is intended to be limited registration. Where the named attribute is intended to be limited
with regard to the minor versions for which they are not be used, the with regard to the minor versions for which they are not be used, the
assignment in registry will clearly state the applicable limits. assignment in registry will clearly state the applicable limits.
All assignments to the registry are made on a First Come First Served All assignments to the registry are made on a First Come First Served
basis, per section 4.1 of [43]. The policy for each assignment is basis, per section 4.1 of [53]. The policy for each assignment is
Specification Required, per section 4.1 of [43]. Specification Required, per section 4.1 of [53].
Under the NFSv4.1 specification, the name of a named attribute can in Under the NFSv4.1 specification, the name of a named attribute can in
theory be up to 2^32 - 1 bytes in length, but in practice NFSv4.1 theory be up to 2^32 - 1 bytes in length, but in practice NFSv4.1
clients and servers will be unable to a handle string that long. clients and servers will be unable to a handle string that long.
IANA should reject any assignment request with a named attribute that IANA should reject any assignment request with a named attribute that
exceeds 128 UTF-8 characters. To give IESG the flexibility to set up exceeds 128 UTF-8 characters. To give IESG the flexibility to set up
bases of assignment of Experimental Use and Standards Action, the bases of assignment of Experimental Use and Standards Action, the
prefixes of "EXPE" and "STDS" are Reserved. The zero length named prefixes of "EXPE" and "STDS" are Reserved. The zero length named
attribute name is Reserved. attribute name is Reserved.
skipping to change at page 589, line 42 skipping to change at page 593, line 42
The potential exists for new notification types to be added to the The potential exists for new notification types to be added to the
CB_NOTIFY_DEVICEID operation Section 20.12. This can be done via CB_NOTIFY_DEVICEID operation Section 20.12. This can be done via
changes to the operations that register notifications, or by adding changes to the operations that register notifications, or by adding
new operations to NFSv4. This requires a new minor version of NFSv4, new operations to NFSv4. This requires a new minor version of NFSv4,
and requires a standards track document from IETF. Another way to and requires a standards track document from IETF. Another way to
add a notification is to specify a new layout type (see add a notification is to specify a new layout type (see
Section 22.4). Section 22.4).
Hence all assignments to the registry are made on a Standards Action Hence all assignments to the registry are made on a Standards Action
basis per section 4.1 of [43], with Expert Review required. basis per section 4.1 of [53], with Expert Review required.
The registry is a list of assignments, each containing five fields The registry is a list of assignments, each containing five fields
per assignment. per assignment.
1. The name of the notification type. This name must have the 1. The name of the notification type. This name must have the
prefix: "NOTIFY_DEVICEID4_". This name must be unique. prefix: "NOTIFY_DEVICEID4_". This name must be unique.
2. The value of the notification. IANA will assign this number, and 2. The value of the notification. IANA will assign this number, and
the request from the registrant will use TBD1 instead of an the request from the registrant will use TBD1 instead of an
actual value. IANA MUST use a whole number which can be no actual value. IANA MUST use a whole number which can be no
higher than 2^32-1, and should be the next available value. The higher than 2^32-1, and should be the next available value. The
value assigned must be unique. A Designated Expert must be used value assigned must be unique. A Designated Expert must be used
to ensure that when the name of the notification type and its to ensure that when the name of the notification type and its
value are added to the NFSv4.1 notify_deviceid_type4 enumerated value are added to the NFSv4.1 notify_deviceid_type4 enumerated
data type in the NFSv4.1 XDR description ([12]), the result data type in the NFSv4.1 XDR description ([13]), the result
continues to be a valid XDR description. continues to be a valid XDR description.
3. The Standards Track RFC(s) that describe the notification. If 3. The Standards Track RFC(s) that describe the notification. If
the RFC(s) have not yet been published, the registrant will use the RFC(s) have not yet been published, the registrant will use
RFCTBD2, RFCTBD3, etc. instead of an actual RFC number. RFCTBD2, RFCTBD3, etc. instead of an actual RFC number.
4. How the RFC introduces the notification. This is indicated by a 4. How the RFC introduces the notification. This is indicated by a
single US-ASCII value. If the value is N, it means a minor single US-ASCII value. If the value is N, it means a minor
revision to the NFSv4 protocol. If the value is L, it means a revision to the NFSv4 protocol. If the value is L, it means a
new pNFS layout type. Other values can be used with IESG new pNFS layout type. Other values can be used with IESG
skipping to change at page 591, line 14 skipping to change at page 595, line 14
The potential exists for new object types to be added to the The potential exists for new object types to be added to the
CB_RECALL_ANY operation (see Section 20.6). This can be done via CB_RECALL_ANY operation (see Section 20.6). This can be done via
changes to the operations that add recallable types, or by adding new changes to the operations that add recallable types, or by adding new
operations to NFSv4. This requires a new minor version of NFSv4, and operations to NFSv4. This requires a new minor version of NFSv4, and
requires a standards track document from IETF. Another way to add a requires a standards track document from IETF. Another way to add a
new recallable object is to specify a new layout type (see new recallable object is to specify a new layout type (see
Section 22.4). Section 22.4).
All assignments to the registry are made on a Standards Action basis All assignments to the registry are made on a Standards Action basis
per section 4.1 of [43], with Expert Review required. per section 4.1 of [53], with Expert Review required.
Recallable object types are 32 bit unsigned numbers. There are no Recallable object types are 32 bit unsigned numbers. There are no
Reserved values. Values in the range 12 through 15, inclusive, are Reserved values. Values in the range 12 through 15, inclusive, are
for Private Use. for Private Use.
The registry is a list of assignments, each containing five fields The registry is a list of assignments, each containing five fields
per assignment. per assignment.
1. The name of the recallable object type. This name must have the 1. The name of the recallable object type. This name must have the
prefix: "RCA4_TYPE_MASK_". The name must be unique. prefix: "RCA4_TYPE_MASK_". The name must be unique.
2. The value of the recallable object type. IANA will assign this 2. The value of the recallable object type. IANA will assign this
number, and the request from the registrant will use TBD1 instead number, and the request from the registrant will use TBD1 instead
of an actual value. IANA MUST use a whole number which can be no of an actual value. IANA MUST use a whole number which can be no
higher than 2^32-1, and should be the next available value. The higher than 2^32-1, and should be the next available value. The
value must be unique. A Designated Expert must be used to ensure value must be unique. A Designated Expert must be used to ensure
that when the name of the recallable type and its value are added that when the name of the recallable type and its value are added
to the NFSv4 XDR description [12], the result continues to be a to the NFSv4 XDR description [13], the result continues to be a
valid XDR description. valid XDR description.
3. The Standards Track RFC(s) that describe the recallable object 3. The Standards Track RFC(s) that describe the recallable object
type. If the RFC(s) have not yet been published, the registrant type. If the RFC(s) have not yet been published, the registrant
will use RFCTBD2, RFCTBD3, etc. instead of an actual RFC number. will use RFCTBD2, RFCTBD3, etc. instead of an actual RFC number.
4. How the RFC introduces the recallable object type. This is 4. How the RFC introduces the recallable object type. This is
indicated by a single US-ASCII value. If the value is N, it indicated by a single US-ASCII value. If the value is N, it
means a minor revision to the NFSv4 protocol. If the value is L, means a minor revision to the NFSv4 protocol. If the value is L,
it means a new pNFS layout type. Other values can be used with it means a new pNFS layout type. Other values can be used with
skipping to change at page 592, line 52 skipping to change at page 596, line 52
The registry is a list of assignments, each containing five fields. The registry is a list of assignments, each containing five fields.
1. The name of the layout type. This name must have the prefix: 1. The name of the layout type. This name must have the prefix:
"LAYOUT4_". The name must be unique. "LAYOUT4_". The name must be unique.
2. The value of the layout type. IANA will assign this number, and 2. The value of the layout type. IANA will assign this number, and
the request from the registrant will use TBD1 instead of an the request from the registrant will use TBD1 instead of an
actual value. The value assigned must be unique. A Designated actual value. The value assigned must be unique. A Designated
Expert must be used to ensure that when the name of the layout Expert must be used to ensure that when the name of the layout
type and its value are added to the NFSv4.1 layouttype4 type and its value are added to the NFSv4.1 layouttype4
enumerated data type in the NFSv4.1 XDR description ([12]), the enumerated data type in the NFSv4.1 XDR description ([13]), the
result continues to be a valid XDR description. result continues to be a valid XDR description.
3. The Standards Track RFC(s) that describe the notification. If 3. The Standards Track RFC(s) that describe the notification. If
the RFC(s) have not yet been published, the registrant will use the RFC(s) have not yet been published, the registrant will use
RFCTBD2, RFCTBD3, etc. instead of an actual RFC number. RFCTBD2, RFCTBD3, etc. instead of an actual RFC number.
Collectively, the RFC(s) must adhere to the guidelines listed in Collectively, the RFC(s) must adhere to the guidelines listed in
Section 22.4.3. Section 22.4.3.
4. How the RFC introduces the notification. This is indicated by a 4. How the RFC introduces the layout type. This is indicated by a
single US-ASCII value. If the value is N, it means a minor single US-ASCII value. If the value is N, it means a minor
revision to the NFSv4 protocol. If the value is L, it means a revision to the NFSv4 protocol. If the value is L, it means a
new pNFS layout type. Other values can be used with IESG new pNFS layout type. Other values can be used with IESG
Approval. Approval.
5. The minor versions of NFSv4 that are allowed to the use the 5. The minor versions of NFSv4 that are allowed to the use the
notification. While these are numeric values, IANA will not notification. While these are numeric values, IANA will not
allocate and assign them; the author of the relevant RFCs with allocate and assign them; the author of the relevant RFCs with
IESG Approval assigns these numbers. Each time there is new IESG Approval assigns these numbers. Each time there is new
minor version of NFSv4 approved, a Designated Expert should minor version of NFSv4 approved, a Designated Expert should
skipping to change at page 594, line 46 skipping to change at page 598, line 46
+ A request to IANA for a new layout type per Section 22.4. + A request to IANA for a new layout type per Section 22.4.
+ A list of requests to IANA for any new recallable object + A list of requests to IANA for any new recallable object
types for CB_RECALL_ANY; each entry is to presented in the types for CB_RECALL_ANY; each entry is to presented in the
form described in Section 22.3. form described in Section 22.3.
+ A list of requests to IANA for any new notification values + A list of requests to IANA for any new notification values
for CB_NOTIFY_DEVICEID; each entry is to presented in the for CB_NOTIFY_DEVICEID; each entry is to presented in the
form described in Section 22.2. form described in Section 22.2.
* Include a security considerations section. * Include a security considerations section. This section MUST
explain how the NFSv4.1 authentication, authorization, and
access control models are preserved. I.e. if a metadata
server would restrict a READ or WRITE operation, how would
pNFS via the layout similarly restrict a corresponding input
or output operation?
3. The author documents the new layout specification as an Internet 3. The author documents the new layout specification as an Internet
Draft. Draft.
4. The author submits the Internet Draft for review through the IETF 4. The author submits the Internet Draft for review through the IETF
standards process as defined in "Internet Official Protocol standards process as defined in "Internet Official Protocol
Standards" (STD 1). The new layout specification will be Standards" (STD 1). The new layout specification will be
submitted for eventual publication as a standards track RFC. submitted for eventual publication as a standards track RFC.
5. The layout specification progresses through the IETF standards 5. The layout specification progresses through the IETF standards
process; the new option will be reviewed by the NFSv4 Working process; the new option will be reviewed by the NFSv4 Working
skipping to change at page 598, line 31 skipping to change at page 602, line 33
22.5.3.2. Updating Registrations 22.5.3.2. Updating Registrations
The registrant is free to update the assignment, i.e. change the The registrant is free to update the assignment, i.e. change the
explanation and/or point of contact fields. explanation and/or point of contact fields.
23. References 23. References
23.1. Normative References 23.1. Normative References
[1] Bradner, S., "Key words for use in RFCs to Indicate Requirement [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement
Levels", March 1997. Levels", RFC 2119, March 1997.
[2] Eisler, M., "XDR: External Data Representation Standard", [2] Eisler, M., "XDR: External Data Representation Standard",
STD 67, RFC 4506, May 2006. STD 67, RFC 4506, May 2006.
[3] Srinivasan, R., "RPC: Remote Procedure Call Protocol [3] Srinivasan, R., "RPC: Remote Procedure Call Protocol
Specification Version 2", RFC 1831, August 1995. Specification Version 2", RFC 1831, August 1995.
[4] Eisler, M., Chiu, A., and L. Ling, "RPCSEC_GSS Protocol [4] Eisler, M., Chiu, A., and L. Ling, "RPCSEC_GSS Protocol
Specification", RFC 2203, September 1997. Specification", RFC 2203, September 1997.
[5] Zhu, L., Jaganathan, K., and S. Hartman, "The Kerberos Version [5] Zhu, L., Jaganathan, K., and S. Hartman, "The Kerberos Version
5 Generic Security Service Application Program Interface (GSS- 5 Generic Security Service Application Program Interface (GSS-
API) Mechanism Version 2", RFC 4121, July 2005. API) Mechanism Version 2", RFC 4121, July 2005.
[6] Eisler, M., "LIPKEY - A Low Infrastructure Public Key Mechanism [6] Eisler, M., "LIPKEY - A Low Infrastructure Public Key Mechanism
Using SPKM", RFC 2847, June 2000. Using SPKM", RFC 2847, June 2000.
[7] Linn, J., "Generic Security Service Application Program [7] The Open Group, "Section 3.191 of Chapter 3 of Base Definitions
of The Open Group Base Specifications Issue 6 IEEE Std 1003.1,
2004 Edition, HTML Version (www.opengroup.org), ISBN
1931624232", 2004.
[8] Linn, J., "Generic Security Service Application Program
Interface Version 2, Update 1", RFC 2743, January 2000. Interface Version 2, Update 1", RFC 2743, January 2000.
[8] Talpey, T. and B. Callaghan, "Remote Direct Memory Access [9] Talpey, T. and B. Callaghan, "Remote Direct Memory Access
Transport for Remote Procedure Call", Transport for Remote Procedure Call",
draft-ietf-nfsv4-rpcrdma-08 (work in progress), April 2008. draft-ietf-nfsv4-rpcrdma-08 (work in progress), April 2008.
[9] Talpey, T., Callaghan, B., and I. Property, "NFS Direct Data [10] Talpey, T., Callaghan, B., and I. Property, "NFS Direct Data
Placement", draft-ietf-nfsv4-nfsdirect-08 (work in progress), Placement", draft-ietf-nfsv4-nfsdirect-08 (work in progress),
April 2008. April 2008.
[10] Recio, P., Metzler, B., Culley, P., Hilland, J., and D. Garcia, [11] Recio, P., Metzler, B., Culley, P., Hilland, J., and D. Garcia,
"A Remote Direct Memory Access Protocol Specification", "A Remote Direct Memory Access Protocol Specification",
RFC 5040, October 2007. RFC 5040, October 2007.
[11] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed-Hashing [12] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed-Hashing
for Message Authentication", RFC 2104, February 1997. for Message Authentication", RFC 2104, February 1997.
[12] Shepler, S., Eisler, M., and D. Noveck, "NFSv4 Minor Version 1 [13] Shepler, S., Eisler, M., and D. Noveck, "NFSv4 Minor Version 1
XDR Description", draft-ietf-nfsv4-minorversion1-dot-x-08 (work XDR Description", draft-ietf-nfsv4-minorversion1-dot-x-10 (work
in progress), Aug 2008. in progress), Dec 2008.
[13] Eisler, M., "IANA Considerations for RPC Net Identifiers and [14] The Open Group, "Section 3.372 of Chapter 3 of Base Definitions
Universal Address Formats", draft-ietf-nfsv4-rpc-netid-03 (work of The Open Group Base Specifications Issue 6 IEEE Std 1003.1,
in progress), Aug 2008. 2004 Edition, HTML Version (www.opengroup.org), ISBN
1931624232", 2004.
[14] International Organization for Standardization, "Information [15] Eisler, M., "IANA Considerations for RPC Net Identifiers and
Universal Address Formats", draft-ietf-nfsv4-rpc-netid-04 (work
in progress), December 2008.
[16] The Open Group, "Section 'read()' of System Interfaces of The
Open Group Base Specifications Issue 6 IEEE Std 1003.1, 2004
Edition, HTML Version (www.opengroup.org), ISBN 1931624232",
2004.
[17] The Open Group, "Section 'readdir()' of System Interfaces of
The Open Group Base Specifications Issue 6 IEEE Std 1003.1,
2004 Edition, HTML Version (www.opengroup.org), ISBN
1931624232", 2004.
[18] The Open Group, "Section 'write()' of System Interfaces of The
Open Group Base Specifications Issue 6 IEEE Std 1003.1, 2004
Edition, HTML Version (www.opengroup.org), ISBN 1931624232",
2004.
[19] Hoffman, P. and M. Blanchet, "Preparation of Internationalized
Strings ("stringprep")", RFC 3454, December 2002.
[20] The Open Group, "Section 'chmod()' of System Interfaces of The
Open Group Base Specifications Issue 6 IEEE Std 1003.1, 2004
Edition, HTML Version (www.opengroup.org), ISBN 1931624232",
2004.
[21] International Organization for Standardization, "Information
Technology - Universal Multiple-octet coded Character Set (UCS) Technology - Universal Multiple-octet coded Character Set (UCS)
- Part 1: Architecture and Basic Multilingual Plane", - Part 1: Architecture and Basic Multilingual Plane",
ISO Standard 10646-1, May 1993. ISO Standard 10646-1, May 1993.
[15] Alvestrand, H., "IETF Policy on Character Sets and Languages", [22] Alvestrand, H., "IETF Policy on Character Sets and Languages",
BCP 18, RFC 2277, January 1998. BCP 18, RFC 2277, January 1998.
[16] Hoffman, P. and M. Blanchet, "Preparation of Internationalized [23] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep Profile
Strings ("stringprep")", RFC 3454, December 2002.
[17] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep Profile
for Internationalized Domain Names (IDN)", RFC 3491, for Internationalized Domain Names (IDN)", RFC 3491,
March 2003. March 2003.
[18] Schaad, J., Kaliski, B., and R. Housley, "Additional Algorithms [24] The Open Group, "Section 'fcntl()' of System Interfaces of The
Open Group Base Specifications Issue 6 IEEE Std 1003.1, 2004
Edition, HTML Version (www.opengroup.org), ISBN 1931624232",
2004.
[25] The Open Group, "Section 'fsync()' of System Interfaces of The
Open Group Base Specifications Issue 6 IEEE Std 1003.1, 2004
Edition, HTML Version (www.opengroup.org), ISBN 1931624232",
2004.
[26] The Open Group, "Section 'getpwnam()' of System Interfaces of
The Open Group Base Specifications Issue 6 IEEE Std 1003.1,
2004 Edition, HTML Version (www.opengroup.org), ISBN
1931624232", 2004.
[27] The Open Group, "Section 'unlink()' of System Interfaces of The
Open Group Base Specifications Issue 6 IEEE Std 1003.1, 2004
Edition, HTML Version (www.opengroup.org), ISBN 1931624232",
2004.
[28] Schaad, J., Kaliski, B., and R. Housley, "Additional Algorithms
and Identifiers for RSA Cryptography for use in the Internet and Identifiers for RSA Cryptography for use in the Internet
X.509 Public Key Infrastructure Certificate and Certificate X.509 Public Key Infrastructure Certificate and Certificate
Revocation List (CRL) Profile", RFC 4055, June 2005. Revocation List (CRL) Profile", RFC 4055, June 2005.
[19] National Institute of Standards and Technology, "Cryptographic [29] National Institute of Standards and Technology, "Cryptographic
Algorithm Object Registration", December 2005. Algorithm Object Registration", URL http://csrc.nist.gov/
groups/ST/crypto_apps_infra/csor/algorithms.html,
November 2007.
23.2. Informative References 23.2. Informative References
[20] Shepler, S., Callaghan, B., Robinson, D., Thurlow, R., Beame, [30] Shepler, S., Callaghan, B., Robinson, D., Thurlow, R., Beame,
C., Eisler, M., and D. Noveck, "Network File System (NFS) C., Eisler, M., and D. Noveck, "Network File System (NFS)
version 4 Protocol", RFC 3530, April 2003. version 4 Protocol", RFC 3530, April 2003.
[21] Callaghan, B., Pawlowski, B., and P. Staubach, "NFS Version 3 [31] Callaghan, B., Pawlowski, B., and P. Staubach, "NFS Version 3
Protocol Specification", RFC 1813, June 1995. Protocol Specification", RFC 1813, June 1995.
[22] Eisler, M., "NFS Version 2 and Version 3 Security Issues and [32] Eisler, M., "NFS Version 2 and Version 3 Security Issues and
the NFS Protocol's Use of RPCSEC_GSS and Kerberos V5", the NFS Protocol's Use of RPCSEC_GSS and Kerberos V5",
RFC 2623, June 1999. RFC 2623, June 1999.
[23] Juszczak, C., "Improving the Performance and Correctness of an [33] Juszczak, C., "Improving the Performance and Correctness of an
NFS Server", USENIX Conference Proceedings , June 1990. NFS Server", USENIX Conference Proceedings , June 1990.
[24] Reynolds, J., "Assigned Numbers: RFC 1700 is Replaced by an On- [34] Reynolds, J., "Assigned Numbers: RFC 1700 is Replaced by an On-
line Database", RFC 3232, January 2002. line Database", RFC 3232, January 2002.
[25] Srinivasan, R., "Binding Protocols for ONC RPC Version 2", [35] Srinivasan, R., "Binding Protocols for ONC RPC Version 2",
RFC 1833, August 1995. RFC 1833, August 1995.
[26] Werme, R., "RPC XID Issues", USENIX Conference Proceedings , [36] Werme, R., "RPC XID Issues", USENIX Conference Proceedings ,
February 1996. February 1996.
[27] Nowicki, B., "NFS: Network File System Protocol specification", [37] Nowicki, B., "NFS: Network File System Protocol specification",
RFC 1094, March 1989. RFC 1094, March 1989.
[28] Bhide, A., Elnozahy, E., and S. Morgan, "A Highly Available [38] Bhide, A., Elnozahy, E., and S. Morgan, "A Highly Available
Network Server", USENIX Conference Proceedings , January 1991. Network Server", USENIX Conference Proceedings , January 1991.
[29] Halevy, B., Welch, B., and J. Zelenka, "Object-based pNFS [39] Halevy, B., Welch, B., and J. Zelenka, "Object-based pNFS
Operations", draft-ietf-nfsv4-pnfs-obj-09 (work in progress), Operations", draft-ietf-nfsv4-pnfs-obj-10 (work in progress),
June 2008. December 2008.
[30] Black, D., Fridella, S., and J. Glasgow, "pNFS Block/Volume [40] Black, D., Fridella, S., and J. Glasgow, "pNFS Block/Volume
Layout", draft-ietf-nfsv4-pnfs-block-09 (work in progress), Layout", draft-ietf-nfsv4-pnfs-block-10 (work in progress),
June 2008. November 2008.
[31] Callaghan, B., "WebNFS Client Specification", RFC 2054, [41] Callaghan, B., "WebNFS Client Specification", RFC 2054,
October 1996. October 1996.
[32] Callaghan, B., "WebNFS Server Specification", RFC 2055, [42] Callaghan, B., "WebNFS Server Specification", RFC 2055,
October 1996. October 1996.
[33] Shepler, S., "NFS Version 4 Design Considerations", RFC 2624, [43] Shepler, S., "NFS Version 4 Design Considerations", RFC 2624,
June 1999. June 1999.
[34] Simonsen, K., "Character Mnemonics and Character Sets", [44] The Open Group, "Protocols for Interworking: XNFS, Version 3W,
RFC 1345, June 1992.
[35] The Open Group, "Protocols for Interworking: XNFS, Version 3W,
ISBN 1-85912-184-5", February 1998. ISBN 1-85912-184-5", February 1998.
[36] Floyd, S. and V. Jacobson, "The Synchronization of Periodic [45] Floyd, S. and V. Jacobson, "The Synchronization of Periodic
Routing Messages", IEEE/ACM Transactions on Networking 2(2), Routing Messages", IEEE/ACM Transactions on Networking 2(2),
pp. 122-136, April 1994. pp. 122-136, April 1994.
[37] Satran, J., Meth, K., Sapuntzakis, C., Chadalapaka, M., and E. [46] Satran, J., Meth, K., Sapuntzakis, C., Chadalapaka, M., and E.
Zeidner, "Internet Small Computer Systems Interface (iSCSI)", Zeidner, "Internet Small Computer Systems Interface (iSCSI)",
RFC 3720, April 2004. RFC 3720, April 2004.
[38] Snively, R., "Fibre Channel Protocol for SCSI, 2nd Version [47] Snively, R., "Fibre Channel Protocol for SCSI, 2nd Version
(FCP-2)", ANSI/INCITS 350-2003, Oct 2003. (FCP-2)", ANSI/INCITS 350-2003, Oct 2003.
[39] Weber, R., "Object-Based Storage Device Commands (OSD)", ANSI/ [48] Weber, R., "Object-Based Storage Device Commands (OSD)", ANSI/
INCITS 400-2004, July 2004, INCITS 400-2004, July 2004,
<http://www.t10.org/ftp/t10/drafts/osd/osd-r10.pdf>. <http://www.t10.org/ftp/t10/drafts/osd/osd-r10.pdf>.
[40] The Open Group, "The Open Group Base Specifications Issue 6, [49] Carns, P., Ligon III, W., Ross, R., and R. Thakur, "PVFS: A
Parallel File System for Linux Clusters.", Proceedings of the
4th Annual Linux Showcase and Conference , 2000.
[50] The Open Group, "The Open Group Base Specifications Issue 6,
IEEE Std 1003.1, 2004 Edition", 2004. IEEE Std 1003.1, 2004 Edition", 2004.
[41] Callaghan, B., "NFS URL Scheme", RFC 2224, October 1997. [51] Callaghan, B., "NFS URL Scheme", RFC 2224, October 1997.
[42] Chiu, A., Eisler, M., and B. Callaghan, "Security Negotiation [52] Chiu, A., Eisler, M., and B. Callaghan, "Security Negotiation
for WebNFS", RFC 2755, January 2000. for WebNFS", RFC 2755, January 2000.
[43] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA [53] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA
Considerations Section in RFCs", BCP 26, RFC 5226, May 2008. Considerations Section in RFCs", BCP 26, RFC 5226, May 2008.
Appendix A. Acknowledgments Appendix A. Acknowledgments
The initial drafts for the SECINFO extensions were edited by Mike The initial drafts for the SECINFO extensions were edited by Mike
Eisler with contributions from Peng Dai, Sergey Klyushin, and Carl Eisler with contributions from Peng Dai, Sergey Klyushin, and Carl
Burnett. Burnett.
The initial drafts for the SESSIONS extensions were edited by Tom The initial drafts for the SESSIONS extensions were edited by Tom
Talpey, Spencer Shepler, Jon Bauman with contributions from Charles Talpey, Spencer Shepler, Jon Bauman with contributions from Charles
skipping to change at page 603, line 33 skipping to change at page 608, line 37
o Final pNFS inspection, with the following inspectors: Andy o Final pNFS inspection, with the following inspectors: Andy
Adamson, Mike Eisler, Mark Eshel, Sam Falkner, Jason Glasgow, Adamson, Mike Eisler, Mark Eshel, Sam Falkner, Jason Glasgow,
Garth Goodson, Robert Gordon, Benny Halevy, Dean Hildebrand, Rahul Garth Goodson, Robert Gordon, Benny Halevy, Dean Hildebrand, Rahul
Iyer, Suchit Kaura, Trond Myklebust, Anatoly Pinchuk, Spencer Iyer, Suchit Kaura, Trond Myklebust, Anatoly Pinchuk, Spencer
Shepler, Renu Tewari, Lisa Week, and Brent Welch. Shepler, Renu Tewari, Lisa Week, and Brent Welch.
A review team worked together to generate the tables of assignments A review team worked together to generate the tables of assignments
of error sets to operations and make sure that each such assignment of error sets to operations and make sure that each such assignment
had two or more people validating it. Participating in the process had two or more people validating it. Participating in the process
were: Andy Adamson, Mike Eisler, Sam Falkner, Garth Goodson, Robert were: Andy Adamson, Mike Eisler, Sam Falkner, Garth Goodson, Robert
Gordon, Trond Myklebust, Dave Noveck Spencer Shepler, Tom Talpey, Amy Gordon, Trond Myklebust, Dave Noveck, Spencer Shepler, Tom Talpey,
Weaver, and Lisa Week. Amy Weaver, and Lisa Week.
Lars Eggert provided valuable review and guidance. David Black, Scott Bradner, Lisa Dusseault, and Lars Eggert provided
valuable review and guidance.
Others who provided comments include: Jason Goldschmidt, James Others who provided comments include: Jason Goldschmidt, Vijay K.
Lentini, Archana Ramani, Jim Rees, and Mahesh Siddheshwar. Gurbani, James Lentini, Anshul Madan, Archana Ramani, Jim Rees,
Mahesh Siddheshwar, and Sunil Bhargo.
Appendix B. RFC Editor Notes Appendix B. RFC Editor Notes
[RFC Editor: please remove this section prior to publishing this [RFC Editor: please remove this section prior to publishing this
document as an RFC] document as an RFC]
[RFC Editor: prior to publishing this document as an RFC, please [RFC Editor: prior to publishing this document as an RFC, please
replace all occurrences of RFCTBD10 with RFCxxxx where xxxx is the replace all occurrences of RFCTBD10 with RFCxxxx where xxxx is the
RFC number of this document] RFC number of this document]
[RFC Editor: prior to publishing this document as an RFC, please [RFC Editor: prior to publishing this document as an RFC, please
replace all occurrences of RFCTBD20 with RFCyyyy where yyyy is the replace all occurrences of RFCTBD20 with RFCyyyy where yyyy is the
RFC number of the document referenced in [30]] RFC number of the document referenced in [40]]
[RFC Editor: prior to publishing this document as an RFC, please [RFC Editor: prior to publishing this document as an RFC, please
replace all occurrences of RFCTBD30 with RFCzzzz where zzzz is the replace all occurrences of RFCTBD30 with RFCzzzz where zzzz is the
RFC number of the document referenced in [29]] RFC number of the document referenced in [39]]
Authors' Addresses Authors' Addresses
Spencer Shepler Spencer Shepler
Storspeed, Inc. Storspeed, Inc.
7808 Moonflower Drive 7808 Moonflower Drive
Austin, TX 78750 Austin, TX 78750
USA USA
Phone: +1-512-402-5811 ext 8530 Phone: +1-512-402-5811 ext 8530
 End of changes. 249 change blocks. 
723 lines changed or deleted 904 lines changed or added

This html diff was produced by rfcdiff 1.35. The latest version is available from http://tools.ietf.org/tools/rfcdiff/