draft-ietf-nfsv4-rfc3530bis-03.txt   draft-ietf-nfsv4-rfc3530bis-04.txt 
NFSv4 T. Haynes NFSv4 T. Haynes
Internet-Draft Editor Internet-Draft D. Noveck
Intended status: Standards Track March 05, 2010 Intended status: Standards Track Editors
Expires: September 6, 2010 Expires: January 8, 2011 July 07, 2010
NFS Version 4 Protocol NFS Version 4 Protocol
draft-ietf-nfsv4-rfc3530bis-03.txt draft-ietf-nfsv4-rfc3530bis-04.txt
Abstract Abstract
The Network File System (NFS) version 4 is a distributed filesystem The Network File System (NFS) version 4 is a distributed filesystem
protocol which owes heritage to NFS protocol version 2, RFC 1094, and protocol which owes heritage to NFS protocol version 2, RFC 1094, and
version 3, RFC 1813. Unlike earlier versions, the NFS version 4 version 3, RFC 1813. Unlike earlier versions, the NFS version 4
protocol supports traditional file access while integrating support protocol supports traditional file access while integrating support
for file locking and the mount protocol. In addition, support for for file locking and the mount protocol. In addition, support for
strong security (and its negotiation), compound operations, client strong security (and its negotiation), compound operations, client
caching, and internationalization have been added. Of course, caching, and internationalization have been added. Of course,
attention has been applied to making NFS version 4 operate well in an attention has been applied to making NFS version 4 operate well in an
Internet environment. This document replaces RFC 3530 as the Internet environment.
definition of the NFS version 4 protocol.
This document, together with the companion XDR description document,
replaces RFC 3530 as the definition of the NFS version 4 protocol.
Requirements Language Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [1]. document are to be interpreted as described in RFC 2119 [1].
Status of this Memo Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the This Internet-Draft is submitted to IETF in full conformance with the
skipping to change at page 2, line 5 skipping to change at page 2, line 8
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt. http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet-Draft will expire on September 6, 2010. This Internet-Draft will expire on January 8, 2011.
Copyright Notice Copyright Notice
Copyright (c) 2010 IETF Trust and the persons identified as the Copyright (c) 2010 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 3, line 7 skipping to change at page 3, line 7
modifications of such material outside the IETF Standards Process. modifications of such material outside the IETF Standards Process.
Without obtaining an adequate license from the person(s) controlling Without obtaining an adequate license from the person(s) controlling
the copyright in such materials, this document may not be modified the copyright in such materials, this document may not be modified
outside the IETF Standards Process, and derivative works of it may outside the IETF Standards Process, and derivative works of it may
not be created outside the IETF Standards Process, except to format not be created outside the IETF Standards Process, except to format
it for publication as an RFC or to translate it into languages other it for publication as an RFC or to translate it into languages other
than English. than English.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 7 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 8
1.1. Changes since RFC 3530 . . . . . . . . . . . . . . . . . 7 1.1. Changes since RFC 3530 . . . . . . . . . . . . . . . . . 8
1.2. Changes since RFC 3010 . . . . . . . . . . . . . . . . . 7 1.2. Changes since RFC 3010 . . . . . . . . . . . . . . . . . 8
1.3. NFS Version 4 Goals . . . . . . . . . . . . . . . . . . 8 1.3. NFS Version 4 Goals . . . . . . . . . . . . . . . . . . 10
1.4. Inconsistencies of this Document with Section 18 . . . . 9 1.4. Inconsistencies of this Document with the companion
1.5. Overview of NFS version 4 Features . . . . . . . . . . . 9 document NFS Version 4 Protocol . . . . . . . . . . . . 10
1.5.1. RPC and Security . . . . . . . . . . . . . . . . . . 9 1.5. Overview of NFS version 4 Features . . . . . . . . . . . 11
1.5.2. Procedure and Operation Structure . . . . . . . . . . 10 1.5.1. RPC and Security . . . . . . . . . . . . . . . . . . 11
1.5.3. Filesystem Model . . . . . . . . . . . . . . . . . . 10 1.5.2. Procedure and Operation Structure . . . . . . . . . 11
1.5.4. OPEN and CLOSE . . . . . . . . . . . . . . . . . . . 12 1.5.3. Filesystem Model . . . . . . . . . . . . . . . . . . 12
1.5.5. File locking . . . . . . . . . . . . . . . . . . . . 12 1.5.4. OPEN and CLOSE . . . . . . . . . . . . . . . . . . . 14
1.5.6. Client Caching and Delegation . . . . . . . . . . . . 13 1.5.5. File Locking . . . . . . . . . . . . . . . . . . . . 14
1.6. General Definitions . . . . . . . . . . . . . . . . . . 13 1.5.6. Client Caching and Delegation . . . . . . . . . . . 14
2. Protocol Data Types . . . . . . . . . . . . . . . . . . . . . 15 1.6. General Definitions . . . . . . . . . . . . . . . . . . 15
2.1. Basic Data Types . . . . . . . . . . . . . . . . . . . . 15 2. Protocol Data Types . . . . . . . . . . . . . . . . . . . . . 17
2.2. Structured Data Types . . . . . . . . . . . . . . . . . 17 2.1. Basic Data Types . . . . . . . . . . . . . . . . . . . . 17
3. RPC and Security Flavor . . . . . . . . . . . . . . . . . . . 22 2.2. Structured Data Types . . . . . . . . . . . . . . . . . 18
3.1. Ports and Transports . . . . . . . . . . . . . . . . . . 22 3. RPC and Security Flavor . . . . . . . . . . . . . . . . . . . 24
3.1.1. Client Retransmission Behavior . . . . . . . . . . . 23 3.1. Ports and Transports . . . . . . . . . . . . . . . . . . 24
3.2. Security Flavors . . . . . . . . . . . . . . . . . . . . 23 3.1.1. Client Retransmission Behavior . . . . . . . . . . . 25
3.2.1. Security mechanisms for NFS version 4 . . . . . . . . 24 3.2. Security Flavors . . . . . . . . . . . . . . . . . . . . 25
3.3. Security Negotiation . . . . . . . . . . . . . . . . . . 26 3.2.1. Security mechanisms for NFS version 4 . . . . . . . 26
3.3.1. SECINFO . . . . . . . . . . . . . . . . . . . . . . . 26 3.3. Security Negotiation . . . . . . . . . . . . . . . . . . 28
3.3.2. Security Error . . . . . . . . . . . . . . . . . . . 27 3.3.1. SECINFO . . . . . . . . . . . . . . . . . . . . . . 28
3.3.3. Callback RPC Authentication . . . . . . . . . . . . . 27 3.3.2. Security Error . . . . . . . . . . . . . . . . . . . 28
4. Filehandles . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.3.3. Callback RPC Authentication . . . . . . . . . . . . 29
4.1. Obtaining the First Filehandle . . . . . . . . . . . . . 29 4. Filehandles . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.1.1. Root Filehandle . . . . . . . . . . . . . . . . . . . 29 4.1. Obtaining the First Filehandle . . . . . . . . . . . . . 31
4.1.2. Public Filehandle . . . . . . . . . . . . . . . . . . 30 4.1.1. Root Filehandle . . . . . . . . . . . . . . . . . . 31
4.2. Filehandle Types . . . . . . . . . . . . . . . . . . . . 30 4.1.2. Public Filehandle . . . . . . . . . . . . . . . . . 31
4.2.1. General Properties of a Filehandle . . . . . . . . . 30 4.2. Filehandle Types . . . . . . . . . . . . . . . . . . . . 32
4.2.2. Persistent Filehandle . . . . . . . . . . . . . . . . 31 4.2.1. General Properties of a Filehandle . . . . . . . . . 32
4.2.3. Volatile Filehandle . . . . . . . . . . . . . . . . . 31 4.2.2. Persistent Filehandle . . . . . . . . . . . . . . . 33
4.2.4. One Method of Constructing a Volatile Filehandle . . 33 4.2.3. Volatile Filehandle . . . . . . . . . . . . . . . . 33
4.3. Client Recovery from Filehandle Expiration . . . . . . . 33 4.2.4. One Method of Constructing a Volatile Filehandle . . 35
5. File Attributes . . . . . . . . . . . . . . . . . . . . . . . 34 4.3. Client Recovery from Filehandle Expiration . . . . . . . 35
5.1. Mandatory Attributes . . . . . . . . . . . . . . . . . . 35 5. File Attributes . . . . . . . . . . . . . . . . . . . . . . . 36
5.2. Recommended Attributes . . . . . . . . . . . . . . . . . 35 5.1. REQUIRED Attributes . . . . . . . . . . . . . . . . . . 37
5.3. Named Attributes . . . . . . . . . . . . . . . . . . . . 36 5.2. RECOMMENDED Attributes . . . . . . . . . . . . . . . . . 37
5.4. Classification of Attributes . . . . . . . . . . . . . . 36 5.3. Named Attributes . . . . . . . . . . . . . . . . . . . . 38
5.5. Mandatory Attributes - Definitions . . . . . . . . . . . 37 5.4. Classification of Attributes . . . . . . . . . . . . . . 39
5.6. Recommended Attributes - Definitions . . . . . . . . . . 39 5.5. Set-Only and Get-Only Attributes . . . . . . . . . . . . 40
5.7. Time Access . . . . . . . . . . . . . . . . . . . . . . 45 5.6. REQUIRED Attributes - List and Definition References . . 40
5.8. Interpreting owner and owner_group . . . . . . . . . . . 46 5.7. RECOMMENDED Attributes - List and Definition
5.9. Character Case Attributes . . . . . . . . . . . . . . . 48 References . . . . . . . . . . . . . . . . . . . . . . . 41
5.10. Quota Attributes . . . . . . . . . . . . . . . . . . . . 48 5.8. Attribute Definitions . . . . . . . . . . . . . . . . . 42
5.11. Access Control Lists . . . . . . . . . . . . . . . . . . 49 5.8.1. Definitions of REQUIRED Attributes . . . . . . . . . 42
5.11.1. ACE type . . . . . . . . . . . . . . . . . . . . . . 50 5.8.2. Definitions of Uncategorized RECOMMENDED
5.11.2. ACE Access Mask . . . . . . . . . . . . . . . . . . . 51 Attributes . . . . . . . . . . . . . . . . . . . . . 44
5.11.3. ACE flag . . . . . . . . . . . . . . . . . . . . . . 53 5.9. Interpreting owner and owner_group . . . . . . . . . . . 50
5.11.4. ACE who . . . . . . . . . . . . . . . . . . . . . . . 55 5.10. Character Case Attributes . . . . . . . . . . . . . . . 52
5.11.5. Mode Attribute . . . . . . . . . . . . . . . . . . . 56 6. Access Control Attributes . . . . . . . . . . . . . . . . . . 53
5.11.6. Mode and ACL Attribute . . . . . . . . . . . . . . . 57 6.1. Goals . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.11.7. mounted_on_fileid . . . . . . . . . . . . . . . . . . 57 6.2. File Attributes Discussion . . . . . . . . . . . . . . . 54
6. Filesystem Migration and Replication . . . . . . . . . . . . 58 6.2.1. Attribute 12: acl . . . . . . . . . . . . . . . . . 54
6.1. Replication . . . . . . . . . . . . . . . . . . . . . . 59 6.2.2. Attribute 33: mode . . . . . . . . . . . . . . . . . 68
6.2. Migration . . . . . . . . . . . . . . . . . . . . . . . 59 6.3. Common Methods . . . . . . . . . . . . . . . . . . . . . 68
6.3. Interpretation of the fs_locations Attribute . . . . . . 60 6.3.1. Interpreting an ACL . . . . . . . . . . . . . . . . 68
6.4. Filehandle Recovery for Migration or Replication . . . . 61 6.3.2. Computing a Mode Attribute from an ACL . . . . . . . 69
7. NFS Server Name Space . . . . . . . . . . . . . . . . . . . . 61 6.4. Requirements . . . . . . . . . . . . . . . . . . . . . . 70
7.1. Server Exports . . . . . . . . . . . . . . . . . . . . . 61 6.4.1. Setting the mode and/or ACL Attributes . . . . . . . 71
7.2. Browsing Exports . . . . . . . . . . . . . . . . . . . . 62 6.4.2. Retrieving the mode and/or ACL Attributes . . . . . 72
7.3. Server Pseudo Filesystem . . . . . . . . . . . . . . . . 62 6.4.3. Creating New Objects . . . . . . . . . . . . . . . . 72
7.4. Multiple Roots . . . . . . . . . . . . . . . . . . . . . 63 7. Multi-Server Namespace . . . . . . . . . . . . . . . . . . . 74
7.5. Filehandle Volatility . . . . . . . . . . . . . . . . . 63 7.1. Location Attributes . . . . . . . . . . . . . . . . . . 74
7.6. Exported Root . . . . . . . . . . . . . . . . . . . . . 63 7.2. File System Presence or Absence . . . . . . . . . . . . 75
7.7. Mount Point Crossing . . . . . . . . . . . . . . . . . . 63 7.3. Getting Attributes for an Absent File System . . . . . . 76
7.8. Security Policy and Name Space Presentation . . . . . . 64 7.3.1. GETATTR Within an Absent File System . . . . . . . . 76
8. File Locking and Share Reservations . . . . . . . . . . . . . 65 7.3.2. READDIR and Absent File Systems . . . . . . . . . . 77
8.1. Locking . . . . . . . . . . . . . . . . . . . . . . . . 65 7.4. Uses of Location Information . . . . . . . . . . . . . . 78
8.1.1. Client ID . . . . . . . . . . . . . . . . . . . . . . 66 7.4.1. File System Replication . . . . . . . . . . . . . . 78
8.1.2. Server Release of Clientid . . . . . . . . . . . . . 69 7.4.2. File System Migration . . . . . . . . . . . . . . . 79
8.1.3. lock_owner and stateid Definition . . . . . . . . . . 69 7.4.3. Referrals . . . . . . . . . . . . . . . . . . . . . 80
8.1.4. Use of the stateid and Locking . . . . . . . . . . . 71 7.5. Location Entries and Server Identity . . . . . . . . . . 80
8.1.5. Sequencing of Lock Requests . . . . . . . . . . . . . 73 7.6. Additional Client-side Considerations . . . . . . . . . 81
8.1.6. Recovery from Replayed Requests . . . . . . . . . . . 74 7.7. Effecting File System Transitions . . . . . . . . . . . 82
8.1.7. Releasing lock_owner State . . . . . . . . . . . . . 74 7.7.1. File System Transitions and Simultaneous Access . . 83
8.1.8. Use of Open Confirmation . . . . . . . . . . . . . . 75 7.7.2. Filehandles and File System Transitions . . . . . . 83
8.2. Lock Ranges . . . . . . . . . . . . . . . . . . . . . . 76 7.7.3. Fileids and File System Transitions . . . . . . . . 84
8.3. Upgrading and Downgrading Locks . . . . . . . . . . . . 76 7.7.4. Fsids and File System Transitions . . . . . . . . . 85
8.4. Blocking Locks . . . . . . . . . . . . . . . . . . . . . 77 7.7.5. The Change Attribute and File System Transitions . . 85
8.5. Lease Renewal . . . . . . . . . . . . . . . . . . . . . 77 7.7.6. Lock State and File System Transitions . . . . . . . 86
8.6. Crash Recovery . . . . . . . . . . . . . . . . . . . . . 78 7.7.7. Write Verifiers and File System Transitions . . . . 88
8.6.1. Client Failure and Recovery . . . . . . . . . . . . . 78 7.7.8. Readdir Cookies and Verifiers and File System
8.6.2. Server Failure and Recovery . . . . . . . . . . . . . 79 Transitions . . . . . . . . . . . . . . . . . . . . 88
8.6.3. Network Partitions and Recovery . . . . . . . . . . . 81 7.7.9. File System Data and File System Transitions . . . . 88
8.7. Recovery from a Lock Request Timeout or Abort . . . . . 84 7.8. Effecting File System Referrals . . . . . . . . . . . . 90
8.8. Server Revocation of Locks . . . . . . . . . . . . . . . 85 7.8.1. Referral Example (LOOKUP) . . . . . . . . . . . . . 90
8.9. Share Reservations . . . . . . . . . . . . . . . . . . . 86 7.8.2. Referral Example (READDIR) . . . . . . . . . . . . . 94
8.10. OPEN/CLOSE Operations . . . . . . . . . . . . . . . . . 87 7.9. The Attribute fs_locations . . . . . . . . . . . . . . . 97
8.10.1. Close and Retention of State Information . . . . . . 87 7.9.1. Inferring Transition Modes . . . . . . . . . . . . . 98
8.11. Open Upgrade and Downgrade . . . . . . . . . . . . . . . 88 8. NFS Server Name Space . . . . . . . . . . . . . . . . . . . . 99
8.12. Short and Long Leases . . . . . . . . . . . . . . . . . 89 8.1. Server Exports . . . . . . . . . . . . . . . . . . . . . 100
8.13. Clocks, Propagation Delay, and Calculating Lease 8.2. Browsing Exports . . . . . . . . . . . . . . . . . . . . 100
Expiration . . . . . . . . . . . . . . . . . . . . . . . 89 8.3. Server Pseudo Filesystem . . . . . . . . . . . . . . . . 100
8.14. Migration, Replication and State . . . . . . . . . . . . 90 8.4. Multiple Roots . . . . . . . . . . . . . . . . . . . . . 101
8.14.1. Migration and State . . . . . . . . . . . . . . . . . 90 8.5. Filehandle Volatility . . . . . . . . . . . . . . . . . 101
8.14.2. Replication and State . . . . . . . . . . . . . . . . 91 8.6. Exported Root . . . . . . . . . . . . . . . . . . . . . 101
8.14.3. Notification of Migrated Lease . . . . . . . . . . . 91 8.7. Mount Point Crossing . . . . . . . . . . . . . . . . . . 102
8.14.4. Migration and the Lease_time Attribute . . . . . . . 92 8.8. Security Policy and Name Space Presentation . . . . . . 102
9. Client-Side Caching . . . . . . . . . . . . . . . . . . . . . 93 9. File Locking and Share Reservations . . . . . . . . . . . . . 103
9.1. Performance Challenges for Client-Side Caching . . . . . 93 9.1. Locking . . . . . . . . . . . . . . . . . . . . . . . . 104
9.2. Delegation and Callbacks . . . . . . . . . . . . . . . . 94 9.1.1. Client ID . . . . . . . . . . . . . . . . . . . . . 104
9.2.1. Delegation Recovery . . . . . . . . . . . . . . . . . 95 9.1.2. Server Release of Clientid . . . . . . . . . . . . . 107
9.3. Data Caching . . . . . . . . . . . . . . . . . . . . . . 97 9.1.3. lock_owner and stateid Definition . . . . . . . . . 107
9.3.1. Data Caching and OPENs . . . . . . . . . . . . . . . 98 9.1.4. Use of the stateid and Locking . . . . . . . . . . . 109
9.3.2. Data Caching and File Locking . . . . . . . . . . . . 99 9.1.5. Sequencing of Lock Requests . . . . . . . . . . . . 111
9.3.3. Data Caching and Mandatory File Locking . . . . . . . 100 9.1.6. Recovery from Replayed Requests . . . . . . . . . . 112
9.3.4. Data Caching and File Identity . . . . . . . . . . . 101 9.1.7. Releasing lock_owner State . . . . . . . . . . . . . 112
9.4. Open Delegation . . . . . . . . . . . . . . . . . . . . 102 9.1.8. Use of Open Confirmation . . . . . . . . . . . . . . 113
9.4.1. Open Delegation and Data Caching . . . . . . . . . . 104 9.2. Lock Ranges . . . . . . . . . . . . . . . . . . . . . . 114
9.4.2. Open Delegation and File Locks . . . . . . . . . . . 105 9.3. Upgrading and Downgrading Locks . . . . . . . . . . . . 114
9.4.3. Handling of CB_GETATTR . . . . . . . . . . . . . . . 106 9.4. Blocking Locks . . . . . . . . . . . . . . . . . . . . . 115
9.4.4. Recall of Open Delegation . . . . . . . . . . . . . . 109 9.5. Lease Renewal . . . . . . . . . . . . . . . . . . . . . 115
9.4.5. Clients that Fail to Honor Delegation Recalls . . . . 111 9.6. Crash Recovery . . . . . . . . . . . . . . . . . . . . . 116
9.4.6. Delegation Revocation . . . . . . . . . . . . . . . . 111 9.6.1. Client Failure and Recovery . . . . . . . . . . . . 117
9.5. Data Caching and Revocation . . . . . . . . . . . . . . 112 9.6.2. Server Failure and Recovery . . . . . . . . . . . . 117
9.5.1. Revocation Recovery for Write Open Delegation . . . . 112 9.6.3. Network Partitions and Recovery . . . . . . . . . . 119
9.6. Attribute Caching . . . . . . . . . . . . . . . . . . . 113 9.7. Recovery from a Lock Request Timeout or Abort . . . . . 122
9.7. Data and Metadata Caching and Memory Mapped Files . . . 115 9.8. Server Revocation of Locks . . . . . . . . . . . . . . . 123
9.8. Name Caching . . . . . . . . . . . . . . . . . . . . . . 117 9.9. Share Reservations . . . . . . . . . . . . . . . . . . . 124
9.9. Directory Caching . . . . . . . . . . . . . . . . . . . 118 9.10. OPEN/CLOSE Operations . . . . . . . . . . . . . . . . . 125
10. Minor Versioning . . . . . . . . . . . . . . . . . . . . . . 119 9.10.1. Close and Retention of State Information . . . . . . 125
11. Internationalization . . . . . . . . . . . . . . . . . . . . 122 9.11. Open Upgrade and Downgrade . . . . . . . . . . . . . . . 126
11.1. Stringprep profile for the utf8str_cs type . . . . . . . 123 9.12. Short and Long Leases . . . . . . . . . . . . . . . . . 127
11.1.1. Intended applicability of the nfs4_cs_prep profile . 123 9.13. Clocks, Propagation Delay, and Calculating Lease
11.1.2. Character repertoire of nfs4_cs_prep . . . . . . . . 123 Expiration . . . . . . . . . . . . . . . . . . . . . . . 127
11.1.3. Mapping used by nfs4_cs_prep . . . . . . . . . . . . 123 9.14. Migration, Replication and State . . . . . . . . . . . . 128
11.1.4. Normalization used by nfs4_cs_prep . . . . . . . . . 124 9.14.1. Migration and State . . . . . . . . . . . . . . . . 128
11.1.5. Prohibited output for nfs4_cs_prep . . . . . . . . . 124 9.14.2. Replication and State . . . . . . . . . . . . . . . 129
11.1.6. Bidirectional output for nfs4_cs_prep . . . . . . . . 124 9.14.3. Notification of Migrated Lease . . . . . . . . . . . 129
11.2. Stringprep profile for the utf8str_cis type . . . . . . 124 9.14.4. Migration and the Lease_time Attribute . . . . . . . 130
11.2.1. Intended applicability of the nfs4_cis_prep profile . 125 10. Client-Side Caching . . . . . . . . . . . . . . . . . . . . . 131
11.2.2. Character repertoire of nfs4_cis_prep . . . . . . . . 125 10.1. Performance Challenges for Client-Side Caching . . . . . 131
11.2.3. Mapping used by nfs4_cis_prep . . . . . . . . . . . . 125 10.2. Delegation and Callbacks . . . . . . . . . . . . . . . . 132
11.2.4. Normalization used by nfs4_cis_prep . . . . . . . . . 125 10.2.1. Delegation Recovery . . . . . . . . . . . . . . . . 133
11.2.5. Prohibited output for nfs4_cis_prep . . . . . . . . . 125 10.3. Data Caching . . . . . . . . . . . . . . . . . . . . . . 135
11.2.6. Bidirectional output for nfs4_cis_prep . . . . . . . 126 10.3.1. Data Caching and OPENs . . . . . . . . . . . . . . . 136
11.3. Stringprep profile for the utf8str_mixed type . . . . . 126 10.3.2. Data Caching and File Locking . . . . . . . . . . . 137
11.3.1. Intended applicability of the nfs4_mixed_prep 10.3.3. Data Caching and Mandatory File Locking . . . . . . 138
profile . . . . . . . . . . . . . . . . . . . . . . . 126 10.3.4. Data Caching and File Identity . . . . . . . . . . . 139
11.3.2. Character repertoire of nfs4_mixed_prep . . . . . . . 126 10.4. Open Delegation . . . . . . . . . . . . . . . . . . . . 140
11.3.3. Mapping used by nfs4_cis_prep . . . . . . . . . . . . 126 10.4.1. Open Delegation and Data Caching . . . . . . . . . . 142
11.3.4. Normalization used by nfs4_mixed_prep . . . . . . . . 126 10.4.2. Open Delegation and File Locks . . . . . . . . . . . 143
11.3.5. Prohibited output for nfs4_mixed_prep . . . . . . . . 126 10.4.3. Handling of CB_GETATTR . . . . . . . . . . . . . . . 144
11.3.6. Bidirectional output for nfs4_mixed_prep . . . . . . 127 10.4.4. Recall of Open Delegation . . . . . . . . . . . . . 147
11.4. UTF-8 Related Errors . . . . . . . . . . . . . . . . . . 127 10.4.5. Clients that Fail to Honor Delegation Recalls . . . 149
12. Error Definitions . . . . . . . . . . . . . . . . . . . . . . 128 10.4.6. Delegation Revocation . . . . . . . . . . . . . . . 149
13. NFS version 4 Requests . . . . . . . . . . . . . . . . . . . 133 10.5. Data Caching and Revocation . . . . . . . . . . . . . . 150
13.1. Compound Procedure . . . . . . . . . . . . . . . . . . . 133 10.5.1. Revocation Recovery for Write Open Delegation . . . 150
13.2. Evaluation of a Compound Request . . . . . . . . . . . . 134 10.6. Attribute Caching . . . . . . . . . . . . . . . . . . . 151
13.3. Synchronous Modifying Operations . . . . . . . . . . . . 135 10.7. Data and Metadata Caching and Memory Mapped Files . . . 153
13.4. Operation Values . . . . . . . . . . . . . . . . . . . . 135 10.8. Name Caching . . . . . . . . . . . . . . . . . . . . . . 155
14. NFS version 4 Procedures . . . . . . . . . . . . . . . . . . 135 10.9. Directory Caching . . . . . . . . . . . . . . . . . . . 156
14.1. Procedure 0: NULL - No Operation . . . . . . . . . . . . 135 11. Minor Versioning . . . . . . . . . . . . . . . . . . . . . . 157
14.2. Procedure 1: COMPOUND - Compound Operations . . . . . . 136 12. Internationalization . . . . . . . . . . . . . . . . . . . . 160
14.3. Operation 3: ACCESS - Check Access Rights . . . . . . . 139 12.1. Use of UTF-8 . . . . . . . . . . . . . . . . . . . . . . 161
14.4. Operation 4: CLOSE - Close File . . . . . . . . . . . . 142 12.1.1. Relation to Stringprep . . . . . . . . . . . . . . . 161
14.5. Operation 5: COMMIT - Commit Cached Data . . . . . . . . 143 12.1.2. Normalization, Equivalence, and Confusability . . . 162
14.6. Operation 6: CREATE - Create a Non-Regular File Object . 146 12.2. String Type Overview . . . . . . . . . . . . . . . . . . 164
14.7. Operation 7: DELEGPURGE - Purge Delegations Awaiting 12.2.1. Overall String Class Divisions . . . . . . . . . . . 164
Recovery . . . . . . . . . . . . . . . . . . . . . . . . 149 12.2.2. Divisions by Typedef Parent types . . . . . . . . . 165
14.8. Operation 8: DELEGRETURN - Return Delegation . . . . . . 150 12.2.3. Individual Types and Their Handling . . . . . . . . 166
14.9. Operation 9: GETATTR - Get Attributes . . . . . . . . . 151 12.3. Errors Related to Strings . . . . . . . . . . . . . . . 167
14.10. Operation 10: GETFH - Get Current Filehandle . . . . . . 153 12.4. Types with Pre-processing to Resolve Mixture Issues . . 168
14.11. Operation 11: LINK - Create Link to a File . . . . . . . 154 12.4.1. Processing of Principal Strings . . . . . . . . . . 168
14.12. Operation 12: LOCK - Create Lock . . . . . . . . . . . . 156 12.4.2. Processing of Server Id Strings . . . . . . . . . . 168
14.13. Operation 13: LOCKT - Test For Lock . . . . . . . . . . 160 12.5. String Types without Internationalization Processing . . 169
14.14. Operation 14: LOCKU - Unlock File . . . . . . . . . . . 162 12.6. Types with Processing Defined by Other Internet Areas . 169
14.15. Operation 15: LOOKUP - Lookup Filename . . . . . . . . . 164 12.7. String Types with NFS-specific Processing . . . . . . . 170
14.16. Operation 16: LOOKUPP - Lookup Parent Directory . . . . 166 12.7.1. Handling of File Came Components . . . . . . . . . . 171
14.17. Operation 17: NVERIFY - Verify Difference in 12.7.2. Processing of Link Text . . . . . . . . . . . . . . 178
Attributes . . . . . . . . . . . . . . . . . . . . . . . 167 12.7.3. Processing of Principal Prefixes . . . . . . . . . . 179
14.18. Operation 18: OPEN - Open a Regular File . . . . . . . . 169 13. Error Values . . . . . . . . . . . . . . . . . . . . . . . . 179
14.19. Operation 19: OPENATTR - Open Named Attribute 13.1. Error Definitions . . . . . . . . . . . . . . . . . . . 180
Directory . . . . . . . . . . . . . . . . . . . . . . . 179 13.1.1. General Errors . . . . . . . . . . . . . . . . . . . 181
14.20. Operation 20: OPEN_CONFIRM - Confirm Open . . . . . . . 181 13.1.2. Filehandle Errors . . . . . . . . . . . . . . . . . 183
14.21. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access . 184 13.1.3. Compound Structure Errors . . . . . . . . . . . . . 184
14.22. Operation 22: PUTFH - Set Current Filehandle . . . . . . 185 13.1.4. File System Errors . . . . . . . . . . . . . . . . . 185
14.23. Operation 23: PUTPUBFH - Set Public Filehandle . . . . . 186 13.1.5. State Management Errors . . . . . . . . . . . . . . 187
14.24. Operation 24: PUTROOTFH - Set Root Filehandle . . . . . 188 13.1.6. Security Errors . . . . . . . . . . . . . . . . . . 188
14.25. Operation 25: READ - Read from File . . . . . . . . . . 188 13.1.7. Name Errors . . . . . . . . . . . . . . . . . . . . 188
14.26. Operation 26: READDIR - Read Directory . . . . . . . . . 191 13.1.8. Locking Errors . . . . . . . . . . . . . . . . . . . 189
14.27. Operation 27: READLINK - Read Symbolic Link . . . . . . 195 13.1.9. Reclaim Errors . . . . . . . . . . . . . . . . . . . 190
14.28. Operation 28: REMOVE - Remove Filesystem Object . . . . 196 13.1.10. Client Management Errors . . . . . . . . . . . . . . 191
14.29. Operation 29: RENAME - Rename Directory Entry . . . . . 199 13.1.11. Attribute Handling Errors . . . . . . . . . . . . . 191
14.30. Operation 30: RENEW - Renew a Lease . . . . . . . . . . 202 13.2. Operations and their valid errors . . . . . . . . . . . 192
14.31. Operation 31: RESTOREFH - Restore Saved Filehandle . . . 204 13.3. Callback operations and their valid errors . . . . . . . 199
14.32. Operation 32: SAVEFH - Save Current Filehandle . . . . . 205 13.4. Errors and the operations that use them . . . . . . . . 199
14.33. Operation 33: SECINFO - Obtain Available Security . . . 206 14. NFS version 4 Requests . . . . . . . . . . . . . . . . . . . 204
14.34. Operation 34: SETATTR - Set Attributes . . . . . . . . . 210 14.1. Compound Procedure . . . . . . . . . . . . . . . . . . . 204
14.35. Operation 35: SETCLIENTID - Negotiate Clientid . . . . . 213 14.2. Evaluation of a Compound Request . . . . . . . . . . . . 205
14.36. Operation 36: SETCLIENTID_CONFIRM - Confirm Clientid . . 216 14.3. Synchronous Modifying Operations . . . . . . . . . . . . 206
14.37. Operation 37: VERIFY - Verify Same Attributes . . . . . 220 14.4. Operation Values . . . . . . . . . . . . . . . . . . . . 206
14.38. Operation 38: WRITE - Write to File . . . . . . . . . . 222 15. NFS version 4 Procedures . . . . . . . . . . . . . . . . . . 206
14.39. Operation 39: RELEASE_LOCKOWNER - Release Lockowner 15.1. Procedure 0: NULL - No Operation . . . . . . . . . . . . 206
State . . . . . . . . . . . . . . . . . . . . . . . . . 226 15.2. Procedure 1: COMPOUND - Compound Operations . . . . . . 207
14.40. Operation 10044: ILLEGAL - Illegal operation . . . . . . 228 15.3. Operation 3: ACCESS - Check Access Rights . . . . . . . 209
15. NFS version 4 Callback Procedures . . . . . . . . . . . . . . 228 15.4. Operation 4: CLOSE - Close File . . . . . . . . . . . . 212
15.1. Procedure 0: CB_NULL - No Operation . . . . . . . . . . 229 15.5. Operation 5: COMMIT - Commit Cached Data . . . . . . . . 213
15.2. Procedure 1: CB_COMPOUND - Compound Operations . . . . . 229 15.6. Operation 6: CREATE - Create a Non-Regular File Object . 216
15.2.7. Operation 3: CB_GETATTR - Get Attributes . . . . . . 231 15.7. Operation 7: DELEGPURGE - Purge Delegations Awaiting
15.2.8. Operation 4: CB_RECALL - Recall an Open Delegation . 232 Recovery . . . . . . . . . . . . . . . . . . . . . . . . 218
15.2.9. Operation 10044: CB_ILLEGAL - Illegal Callback 15.8. Operation 8: DELEGRETURN - Return Delegation . . . . . . 219
Operation . . . . . . . . . . . . . . . . . . . . . . 234 15.9. Operation 9: GETATTR - Get Attributes . . . . . . . . . 220
16. Security Considerations . . . . . . . . . . . . . . . . . . . 234 15.10. Operation 10: GETFH - Get Current Filehandle . . . . . . 221
17. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 236 15.11. Operation 11: LINK - Create Link to a File . . . . . . . 222
17.1. Named Attribute Definition . . . . . . . . . . . . . . . 236 15.12. Operation 12: LOCK - Create Lock . . . . . . . . . . . . 224
17.2. ONC RPC Network Identifiers (netids) . . . . . . . . . . 236 15.13. Operation 13: LOCKT - Test For Lock . . . . . . . . . . 228
18. References . . . . . . . . . . . . . . . . . . . . . . . . . 238 15.14. Operation 14: LOCKU - Unlock File . . . . . . . . . . . 229
18.1. Normative References . . . . . . . . . . . . . . . . . . 238 15.15. Operation 15: LOOKUP - Lookup Filename . . . . . . . . . 230
18.2. Informative References . . . . . . . . . . . . . . . . . 238 15.16. Operation 16: LOOKUPP - Lookup Parent Directory . . . . 232
Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 240 15.17. Operation 17: NVERIFY - Verify Difference in
Appendix B. RFC Editor Notes . . . . . . . . . . . . . . . . . . 240 Attributes . . . . . . . . . . . . . . . . . . . . . . . 233
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 240 15.18. Operation 18: OPEN - Open a Regular File . . . . . . . . 234
15.19. Operation 19: OPENATTR - Open Named Attribute
Directory . . . . . . . . . . . . . . . . . . . . . . . 243
15.20. Operation 20: OPEN_CONFIRM - Confirm Open . . . . . . . 244
15.21. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access . 246
15.22. Operation 22: PUTFH - Set Current Filehandle . . . . . . 248
15.23. Operation 23: PUTPUBFH - Set Public Filehandle . . . . . 248
15.24. Operation 24: PUTROOTFH - Set Root Filehandle . . . . . 250
15.25. Operation 25: READ - Read from File . . . . . . . . . . 250
15.26. Operation 26: READDIR - Read Directory . . . . . . . . . 252
15.27. Operation 27: READLINK - Read Symbolic Link . . . . . . 256
15.28. Operation 28: REMOVE - Remove Filesystem Object . . . . 257
15.29. Operation 29: RENAME - Rename Directory Entry . . . . . 259
15.30. Operation 30: RENEW - Renew a Lease . . . . . . . . . . 261
15.31. Operation 31: RESTOREFH - Restore Saved Filehandle . . . 262
15.32. Operation 32: SAVEFH - Save Current Filehandle . . . . . 263
15.33. Operation 33: SECINFO - Obtain Available Security . . . 263
15.34. Operation 34: SETATTR - Set Attributes . . . . . . . . . 266
15.35. Operation 35: SETCLIENTID - Negotiate Clientid . . . . . 269
15.36. Operation 36: SETCLIENTID_CONFIRM - Confirm Clientid . . 272
15.37. Operation 37: VERIFY - Verify Same Attributes . . . . . 276
15.38. Operation 38: WRITE - Write to File . . . . . . . . . . 277
15.39. Operation 39: RELEASE_LOCKOWNER - Release Lockowner
State . . . . . . . . . . . . . . . . . . . . . . . . . 281
15.40. Operation 10044: ILLEGAL - Illegal operation . . . . . . 282
16. NFS version 4 Callback Procedures . . . . . . . . . . . . . . 283
16.1. Procedure 0: CB_NULL - No Operation . . . . . . . . . . 283
16.2. Procedure 1: CB_COMPOUND - Compound Operations . . . . . 284
16.2.6. Operation 3: CB_GETATTR - Get Attributes . . . . . . 285
16.2.7. Operation 4: CB_RECALL - Recall an Open Delegation . 286
16.2.8. Operation 10044: CB_ILLEGAL - Illegal Callback
Operation . . . . . . . . . . . . . . . . . . . . . 287
17. Security Considerations . . . . . . . . . . . . . . . . . . . 288
18. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 290
18.1. Named Attribute Definition . . . . . . . . . . . . . . . 290
18.2. ONC RPC Network Identifiers (netids) . . . . . . . . . . 290
19. References . . . . . . . . . . . . . . . . . . . . . . . . . 291
19.1. Normative References . . . . . . . . . . . . . . . . . . 291
19.2. Informative References . . . . . . . . . . . . . . . . . 292
Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 294
Appendix B. RFC Editor Notes . . . . . . . . . . . . . . . . . . 294
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 294
1. Introduction 1. Introduction
1.1. Changes since RFC 3530 1.1. Changes since RFC 3530
This document obsoletes RFC 3530 [10] as the authoritative document This document, together with the companion XDR description document
describing NFSv4, without introducing any over-the-wire protocol [2], obsoletes RFC 3530 [11] as the authoritative document describing
changes. The main changes from RFC 3530 are: NFSv4. It does not introduce any over-the-wire protocol changes, in
the sense that previously valid requests requests remain valid.
However, some requests previously defined as invalid, although not
generally rejected, are now explicitly allowed, in that
internationalization handling has been generalized and liberalized.
The main changes from RFC 3530 are:
o The RPC definition has been moved to a companion document [2] o The XDR definition has been moved to a companion document [2]
o Updates for the latest IETF intellectual property statements o Updates for the latest IETF intellectual property statements
o There is a restructured and more complete explanation of multi-
server namespace features. In particular, this explanation
explicitly describes handling of inter-server referrals, even
where neither migration nor replication is involved.
o More liberal handling of internationalization for file names and
user and group names, with the elimination of restrictions imposed
by stringprep, with the recognition that rules for the forms of
these name are the province of the receiving entity.
o Updating handling of domain names to reflect IDNA.
o Restructuring of string types to more appropriately reflect the
reality of required string processing.
o LIPKEY SPKM/3 has been moved from being mandatory to optional o LIPKEY SPKM/3 has been moved from being mandatory to optional
o Some clarification on a client re-establishing callback o Some clarification on a client re-establishing callback
information to the new server if state has been migrated information to the new server if state has been migrated
1.2. Changes since RFC 3010 1.2. Changes since RFC 3010
This definition of the NFS version 4 protocol replaces or obsoletes This definition of the NFS version 4 protocol replaces or obsoletes
the definition present in [11]. While portions of the two documents the definition present in [12]. While portions of the two documents
have remained the same, there have been substantive changes in have remained the same, there have been substantive changes in
others. The changes made between [11] and this document represent others. The changes made between [12] and this document represent
implementation experience and further review of the protocol. While implementation experience and further review of the protocol. While
some modifications were made for ease of implementation or some modifications were made for ease of implementation or
clarification, most updates represent errors or situations where the clarification, most updates represent errors or situations where the
[11] definition were untenable. [12] definition were untenable.
The following list is not all inclusive of all changes but presents The following list is not all inclusive of all changes but presents
some of the most notable changes or additions made: some of the most notable changes or additions made:
o The state model has added an open_owner4 identifier. This was o The state model has added an open_owner4 identifier. This was
done to accommodate Posix based clients and the model they use for done to accommodate Posix based clients and the model they use for
file locking. For Posix clients, an open_owner4 would correspond file locking. For Posix clients, an open_owner4 would correspond
to a file descriptor potentially shared amongst a set of processes to a file descriptor potentially shared amongst a set of processes
and the lock_owner4 identifier would correspond to a process that and the lock_owner4 identifier would correspond to a process that
is locking a file. is locking a file.
skipping to change at page 9, line 34 skipping to change at page 11, line 8
o Remove use of the pathname4 data type from LOOKUP and OPEN in o Remove use of the pathname4 data type from LOOKUP and OPEN in
favor of having the client construct a sequence of LOOKUP favor of having the client construct a sequence of LOOKUP
operations to achieive the same effect. operations to achieive the same effect.
o Clarification of the internationalization issues and adoption of o Clarification of the internationalization issues and adoption of
the new stringprep profile framework. the new stringprep profile framework.
1.3. NFS Version 4 Goals 1.3. NFS Version 4 Goals
The NFS version 4 protocol is a further revision of the NFS protocol The NFS version 4 protocol is a further revision of the NFS protocol
defined already by versions 2 [12] and 3 [13]. It retains the defined already by versions 2 [13] and 3 [14]. It retains the
essential characteristics of previous versions: design for easy essential characteristics of previous versions: design for easy
recovery, independent of transport protocols, operating systems and recovery, independent of transport protocols, operating systems and
filesystems, simplicity, and good performance. The NFS version 4 filesystems, simplicity, and good performance. The NFS version 4
revision has the following goals: revision has the following goals:
o Improved access and good performance on the Internet. o Improved access and good performance on the Internet.
The protocol is designed to transit firewalls easily, perform well The protocol is designed to transit firewalls easily, perform well
where latency is high and bandwidth is low, and scale to very where latency is high and bandwidth is low, and scale to very
large numbers of clients per server. large numbers of clients per server.
skipping to change at page 10, line 16 skipping to change at page 11, line 39
The protocol features a filesystem model that provides a useful, The protocol features a filesystem model that provides a useful,
common set of features that does not unduly favor one filesystem common set of features that does not unduly favor one filesystem
or operating system over another. or operating system over another.
o Designed for protocol extensions. o Designed for protocol extensions.
The protocol is designed to accept standard extensions that do not The protocol is designed to accept standard extensions that do not
compromise backward compatibility. compromise backward compatibility.
1.4. Inconsistencies of this Document with Section 18 1.4. Inconsistencies of this Document with the companion document NFS
Version 4 Protocol
Section 18, RPC Definition File, contains the definitions in XDR [2], NFS Version 4 Protocol, contains the definitions in XDR
description language of the constructs used by the protocol. Prior description language of the constructs used by the protocol. Inside
to Section 18, several of the constructs are reproduced for purposes this document, several of the constructs are reproduced for purposes
of explanation. The reader is warned of the possibility of errors in of explanation. The reader is warned of the possibility of errors in
the reproduced constructs outside of Section 18. For any part of the the reproduced constructs outside of [2]. For any part of the
document that is inconsistent with Section 18, Section 18 is to be document that is inconsistent with [2], [2] is to be considered
considered authoritative. authoritative.
1.5. Overview of NFS version 4 Features 1.5. Overview of NFS version 4 Features
To provide a reasonable context for the reader, the major features of To provide a reasonable context for the reader, the major features of
NFS version 4 protocol will be reviewed in brief. This will be done NFS version 4 protocol will be reviewed in brief. This will be done
to provide an appropriate context for both the reader who is familiar to provide an appropriate context for both the reader who is familiar
with the previous versions of the NFS protocol and the reader that is with the previous versions of the NFS protocol and the reader that is
new to the NFS protocols. For the reader new to the NFS protocols, new to the NFS protocols. For the reader new to the NFS protocols,
there is still a fundamental knowledge that is expected. The reader there is still a fundamental knowledge that is expected. The reader
should be familiar with the XDR and RPC protocols as described in [3] should be familiar with the XDR and RPC protocols as described in [3]
and [14]. A basic knowledge of filesystems and distributed and [15]. A basic knowledge of filesystems and distributed
filesystems is expected as well. filesystems is expected as well.
1.5.1. RPC and Security 1.5.1. RPC and Security
As with previous versions of NFS, the External Data Representation As with previous versions of NFS, the External Data Representation
(XDR) and Remote Procedure Call (RPC) mechanisms used for the NFS (XDR) and Remote Procedure Call (RPC) mechanisms used for the NFS
version 4 protocol are those defined in [3] and [14]. To meet end to version 4 protocol are those defined in [3] and [15]. To meet end to
end security requirements, the RPCSEC_GSS framework [4] will be used end security requirements, the RPCSEC_GSS framework [4] will be used
to extend the basic RPC security. With the use of RPCSEC_GSS, to extend the basic RPC security. With the use of RPCSEC_GSS,
various mechanisms can be provided to offer authentication, various mechanisms can be provided to offer authentication,
integrity, and privacy to the NFS version 4 protocol. Kerberos V5 integrity, and privacy to the NFS version 4 protocol. Kerberos V5
will be used as described in [15] to provide one security framework. will be used as described in [16] to provide one security framework.
The LIPKEY GSS-API mechanism described in [5] will be used to provide The LIPKEY GSS-API mechanism described in [5] will be used to provide
for the use of user password and server public key by the NFS version for the use of user password and server public key by the NFS version
4 protocol. With the use of RPCSEC_GSS, other mechanisms may also be 4 protocol. With the use of RPCSEC_GSS, other mechanisms may also be
specified and used for NFS version 4 security. specified and used for NFS version 4 security.
To enable in-band security negotiation, the NFS version 4 protocol To enable in-band security negotiation, the NFS version 4 protocol
has added a new operation which provides the client a method of has added a new operation which provides the client a method of
querying the server about its policies regarding which security querying the server about its policies regarding which security
mechanisms must be used for access to the server's filesystem mechanisms must be used for access to the server's filesystem
resources. With this, the client can securely match the security resources. With this, the client can securely match the security
skipping to change at page 13, line 17 skipping to change at page 14, line 45
application specific data with a regular file or directory. application specific data with a regular file or directory.
One significant addition to the recommended set of file attributes is One significant addition to the recommended set of file attributes is
the Access Control List (ACL) attribute. This attribute provides for the Access Control List (ACL) attribute. This attribute provides for
directory and file access control beyond the model used in previous directory and file access control beyond the model used in previous
versions of the NFS protocol. The ACL definition allows for versions of the NFS protocol. The ACL definition allows for
specification of user and group level access control. specification of user and group level access control.
1.5.3.3. Filesystem Replication and Migration 1.5.3.3. Filesystem Replication and Migration
With the use of a special file attribute, the ability to migrate or With the use of a special file attribute, the ability to inform the
replicate server filesystems is enabled within the protocol. The client of filesystem locations on another server is enabled. The
filesystem locations attribute provides a method for the client to filesystem locations attribute provides a method for the client to
probe the server about the location of a filesystem. In the event of probe the server about the location of a filesystem. In the event
a migration of a filesystem, the client will receive an error when that a fileystems is not present on server the client will receive an
operating on the filesystem and it can then query as to the new file error when attempting to operate on the filesystem and it can then
system location. Similar steps are used for replication, the client query as to the correct filesystem location. Thus is allowed
is able to query the server for the multiple available locations of a construction of multi-server namespaces..
particular filesystem. From this information, the client can use its
own policies to access the appropriate filesystem location. These features also allow file system replication and migration. In
the event of a migration of a filesystem, the client will receive an
error when operating on the filesystem and it can then query location
attribute to determine the new file system location. Similar steps
are used for replication, the client is able to query the server for
the multiple available locations of a particular filesystem. From
this information, the client can use its own policies to access the
appropriate filesystem location.
1.5.4. OPEN and CLOSE 1.5.4. OPEN and CLOSE
The NFS version 4 protocol introduces OPEN and CLOSE operations. The The NFS version 4 protocol introduces OPEN and CLOSE operations. The
OPEN operation provides a single point where file lookup, creation, OPEN operation provides a single point where file lookup, creation,
and share semantics can be combined. The CLOSE operation also and share semantics can be combined. The CLOSE operation also
provides for the release of state accumulated by OPEN. provides for the release of state accumulated by OPEN.
1.5.5. File locking 1.5.5. File Locking
With the NFS version 4 protocol, the support for byte range file With the NFS version 4 protocol, the support for byte range file
locking is part of the NFS protocol. The file locking support is locking is part of the NFS protocol. The file locking support is
structured so that an RPC callback mechanism is not required. This structured so that an RPC callback mechanism is not required. This
is a departure from the previous versions of the NFS file locking is a departure from the previous versions of the NFS file locking
protocol, Network Lock Manager (NLM). The state associated with file protocol, Network Lock Manager (NLM). The state associated with file
locks is maintained at the server under a lease-based model. The locks is maintained at the server under a lease-based model. The
server defines a single lease period for all state held by a NFS server defines a single lease period for all state held by a NFS
client. If the client does not renew its lease within the defined client. If the client does not renew its lease within the defined
period, all state associated with the client's lease may be released period, all state associated with the client's lease may be released
skipping to change at page 16, line 25 skipping to change at page 18, line 12
Stateids composed of all bits 0 or all bits 1 have special meaning Stateids composed of all bits 0 or all bits 1 have special meaning
and are reserved values. and are reserved values.
Verifier A 64-bit quantity generated by the client that the server Verifier A 64-bit quantity generated by the client that the server
can use to determine if the client has restarted and lost all can use to determine if the client has restarted and lost all
previous lock state. previous lock state.
2. Protocol Data Types 2. Protocol Data Types
The syntax and semantics to describe the data types of the NFS The syntax and semantics to describe the data types of the NFS
version 4 protocol are defined in the XDR [14] and RPC [3] documents. version 4 protocol are defined in the XDR [15] and RPC [3] documents.
The next sections build upon the XDR data types to define types and The next sections build upon the XDR data types to define types and
structures specific to this protocol. structures specific to this protocol.
2.1. Basic Data Types 2.1. Basic Data Types
These are the base NFSv4 data types. These are the base NFSv4 data types.
+---------------+---------------------------------------------------+ +----------------+--------------------------------------------------+
| Data Type | Definition | | Data Type | Definition |
+---------------+---------------------------------------------------+ +----------------+--------------------------------------------------+
| int32_t | typedef int int32_t; | | int32_t | typedef int int32_t; |
| uint32_t | typedef unsigned int uint32_t; | | uint32_t | typedef unsigned int uint32_t; |
| int64_t | typedef hyper int64_t; | | int64_t | typedef hyper int64_t; |
| uint64_t | typedef unsigned hyper uint64_t; | | uint64_t | typedef unsigned hyper uint64_t; |
| attrlist4 | typedef opaque attrlist4<>; | | attrlist4 | typedef opaque attrlist4<>; |
| | Used for file/directory attributes. | | | Used for file/directory attributes. |
| bitmap4 | typedef uint32_t bitmap4<>; | | bitmap4 | typedef uint32_t bitmap4<>; |
| | Used in attribute array encoding. | | | Used in attribute array encoding. |
| changeid4 | typedef uint64_t changeid4; | | changeid4 | typedef uint64_t changeid4; |
| | Used in the definition of change_info4. | | | Used in the definition of change_info4. |
| clientid4 | typedef uint64_t clientid4; | | clientid4 | typedef uint64_t clientid4; |
| | Shorthand reference to client identification. | | | Shorthand reference to client identification. |
| count4 | typedef uint32_t count4; | | count4 | typedef uint32_t count4; |
| | Various count parameters (READ, WRITE, COMMIT). | | | Various count parameters (READ, WRITE, COMMIT). |
| length4 | typedef uint64_t length4; | | length4 | typedef uint64_t length4; |
| | Describes LOCK lengths. | | | Describes LOCK lengths. |
| mode4 | typedef uint32_t mode4; | | mode4 | typedef uint32_t mode4; |
| | Mode attribute data type. | | | Mode attribute data type. |
| nfs_cookie4 | typedef uint64_t nfs_cookie4; | | nfs_cookie4 | typedef uint64_t nfs_cookie4; |
| | Opaque cookie value for READDIR. | | | Opaque cookie value for READDIR. |
| nfs_fh4 | typedef opaque nfs_fh4<NFS4_FHSIZE>; | | nfs_fh4 | typedef opaque nfs_fh4<NFS4_FHSIZE>; |
| | Filehandle definition. | | | Filehandle definition. |
| nfs_ftype4 | enum nfs_ftype4; | | nfs_ftype4 | enum nfs_ftype4; |
| | Various defined file types. | | | Various defined file types. |
| nfsstat4 | enum nfsstat4; | | nfsstat4 | enum nfsstat4; |
| | Return value for operations. | | | Return value for operations. |
| offset4 | typedef uint64_t offset4; | | offset4 | typedef uint64_t offset4; |
| | Various offset designations (READ, WRITE, LOCK, | | | Various offset designations (READ, WRITE, LOCK, |
| | COMMIT). | | | COMMIT). |
| qop4 | typedef uint32_t qop4; | | qop4 | typedef uint32_t qop4; |
| | Quality of protection designation in SECINFO. | | | Quality of protection designation in SECINFO. |
| sec_oid4 | typedef opaque sec_oid4<>; | | sec_oid4 | typedef opaque sec_oid4<>; |
| | Security Object Identifier. The sec_oid4 data | | | Security Object Identifier. The sec_oid4 data |
| | type is not really opaque. Instead it contains an | | | type is not really opaque. Instead it contains |
| | ASN.1 OBJECT IDENTIFIER as used by GSS-API in the | | | an ASN.1 OBJECT IDENTIFIER as used by GSS-API in |
| | mech_type argument to GSS_Init_sec_context. See | | | the mech_type argument to GSS_Init_sec_context. |
| | [6] for details. | | | See [6] for details. |
| seqid4 | typedef uint32_t seqid4; | | seqid4 | typedef uint32_t seqid4; |
| | Sequence identifier used for file locking. | | | Sequence identifier used for file locking. |
| utf8string | typedef opaque utf8string<>; | | utf8string | typedef opaque utf8string<>; |
| | UTF-8 encoding for strings. | | | UTF-8 encoding for strings. |
| utf8str_cis | typedef utf8string utf8str_cis; | | utf8_should | typedef utf8string utf8_should; |
| | Case-insensitive UTF-8 string. | | | String expected to be UTF8 but no validation |
| utf8str_cs | typedef utf8string utf8str_cs; | | utf8val_should | typedef utf8string utf8val_should; |
| | Case-sensitive UTF-8 string. | | | String SHOULD be sent UTF8 and SHOULD be |
| utf8str_mixed | typedef utf8string utf8str_mixed; | | | validated |
| | UTF-8 strings with a case sensitive prefix and a | | utf8val_must | typedef utf8string utf8val_must; |
| | case insensitive suffix. | | | String MUST be sent UTF8 and MUST be validated |
| component4 | typedef utf8str_cs component4; | | ascii_must | typedef utf8string ascii_must; |
| | Represents path name components. | | | String MUST be sent as ASCII and thus is |
| linktext4 | typedef utf8str_cs linktext4; | | | automatically UTF8 |
| | Symbolic link contents. | | comptag4 | typedef utf8_should comptag4; |
| pathname4 | typedef component4 pathname4<>; | | | Tag should be UTF8 but is not checked |
| | Represents path name for fs_locations. | | component4 | typedef utf8val_should component4; |
| nfs_lockid4 | typedef uint64_t nfs_lockid4; | | | Represents path name components. |
| verifier4 | typedef opaque verifier4[NFS4_VERIFIER_SIZE]; | | linktext4 | typedef utf8val_should linktext4; |
| | Verifier used for various operations (COMMIT, | | | Symbolic link contents. |
| | CREATE, EXCHANGE_ID, OPEN, READDIR, WRITE) | | pathname4 | typedef component4 pathname4<>; |
| | NFS4_VERIFIER_SIZE is defined as 8. | | | Represents path name for fs_locations. |
+---------------+---------------------------------------------------+ | nfs_lockid4 | typedef uint64_t nfs_lockid4; |
| verifier4 | typedef opaque verifier4[NFS4_VERIFIER_SIZE]; |
| | Verifier used for various operations (COMMIT, |
| | CREATE, EXCHANGE_ID, OPEN, READDIR, WRITE) |
| | NFS4_VERIFIER_SIZE is defined as 8. |
+----------------+--------------------------------------------------+
End of Base Data Types End of Base Data Types
Table 1 Table 1
2.2. Structured Data Types 2.2. Structured Data Types
2.2.1. nfstime4 2.2.1. nfstime4
struct nfstime4 { struct nfstime4 {
int64_t seconds; int64_t seconds;
uint32_t nseconds; uint32_t nseconds;
}; };
The nfstime4 structure gives the number of seconds and nanoseconds The nfstime4 structure gives the number of seconds and nanoseconds
since midnight or 0 hour January 1, 1970 Coordinated Universal Time since midnight or 0 hour January 1, 1970 Coordinated Universal Time
(UTC). Values greater than zero for the seconds field denote dates (UTC). Values greater than zero for the seconds field denote dates
skipping to change at page 19, line 28 skipping to change at page 21, line 28
uint64_t major; uint64_t major;
uint64_t minor; uint64_t minor;
}; };
This type is the filesystem identifier that is used as a mandatory This type is the filesystem identifier that is used as a mandatory
attribute. attribute.
2.2.6. fs_location4 2.2.6. fs_location4
struct fs_location4 { struct fs_location4 {
utf8str_cis server<>; utf8val_must server<>;
pathname4 rootpath; pathname4 rootpath;
}; };
2.2.7. fs_locations4 2.2.7. fs_locations4
struct fs_locations4 { struct fs_locations4 {
pathname4 fs_root; pathname4 fs_root;
fs_location4 locations<>; fs_location4 locations<>;
}; };
skipping to change at page 20, line 38 skipping to change at page 22, line 38
struct clientaddr4 { struct clientaddr4 {
/* see struct rpcb in RFC 1833 */ /* see struct rpcb in RFC 1833 */
string r_netid<>; /* network id */ string r_netid<>; /* network id */
string r_addr<>; /* universal address */ string r_addr<>; /* universal address */
}; };
The clientaddr4 structure is used as part of the SETCLIENTID The clientaddr4 structure is used as part of the SETCLIENTID
operation to either specify the address of the client that is using a operation to either specify the address of the client that is using a
clientid or as part of the callback registration. The r_netid and clientid or as part of the callback registration. The r_netid and
r_addr fields are specified in [16], but they are underspecified in r_addr fields are specified in [17], but they are underspecified in
[16] as far as what they should look like for specific protocols. [17] as far as what they should look like for specific protocols.
For TCP over IPv4 and for UDP over IPv4, the format of r_addr is the For TCP over IPv4 and for UDP over IPv4, the format of r_addr is the
US-ASCII string: US-ASCII string:
h1.h2.h3.h4.p1.p2 h1.h2.h3.h4.p1.p2
The prefix, "h1.h2.h3.h4", is the standard textual form for The prefix, "h1.h2.h3.h4", is the standard textual form for
representing an IPv4 address, which is always four octets long. representing an IPv4 address, which is always four octets long.
Assuming big-endian ordering, h1, h2, h3, and h4, are respectively, Assuming big-endian ordering, h1, h2, h3, and h4, are respectively,
the first through fourth octets each converted to ASCII-decimal. the first through fourth octets each converted to ASCII-decimal.
skipping to change at page 21, line 20 skipping to change at page 23, line 20
over IPv4 the value of r_netid is the string "udp". over IPv4 the value of r_netid is the string "udp".
For TCP over IPv6 and for UDP over IPv6, the format of r_addr is the For TCP over IPv6 and for UDP over IPv6, the format of r_addr is the
US-ASCII string: US-ASCII string:
x1:x2:x3:x4:x5:x6:x7:x8.p1.p2 x1:x2:x3:x4:x5:x6:x7:x8.p1.p2
The suffix "p1.p2" is the service port, and is computed the same way The suffix "p1.p2" is the service port, and is computed the same way
as with universal addresses for TCP and UDP over IPv4. The prefix, as with universal addresses for TCP and UDP over IPv4. The prefix,
"x1:x2:x3:x4:x5:x6:x7:x8", is the standard textual form for "x1:x2:x3:x4:x5:x6:x7:x8", is the standard textual form for
representing an IPv6 address as defined in Section 2.2 of [17]. representing an IPv6 address as defined in Section 2.2 of [18].
Additionally, the two alternative forms specified in Section 2.2 of Additionally, the two alternative forms specified in Section 2.2 of
[17] are also acceptable. [18] are also acceptable.
For TCP over IPv6 the value of r_netid is the string "tcp6". For UDP For TCP over IPv6 the value of r_netid is the string "tcp6". For UDP
over IPv6 the value of r_netid is the string "udp6". over IPv6 the value of r_netid is the string "udp6".
2.2.11. cb_client4 2.2.11. cb_client4
struct cb_client4 { struct cb_client4 {
unsigned int cb_program; unsigned int cb_program;
clientaddr4 cb_location; clientaddr4 cb_location;
}; };
skipping to change at page 23, line 11 skipping to change at page 25, line 11
is read-only. The starting value of the seqid field is undefined. is read-only. The starting value of the seqid field is undefined.
The server is required to increment the seqid field monotonically at The server is required to increment the seqid field monotonically at
each transition of the stateid. This is important since the client each transition of the stateid. This is important since the client
will inspect the seqid in OPEN stateids to determine the order of will inspect the seqid in OPEN stateids to determine the order of
OPEN processing done by the server. OPEN processing done by the server.
3. RPC and Security Flavor 3. RPC and Security Flavor
The NFS version 4 protocol is a Remote Procedure Call (RPC) The NFS version 4 protocol is a Remote Procedure Call (RPC)
application that uses RPC version 2 and the corresponding eXternal application that uses RPC version 2 and the corresponding eXternal
Data Representation (XDR) as defined in [3] and [14]. The RPCSEC_GSS Data Representation (XDR) as defined in [3] and [15]. The RPCSEC_GSS
security flavor as defined in [4] MUST be used as the mechanism to security flavor as defined in [4] MUST be used as the mechanism to
deliver stronger security for the NFS version 4 protocol. deliver stronger security for the NFS version 4 protocol.
3.1. Ports and Transports 3.1. Ports and Transports
Historically, NFS version 2 and version 3 servers have resided on Historically, NFS version 2 and version 3 servers have resided on
port 2049. The registered port 2049 [18] for the NFS protocol should port 2049. The registered port 2049 [19] for the NFS protocol should
be the default configuration. Using the registered port for NFS be the default configuration. Using the registered port for NFS
services means the NFS client will not need to use the RPC binding services means the NFS client will not need to use the RPC binding
protocols as described in [16]; this will allow NFS to transit protocols as described in [17]; this will allow NFS to transit
firewalls. firewalls.
Where an NFS version 4 implementation supports operation over the IP Where an NFS version 4 implementation supports operation over the IP
network protocol, the supported transports between NFS and IP MUST be network protocol, the supported transports between NFS and IP MUST be
among the IETF-approved congestion control transport protocols, which among the IETF-approved congestion control transport protocols, which
include TCP and SCTP. To enhance the possibilities for include TCP and SCTP. To enhance the possibilities for
interoperability, an NFS version 4 implementation MUST support interoperability, an NFS version 4 implementation MUST support
operation over the TCP transport protocol, at least until such time operation over the TCP transport protocol, at least until such time
as a standards track RFC revises this requirement to use a different as a standards track RFC revises this requirement to use a different
IETF-approved congestion control transport protocol. IETF-approved congestion control transport protocol.
If TCP is used as the transport, the client and server SHOULD use If TCP is used as the transport, the client and server SHOULD use
persistent connections. This will prevent the weakening of TCP's persistent connections. This will prevent the weakening of TCP's
congestion control via short lived connections and will improve congestion control via short lived connections and will improve
performance for the WAN environment by eliminating the need for SYN performance for the WAN environment by eliminating the need for SYN
handshakes. handshakes.
As noted in the Security Considerations section, the authentication As noted in Section 17, the authentication model for NFS version 4
model for NFS version 4 has moved from machine-based to principal- has moved from machine-based to principal- based. However, this
based. However, this modification of the authentication model does modification of the authentication model does not imply a technical
not imply a technical requirement to move the TCP connection requirement to move the TCP connection management model from whole
management model from whole machine-based to one based on a per user machine-based to one based on a per user model. In particular, NFS
model. In particular, NFS over TCP client implementations have over TCP client implementations have traditionally multiplexed
traditionally multiplexed traffic for multiple users over a common traffic for multiple users over a common TCP connection between an
TCP connection between an NFS client and server. This has been true, NFS client and server. This has been true, regardless whether the
regardless whether the NFS client is using AUTH_SYS, AUTH_DH, NFS client is using AUTH_SYS, AUTH_DH, RPCSEC_GSS or any other
RPCSEC_GSS or any other flavor. Similarly, NFS over TCP server flavor. Similarly, NFS over TCP server implementations have assumed
implementations have assumed such a model and thus scale the such a model and thus scale the implementation of TCP connection
implementation of TCP connection management in proportion to the management in proportion to the number of expected client machines.
number of expected client machines. It is intended that NFS version
4 will not modify this connection management model. NFS version 4 It is intended that NFS version 4 will not modify this connection
clients that violate this assumption can expect scaling issues on the management model. NFS version 4 clients that violate this assumption
server and hence reduced service. can expect scaling issues on the server and hence reduced service.
Note that for various timers, the client and server should avoid Note that for various timers, the client and server should avoid
inadvertent synchronization of those timers. For further discussion inadvertent synchronization of those timers. For further discussion
of the general issue refer to [19]. of the general issue refer to [20].
3.1.1. Client Retransmission Behavior 3.1.1. Client Retransmission Behavior
When processing a request received over a reliable transport such as When processing a request received over a reliable transport such as
TCP, the NFS version 4 server MUST NOT silently drop the request, TCP, the NFS version 4 server MUST NOT silently drop the request,
except if the transport connection has been broken. Given such a except if the transport connection has been broken. Given such a
contract between NFS version 4 clients and servers, clients MUST NOT contract between NFS version 4 clients and servers, clients MUST NOT
retry a request unless one or both of the following are true: retry a request unless one or both of the following are true:
o The transport connection has been broken o The transport connection has been broken
skipping to change at page 25, line 18 skipping to change at page 27, line 17
3.2.1. Security mechanisms for NFS version 4 3.2.1. Security mechanisms for NFS version 4
The use of RPCSEC_GSS requires selection of: mechanism, quality of The use of RPCSEC_GSS requires selection of: mechanism, quality of
protection, and service (authentication, integrity, privacy). The protection, and service (authentication, integrity, privacy). The
remainder of this document will refer to these three parameters of remainder of this document will refer to these three parameters of
the RPCSEC_GSS security as the security triple. the RPCSEC_GSS security as the security triple.
3.2.1.1. Kerberos V5 as a security triple 3.2.1.1. Kerberos V5 as a security triple
The Kerberos V5 GSS-API mechanism as described in [15] MUST be The Kerberos V5 GSS-API mechanism as described in [16] MUST be
implemented and provide the following security triples. implemented and provide the following security triples.
column descriptions: column descriptions:
1 == number of pseudo flavor 1 == number of pseudo flavor
2 == name of pseudo flavor 2 == name of pseudo flavor
3 == mechanism's OID 3 == mechanism's OID
4 == mechanism's algorithm(s) 4 == mechanism's algorithm(s)
5 == RPCSEC_GSS service 5 == RPCSEC_GSS service
skipping to change at page 25, line 46 skipping to change at page 27, line 45
for privacy. for privacy.
Note that the pseudo flavor is presented here as a mapping aid to the Note that the pseudo flavor is presented here as a mapping aid to the
implementor. Because this NFS protocol includes a method to implementor. Because this NFS protocol includes a method to
negotiate security and it understands the GSS-API mechanism, the negotiate security and it understands the GSS-API mechanism, the
pseudo flavor is not needed. The pseudo flavor is needed for NFS pseudo flavor is not needed. The pseudo flavor is needed for NFS
version 3 since the security negotiation is done via the MOUNT version 3 since the security negotiation is done via the MOUNT
protocol. protocol.
For a discussion of NFS' use of RPCSEC_GSS and Kerberos V5, please For a discussion of NFS' use of RPCSEC_GSS and Kerberos V5, please
see [20]. see [21].
Users and implementors are warned that 56 bit DES is no longer Users and implementors are warned that 56 bit DES is no longer
considered state of the art in terms of resistance to brute force considered state of the art in terms of resistance to brute force
attacks. Once a revision to [15] is available that adds support for attacks. Once a revision to [16] is available that adds support for
AES, implementors are urged to incorporate AES into their NFSv4 over AES, implementors are urged to incorporate AES into their NFSv4 over
Kerberos V5 protocol stacks, and users are similarly urged to migrate Kerberos V5 protocol stacks, and users are similarly urged to migrate
to the use of AES. to the use of AES.
3.2.1.2. LIPKEY as a security triple 3.2.1.2. LIPKEY as a security triple
The LIPKEY GSS-API mechanism as described in [5] MAY be implemented The LIPKEY GSS-API mechanism as described in [5] MAY be implemented
and provide the following security triples. The definition of the and provide the following security triples. The definition of the
columns matches the previous subsection "Kerberos V5 as security columns matches those in Section 3.2.1.1.
triple".
1 2 3 4 5 1 2 3 4 5
-------------------------------------------------------------------- --------------------------------------------------------------------
390006 lipkey 1.3.6.1.5.5.9 negotiated rpc_gss_svc_none 390006 lipkey 1.3.6.1.5.5.9 negotiated rpc_gss_svc_none
390007 lipkey-i 1.3.6.1.5.5.9 negotiated rpc_gss_svc_integrity 390007 lipkey-i 1.3.6.1.5.5.9 negotiated rpc_gss_svc_integrity
390008 lipkey-p 1.3.6.1.5.5.9 negotiated rpc_gss_svc_privacy 390008 lipkey-p 1.3.6.1.5.5.9 negotiated rpc_gss_svc_privacy
The mechanism algorithm is listed as "negotiated". This is because The mechanism algorithm is listed as "negotiated". This is because
LIPKEY is layered on SPKM-3 and in SPKM-3 [5] the confidentiality and LIPKEY is layered on SPKM-3 and in SPKM-3 [5] the confidentiality and
integrity algorithms are negotiated. Since SPKM-3 specifies HMAC-MD5 integrity algorithms are negotiated. Since SPKM-3 specifies HMAC-MD5
for integrity as MANDATORY, 128 bit cast5CBC for confidentiality for for integrity as MANDATORY, 128 bit cast5CBC for confidentiality for
privacy as MANDATORY, and further specifies that HMAC-MD5 and privacy as MANDATORY, and further specifies that HMAC-MD5 and
cast5CBC MUST be listed first before weaker algorithms, specifying cast5CBC MUST be listed first before weaker algorithms, specifying
"negotiated" in column 4 does not impair interoperability. In the "negotiated" in column 4 does not impair interoperability. In the
event an SPKM-3 peer does not support the mandatory algorithms, the event an SPKM-3 peer does not support the mandatory algorithms, the
other peer is free to accept or reject the GSS-API context creation. other peer is free to accept or reject the GSS-API context creation.
Because SPKM-3 negotiates the algorithms, subsequent calls to Because SPKM-3 negotiates the algorithms, subsequent calls to
LIPKEY's GSS_Wrap() and GSS_GetMIC() by RPCSEC_GSS will use a quality LIPKEY's GSS_Wrap() and GSS_GetMIC() by RPCSEC_GSS will use a quality
of protection value of 0 (zero). See section 5.2 of [21] for an of protection value of 0 (zero). See section 5.2 of [22] for an
explanation. explanation.
LIPKEY uses SPKM-3 to create a secure channel in which to pass a user LIPKEY uses SPKM-3 to create a secure channel in which to pass a user
name and password from the client to the server. Once the user name name and password from the client to the server. Once the user name
and password have been accepted by the server, calls to the LIPKEY and password have been accepted by the server, calls to the LIPKEY
context are redirected to the SPKM-3 context. See [5] for more context are redirected to the SPKM-3 context. See [5] for more
details. details.
3.2.1.3. SPKM-3 as a security triple 3.2.1.3. SPKM-3 as a security triple
The SPKM-3 GSS-API mechanism as described in [5] MAY be implemented The SPKM-3 GSS-API mechanism as described in [5] MAY be implemented
and provide the following security triples. The definition of the and provide the following security triples. The definition of the
columns matches the previous subsection "Kerberos V5 as security columns matches those in Section 3.2.1.1.
triple".
1 2 3 4 5 1 2 3 4 5
-------------------------------------------------------------------- --------------------------------------------------------------------
390009 spkm3 1.3.6.1.5.5.1.3 negotiated rpc_gss_svc_none 390009 spkm3 1.3.6.1.5.5.1.3 negotiated rpc_gss_svc_none
390010 spkm3i 1.3.6.1.5.5.1.3 negotiated rpc_gss_svc_integrity 390010 spkm3i 1.3.6.1.5.5.1.3 negotiated rpc_gss_svc_integrity
390011 spkm3p 1.3.6.1.5.5.1.3 negotiated rpc_gss_svc_privacy 390011 spkm3p 1.3.6.1.5.5.1.3 negotiated rpc_gss_svc_privacy
For a discussion as to why the mechanism algorithm is listed as For a discussion as to why the mechanism algorithm is listed as
"negotiated", see Section 3.2.1.2 "LIPKEY as a security triple." "negotiated", see Section 3.2.1.2.
Because SPKM-3 negotiates the algorithms, subsequent calls to SPKM- Because SPKM-3 negotiates the algorithms, subsequent calls to SPKM-
3's GSS_Wrap() and GSS_GetMIC() by RPCSEC_GSS will use a quality of 3's GSS_Wrap() and GSS_GetMIC() by RPCSEC_GSS will use a quality of
protection value of 0 (zero). See section 5.2 of [21] for an protection value of 0 (zero). See section 5.2 of [22] for an
explanation. explanation.
Even though LIPKEY is layered over SPKM-3, SPKM-3 is specified as a Even though LIPKEY is layered over SPKM-3, SPKM-3 is specified as a
mandatory set of triples to handle the situations where the initiator mandatory set of triples to handle the situations where the initiator
(the client) is anonymous or where the initiator has its own (the client) is anonymous or where the initiator has its own
certificate. If the initiator is anonymous, there will not be a user certificate. If the initiator is anonymous, there will not be a user
name and password to send to the target (the server). If the name and password to send to the target (the server). If the
initiator has its own certificate, then using passwords is initiator has its own certificate, then using passwords is
superfluous. superfluous.
skipping to change at page 27, line 41 skipping to change at page 29, line 36
mechanism is to be used for its communication with the server. The mechanism is to be used for its communication with the server. The
NFS server may have multiple points within its filesystem name space NFS server may have multiple points within its filesystem name space
that are available for use by NFS clients. In turn the NFS server that are available for use by NFS clients. In turn the NFS server
may be configured such that each of these entry points may have may be configured such that each of these entry points may have
different or multiple security mechanisms in use. different or multiple security mechanisms in use.
The security negotiation between client and server must be done with The security negotiation between client and server must be done with
a secure channel to eliminate the possibility of a third party a secure channel to eliminate the possibility of a third party
intercepting the negotiation sequence and forcing the client and intercepting the negotiation sequence and forcing the client and
server to choose a lower level of security than required or desired. server to choose a lower level of security than required or desired.
See Section 16 "Security Considerations" for further discussion. See Section 17 for further discussion.
3.3.1. SECINFO 3.3.1. SECINFO
The new SECINFO operation will allow the client to determine, on a The new SECINFO operation will allow the client to determine, on a
per filehandle basis, what security triple is to be used for server per filehandle basis, what security triple is to be used for server
access. In general, the client will not have to use the SECINFO access. In general, the client will not have to use the SECINFO
operation except during initial communication with the server or when operation except during initial communication with the server or when
the client crosses policy boundaries at the server. It is possible the client crosses policy boundaries at the server. It is possible
that the server's policies change during the client's interaction that the server's policies change during the client's interaction
therefore forcing the client to negotiate a new security triple. therefore forcing the client to negotiate a new security triple.
skipping to change at page 28, line 17 skipping to change at page 30, line 12
Based on the assumption that each NFS version 4 client and server Based on the assumption that each NFS version 4 client and server
must support a minimum set of security (i.e., LIPKEY, SPKM-3, and must support a minimum set of security (i.e., LIPKEY, SPKM-3, and
Kerberos-V5 all under RPCSEC_GSS), the NFS client will start its Kerberos-V5 all under RPCSEC_GSS), the NFS client will start its
communication with the server with one of the minimal security communication with the server with one of the minimal security
triples. During communication with the server, the client may triples. During communication with the server, the client may
receive an NFS error of NFS4ERR_WRONGSEC. This error allows the receive an NFS error of NFS4ERR_WRONGSEC. This error allows the
server to notify the client that the security triple currently being server to notify the client that the security triple currently being
used is not appropriate for access to the server's filesystem used is not appropriate for access to the server's filesystem
resources. The client is then responsible for determining what resources. The client is then responsible for determining what
security triples are available at the server and choose one which is security triples are available at the server and choose one which is
appropriate for the client. See Section 14.33 for the "SECINFO" appropriate for the client. See Section 15.33 for further discussion
operation for further discussion of how the client will respond to of how the client will respond to the NFS4ERR_WRONGSEC error and use
the NFS4ERR_WRONGSEC error and use SECINFO. SECINFO.
3.3.3. Callback RPC Authentication 3.3.3. Callback RPC Authentication
Except as noted elsewhere in this section, the callback RPC Except as noted elsewhere in this section, the callback RPC
(described later) MUST mutually authenticate the NFS server to the (described later) MUST mutually authenticate the NFS server to the
principal that acquired the clientid (also described later), using principal that acquired the clientid (also described later), using
the security flavor the original SETCLIENTID operation used. the security flavor the original SETCLIENTID operation used.
For AUTH_NONE, there are no principals, so this is a non-issue. For AUTH_NONE, there are no principals, so this is a non-issue.
skipping to change at page 30, line 22 skipping to change at page 32, line 18
for a filesystem object. The contents of the filehandle are opaque for a filesystem object. The contents of the filehandle are opaque
to the client. Therefore, the server is responsible for translating to the client. Therefore, the server is responsible for translating
the filehandle to an internal representation of the filesystem the filehandle to an internal representation of the filesystem
object. object.
4.1. Obtaining the First Filehandle 4.1. Obtaining the First Filehandle
The operations of the NFS protocol are defined in terms of one or The operations of the NFS protocol are defined in terms of one or
more filehandles. Therefore, the client needs a filehandle to more filehandles. Therefore, the client needs a filehandle to
initiate communication with the server. With the NFS version 2 initiate communication with the server. With the NFS version 2
protocol [12] and the NFS version 3 protocol [13], there exists an protocol [13] and the NFS version 3 protocol [14], there exists an
ancillary protocol to obtain this first filehandle. The MOUNT ancillary protocol to obtain this first filehandle. The MOUNT
protocol, RPC program number 100005, provides the mechanism of protocol, RPC program number 100005, provides the mechanism of
translating a string based filesystem path name to a filehandle which translating a string based filesystem path name to a filehandle which
can then be used by the NFS protocols. can then be used by the NFS protocols.
The MOUNT protocol has deficiencies in the area of security and use The MOUNT protocol has deficiencies in the area of security and use
via firewalls. This is one reason that the use of the public via firewalls. This is one reason that the use of the public
filehandle was introduced in [22] and [23]. With the use of the filehandle was introduced in [23] and [24]. With the use of the
public filehandle in combination with the LOOKUP operation in the NFS public filehandle in combination with the LOOKUP operation in the NFS
version 2 and 3 protocols, it has been demonstrated that the MOUNT version 2 and 3 protocols, it has been demonstrated that the MOUNT
protocol is unnecessary for viable interaction between NFS client and protocol is unnecessary for viable interaction between NFS client and
server. server.
Therefore, the NFS version 4 protocol will not use an ancillary Therefore, the NFS version 4 protocol will not use an ancillary
protocol for translation from string based path names to a protocol for translation from string based path names to a
filehandle. Two special filehandles will be used as starting points filehandle. Two special filehandles will be used as starting points
for the NFS client. for the NFS client.
4.1.1. Root Filehandle 4.1.1. Root Filehandle
The first of the special filehandles is the ROOT filehandle. The The first of the special filehandles is the ROOT filehandle. The
ROOT filehandle is the "conceptual" root of the filesystem name space ROOT filehandle is the "conceptual" root of the filesystem name space
at the NFS server. The client uses or starts with the ROOT at the NFS server. The client uses or starts with the ROOT
filehandle by employing the PUTROOTFH operation. The PUTROOTFH filehandle by employing the PUTROOTFH operation. The PUTROOTFH
operation instructs the server to set the "current" filehandle to the operation instructs the server to set the "current" filehandle to the
ROOT of the server's file tree. Once this PUTROOTFH operation is ROOT of the server's file tree. Once this PUTROOTFH operation is
used, the client can then traverse the entirety of the server's file used, the client can then traverse the entirety of the server's file
tree with the LOOKUP operation. A complete discussion of the server tree with the LOOKUP operation. A complete discussion of the server
name space is in the section "NFS Server Name Space". name space is in Section 8.
4.1.2. Public Filehandle 4.1.2. Public Filehandle
The second special filehandle is the PUBLIC filehandle. Unlike the The second special filehandle is the PUBLIC filehandle. Unlike the
ROOT filehandle, the PUBLIC filehandle may be bound or represent an ROOT filehandle, the PUBLIC filehandle may be bound or represent an
arbitrary filesystem object at the server. The server is responsible arbitrary filesystem object at the server. The server is responsible
for this binding. It may be that the PUBLIC filehandle and the ROOT for this binding. It may be that the PUBLIC filehandle and the ROOT
filehandle refer to the same filesystem object. However, it is up to filehandle refer to the same filesystem object. However, it is up to
the administrative software at the server and the policies of the the administrative software at the server and the policies of the
server administrator to define the binding of the PUBLIC filehandle server administrator to define the binding of the PUBLIC filehandle
skipping to change at page 32, line 13 skipping to change at page 34, line 7
doing a byte-by-byte comparison. However, the client MUST NOT doing a byte-by-byte comparison. However, the client MUST NOT
otherwise interpret the contents of filehandles. If two filehandles otherwise interpret the contents of filehandles. If two filehandles
from the same server are equal, they MUST refer to the same file. from the same server are equal, they MUST refer to the same file.
Servers SHOULD try to maintain a one-to-one correspondence between Servers SHOULD try to maintain a one-to-one correspondence between
filehandles and files but this is not required. Clients MUST use filehandles and files but this is not required. Clients MUST use
filehandle comparisons only to improve performance, not for correct filehandle comparisons only to improve performance, not for correct
behavior. All clients need to be prepared for situations in which it behavior. All clients need to be prepared for situations in which it
cannot be determined whether two filehandles denote the same object cannot be determined whether two filehandles denote the same object
and in such cases, avoid making invalid assumptions which might cause and in such cases, avoid making invalid assumptions which might cause
incorrect behavior. Further discussion of filehandle and attribute incorrect behavior. Further discussion of filehandle and attribute
comparison in the context of data caching is presented in the section comparison in the context of data caching is presented in
"Data Caching and File Identity". Section 10.3.4.
As an example, in the case that two different path names when As an example, in the case that two different path names when
traversed at the server terminate at the same filesystem object, the traversed at the server terminate at the same filesystem object, the
server SHOULD return the same filehandle for each path. This can server SHOULD return the same filehandle for each path. This can
occur if a hard link is used to create two file names which refer to occur if a hard link is used to create two file names which refer to
the same underlying file object and associated data. For example, if the same underlying file object and associated data. For example, if
paths /a/b/c and /a/d/c refer to the same file, the server SHOULD paths /a/b/c and /a/d/c refer to the same file, the server SHOULD
return the same filehandle for both path names traversals. return the same filehandle for both path names traversals.
4.2.2. Persistent Filehandle 4.2.2. Persistent Filehandle
skipping to change at page 35, line 25 skipping to change at page 37, line 21
GETFH GETFH
Note that the COMPOUND procedure does not provide atomicity. This Note that the COMPOUND procedure does not provide atomicity. This
example only reduces the overhead of recovering from an expired example only reduces the overhead of recovering from an expired
filehandle. filehandle.
5. File Attributes 5. File Attributes
To meet the requirements of extensibility and increased To meet the requirements of extensibility and increased
interoperability with non-UNIX platforms, attributes must be handled interoperability with non-UNIX platforms, attributes must be handled
in a flexible manner. The NFS version 3 fattr3 structure contains a in a flexible manner. The NFSv3 fattr3 structure contains a fixed
fixed list of attributes that not all clients and servers are able to list of attributes that not all clients and servers are able to
support or care about. The fattr3 structure can not be extended as support or care about. The fattr3 structure can not be extended as
new needs arise and it provides no way to indicate non-support. With new needs arise and it provides no way to indicate non-support. With
the NFS version 4 protocol, the client is able query what attributes the NFSv4.0 protocol, the client is able query what attributes the
the server supports and construct requests with only those supported server supports and construct requests with only those supported
attributes (or a subset thereof). attributes (or a subset thereof).
To this end, attributes are divided into three groups: mandatory, To this end, attributes are divided into three groups: REQUIRED,
recommended, and named. Both mandatory and recommended attributes RECOMMENDED, and named. Both REQUIRED and RECOMMENDED attributes are
are supported in the NFS version 4 protocol by a specific and well- supported in the NFSv4.0 protocol by a specific and well-defined
defined encoding and are identified by number. They are requested by encoding and are identified by number. They are requested by setting
setting a bit in the bit vector sent in the GETATTR request; the a bit in the bit vector sent in the GETATTR request; the server
server response includes a bit vector to list what attributes were response includes a bit vector to list what attributes were returned
returned in the response. New mandatory or recommended attributes in the response. New REQUIRED or RECOMMENDED attributes may be added
may be added to the NFS protocol between major revisions by to the NFSv4 protocol as part of a new minor version by publishing a
publishing a standards-track RFC which allocates a new attribute standards-track RFC which allocates a new attribute number value and
number value and defines the encoding for the attribute. See defines the encoding for the attribute. See Section 11 for further
Section 10 "Minor Versioning" for further discussion. discussion.
Named attributes are accessed by the new OPENATTR operation, which Named attributes are accessed by the new OPENATTR operation, which
accesses a hidden directory of attributes associated with a file accesses a hidden directory of attributes associated with a file
system object. OPENATTR takes a filehandle for the object and system object. OPENATTR takes a filehandle for the object and
returns the filehandle for the attribute hierarchy. The filehandle returns the filehandle for the attribute hierarchy. The filehandle
for the named attributes is a directory object accessible by LOOKUP for the named attributes is a directory object accessible by LOOKUP
or READDIR and contains files whose names represent the named or READDIR and contains files whose names represent the named
attributes and whose data bytes are the value of the attribute. For attributes and whose data bytes are the value of the attribute. For
example: example:
LOOKUP "foo" ; look up file +----------+-----------+---------------------------------+
GETATTR attrbits | LOOKUP | "foo" | ; look up file |
OPENATTR ; access foo's named attributes | GETATTR | attrbits | |
LOOKUP "x11icon" ; look up specific attribute | OPENATTR | | ; access foo's named attributes |
READ 0,4096 ; read stream of bytes | LOOKUP | "x11icon" | ; look up specific attribute |
| READ | 0,4096 | ; read stream of bytes |
+----------+-----------+---------------------------------+
Named attributes are intended for data needed by applications rather Named attributes are intended for data needed by applications rather
than by an NFS client implementation. NFS implementors are strongly than by an NFS client implementation. NFS implementors are strongly
encouraged to define their new attributes as recommended attributes encouraged to define their new attributes as RECOMMENDED attributes
by bringing them to the IETF standards-track process. by bringing them to the IETF standards-track process.
The set of attributes which are classified as mandatory is The set of attributes which are classified as REQUIRED is
deliberately small since servers must do whatever it takes to support deliberately small since servers must do whatever it takes to support
them. A server should support as many of the recommended attributes them. A server should support as many of the RECOMMENDED attributes
as possible but by their definition, the server is not required to as possible but by their definition, the server is not required to
support all of them. Attributes are deemed mandatory if the data is support all of them. Attributes are deemed REQUIRED if the data is
both needed by a large number of clients and is not otherwise both needed by a large number of clients and is not otherwise
reasonably computable by the client when support is not provided on reasonably computable by the client when support is not provided on
the server. the server.
Note that the hidden directory returned by OPENATTR is a convenience Note that the hidden directory returned by OPENATTR is a convenience
for protocol processing. The client should not make any assumptions for protocol processing. The client should not make any assumptions
about the server's implementation of named attributes and whether the about the server's implementation of named attributes and whether the
underlying filesystem at the server has a named attribute directory underlying file system at the server has a named attribute directory
or not. Therefore, operations such as SETATTR and GETATTR on the or not. Therefore, operations such as SETATTR and GETATTR on the
named attribute directory are undefined. named attribute directory are undefined.
5.1. Mandatory Attributes 5.1. REQUIRED Attributes
These MUST be supported by every NFS version 4 client and server in These MUST be supported by every NFSv4.0 client and server in order
order to ensure a minimum level of interoperability. The server must to ensure a minimum level of interoperability. The server MUST store
store and return these attributes and the client must be able to and return these attributes and the client MUST be able to function
function with an attribute set limited to these attributes. With with an attribute set limited to these attributes. With just the
just the mandatory attributes some client functionality may be REQUIRED attributes some client functionality may be impaired or
impaired or limited in some ways. A client may ask for any of these limited in some ways. A client may ask for any of these attributes
attributes to be returned by setting a bit in the GETATTR request and to be returned by setting a bit in the GETATTR request and the server
the server must return their value. must return their value.
5.2. Recommended Attributes 5.2. RECOMMENDED Attributes
These attributes are understood well enough to warrant support in the These attributes are understood well enough to warrant support in the
NFS version 4 protocol. However, they may not be supported on all NFSv4.0 protocol. However, they may not be supported on all clients
clients and servers. A client may ask for any of these attributes to and servers. A client may ask for any of these attributes to be
be returned by setting a bit in the GETATTR request but must handle returned by setting a bit in the GETATTR request but must handle the
the case where the server does not return them. A client may ask for case where the server does not return them. A client may ask for the
the set of attributes the server supports and should not request set of attributes the server supports and SHOULD NOT request
attributes the server does not support. A server should be tolerant attributes the server does not support. A server should be tolerant
of requests for unsupported attributes and simply not return them of requests for unsupported attributes and simply not return them
rather than considering the request an error. It is expected that rather than considering the request an error. It is expected that
servers will support all attributes they comfortably can and only servers will support all attributes they comfortably can and only
fail to support attributes which are difficult to support in their fail to support attributes which are difficult to support in their
operating environments. A server should provide attributes whenever operating environments. A server should provide attributes whenever
they don't have to "tell lies" to the client. For example, a file they don't have to "tell lies" to the client. For example, a file
modification time should be either an accurate time or should not be modification time should be either an accurate time or should not be
supported by the server. This will not always be comfortable to supported by the server. This will not always be comfortable to
clients but the client is better positioned decide whether and how to clients but the client is better positioned decide whether and how to
fabricate or construct an attribute or whether to do without the fabricate or construct an attribute or whether to do without the
attribute. attribute.
5.3. Named Attributes 5.3. Named Attributes
These attributes are not supported by direct encoding in the NFS These attributes are not supported by direct encoding in the NFSv4
Version 4 protocol but are accessed by string names rather than protocol but are accessed by string names rather than numbers and
numbers and correspond to an uninterpreted stream of bytes which are correspond to an uninterpreted stream of bytes which are stored with
stored with the filesystem object. The name space for these the file system object. The name space for these attributes may be
attributes may be accessed by using the OPENATTR operation. The accessed by using the OPENATTR operation. The OPENATTR operation
OPENATTR operation returns a filehandle for a virtual "attribute returns a filehandle for a virtual "named attribute directory" and
directory" and further perusal of the name space may be done using further perusal and modification of the name space may be done using
READDIR and LOOKUP operations on this filehandle. Named attributes operations that work on more typical directories. In particular,
may then be examined or changed by normal READ and WRITE and CREATE READDIR may be used to get a list of such named attributes and LOOKUP
operations on the filehandles returned from READDIR and LOOKUP. and OPEN may select a particular attribute. Creation of a new named
Named attributes may have attributes. attribute may be the result of an OPEN specifying file creation.
It is recommended that servers support arbitrary named attributes. A Once an OPEN is done, named attributes may be examined and changed by
normal READ and WRITE operations using the filehandles and stateids
returned by OPEN.
Named attributes and the named attribute directory may have their own
(non-named) attributes. Each of objects must have all of the
REQUIRED attributes and may have additional RECOMMENDED attributes.
However, the set of attributes for named attributes and the named
attribute directory need not be as large as, and typically will not
be as large as that for other objects in that file system.
Named attributes and the named attribute directory may be the target
of delegations (in the case of the named attribute directory these
will be directory delegations). However, since granting of
delegations or not is within the server's discretion, a server need
not support delegations on named attributes or the named attribute
directory.
It is RECOMMENDED that servers support arbitrary named attributes. A
client should not depend on the ability to store any named attributes client should not depend on the ability to store any named attributes
in the server's filesystem. If a server does support named in the server's file system. If a server does support named
attributes, a client which is also able to handle them should be able attributes, a client which is also able to handle them should be able
to copy a file's data and meta-data with complete transparency from to copy a file's data and metadata with complete transparency from
one location to another; this would imply that names allowed for one location to another; this would imply that names allowed for
regular directory entries are valid for named attribute names as regular directory entries are valid for named attribute names as
well. well.
In NFSv4.0, the structure of named attribute directories is
restricted in a number of ways, in order to prevent the development
of non-interoperable implementations in which some servers support a
fully general hierarchical directory structure for named attributes
while others support a limited set, but fully adequate to the
feature's goals. In such an environment, clients or applications
might come to depend on non-portable extensions. The restrictions
are:
o CREATE is not allowed in a named attribute directory. Thus, such
objects as symbolic links and special files are not allowed to be
named attributes. Further, directories may not be created in a
named attribute directory so no hierarchical structure of named
attributes for a single object is allowed.
o If OPENATTR is done on a named attribute directory or on a named
attribute, the server MUST return NFS4ERR_WRONG_TYPE.
o Doing a RENAME of a named attribute to a different named attribute
directory or to an ordinary (i.e. non-named-attribute) directory
is not allowed.
o Creating hard links between named attribute directories or between
named attribute directories and ordinary directories is not
allowed.
Names of attributes will not be controlled by this document or other Names of attributes will not be controlled by this document or other
IETF standards track documents. See Section 17 "IANA Considerations" IETF standards track documents. See Section 18 for further
for further discussion. discussion.
5.4. Classification of Attributes 5.4. Classification of Attributes
Each of the Mandatory and Recommended attributes can be classified in Each of the REQUIRED and RECOMMENDED attributes can be classified in
one of three categories: per server, per filesystem, or per one of three categories: per server, per file system, or per file
filesystem object. Note that it is possible that some per filesystem system object. Note that it is possible that some per file system
attributes may vary within the filesystem. See the "homogeneous" attributes may vary within the file system. See the "homogeneous"
attribute for its definition. Note that the attributes attribute for its definition. Note that the attributes
time_access_set and time_modify_set are not listed in this section time_access_set and time_modify_set are not listed in this section
because they are write-only attributes corresponding to time_access because they are write-only attributes corresponding to time_access
and time_modify, and are used in a special instance of SETATTR. and time_modify, and are used in a special instance of SETATTR.
o The per server attribute is: o The per server attribute is:
lease_time lease_time
o The per filesystem attributes are: o The per file system attributes are:
supp_attr, fh_expire_type, link_support, symlink_support, supported_attrs, fh_expire_type, link_support, symlink_support,
unique_handles, aclsupport, cansettime, case_insensitive, unique_handles, aclsupport, cansettime, case_insensitive,
case_preserving, chown_restricted, files_avail, files_free, case_preserving, chown_restricted, files_avail, files_free,
files_total, fs_locations, homogeneous, maxfilesize, maxname, files_total, fs_locations, homogeneous, maxfilesize, maxname,
maxread, maxwrite, no_trunc, space_avail, space_free, space_total, maxread, maxwrite, no_trunc, space_avail, space_free,
time_delta space_total, time_delta,
o The per filesystem object attributes are: o The per file system object attributes are:
type, change, size, named_attr, fsid, rdattr_error, filehandle, type, change, size, named_attr, fsid, rdattr_error, filehandle,
ACL, archive, fileid, hidden, maxlink, mimetype, mode, numlinks, acl, archive, fileid, hidden, maxlink, mimetype, mode,
owner, owner_group, rawdev, space_used, system, time_access, numlinks, owner, owner_group, rawdev, space_used, system,
time_backup, time_create, time_metadata, time_modify, time_access, time_backup, time_create, time_metadata,
mounted_on_fileid time_modify, mounted_on_fileid
For quota_avail_hard, quota_avail_soft, and quota_used see their For quota_avail_hard, quota_avail_soft, and quota_used see their
definitions below for the appropriate classification. definitions below for the appropriate classification.
5.5. Mandatory Attributes - Definitions 5.5. Set-Only and Get-Only Attributes
+-----------------+----+------------+--------+----------------------+ Some REQUIRED and RECOMMENDED attributes are set-only, i.e. they can
| Name | Id | Data Type | Access | Description | be set via SETATTR but not retrieved via GETATTR. Similarly, some
+-----------------+----+------------+--------+----------------------+ REQUIRED and RECOMMENDED attributes are get-only, i.e. they can be
| supp_attr | 0 | bitmap | READ | The bit vector which | retrieved GETATTR but not set via SETATTR. If a client attempts to
| | | | | would retrieve all | set a get-only attribute or get a set-only attributes, the server
| | | | | mandatory and | MUST return NFS4ERR_INVAL.
| | | | | recommended |
| | | | | attributes that are | 5.6. REQUIRED Attributes - List and Definition References
| | | | | supported for this |
| | | | | object. The scope of | The list of REQUIRED attributes appears in Table 2. The meaning of
| | | | | this attribute | the columns of the table are:
| | | | | applies to all |
| | | | | objects with a | o Name: the name of attribute
| | | | | matching fsid. |
| type | 1 | nfs4_ftype | READ | The type of the | o Id: the number assigned to the attribute. In the event of
| | | | | object (file, | conflicts between the assigned number and [2], the latter is
| | | | | directory, symlink, | authoritative.
| | | | | etc.) |
| fh_expire_type | 2 | uint32 | READ | Server uses this to | o Data Type: The XDR data type of the attribute.
| | | | | specify filehandle |
| | | | | expiration behavior | o Acc: Access allowed to the attribute. R means read-only (GETATTR
| | | | | to the client. See | may retrieve, SETATTR may not set). W means write-only (SETATTR
| | | | | Section 4 | may set, GETATTR may not retrieve). R W means read/write (GETATTR
| | | | | "Filehandles" for | may retrieve, SETATTR may set).
| | | | | additional |
| | | | | description. | o Defined in: the section of this specification that describes the
| change | 3 | uint64 | READ | A value created by | attribute.
| | | | | the server that the |
| | | | | client can use to | +-----------------+----+------------+-----+------------------+
| | | | | determine if file | | Name | Id | Data Type | Acc | Defined in: |
| | | | | data, directory | +-----------------+----+------------+-----+------------------+
| | | | | contents or | | supported_attrs | 0 | bitmap4 | R | Section 5.8.1.1 |
| | | | | attributes of the | | type | 1 | nfs_ftype4 | R | Section 5.8.1.2 |
| | | | | object have been | | fh_expire_type | 2 | uint32_t | R | Section 5.8.1.3 |
| | | | | modified. The server | | change | 3 | uint64_t | R | Section 5.8.1.4 |
| | | | | may return the | | size | 4 | uint64_t | R W | Section 5.8.1.5 |
| | | | | object's | | link_support | 5 | bool | R | Section 5.8.1.6 |
| | | | | time_metadata | | symlink_support | 6 | bool | R | Section 5.8.1.7 |
| | | | | attribute for this | | named_attr | 7 | bool | R | Section 5.8.1.8 |
| | | | | attribute's value | | fsid | 8 | fsid4 | R | Section 5.8.1.9 |
| | | | | but only if the | | unique_handles | 9 | bool | R | Section 5.8.1.10 |
| | | | | filesystem object | | lease_time | 10 | nfs_lease4 | R | Section 5.8.1.11 |
| | | | | can not be updated | | rdattr_error | 11 | enum | R | Section 5.8.1.12 |
| | | | | more frequently than | | filehandle | 19 | nfs_fh4 | R | Section 5.8.1.13 |
| | | | | the resolution of | +-----------------+----+------------+-----+------------------+
| | | | | time_metadata. |
| size | 4 | uint64 | R/W | The size of the |
| | | | | object in bytes. |
| link_support | 5 | bool | READ | True, if the |
| | | | | object's filesystem |
| | | | | supports hard links. |
| symlink_support | 6 | bool | READ | True, if the |
| | | | | object's filesystem |
| | | | | supports symbolic |
| | | | | links. |
| named_attr | 7 | bool | READ | True, if this object |
| | | | | has named |
| | | | | attributes. In other |
| | | | | words, object has a |
| | | | | non-empty named |
| | | | | attribute directory. |
| fsid | 8 | fsid4 | READ | Unique filesystem |
| | | | | identifier for the |
| | | | | filesystem holding |
| | | | | this object. fsid |
| | | | | contains major and |
| | | | | minor components |
| | | | | each of which are |
| | | | | uint64. |
| unique_handles | 9 | bool | READ | True, if two |
| | | | | distinct filehandles |
| | | | | guaranteed to refer |
| | | | | to two different |
| | | | | filesystem objects. |
| lease_time | 10 | nfs_lease4 | READ | Duration of leases |
| | | | | at server in |
| | | | | seconds. |
| rdattr_error | 11 | enum | READ | Error returned from |
| | | | | getattr during |
| | | | | readdir. |
| filehandle | 19 | nfs_fh4 | READ | The filehandle of |
| | | | | this object |
| | | | | (primarily for |
| | | | | readdir requests). |
+-----------------+----+------------+--------+----------------------+
Table 2 Table 2
5.6. Recommended Attributes - Definitions 5.7. RECOMMENDED Attributes - List and Definition References
+-------------------+----+--------------+--------+------------------+ The RECOMMENDED attributes are defined in Table 3. The meanings of
| Name | Id | Data Type | Access | Description | the column headers are the same as Table 2; see Section 5.6 for the
+-------------------+----+--------------+--------+------------------+ meanings.
| ACL | 12 | nfsace4<> | R/W | The access |
| | | | | control list for | +-------------------+----+--------------+-----+------------------+
| | | | | the object. | | Name | Id | Data Type | Acc | Defined in: |
| aclsupport | 13 | uint32 | READ | Indicates what | +-------------------+----+--------------+-----+------------------+
| | | | | types of ACLs | | acl | 12 | nfsace4<> | R W | Section 6.2.1 |
| | | | | are supported on | | aclsupport | 13 | uint32_t | R | Section 6.2.1.2 |
| | | | | the current | | archive | 14 | bool | R W | Section 5.8.2.1 |
| | | | | filesystem. | | cansettime | 15 | bool | R | Section 5.8.2.2 |
| archive | 14 | bool | R/W | True, if this | | case_insensitive | 16 | bool | R | Section 5.8.2.3 |
| | | | | file has been | | case_preserving | 17 | bool | R | Section 5.8.2.4 |
| | | | | archived since | | chown_restricted | 18 | bool | R | Section 5.8.2.5 |
| | | | | the time of last | | fileid | 20 | uint64_t | R | Section 5.8.2.6 |
| | | | | modification | | files_avail | 21 | uint64_t | R | Section 5.8.2.7 |
| | | | | (deprecated in | | files_free | 22 | uint64_t | R | Section 5.8.2.8 |
| | | | | favor of | | files_total | 23 | uint64_t | R | Section 5.8.2.9 |
| | | | | time_backup). | | fs_locations | 24 | fs_locations | R | Section 5.8.2.10 |
| cansettime | 15 | bool | READ | True, if the | | hidden | 25 | bool | R W | Section 5.8.2.11 |
| | | | | server is able | | homogeneous | 26 | bool | R | Section 5.8.2.12 |
| | | | | to change the | | maxfilesize | 27 | uint64_t | R | Section 5.8.2.13 |
| | | | | times for a | | maxlink | 28 | uint32_t | R | Section 5.8.2.14 |
| | | | | filesystem | | maxname | 29 | uint32_t | R | Section 5.8.2.15 |
| | | | | object as | | maxread | 30 | uint64_t | R | Section 5.8.2.16 |
| | | | | specified in a | | maxwrite | 31 | uint64_t | R | Section 5.8.2.17 |
| | | | | SETATTR | | mimetype | 32 | utf8<> | R W | Section 5.8.2.18 |
| | | | | operation. | | mode | 33 | mode4 | R W | Section 6.2.2 |
| case_insensitive | 16 | bool | READ | True, if | | mounted_on_fileid | 55 | uint64_t | R | Section 5.8.2.19 |
| | | | | filename | | no_trunc | 34 | bool | R | Section 5.8.2.20 |
| | | | | comparisons on | | numlinks | 35 | uint32_t | R | Section 5.8.2.21 |
| | | | | this filesystem | | owner | 36 | utf8<> | R W | Section 5.8.2.22 |
| | | | | are case | | owner_group | 37 | utf8<> | R W | Section 5.8.2.23 |
| | | | | insensitive. | | quota_avail_hard | 38 | uint64_t | R | Section 5.8.2.24 |
| case_preserving | 17 | bool | READ | True, if | | quota_avail_soft | 39 | uint64_t | R | Section 5.8.2.25 |
| | | | | filename case on | | quota_used | 40 | uint64_t | R | Section 5.8.2.26 |
| | | | | this filesystem | | rawdev | 41 | specdata4 | R | Section 5.8.2.27 |
| | | | | are preserved. | | space_avail | 42 | uint64_t | R | Section 5.8.2.28 |
| chown_restricted | 18 | bool | READ | If TRUE, the | | space_free | 43 | uint64_t | R | Section 5.8.2.29 |
| | | | | server will | | space_total | 44 | uint64_t | R | Section 5.8.2.30 |
| | | | | reject any | | space_used | 45 | uint64_t | R | Section 5.8.2.31 |
| | | | | request to | | system | 46 | bool | R W | Section 5.8.2.32 |
| | | | | change either | | time_access | 47 | nfstime4 | R | Section 5.8.2.33 |
| | | | | the owner or the | | time_access_set | 48 | settime4 | W | Section 5.8.2.34 |
| | | | | group associated | | time_backup | 49 | nfstime4 | R W | Section 5.8.2.35 |
| | | | | with a file if | | time_create | 50 | nfstime4 | R W | Section 5.8.2.36 |
| | | | | the caller is | | time_delta | 51 | nfstime4 | R | Section 5.8.2.37 |
| | | | | not a privileged | | time_metadata | 52 | nfstime4 | R | Section 5.8.2.38 |
| | | | | user (for | | time_modify | 53 | nfstime4 | R | Section 5.8.2.39 |
| | | | | example, "root" | | time_modify_set | 54 | settime4 | W | Section 5.8.2.40 |
| | | | | in UNIX | +-------------------+----+--------------+-----+------------------+
| | | | | operating |
| | | | | environments or |
| | | | | in Windows 2000 |
| | | | | the "Take |
| | | | | Ownership" |
| | | | | privilege). |
| fileid | 20 | uint64 | READ | A number |
| | | | | uniquely |
| | | | | identifying the |
| | | | | file within the |
| | | | | filesystem. |
| files_avail | 21 | uint64 | READ | File slots |
| | | | | available to |
| | | | | this user on the |
| | | | | filesystem |
| | | | | containing this |
| | | | | object - this |
| | | | | should be the |
| | | | | smallest |
| | | | | relevant limit. |
| files_free | 22 | uint64 | READ | Free file slots |
| | | | | on the |
| | | | | filesystem |
| | | | | containing this |
| | | | | object - this |
| | | | | should be the |
| | | | | smallest |
| | | | | relevant limit. |
| files_total | 23 | uint64 | READ | Total file slots |
| | | | | on the |
| | | | | filesystem |
| | | | | containing this |
| | | | | object. |
| fs_locations | 24 | fs_locations | READ | Locations where |
| | | | | this filesystem |
| | | | | may be found. If |
| | | | | the server |
| | | | | returns |
| | | | | NFS4ERR_MOVED as |
| | | | | an error, this |
| | | | | attribute MUST |
| | | | | be supported. |
| hidden | 25 | bool | R/W | True, if the |
| | | | | file is |
| | | | | considered |
| | | | | hidden with |
| | | | | respect to the |
| | | | | Windows API. |
| homogeneous | 26 | bool | READ | True, if this |
| | | | | object's |
| | | | | filesystem is |
| | | | | homogeneous, |
| | | | | i.e., are per |
| | | | | filesystem |
| | | | | attributes the |
| | | | | same for all |
| | | | | filesystem's |
| | | | | objects? |
| maxfilesize | 27 | uint64 | READ | Maximum |
| | | | | supported file |
| | | | | size for the |
| | | | | filesystem of |
| | | | | this object. |
| maxlink | 28 | uint32 | READ | Maximum number |
| | | | | of links for |
| | | | | this object. |
| maxname | 29 | uint32 | READ | Maximum filename |
| | | | | size supported |
| | | | | for this object. |
| maxread | 30 | uint64 | READ | Maximum read |
| | | | | size supported |
| | | | | for this object. |
| maxwrite | 31 | uint64 | READ | Maximum write |
| | | | | size supported |
| | | | | for this object. |
| | | | | This attribute |
| | | | | SHOULD be |
| | | | | supported if the |
| | | | | file is |
| | | | | writable. Lack |
| | | | | of this |
| | | | | attribute can |
| | | | | lead to the |
| | | | | client either |
| | | | | wasting |
| | | | | bandwidth or not |
| | | | | receiving the |
| | | | | best |
| | | | | performance. |
| mimetype | 32 | utf8<> | R/W | MIME body |
| | | | | type/subtype of |
| | | | | this object. |
| mode | 33 | mode4 | R/W | UNIX-style mode |
| | | | | and permission |
| | | | | bits for this |
| | | | | object. |
| no_trunc | 34 | bool | READ | True, if a name |
| | | | | longer than |
| | | | | name_max is |
| | | | | used, an error |
| | | | | be returned and |
| | | | | name is not |
| | | | | truncated. |
| numlinks | 35 | uint32 | READ | Number of hard |
| | | | | links to this |
| | | | | object. |
| owner | 36 | utf8<> | R/W | The string name |
| | | | | of the owner of |
| | | | | this object. |
| owner_group | 37 | utf8<> | R/W | The string name |
| | | | | of the group |
| | | | | ownership of |
| | | | | this object. |
| quota_avail_hard | 38 | uint64 | READ | For definition |
| | | | | see Section 5.10 |
| | | | | "Quota |
| | | | | Attributes" |
| | | | | below. |
| quota_avail_soft | 39 | uint64 | READ | For definition |
| | | | | see Section 5.10 |
| | | | | "Quota |
| | | | | Attributes" |
| | | | | below. |
| quota_used | 40 | uint64 | READ | For definition |
| | | | | see Section 5.10 |
| | | | | "Quota |
| | | | | Attributes" |
| | | | | below. |
| rawdev | 41 | specdata4 | READ | Raw device |
| | | | | identifier. UNIX |
| | | | | device |
| | | | | major/minor node |
| | | | | information. If |
| | | | | the value of |
| | | | | type is not |
| | | | | NF4BLK or |
| | | | | NF4CHR, the |
| | | | | value return |
| | | | | SHOULD NOT be |
| | | | | considered |
| | | | | useful. |
| space_avail | 42 | uint64 | READ | Disk space in |
| | | | | bytes available |
| | | | | to this user on |
| | | | | the filesystem |
| | | | | containing this |
| | | | | object - this |
| | | | | should be the |
| | | | | smallest |
| | | | | relevant limit. |
| space_free | 43 | uint64 | READ | Free disk space |
| | | | | in bytes on the |
| | | | | filesystem |
| | | | | containing this |
| | | | | object - this |
| | | | | should be the |
| | | | | smallest |
| | | | | relevant limit. |
| space_total | 44 | uint64 | READ | Total disk space |
| | | | | in bytes on the |
| | | | | filesystem |
| | | | | containing this |
| | | | | object. |
| space_used | 45 | uint64 | READ | Number of |
| | | | | filesystem bytes |
| | | | | allocated to |
| | | | | this object. |
| system | 46 | bool | R/W | True, if this |
| | | | | file is a |
| | | | | "system" file |
| | | | | with respect to |
| | | | | the Windows API. |
| time_access | 47 | nfstime4 | READ | The time of last |
| | | | | access to the |
| | | | | object by a read |
| | | | | that was |
| | | | | satisfied by the |
| | | | | server. |
| time_access_set | 48 | settime4 | WRITE | Set the time of |
| | | | | last access to |
| | | | | the object. |
| | | | | SETATTR use |
| | | | | only. |
| time_backup | 49 | nfstime4 | R/W | The time of last |
| | | | | backup of the |
| | | | | object. |
| time_create | 50 | nfstime4 | R/W | The time of |
| | | | | creation of the |
| | | | | object. This |
| | | | | attribute does |
| | | | | not have any |
| | | | | relation to the |
| | | | | traditional UNIX |
| | | | | file attribute |
| | | | | "ctime" or |
| | | | | "change time". |
| time_delta | 51 | nfstime4 | READ | Smallest useful |
| | | | | server time |
| | | | | granularity. |
| time_metadata | 52 | nfstime4 | READ | The time of last |
| | | | | meta-data |
| | | | | modification of |
| | | | | the object. |
| time_modify | 53 | nfstime4 | READ | The time of last |
| | | | | modification to |
| | | | | the object. |
| time_modify_set | 54 | settime4 | WRITE | Set the time of |
| | | | | last |
| | | | | modification to |
| | | | | the object. |
| | | | | SETATTR use |
| | | | | only. |
| mounted_on_fileid | 55 | uint64 | READ | Like fileid, but |
| | | | | if the target |
| | | | | filehandle is |
| | | | | the root of a |
| | | | | filesystem |
| | | | | return the |
| | | | | fileid of the |
| | | | | underlying |
| | | | | directory. |
+-------------------+----+--------------+--------+------------------+
Table 3 Table 3
5.7. Time Access 5.8. Attribute Definitions
As defined above, the time_access attribute represents the time of 5.8.1. Definitions of REQUIRED Attributes
last access to the object by a read that was satisfied by the server.
The notion of what is an "access" depends on server's operating
environment and/or the server's filesystem semantics. For example,
for servers obeying POSIX semantics, time_access would be updated
only by the READLINK, READ, and READDIR operations and not any of the
operations that modify the content of the object. Of course, setting
the corresponding time_access_set attribute is another way to modify
the time_access attribute.
Whenever the file object resides on a writable filesystem, the server 5.8.1.1. Attribute 0: supported_attrs
should make best efforts to record time_access into stable storage.
However, to mitigate the performance effects of doing so, and most
especially whenever the server is satisfying the read of the object's
content from its cache, the server MAY cache access time updates and
lazily write them to stable storage. It is also acceptable to give
administrators of the server the option to disable time_access
updates.
5.8. Interpreting owner and owner_group The bit vector which would retrieve all REQUIRED and RECOMMENDED
attributes that are supported for this object. The scope of this
attribute applies to all objects with a matching fsid.
The recommended attributes "owner" and "owner_group" (and also users 5.8.1.2. Attribute 1: type
Designates the type of an object in terms of one of a number of
special constants:
o NF4REG designates a regular file.
o NF4DIR designates a directory.
o NF4BLK designates a block device special file.
o NF4CHR designates a character device special file.
o NF4LNK designates a symbolic link.
o NF4SOCK designates a named socket special file.
o NF4FIFO designates a fifo special file.
o NF4ATTRDIR designates a named attribute directory.
o NF4NAMEDATTR designates a named attribute.
Within the explanatory text and operation descriptions, the following
phrases will be used with the meanings given below:
o The phrase "is a directory" means that the object is of type
NF4DIR or of type NF4ATTRDIR.
o The phrase "is a special file" means that the object is of one of
the types NF4BLK, NF4CHR, NF4SOCK, or NF4FIFO.
o The phrase "is an ordinary file" means that the object is of type
NF4REG or of type NF4NAMEDATTR.
5.8.1.3. Attribute 2: fh_expire_type
Server uses this to specify filehandle expiration behavior to the
client. See Section 4 for additional description.
5.8.1.4. Attribute 3: change
A value created by the server that the client can use to determine if
file data, directory contents or attributes of the object have been
modified. The server may return the object's time_metadata attribute
for this attribute's value but only if the file system object can not
be updated more frequently than the resolution of time_metadata.
5.8.1.5. Attribute 4: size
The size of the object in bytes.
5.8.1.6. Attribute 5: link_support
True, if the object's file system supports hard links.
5.8.1.7. Attribute 6: symlink_support
True, if the object's file system supports symbolic links.
5.8.1.8. Attribute 7: named_attr
True, if this object has named attributes. In other words, object
has a non-empty named attribute directory.
5.8.1.9. Attribute 8: fsid
Unique file system identifier for the file system holding this
object. fsid contains major and minor components each of which are of
data type uint64_t.
5.8.1.10. Attribute 9: unique_handles
True, if two distinct filehandles guaranteed to refer to two
different file system objects.
5.8.1.11. Attribute 10: lease_time
Duration of leases at server in seconds.
5.8.1.12. Attribute 11: rdattr_error
Error returned from an attempt to retrieve attributes during a
READDIR operation.
5.8.1.13. Attribute 19: filehandle
The filehandle of this object (primarily for READDIR requests).
5.8.2. Definitions of Uncategorized RECOMMENDED Attributes
The definitions of most of the RECOMMENDED attributes follow.
Collections that share a common category are defined in other
sections.
5.8.2.1. Attribute 14: archive
True, if this file has been archived since the time of last
modification (deprecated in favor of time_backup).
5.8.2.2. Attribute 15: cansettime
True, if the server able to change the times for a file system object
as specified in a SETATTR operation.
5.8.2.3. Attribute 16: case_insensitive
True, if file name comparisons on this file system are case
insensitive.
5.8.2.4. Attribute 17: case_preserving
True, if file name case on this file system is preserved.
5.8.2.5. Attribute 18: chown_restricted
If TRUE, the server will reject any request to change either the
owner or the group associated with a file if the caller is not a
privileged user (for example, "root" in UNIX operating environments
or in Windows 2000 the "Take Ownership" privilege).
5.8.2.6. Attribute 20: fileid
A number uniquely identifying the file within the file system.
5.8.2.7. Attribute 21: files_avail
File slots available to this user on the file system containing this
object - this should be the smallest relevant limit.
5.8.2.8. Attribute 22: files_free
Free file slots on the file system containing this object - this
should be the smallest relevant limit.
5.8.2.9. Attribute 23: files_total
Total file slots on the file system containing this object.
5.8.2.10. Attribute 24: fs_locations
Locations where this file system may be found. If the server returns
NFS4ERR_MOVED as an error, this attribute MUST be supported.
5.8.2.11. Attribute 25: hidden
True, if the file is considered hidden with respect to the Windows
API.
5.8.2.12. Attribute 26: homogeneous
True, if this object's file system is homogeneous, i.e. are per file
system attributes the same for all file system's objects.
5.8.2.13. Attribute 27: maxfilesize
Maximum supported file size for the file system of this object.
5.8.2.14. Attribute 28: maxlink
Maximum number of links for this object.
5.8.2.15. Attribute 29: maxname
Maximum file name size supported for this object.
5.8.2.16. Attribute 30: maxread
Maximum read size supported for this object.
5.8.2.17. Attribute 31: maxwrite
Maximum write size supported for this object. This attribute SHOULD
be supported if the file is writable. Lack of this attribute can
lead to the client either wasting bandwidth or not receiving the best
performance.
5.8.2.18. Attribute 32: mimetype
MIME body type/subtype of this object.
5.8.2.19. Attribute 55: mounted_on_fileid
Like fileid, but if the target filehandle is the root of a file
system, this attribute represents the fileid of the underlying
directory.
UNIX-based operating environments connect a file system into the
namespace by connecting (mounting) the file system onto the existing
file object (the mount point, usually a directory) of an existing
file system. When the mount point's parent directory is read via an
API like readdir(), the return results are directory entries, each
with a component name and a fileid. The fileid of the mount point's
directory entry will be different from the fileid that the stat()
system call returns. The stat() system call is returning the fileid
of the root of the mounted file system, whereas readdir() is
returning the fileid stat() would have returned before any file
systems were mounted on the mount point.
Unlike NFSv3, NFSv4.0 allows a client's LOOKUP request to cross other
file systems. The client detects the file system crossing whenever
the filehandle argument of LOOKUP has an fsid attribute different
from that of the filehandle returned by LOOKUP. A UNIX-based client
will consider this a "mount point crossing". UNIX has a legacy
scheme for allowing a process to determine its current working
directory. This relies on readdir() of a mount point's parent and
stat() of the mount point returning fileids as previously described.
The mounted_on_fileid attribute corresponds to the fileid that
readdir() would have returned as described previously.
While the NFSv4.0 client could simply fabricate a fileid
corresponding to what mounted_on_fileid provides (and if the server
does not support mounted_on_fileid, the client has no choice), there
is a risk that the client will generate a fileid that conflicts with
one that is already assigned to another object in the file system.
Instead, if the server can provide the mounted_on_fileid, the
potential for client operational problems in this area is eliminated.
If the server detects that there is no mounted point at the target
file object, then the value for mounted_on_fileid that it returns is
the same as that of the fileid attribute.
The mounted_on_fileid attribute is RECOMMENDED, so the server SHOULD
provide it if possible, and for a UNIX-based server, this is
straightforward. Usually, mounted_on_fileid will be requested during
a READDIR operation, in which case it is trivial (at least for UNIX-
based servers) to return mounted_on_fileid since it is equal to the
fileid of a directory entry returned by readdir(). If
mounted_on_fileid is requested in a GETATTR operation, the server
should obey an invariant that has it returning a value that is equal
to the file object's entry in the object's parent directory, i.e.
what readdir() would have returned. Some operating environments
allow a series of two or more file systems to be mounted onto a
single mount point. In this case, for the server to obey the
aforementioned invariant, it will need to find the base mount point,
and not the intermediate mount points.
5.8.2.20. Attribute 34: no_trunc
If this attribute is TRUE, then if the client uses a file name longer
than name_max, an error will be returned instead of the name being
truncated.
5.8.2.21. Attribute 35: numlinks
Number of hard links to this object.
5.8.2.22. Attribute 36: owner
The string name of the owner of this object.
5.8.2.23. Attribute 37: owner_group
The string name of the group ownership of this object.
5.8.2.24. Attribute 38: quota_avail_hard
The value in bytes which represents the amount of additional disk
space beyond the current allocation that can be allocated to this
file or directory before further allocations will be refused. It is
understood that this space may be consumed by allocations to other
files or directories.
5.8.2.25. Attribute 39: quota_avail_soft
The value in bytes which represents the amount of additional disk
space that can be allocated to this file or directory before the user
may reasonably be warned. It is understood that this space may be
consumed by allocations to other files or directories though there is
a rule as to which other files or directories.
5.8.2.26. Attribute 40: quota_used
The value in bytes which represent the amount of disc space used by
this file or directory and possibly a number of other similar files
or directories, where the set of "similar" meets at least the
criterion that allocating space to any file or directory in the set
will reduce the "quota_avail_hard" of every other file or directory
in the set.
Note that there may be a number of distinct but overlapping sets of
files or directories for which a quota_used value is maintained.
E.g. "all files with a given owner", "all files with a given group
owner". etc.
The server is at liberty to choose any of those sets but should do so
in a repeatable way. The rule may be configured per file system or
may be "choose the set with the smallest quota".
5.8.2.27. Attribute 41: rawdev
Raw device identifier; the UNIX device major/minor node information.
If the value of type is not NF4BLK or NF4CHR, the value returned
SHOULD NOT be considered useful.
5.8.2.28. Attribute 42: space_avail
Disk space in bytes available to this user on the file system
containing this object - this should be the smallest relevant limit.
5.8.2.29. Attribute 43: space_free
Free disk space in bytes on the file system containing this object -
this should be the smallest relevant limit.
5.8.2.30. Attribute 44: space_total
Total disk space in bytes on the file system containing this object.
5.8.2.31. Attribute 45: space_used
Number of file system bytes allocated to this object.
5.8.2.32. Attribute 46: system
This attribute is TRUE if this file is a "system" file with respect
to the Windows operating environment.
5.8.2.33. Attribute 47: time_access
The time_access attribute represents the time of last access to the
object by a read that was satisfied by the server. The notion of
what is an "access" depends on server's operating environment and/or
the server's file system semantics. For example, for servers obeying
POSIX semantics, time_access would be updated only by the READLINK,
READ, and READDIR operations and not any of the operations that
modify the content of the object. Of course, setting the
corresponding time_access_set attribute is another way to modify the
time_access attribute.
Whenever the file object resides on a writable file system, the
server should make best efforts to record time_access into stable
storage. However, to mitigate the performance effects of doing so,
and most especially whenever the server is satisfying the read of the
object's content from its cache, the server MAY cache access time
updates and lazily write them to stable storage. It is also
acceptable to give administrators of the server the option to disable
time_access updates.
5.8.2.34. Attribute 48: time_access_set
Set the time of last access to the object. SETATTR use only.
5.8.2.35. Attribute 49: time_backup
The time of last backup of the object.
5.8.2.36. Attribute 50: time_create
The time of creation of the object. This attribute does not have any
relation to the traditional UNIX file attribute "ctime" or "change
time".
5.8.2.37. Attribute 51: time_delta
Smallest useful server time granularity.
5.8.2.38. Attribute 52: time_metadata
The time of last metadata modification of the object.
5.8.2.39. Attribute 53: time_modify
The time of last modification to the object.
5.8.2.40. Attribute 54: time_modify_set
Set the time of last modification to the object. SETATTR use only.
5.9. Interpreting owner and owner_group
The RECOMMENDED attributes "owner" and "owner_group" (and also users
and groups within the "acl" attribute) are represented in terms of a and groups within the "acl" attribute) are represented in terms of a
UTF-8 string. To avoid a representation that is tied to a particular UTF-8 string. To avoid a representation that is tied to a particular
underlying implementation at the client or server, the use of the underlying implementation at the client or server, the use of the
UTF-8 string has been chosen. Note that section 6.1 of [24] provides UTF-8 string has been chosen. Note that section 6.1 of RFC2624 [25]
additional rationale. It is expected that the client and server will provides additional rationale. It is expected that the client and
have their own local representation of owner and owner_group that is server will have their own local representation of owner and
used for local storage or presentation to the end user. Therefore, owner_group that is used for local storage or presentation to the end
it is expected that when these attributes are transferred between the user. Therefore, it is expected that when these attributes are
client and server that the local representation is translated to a transferred between the client and server that the local
syntax of the form "user@dns_domain". This will allow for a client representation is translated to a syntax of the form "user@
and server that do not use the same local representation the ability dns_domain". This will allow for a client and server that do not use
to translate to a common syntax that can be interpreted by both. the same local representation the ability to translate to a common
syntax that can be interpreted by both.
Similarly, security principals may be represented in different ways Similarly, security principals may be represented in different ways
by different security mechanisms. Servers normally translate these by different security mechanisms. Servers normally translate these
representations into a common format, generally that used by local representations into a common format, generally that used by local
storage, to serve as a means of identifying the users corresponding storage, to serve as a means of identifying the users corresponding
to these security principals. When these local identifiers are to these security principals. When these local identifiers are
translated to the form of the owner attribute, associated with files translated to the form of the owner attribute, associated with files
created by such principals they identify, in a common format, the created by such principals they identify, in a common format, the
users associated with each corresponding set of security principals. users associated with each corresponding set of security principals.
The translation used to interpret owner and group strings is not The translation used to interpret owner and group strings is not
specified as part of the protocol. This allows various solutions to specified as part of the protocol. This allows various solutions to
be employed. For example, a local translation table may be consulted be employed. For example, a local translation table may be consulted
that maps between a numeric id to the user@dns_domain syntax. A name that maps between a numeric identifier to the user@dns_domain syntax.
service may also be used to accomplish the translation. A server may A name service may also be used to accomplish the translation. A
provide a more general service, not limited by any particular server may provide a more general service, not limited by any
translation (which would only translate a limited set of possible particular translation (which would only translate a limited set of
strings) by storing the owner and owner_group attributes in local possible strings) by storing the owner and owner_group attributes in
storage without any translation or it may augment a translation local storage without any translation or it may augment a translation
method by storing the entire string for attributes for which no method by storing the entire string for attributes for which no
translation is available while using the local representation for translation is available while using the local representation for
those cases in which a translation is available. those cases in which a translation is available.
Servers that do not provide support for all possible values of the Servers that do not provide support for all possible values of the
owner and owner_group attributes, should return an error owner and owner_group attributes, SHOULD return an error
(NFS4ERR_BADOWNER) when a string is presented that has no (NFS4ERR_BADOWNER) when a string is presented that has no
translation, as the value to be set for a SETATTR of the owner, translation, as the value to be set for a SETATTR of the owner,
owner_group, or acl attributes. When a server does accept an owner owner_group, or acl attributes. When a server does accept an owner
or owner_group value as valid on a SETATTR (and similarly for the or owner_group value as valid on a SETATTR (and similarly for the
owner and group strings in an acl), it is promising to return that owner and group strings in an acl), it needs to try to return that
same string when a corresponding GETATTR is done. Configuration same string for which see below) when a corresponding GETATTR is
changes and ill-constructed name translations (those that contain done. For some internationalization-related exceptions where this is
aliasing) may make that promise impossible to honor. Servers should not possible, see below. Configuration changes (including changes
make appropriate efforts to avoid a situation in which these from the mapping of the string to the local representation) and ill-
attributes have their values changed when no real change to ownership constructed name translations (those that contain aliasing) may make
has occurred. that promise impossible to honor. Servers should make appropriate
efforts to avoid a situation in which these attributes have their
values changed when no real change to ownership has occurred.
The "dns_domain" portion of the owner string is meant to be a DNS The "dns_domain" portion of the owner string is meant to be a DNS
domain name. For example, user@ietf.org. Servers should accept as domain name. For example, user@ietf.org. Servers should accept as
valid a set of users for at least one domain. A server may treat valid a set of users for at least one domain. A server may treat
other domains as having no valid translations. A more general other domains as having no valid translations. A more general
service is provided when a server is capable of accepting users for service is provided when a server is capable of accepting users for
multiple domains, or for all domains, subject to security multiple domains, or for all domains, subject to security
constraints. constraints.
As mentioned above, it is desirable that a server when accepting a
string of the form user@domain or group@domain in an attribute,
return this same string when that corresponding attribute is fetched.
Internationalization issues (for a general discussion of which see
Section 12) make this impossible and the client needs to take note of
the following situations:
o The string representing the domain may be converted to equivalent
U-label, if presented using a form other a a U-label. See
Section 12.6 for details.
o The user or group may be returned in a different form, due to
normalization issues, although it will always be a canonically
equivalent string. See See Section 12.7.3 for details.
In the case where there is no translation available to the client or In the case where there is no translation available to the client or
server, the attribute value must be constructed without the "@". server, the attribute value must be constructed without the "@".
Therefore, the absence of the @ from the owner or owner_group Therefore, the absence of the @ from the owner or owner_group
attribute signifies that no translation was available at the sender attribute signifies that no translation was available at the sender
and that the receiver of the attribute should not use that string as and that the receiver of the attribute should not use that string as
a basis for translation into its own internal format. Even though a basis for translation into its own internal format. Even though
the attribute value can not be translated, it may still be useful. the attribute value can not be translated, it may still be useful.
In the case of a client, the attribute string may be used for local In the case of a client, the attribute string may be used for local
display of ownership. display of ownership.
To provide a greater degree of compatibility with previous versions To provide a greater degree of compatibility with NFSv3, which
of NFS (i.e., v2 and v3), which identified users and groups by 32-bit identified users and groups by 32-bit unsigned user identifiers and
unsigned uid's and gid's, owner and group strings that consist of group identifiers, owner and group strings that consist of decimal
decimal numeric values with no leading zeros can be given a special numeric values with no leading zeros can be given a special
interpretation by clients and servers which choose to provide such interpretation by clients and servers which choose to provide such
support. The receiver may treat such a user or group string as support. The receiver may treat such a user or group string as
representing the same user as would be represented by a v2/v3 uid or representing the same user as would be represented by an NFSv3 uid or
gid having the corresponding numeric value. A server is not gid having the corresponding numeric value. A server is not
obligated to accept such a string, but may return an NFS4ERR_BADOWNER obligated to accept such a string, but may return an NFS4ERR_BADOWNER
instead. To avoid this mechanism being used to subvert user and instead. To avoid this mechanism being used to subvert user and
group translation, so that a client might pass all of the owners and group translation, so that a client might pass all of the owners and
groups in numeric form, a server SHOULD return an NFS4ERR_BADOWNER groups in numeric form, a server SHOULD return an NFS4ERR_BADOWNER
error when there is a valid translation for the user or owner error when there is a valid translation for the user or owner
designated in this way. In that case, the client must use the designated in this way. In that case, the client must use the
appropriate name@domain string and not the special form for appropriate name@domain string and not the special form for
compatibility. compatibility.
The owner string "nobody" may be used to designate an anonymous user, The owner string "nobody" may be used to designate an anonymous user,
which will be associated with a file created by a security principal which will be associated with a file created by a security principal
that cannot be mapped through normal means to the owner attribute. that cannot be mapped through normal means to the owner attribute.
5.9. Character Case Attributes 5.10. Character Case Attributes
With respect to the case_insensitive and case_preserving attributes, With respect to the case_insensitive and case_preserving attributes,
each UCS-4 character (which UTF-8 encodes) has a "long descriptive each UCS-4 character (which UTF-8 encodes) has a "long descriptive
name" [25] which may or may not included the word "CAPITAL" or name" RFC1345 [26] which may or may not include the word "CAPITAL" or
"SMALL". The presence of SMALL or CAPITAL allows an NFS server to "SMALL". The presence of SMALL or CAPITAL allows an NFS server to
implement unambiguous and efficient table driven mappings for case implement unambiguous and efficient table driven mappings for case
insensitive comparisons, and non-case-preserving storage. For insensitive comparisons, and non-case-preserving storage, although
general character handling and internationalization issues, see there are variations that occur additional characters with a name
Section 1 "Internationalization". including "SMALL" or "CAPITAL" are added in a subsequent version of
Unicode.
5.10. Quota Attributes For general character handling and internationalization issues, see
Section 12. For details regarding case mapping, see the section
Case-based Mapping Used for Component4 Strings.
For the attributes related to filesystem quotas, the following 6. Access Control Attributes
definitions apply:
quota_avail_soft The value in bytes which represents the amount of Access Control Lists (ACLs) are file attributes that specify fine
additional disk space that can be allocated to this file or grained access control. This chapter covers the "acl", "aclsupport",
directory before the user may reasonably be warned. It is "mode", file attributes, and their interactions. Note that file
understood that this space may be consumed by allocations to other attributes may apply to any file system object.
files or directories though there is a rule as to which other
files or directories.
quota_avail_hard The value in bytes which represent the amount of 6.1. Goals
additional disk space beyond the current allocation that can be
allocated to this file or directory before further allocations
will be refused. It is understood that this space may be consumed
by allocations to other files or directories.
quota_used The value in bytes which represent the amount of disc ACLs and modes represent two well established models for specifying
space used by this file or directory and possibly a number of permissions. This chapter specifies requirements that attempt to
other similar files or directories, where the set of "similar" meet the following goals:
meets at least the criterion that allocating space to any file or
directory in the set will reduce the "quota_avail_hard" of every
other file or directory in the set.
Note that there may be a number of distinct but overlapping sets o If a server supports the mode attribute, it should provide
of files or directories for which a quota_used value is maintained reasonable semantics to clients that only set and retrieve the
(e.g., "all files with a given owner", "all files with a given mode attribute.
group owner", etc.).
The server is at liberty to choose any of those sets but should do o If a server supports ACL attributes, it should provide reasonable
so in a repeatable way. The rule may be configured per-filesystem semantics to clients that only set and retrieve those attributes.
or may be "choose the set with the smallest quota".
5.11. Access Control Lists o On servers that support the mode attribute, if ACL attributes have
never been set on an object, via inheritance or explicitly, the
behavior should be traditional UNIX-like behavior.
The NFS version 4 ACL attribute is an array of access control entries o On servers that support the mode attribute, if the ACL attributes
(ACE). Although, the client can read and write the ACL attribute, have been previously set on an object, either explicitly or via
the NFSv4 model is the server does all access control based on the inheritance:
server's interpretation of the ACL. If at any point the client wants
to check access without issuing an operation that modifies or reads
data or metadata, the client can use the OPEN and ACCESS operations
to do so. There are various access control entry types, as defined
in the Section "ACE type". The server is able to communicate which
ACE types are supported by returning the appropriate value within the
aclsupport attribute. Each ACE covers one or more operations on a
file or directory as described in the Section "ACE Access Mask". It
may also contain one or more flags that modify the semantics of the
ACE as defined in the Section "ACE flag".
The NFS ACE attribute is defined as follows: * Setting only the mode attribute should effectively control the
traditional UNIX-like permissions of read, write, and execute
on owner, owner_group, and other.
* Setting only the mode attribute should provide reasonable
security. For example, setting a mode of 000 should be enough
to ensure that future opens for read or write by any principal
fail, regardless of a previously existing or inherited ACL.
o When a mode attribute is set on an object, the ACL attributes may
need to be modified so as to not conflict with the new mode. In
such cases, it is desirable that the ACL keep as much information
as possible. This includes information about inheritance, AUDIT
and ALARM ACEs, and permissions granted and denied that do not
conflict with the new mode.
6.2. File Attributes Discussion
6.2.1. Attribute 12: acl
The NFSv4.0 ACL attribute contains an array of access control entries
(ACEs) that are associated with the file system object. Although the
client can read and write the acl attribute, the server is
responsible for using the ACL to perform access control. The client
can use the OPEN or ACCESS operations to check access without
modifying or reading data or metadata.
The NFS ACE structure is defined as follows:
typedef uint32_t acetype4; typedef uint32_t acetype4;
typedef uint32_t aceflag4; typedef uint32_t aceflag4;
typedef uint32_t acemask4; typedef uint32_t acemask4;
struct nfsace4 { struct nfsace4 {
acetype4 type; acetype4 type;
aceflag4 flag; aceflag4 flag;
acemask4 access_mask; acemask4 access_mask;
utf8str_mixed who; utf8_must who;
}; };
To determine if a request succeeds, each nfsace4 entry is processed To determine if a request succeeds, the server processes each nfsace4
in order by the server. Only ACEs which have a "who" that matches entry in order. Only ACEs which have a "who" that matches the
the requester are considered. Each ACE is processed until all of the requester are considered. Each ACE is processed until all of the
bits of the requester's access have been ALLOWED. Once a bit (see bits of the requester's access have been ALLOWED. Once a bit (see
below) has been ALLOWED by an ACCESS_ALLOWED_ACE, it is no longer below) has been ALLOWED by an ACCESS_ALLOWED_ACE, it is no longer
considered in the processing of later ACEs. If an ACCESS_DENIED_ACE considered in the processing of later ACEs. If an ACCESS_DENIED_ACE
is encountered where the requester's access still has unALLOWED bits is encountered where the requester's access still has unALLOWED bits
in common with the "access_mask" of the ACE, the request is denied. in common with the "access_mask" of the ACE, the request is denied.
However, unlike the ALLOWED and DENIED ACE types, the ALARM and AUDIT When the ACL is fully processed, if there are bits in the requester's
ACE types do not affect a requester's access, and instead are for mask that have not been ALLOWED or DENIED, access is denied.
triggering events as a result of a requester's access attempt.
Therefore, all AUDIT and ALARM ACEs are processed until end of the Unlike the ALLOW and DENY ACE types, the ALARM and AUDIT ACE types do
ACL. When the ACL is fully processed, if there are bits in not affect a requester's access, and instead are for triggering
requester's mask that have not been considered whether the server events as a result of a requester's access attempt. Therefore, AUDIT
allows or denies the access is undefined. If there is a mode and ALARM ACEs are processed only after processing ALLOW and DENY
attribute on the file, then this cannot happen, since the mode's ACEs.
MODE4_*OTH bits will map to EVERYONE@ ACEs that unambiguously specify
the requester's access.
The NFS version 4 ACL model is quite rich. Some server platforms may The NFSv4.0 ACL model is quite rich. Some server platforms may
provide access control functionality that goes beyond the UNIX-style provide access control functionality that goes beyond the UNIX-style
mode attribute, but which is not as rich as the NFS ACL model. So mode attribute, but which is not as rich as the NFS ACL model. So
that users can take advantage of this more limited functionality, the that users can take advantage of this more limited functionality, the
server may indicate that it supports ACLs as long as it follows the server may support the acl attributes by mapping between its ACL
guidelines for mapping between its ACL model and the NFS version 4 model and the NFSv4.0 ACL model. Servers must ensure that the ACL
ACL model. they actually store or enforce is at least as strict as the NFSv4 ACL
that was set. It is tempting to accomplish this by rejecting any ACL
that falls outside the small set that can be represented accurately.
However, such an approach can render ACLs unusable without special
client-side knowledge of the server's mapping, which defeats the
purpose of having a common NFSv4 ACL protocol. Therefore servers
should accept every ACL that they can without compromising security.
To help accomplish this, servers may make a special exception, in the
case of unsupported permission bits, to the rule that bits not
ALLOWED or DENIED by an ACL must be denied. For example, a UNIX-
style server might choose to silently allow read attribute
permissions even though an ACL does not explicitly allow those
permissions. (An ACL that explicitly denies permission to read
attributes should still be rejected.)
The situation is complicated by the fact that a server may have The situation is complicated by the fact that a server may have
multiple modules that enforce ACLs. For example, the enforcement for multiple modules that enforce ACLs. For example, the enforcement for
NFS version 4 access may be different from the enforcement for local NFSv4.0 access may be different from, but not weaker than, the
access, and both may be different from the enforcement for access enforcement for local access, and both may be different from the
through other protocols such as SMB. So it may be useful for a enforcement for access through other protocols such as SMB. So it
server to accept an ACL even if not all of its modules are able to may be useful for a server to accept an ACL even if not all of its
support it. modules are able to support it.
The guiding principle in all cases is that the server must not accept The guiding principle with regard to NFSv4 access is that the server
ACLs that appear to make the file more secure than it really is. must not accept ACLs that appear to make access to the file more
restrictive than it really is.
5.11.1. ACE type 6.2.1.1. ACE Type
+-------+-----------------------------------------------------------+ The constants used for the type field (acetype4) are as follows:
| Type | Description |
+-------+-----------------------------------------------------------+
| ALLOW | Explicitly grants the access defined in acemask4 to the |
| | file or directory. |
| DENY | Explicitly denies the access defined in acemask4 to the |
| | file or directory. |
| AUDIT | LOG (system dependent) any access attempt to a file or |
| | directory which uses any of the access methods specified |
| | in acemask4. |
| ALARM | Generate a system ALARM (system dependent) when any |
| | access attempt is made to a file or directory for the |
| | access methods specified in acemask4. |
+-------+-----------------------------------------------------------+
Table 4 const ACE4_ACCESS_ALLOWED_ACE_TYPE = 0x00000000;
const ACE4_ACCESS_DENIED_ACE_TYPE = 0x00000001;
const ACE4_SYSTEM_AUDIT_ACE_TYPE = 0x00000002;
const ACE4_SYSTEM_ALARM_ACE_TYPE = 0x00000003;
All four but types are permitted in the acl attribute.
A server need not support all of the above ACE types. The bitmask +------------------------------+--------------+---------------------+
constants used to represent the above definitions within the | Value | Abbreviation | Description |
aclsupport attribute are as follows: +------------------------------+--------------+---------------------+
| ACE4_ACCESS_ALLOWED_ACE_TYPE | ALLOW | Explicitly grants |
| | | the access defined |
| | | in acemask4 to the |
| | | file or directory. |
| ACE4_ACCESS_DENIED_ACE_TYPE | DENY | Explicitly denies |
| | | the access defined |
| | | in acemask4 to the |
| | | file or directory. |
| ACE4_SYSTEM_AUDIT_ACE_TYPE | AUDIT | LOG (in a system |
| | | dependent way) any |
| | | access attempt to a |
| | | file or directory |
| | | which uses any of |
| | | the access methods |
| | | specified in |
| | | acemask4. |
| ACE4_SYSTEM_ALARM_ACE_TYPE | ALARM | Generate a system |
| | | ALARM (system |
| | | dependent) when any |
| | | access attempt is |
| | | made to a file or |
| | | directory for the |
| | | access methods |
| | | specified in |
| | | acemask4. |
+------------------------------+--------------+---------------------+
The "Abbreviation" column denotes how the types will be referred to
throughout the rest of this chapter.
6.2.1.2. Attribute 13: aclsupport
A server need not support all of the above ACE types. This attribute
indicates which ACE types are supported for the current file system.
The bitmask constants used to represent the above definitions within
the aclsupport attribute are as follows:
const ACL4_SUPPORT_ALLOW_ACL = 0x00000001; const ACL4_SUPPORT_ALLOW_ACL = 0x00000001;
const ACL4_SUPPORT_DENY_ACL = 0x00000002; const ACL4_SUPPORT_DENY_ACL = 0x00000002;
const ACL4_SUPPORT_AUDIT_ACL = 0x00000004; const ACL4_SUPPORT_AUDIT_ACL = 0x00000004;
const ACL4_SUPPORT_ALARM_ACL = 0x00000008; const ACL4_SUPPORT_ALARM_ACL = 0x00000008;
The semantics of the "type" field follow the descriptions provided Servers which support either the ALLOW or DENY ACE type SHOULD
above. support both ALLOW and DENY ACE types.
The constants used for the type field (acetype4) are as follows:
const ACE4_ACCESS_ALLOWED_ACE_TYPE = 0x00000000;
const ACE4_ACCESS_DENIED_ACE_TYPE = 0x00000001;
const ACE4_SYSTEM_AUDIT_ACE_TYPE = 0x00000002;
const ACE4_SYSTEM_ALARM_ACE_TYPE = 0x00000003;
Clients should not attempt to set an ACE unless the server claims Clients should not attempt to set an ACE unless the server claims
support for that ACE type. If the server receives a request to set support for that ACE type. If the server receives a request to set
an ACE that it cannot store, it MUST reject the request with an ACE that it cannot store, it MUST reject the request with
NFS4ERR_ATTRNOTSUPP. If the server receives a request to set an ACE NFS4ERR_ATTRNOTSUPP. If the server receives a request to set an ACE
that it can store but cannot enforce, the server SHOULD reject the that it can store but cannot enforce, the server SHOULD reject the
request with NFS4ERR_ATTRNOTSUPP. request with NFS4ERR_ATTRNOTSUPP.
Example: suppose a server can enforce NFS ACLs for NFS access but Support for any of the ACL attributes is optional (albeit,
cannot enforce ACLs for local access. If arbitrary processes can run RECOMMENDED).
on the server, then the server SHOULD NOT indicate ACL support. On
the other hand, if only trusted administrative programs run locally,
then the server may indicate ACL support.
5.11.2. ACE Access Mask
The access_mask field contains values based on the following:
+-------------------+-----------------------------------------------+
| Access | Description |
+-------------------+-----------------------------------------------+
| READ_DATA | Permission to read the data of the file |
| LIST_DIRECTORY | Permission to list the contents of a |
| | directory |
| WRITE_DATA | Permission to modify the file's data |
| ADD_FILE | Permission to add a new file to a directory |
| APPEND_DATA | Permission to append data to a file |
| ADD_SUBDIRECTORY | Permission to create a subdirectory to a |
| | directory |
| READ_NAMED_ATTRS | Permission to read the named attributes of a |
| | file |
| WRITE_NAMED_ATTRS | Permission to write the named attributes of a |
| | file |
| EXECUTE | Permission to execute a file |
| DELETE_CHILD | Permission to delete a file or directory |
| | within a directory |
| READ_ATTRIBUTES | The ability to read basic attributes |
| | (non-acls) of a file |
| WRITE_ATTRIBUTES | Permission to change basic attributes |
| | (non-acls) of a file |
| DELETE | Permission to Delete the file |
| READ_ACL | Permission to Read the ACL |
| WRITE_ACL | Permission to Write the ACL |
| WRITE_OWNER | Permission to change the owner |
| SYNCHRONIZE | Permission to access file locally at the |
| | server with synchronous reads and writes |
+-------------------+-----------------------------------------------+
Table 5 6.2.1.3. ACE Access Mask
The bitmask constants used for the access mask field are as follows: The bitmask constants used for the access mask field are as follows:
const ACE4_READ_DATA = 0x00000001; const ACE4_READ_DATA = 0x00000001;
const ACE4_LIST_DIRECTORY = 0x00000001; const ACE4_LIST_DIRECTORY = 0x00000001;
const ACE4_WRITE_DATA = 0x00000002; const ACE4_WRITE_DATA = 0x00000002;
const ACE4_ADD_FILE = 0x00000002; const ACE4_ADD_FILE = 0x00000002;
const ACE4_APPEND_DATA = 0x00000004; const ACE4_APPEND_DATA = 0x00000004;
const ACE4_ADD_SUBDIRECTORY = 0x00000004; const ACE4_ADD_SUBDIRECTORY = 0x00000004;
const ACE4_READ_NAMED_ATTRS = 0x00000008; const ACE4_READ_NAMED_ATTRS = 0x00000008;
skipping to change at page 54, line 24 skipping to change at page 58, line 39
const ACE4_DELETE_CHILD = 0x00000040; const ACE4_DELETE_CHILD = 0x00000040;
const ACE4_READ_ATTRIBUTES = 0x00000080; const ACE4_READ_ATTRIBUTES = 0x00000080;
const ACE4_WRITE_ATTRIBUTES = 0x00000100; const ACE4_WRITE_ATTRIBUTES = 0x00000100;
const ACE4_DELETE = 0x00010000; const ACE4_DELETE = 0x00010000;
const ACE4_READ_ACL = 0x00020000; const ACE4_READ_ACL = 0x00020000;
const ACE4_WRITE_ACL = 0x00040000; const ACE4_WRITE_ACL = 0x00040000;
const ACE4_WRITE_OWNER = 0x00080000; const ACE4_WRITE_OWNER = 0x00080000;
const ACE4_SYNCHRONIZE = 0x00100000; const ACE4_SYNCHRONIZE = 0x00100000;
Server implementations need not provide the granularity of control Note that some masks have coincident values, for example,
that is implied by this list of masks. For example, POSIX-based ACE4_READ_DATA and ACE4_LIST_DIRECTORY. The mask entries
systems might not distinguish APPEND_DATA (the ability to append to a ACE4_LIST_DIRECTORY, ACE4_ADD_FILE, and ACE4_ADD_SUBDIRECTORY are
file) from WRITE_DATA (the ability to modify existing contents); both intended to be used with directory objects, while ACE4_READ_DATA,
masks would be tied to a single "write" permission. When such a ACE4_WRITE_DATA, and ACE4_APPEND_DATA are intended to be used with
server returns attributes to the client, it would show both non-directory objects.
APPEND_DATA and WRITE_DATA if and only if the write permission is
enabled.
If a server receives a SETATTR request that it cannot accurately 6.2.1.3.1. Discussion of Mask Attributes
implement, it should error in the direction of more restricted ACE4_READ_DATA
access. For example, suppose a server cannot distinguish overwriting
data from appending new data, as described in the previous paragraph.
If a client submits an ACE where APPEND_DATA is set but WRITE_DATA is
not (or vice versa), the server should reject the request with
NFS4ERR_ATTRNOTSUPP. Nonetheless, if the ACE has type DENY, the
server may silently turn on the other bit, so that both APPEND_DATA
and WRITE_DATA are denied.
5.11.3. ACE flag Operation(s) affected:
The "flag" field contains values based on the following descriptions. READ
ACE4_FILE_INHERIT_ACE Can be placed on a directory and indicates OPEN
that this ACE should be added to each new non-directory file
created.
ACE4_DIRECTORY_INHERIT_ACE Can be placed on a directory and Discussion:
indicates that this ACE should be added to each new directory
created.
ACE4_INHERIT_ONLY_ACE Can be placed on a directory but does not Permission to read the data of the file.
apply to the directory, only to newly created files/directories as
specified by the above two flags.
ACE4_NO_PROPAGATE_INHERIT_ACE Can be placed on a directory. Servers SHOULD allow a user the ability to read the data of the
Normally when a new directory is created and an ACE exists on the file when only the ACE4_EXECUTE access mask bit is allowed.
parent directory which is marked ACL4_DIRECTORY_INHERIT_ACE, two
ACEs are placed on the new directory. One for the directory
itself and one which is an inheritable ACE for newly created
directories. This flag tells the server to not place an ACE on
the newly created directory which is inheritable by subdirectories
of the created directory.
ACE4_SUCCESSFUL_ACCESS_ACE_FLAG ACE4_LIST_DIRECTORY
ACL4_FAILED_ACCESS_ACE_FLAG The ACE4_SUCCESSFUL_ACCESS_ACE_FLAG Operation(s) affected:
(SUCCESS) and ACE4_FAILED_ACCESS_ACE_FLAG (FAILED) flag bits
relate only to ACE4_SYSTEM_AUDIT_ACE_TYPE (AUDIT) and
ACE4_SYSTEM_ALARM_ACE_TYPE (ALARM) ACE types. If during the
processing of the file's ACL, the server encounters an AUDIT or
ALARM ACE that matches the principal attempting the OPEN, the
server notes that fact, and the presence, if any, of the SUCCESS
and FAILED flags encountered in the AUDIT or ALARM ACE. Once the
server completes the ACL processing, and the share reservation
processing, and the OPEN call, it then notes if the OPEN succeeded
or failed. If the OPEN succeeded, and if the SUCCESS flag was set
for a matching AUDIT or ALARM, then the appropriate AUDIT or ALARM
event occurs. If the OPEN failed, and if the FAILED flag was set
for the matching AUDIT or ALARM, then the appropriate AUDIT or
ALARM event occurs. Clearly either or both of the SUCCESS or
FAILED can be set, but if neither is set, the AUDIT or ALARM ACE
is not useful.
The previously described processing applies to that of the ACCESS READDIR
operation as well. The difference being that "success" or
"failure" does not mean whether ACCESS returns NFS4_OK or not.
Success means whether ACCESS returns all requested and supported
bits. Failure means whether ACCESS failed to return a bit that
was requested and supported.
ACE4_IDENTIFIER_GROUP Indicates that the "who" refers to a GROUP as Discussion:
defined under UNIX.
Permission to list the contents of a directory.
ACE4_WRITE_DATA
Operation(s) affected:
WRITE
OPEN
SETATTR of size
Discussion:
Permission to modify a file's data.
ACE4_ADD_FILE
Operation(s) affected:
CREATE
LINK
OPEN
RENAME
Discussion:
Permission to add a new file in a directory. The CREATE
operation is affected when nfs_ftype4 is NF4LNK, NF4BLK,
NF4CHR, NF4SOCK, or NF4FIFO. (NF4DIR is not listed because it
is covered by ACE4_ADD_SUBDIRECTORY.) OPEN is affected when
used to create a regular file. LINK and RENAME are always
affected.
ACE4_APPEND_DATA
Operation(s) affected:
WRITE
OPEN
SETATTR of size
Discussion:
The ability to modify a file's data, but only starting at EOF.
This allows for the notion of append-only files, by allowing
ACE4_APPEND_DATA and denying ACE4_WRITE_DATA to the same user
or group. If a file has an ACL such as the one described above
and a WRITE request is made for somewhere other than EOF, the
server SHOULD return NFS4ERR_ACCESS.
ACE4_ADD_SUBDIRECTORY
Operation(s) affected:
CREATE
RENAME
Discussion:
Permission to create a subdirectory in a directory. The CREATE
operation is affected when nfs_ftype4 is NF4DIR. The RENAME
operation is always affected.
ACE4_READ_NAMED_ATTRS
Operation(s) affected:
OPENATTR
Discussion:
Permission to read the named attributes of a file or to lookup
the named attributes directory. OPENATTR is affected when it
is not used to create a named attribute directory. This is
when 1.) createdir is TRUE, but a named attribute directory
already exists, or 2.) createdir is FALSE.
ACE4_WRITE_NAMED_ATTRS
Operation(s) affected:
OPENATTR
Discussion:
Permission to write the named attributes of a file or to create
a named attribute directory. OPENATTR is affected when it is
used to create a named attribute directory. This is when
createdir is TRUE and no named attribute directory exists. The
ability to check whether or not a named attribute directory
exists depends on the ability to look it up, therefore, users
also need the ACE4_READ_NAMED_ATTRS permission in order to
create a named attribute directory.
ACE4_EXECUTE
Operation(s) affected:
READ
OPEN
REMOVE
RENAME
LINK
CREATE
Discussion:
Permission to execute a file.
Servers SHOULD allow a user the ability to read the data of the
file when only the ACE4_EXECUTE access mask bit is allowed.
This is because there is no way to execute a file without
reading the contents. Though a server may treat ACE4_EXECUTE
and ACE4_READ_DATA bits identically when deciding to permit a
READ operation, it SHOULD still allow the two bits to be set
independently in ACLs, and MUST distinguish between them when
replying to ACCESS operations. In particular, servers SHOULD
NOT silently turn on one of the two bits when the other is set,
as that would make it impossible for the client to correctly
enforce the distinction between read and execute permissions.
As an example, following a SETATTR of the following ACL:
nfsuser:ACE4_EXECUTE:ALLOW
A subsequent GETATTR of ACL for that file SHOULD return:
nfsuser:ACE4_EXECUTE:ALLOW
Rather than:
nfsuser:ACE4_EXECUTE/ACE4_READ_DATA:ALLOW
ACE4_EXECUTE
Operation(s) affected:
LOOKUP
Discussion:
Permission to traverse/search a directory.
ACE4_DELETE_CHILD
Operation(s) affected:
REMOVE
RENAME
Discussion:
Permission to delete a file or directory within a directory.
See Section 6.2.1.3.2 for information on ACE4_DELETE and
ACE4_DELETE_CHILD interact.
ACE4_READ_ATTRIBUTES
Operation(s) affected:
GETATTR of file system object attributes
VERIFY
NVERIFY
READDIR
Discussion:
The ability to read basic attributes (non-ACLs) of a file. On
a UNIX system, basic attributes can be thought of as the stat
level attributes. Allowing this access mask bit would mean the
entity can execute "ls -l" and stat. If a READDIR operation
requests attributes, this mask must be allowed for the READDIR
to succeed.
ACE4_WRITE_ATTRIBUTES
Operation(s) affected:
SETATTR of time_access_set, time_backup,
time_create, time_modify_set, mimetype, hidden, system
Discussion:
Permission to change the times associated with a file or
directory to an arbitrary value. Also permission to change the
mimetype, hidden and system attributes. A user having
ACE4_WRITE_DATA or ACE4_WRITE_ATTRIBUTES will be allowed to set
the times associated with a file to the current server time.
ACE4_DELETE
Operation(s) affected:
REMOVE
Discussion:
Permission to delete the file or directory. See
Section 6.2.1.3.2 for information on ACE4_DELETE and
ACE4_DELETE_CHILD interact.
ACE4_READ_ACL
Operation(s) affected:
GETATTR of acl
NVERIFY
VERIFY
Discussion:
Permission to read the ACL.
ACE4_WRITE_ACL
Operation(s) affected:
SETATTR of acl and mode
Discussion:
Permission to write the acl and mode attributes.
ACE4_WRITE_OWNER
Operation(s) affected:
SETATTR of owner and owner_group
Discussion:
Permission to write the owner and owner_group attributes. On
UNIX systems, this is the ability to execute chown() and
chgrp().
ACE4_SYNCHRONIZE
Operation(s) affected:
NONE
Discussion:
Permission to access file locally at the server with
synchronized reads and writes.
Server implementations need not provide the granularity of control
that is implied by this list of masks. For example, POSIX-based
systems might not distinguish ACE4_APPEND_DATA (the ability to append
to a file) from ACE4_WRITE_DATA (the ability to modify existing
contents); both masks would be tied to a single "write" permission.
When such a server returns attributes to the client, it would show
both ACE4_APPEND_DATA and ACE4_WRITE_DATA if and only if the write
permission is enabled.
If a server receives a SETATTR request that it cannot accurately
implement, it should err in the direction of more restricted access,
except in the previously discussed cases of execute and read. For
example, suppose a server cannot distinguish overwriting data from
appending new data, as described in the previous paragraph. If a
client submits an ALLOW ACE where ACE4_APPEND_DATA is set but
ACE4_WRITE_DATA is not (or vice versa), the server should either turn
off ACE4_APPEND_DATA or reject the request with NFS4ERR_ATTRNOTSUPP.
6.2.1.3.2. ACE4_DELETE vs. ACE4_DELETE_CHILD
Two access mask bits govern the ability to delete a directory entry:
ACE4_DELETE on the object itself (the "target"), and
ACE4_DELETE_CHILD on the containing directory (the "parent").
Many systems also take the "sticky bit" (MODE4_SVTX) on a directory
to allow unlink only to a user that owns either the target or the
parent; on some such systems the decision also depends on whether the
target is writable.
Servers SHOULD allow unlink if either ACE4_DELETE is permitted on the
target, or ACE4_DELETE_CHILD is permitted on the parent. (Note that
this is true even if the parent or target explicitly denies one of
these permissions.)
If the ACLs in question neither explicitly ALLOW nor DENY either of
the above, and if MODE4_SVTX is not set on the parent, then the
server SHOULD allow the removal if and only if ACE4_ADD_FILE is
permitted. In the case where MODE4_SVTX is set, the server may also
require the remover to own either the parent or the target, or may
require the target to be writable.
This allows servers to support something close to traditional UNIX-
like semantics, with ACE4_ADD_FILE taking the place of the write bit.
6.2.1.4. ACE flag
The bitmask constants used for the flag field are as follows: The bitmask constants used for the flag field are as follows:
const ACE4_FILE_INHERIT_ACE = 0x00000001; const ACE4_FILE_INHERIT_ACE = 0x00000001;
const ACE4_DIRECTORY_INHERIT_ACE = 0x00000002; const ACE4_DIRECTORY_INHERIT_ACE = 0x00000002;
const ACE4_NO_PROPAGATE_INHERIT_ACE = 0x00000004; const ACE4_NO_PROPAGATE_INHERIT_ACE = 0x00000004;
const ACE4_INHERIT_ONLY_ACE = 0x00000008; const ACE4_INHERIT_ONLY_ACE = 0x00000008;
const ACE4_SUCCESSFUL_ACCESS_ACE_FLAG = 0x00000010; const ACE4_SUCCESSFUL_ACCESS_ACE_FLAG = 0x00000010;
const ACE4_FAILED_ACCESS_ACE_FLAG = 0x00000020; const ACE4_FAILED_ACCESS_ACE_FLAG = 0x00000020;
const ACE4_IDENTIFIER_GROUP = 0x00000040; const ACE4_IDENTIFIER_GROUP = 0x00000040;
A server need not support any of these flags. If the server supports A server need not support any of these flags. If the server supports
flags that are similar to, but not exactly the same as, these flags, flags that are similar to, but not exactly the same as, these flags,
the implementation may define a mapping between the protocol-defined the implementation may define a mapping between the protocol-defined
flags and the implementation-defined flags. Again, the guiding flags and the implementation-defined flags.
principle is that the file not appear to be more secure than it
really is.
For example, suppose a client tries to set an ACE with For example, suppose a client tries to set an ACE with
ACE4_FILE_INHERIT_ACE set but not ACE4_DIRECTORY_INHERIT_ACE. If the ACE4_FILE_INHERIT_ACE set but not ACE4_DIRECTORY_INHERIT_ACE. If the
server does not support any form of ACL inheritance, the server server does not support any form of ACL inheritance, the server
should reject the request with NFS4ERR_ATTRNOTSUPP. If the server should reject the request with NFS4ERR_ATTRNOTSUPP. If the server
supports a single "inherit ACE" flag that applies to both files and supports a single "inherit ACE" flag that applies to both files and
directories, the server may reject the request (i.e., requiring the directories, the server may reject the request (i.e., requiring the
client to set both the file and directory inheritance flags). The client to set both the file and directory inheritance flags). The
server may also accept the request and silently turn on the server may also accept the request and silently turn on the
ACE4_DIRECTORY_INHERIT_ACE flag. ACE4_DIRECTORY_INHERIT_ACE flag.
5.11.4. ACE who 6.2.1.4.1. Discussion of Flag Bits
There are several special identifiers ("who") which need to be ACE4_FILE_INHERIT_ACE
understood universally, rather than in the context of a particular Any non-directory file in any sub-directory will get this ACE
DNS domain. Some of these identifiers cannot be understood when an inherited.
NFS client accesses the server, but have meaning when a local process
accesses the file. The ability to display and modify these
permissions is permitted over NFS, even if none of the access methods
on the server understands the identifiers.
+-----------------+------------------------------------------------+ ACE4_DIRECTORY_INHERIT_ACE
| Who | Description | Can be placed on a directory and indicates that this ACE should be
+-----------------+------------------------------------------------+ added to each new directory created.
| "OWNER" | The owner of the file. | If this flag is set in an ACE in an ACL attribute to be set on a
| "GROUP" | The group associated with the file. | non-directory file system object, the operation attempting to set
| "EVERYONE" | The world. | the ACL SHOULD fail with NFS4ERR_ATTRNOTSUPP.
| "INTERACTIVE" | Accessed from an interactive terminal. |
| "NETWORK" | Accessed via the network. |
| "DIALUP" | Accessed as a dialup user to the server. |
| "BATCH" | Accessed from a batch job. |
| "ANONYMOUS" | Accessed without any authentication. |
| "AUTHENTICATED" | Any authenticated user (opposite of ANONYMOUS) |
| "SERVICE" | Access from a system service. |
+-----------------+------------------------------------------------+
Table 6 ACE4_INHERIT_ONLY_ACE
Can be placed on a directory but does not apply to the directory;
ALLOW and DENY ACEs with this bit set do not affect access to the
directory, and AUDIT and ALARM ACEs with this bit set do not
trigger log or alarm events. Such ACEs only take effect once they
are applied (with this bit cleared) to newly created files and
directories as specified by the above two flags.
If this flag is present on an ACE, but neither
ACE4_DIRECTORY_INHERIT_ACE nor ACE4_FILE_INHERIT_ACE is present,
then an operation attempting to set such an attribute SHOULD fail
with NFS4ERR_ATTRNOTSUPP.
To avoid conflict, these special identifiers are distinguish by an ACE4_NO_PROPAGATE_INHERIT_ACE
appended "@" and should appear in the form "xxxx@" (note: no domain Can be placed on a directory. This flag tells the server that
inheritance of this ACE should stop at newly created child
directories.
ACE4_SUCCESSFUL_ACCESS_ACE_FLAG
ACE4_FAILED_ACCESS_ACE_FLAG
The ACE4_SUCCESSFUL_ACCESS_ACE_FLAG (SUCCESS) and
ACE4_FAILED_ACCESS_ACE_FLAG (FAILED) flag bits may be set only on
ACE4_SYSTEM_AUDIT_ACE_TYPE (AUDIT) and ACE4_SYSTEM_ALARM_ACE_TYPE
(ALARM) ACE types. If during the processing of the file's ACL,
the server encounters an AUDIT or ALARM ACE that matches the
principal attempting the OPEN, the server notes that fact, and the
presence, if any, of the SUCCESS and FAILED flags encountered in
the AUDIT or ALARM ACE. Once the server completes the ACL
processing, it then notes if the operation succeeded or failed.
If the operation succeeded, and if the SUCCESS flag was set for a
matching AUDIT or ALARM ACE, then the appropriate AUDIT or ALARM
event occurs. If the operation failed, and if the FAILED flag was
set for the matching AUDIT or ALARM ACE, then the appropriate
AUDIT or ALARM event occurs. Either or both of the SUCCESS or
FAILED can be set, but if neither is set, the AUDIT or ALARM ACE
is not useful.
The previously described processing applies to ACCESS operations
even when they return NFS4_OK. For the purposes of AUDIT and
ALARM, we consider an ACCESS operation to be a "failure" if it
fails to return a bit that was requested and supported.
ACE4_IDENTIFIER_GROUP
Indicates that the "who" refers to a GROUP as defined under UNIX
or a GROUP ACCOUNT as defined under Windows. Clients and servers
MUST ignore the ACE4_IDENTIFIER_GROUP flag on ACEs with a who
value equal to one of the special identifiers outlined in
Section 6.2.1.5.
6.2.1.5. ACE Who
The "who" field of an ACE is an identifier that specifies the
principal or principals to whom the ACE applies. It may refer to a
user or a group, with the flag bit ACE4_IDENTIFIER_GROUP specifying
which.
There are several special identifiers which need to be understood
universally, rather than in the context of a particular DNS domain.
Some of these identifiers cannot be understood when an NFS client
accesses the server, but have meaning when a local process accesses
the file. The ability to display and modify these permissions is
permitted over NFS, even if none of the access methods on the server
understands the identifiers.
+---------------+--------------------------------------------------+
| Who | Description |
+---------------+--------------------------------------------------+
| OWNER | The owner of the file |
| GROUP | The group associated with the file. |
| EVERYONE | The world, including the owner and owning group. |
| INTERACTIVE | Accessed from an interactive terminal. |
| NETWORK | Accessed via the network. |
| DIALUP | Accessed as a dialup user to the server. |
| BATCH | Accessed from a batch job. |
| ANONYMOUS | Accessed without any authentication. |
| AUTHENTICATED | Any authenticated user (opposite of ANONYMOUS) |
| SERVICE | Access from a system service. |
+---------------+--------------------------------------------------+
Table 4
To avoid conflict, these special identifiers are distinguished by an
appended "@" and should appear in the form "xxxx@" (with no domain
name after the "@"). For example: ANONYMOUS@. name after the "@"). For example: ANONYMOUS@.
5.11.5. Mode Attribute The ACE4_IDENTIFIER_GROUP flag MUST be ignored on entries with these
special identifiers. When encoding entries with these special
identifiers, the ACE4_IDENTIFIER_GROUP flag SHOULD be set to zero.
The NFS version 4 mode attribute is based on the UNIX mode bits. The 6.2.1.5.1. Discussion of EVERYONE@
It is important to note that "EVERYONE@" is not equivalent to the
UNIX "other" entity. This is because, by definition, UNIX "other"
does not include the owner or owning group of a file. "EVERYONE@"
means literally everyone, including the owner or owning group.
6.2.2. Attribute 33: mode
The NFSv4.0 mode attribute is based on the UNIX mode bits. The
following bits are defined: following bits are defined:
const MODE4_SUID = 0x800; /* set user id on execution */ const MODE4_SUID = 0x800; /* set user id on execution */
const MODE4_SGID = 0x400; /* set group id on execution */ const MODE4_SGID = 0x400; /* set group id on execution */
const MODE4_SVTX = 0x200; /* save text even after use */ const MODE4_SVTX = 0x200; /* save text even after use */
const MODE4_RUSR = 0x100; /* read permission: owner */ const MODE4_RUSR = 0x100; /* read permission: owner */
const MODE4_WUSR = 0x080; /* write permission: owner */ const MODE4_WUSR = 0x080; /* write permission: owner */
const MODE4_XUSR = 0x040; /* execute permission: owner */ const MODE4_XUSR = 0x040; /* execute permission: owner */
const MODE4_RGRP = 0x020; /* read permission: group */ const MODE4_RGRP = 0x020; /* read permission: group */
const MODE4_WGRP = 0x010; /* write permission: group */ const MODE4_WGRP = 0x010; /* write permission: group */
const MODE4_XGRP = 0x008; /* execute permission: group */ const MODE4_XGRP = 0x008; /* execute permission: group */
const MODE4_ROTH = 0x004; /* read permission: other */ const MODE4_ROTH = 0x004; /* read permission: other */
const MODE4_WOTH = 0x002; /* write permission: other */ const MODE4_WOTH = 0x002; /* write permission: other */
const MODE4_XOTH = 0x001; /* execute permission: other */ const MODE4_XOTH = 0x001; /* execute permission: other */
Bits MODE4_RUSR, MODE4_WUSR, and MODE4_XUSR apply to the principal Bits MODE4_RUSR, MODE4_WUSR, and MODE4_XUSR apply to the principal
identified in the owner attribute. Bits MODE4_RGRP, MODE4_WGRP, and identified in the owner attribute. Bits MODE4_RGRP, MODE4_WGRP, and
MODE4_XGRP apply to the principals identified in the owner_group MODE4_XGRP apply to principals identified in the owner_group
attribute. Bits MODE4_ROTH, MODE4_WOTH, MODE4_XOTH apply to any attribute but who are not identified in the owner attribute. Bits
principal that does not match that in the owner group, and does not MODE4_ROTH, MODE4_WOTH, MODE4_XOTH apply to any principal that does
have a group matching that of the owner_group attribute. not match that in the owner attribute, and does not have a group
matching that of the owner_group attribute.
The remaining bits are not defined by this protocol and MUST NOT be Bits within the mode other than those specified above are not defined
used. The minor version mechanism must be used to define further bit by this protocol. A server MUST NOT return bits other than those
usage. defined above in a GETATTR or READDIR operation, and it MUST return
NFS4ERR_INVAL if bits other than those defined above are set in a
SETATTR, CREATE, OPEN, VERIFY or NVERIFY operation.
Note that in UNIX, if a file has the MODE4_SGID bit set and no 6.3. Common Methods
MODE4_XGRP bit set, then READ and WRITE must use mandatory file
locking.
5.11.6. Mode and ACL Attribute The requirements in this section will be referred to in future
sections, especially Section 6.4.
6.3.1. Interpreting an ACL
6.3.1.1. Server Considerations
The server uses the algorithm described in Section 6.2.1 to determine
whether an ACL allows access to an object. However, the ACL may not
be the sole determiner of access. For example:
o In the case of a file system exported as read-only, the server may
deny write permissions even though an object's ACL grants it.
o Server implementations MAY grant ACE4_WRITE_ACL and ACE4_READ_ACL
permissions to prevent a situation from arising in which there is
no valid way to ever modify the ACL.
o All servers will allow a user the ability to read the data of the
file when only the execute permission is granted (i.e. If the ACL
denies the user the ACE4_READ_DATA access and allows the user
ACE4_EXECUTE, the server will allow the user to read the data of
the file).
o Many servers have the notion of owner-override in which the owner
of the object is allowed to override accesses that are denied by
the ACL. This may be helpful, for example, to allow users
continued access to open files on which the permissions have
changed.
o Many servers have the notion of a "superuser" that has privileges
beyond an ordinary user. The superuser may be able to read or
write data or metadata in ways that would not be permitted by the
ACL.
6.3.1.2. Client Considerations
Clients SHOULD NOT do their own access checks based on their
interpretation the ACL, but rather use the OPEN and ACCESS operations
to do access checks. This allows the client to act on the results of
having the server determine whether or not access should be granted
based on its interpretation of the ACL.
Clients must be aware of situations in which an object's ACL will
define a certain access even though the server will not enforce it.
In general, but especially in these situations, the client needs to
do its part in the enforcement of access as defined by the ACL. To
do this, the client MAY send the appropriate ACCESS operation prior
to servicing the request of the user or application in order to
determine whether the user or application should be granted the
access requested. For examples in which the ACL may define accesses
that the server doesn't enforce see Section 6.3.1.1.
6.3.2. Computing a Mode Attribute from an ACL
The following method can be used to calculate the MODE4_R*, MODE4_W*
and MODE4_X* bits of a mode attribute, based upon an ACL.
First, for each of the special identifiers OWNER@, GROUP@, and
EVERYONE@, evaluate the ACL in order, considering only ALLOW and DENY
ACEs for the identifier EVERYONE@ and for the identifier under
consideration. The result of the evaluation will be an NFSv4 ACL
mask showing exactly which bits are permitted to that identifier.
Then translate the calculated mask for OWNER@, GROUP@, and EVERYONE@
into mode bits for, respectively, the user, group, and other, as
follows:
1. Set the read bit (MODE4_RUSR, MODE4_RGRP, or MODE4_ROTH) if and
only if ACE4_READ_DATA is set in the corresponding mask.
2. Set the write bit (MODE4_WUSR, MODE4_WGRP, or MODE4_WOTH) if and
only if ACE4_WRITE_DATA and ACE4_APPEND_DATA are both set in the
corresponding mask.
3. Set the execute bit (MODE4_XUSR, MODE4_XGRP, or MODE4_XOTH), if
and only if ACE4_EXECUTE is set in the corresponding mask.
6.3.2.1. Discussion
Some server implementations also add bits permitted to named users
and groups to the group bits (MODE4_RGRP, MODE4_WGRP, and
MODE4_XGRP).
Implementations are discouraged from doing this, because it has been
found to cause confusion for users who see members of a file's group
denied access that the mode bits appear to allow. (The presence of
DENY ACEs may also lead to such behavior, but DENY ACEs are expected
to be more rarely used.)
The same user confusion seen when fetching the mode also results if
setting the mode does not effectively control permissions for the
owner, group, and other users; this motivates some of the
requirements that follow.
6.4. Requirements
The server that supports both mode and ACL must take care to The server that supports both mode and ACL must take care to
synchronize the MODE4_*USR, MODE4_*GRP, and MODE4_*OTH bits with the synchronize the MODE4_*USR, MODE4_*GRP, and MODE4_*OTH bits with the
ACEs which have respective who fields of "OWNER@", "GROUP@", and ACEs which have respective who fields of "OWNER@", "GROUP@", and
"EVERYONE@" so that the client can see semantically equivalent access "EVERYONE@" so that the client can see semantically equivalent access
permissions exist whether the client asks for owner, owner_group and permissions exist whether the client asks for owner, owner_group and
mode attributes, or for just the ACL. mode attributes, or for just the ACL.
Because the mode attribute includes bits (e.g., MODE4_SVTX) that have In this section, much is made of the methods in Section 6.3.2. Many
nothing to do with ACL semantics, it is permitted for clients to requirements refer to this section. But note that the methods have
specify both the ACL attribute and mode in the same SETATTR behaviors specified with "SHOULD". This is intentional, to avoid
operation. However, because there is no prescribed order for invalidating existing implementations that compute the mode according
processing the attributes in a SETATTR, the client must ensure that to the withdrawn POSIX ACL draft (1003.1e draft 17), rather than by
ACL attribute, if specified without mode, would produce the desired actual permissions on owner, group, and other.
mode bits, and conversely, the mode attribute if specified without
ACL, would produce the desired "OWNER@", "GROUP@", and "EVERYONE@"
ACEs.
5.11.7. mounted_on_fileid 6.4.1. Setting the mode and/or ACL Attributes
UNIX-based operating environments connect a filesystem into the 6.4.1.1. Setting mode and not ACL
namespace by connecting (mounting) the filesystem onto the existing
file object (the mount point, usually a directory) of an existing
filesystem. When the mount point's parent directory is read via an
API like readdir(), the return results are directory entries, each
with a component name and a fileid. The fileid of the mount point's
directory entry will be different from the fileid that the stat()
system call returns. The stat() system call is returning the fileid
of the root of the mounted filesystem, whereas readdir() is returning
the fileid stat() would have returned before any filesystems were
mounted on the mount point.
Unlike NFS version 3, NFS version 4 allows a client's LOOKUP request When any of the nine low-order mode bits are subject to change,
to cross other filesystems. The client detects the filesystem either because the mode attribute was set or because the
crossing whenever the filehandle argument of LOOKUP has an fsid mode_set_masked attribute was set and the mask included one or more
attribute different from that of the filehandle returned by LOOKUP. bits from the nine low-order mode bits, and no ACL attribute is
A UNIX-based client will consider this a "mount point crossing". explicitly set, the acl attribute must be modified in accordance with
UNIX has a legacy scheme for allowing a process to determine its the updated value of those bits. This must happen even if the value
current working directory. This relies on readdir() of a mount of the low-order bits is the same after the mode is set as before.
point's parent and stat() of the mount point returning fileids as
previously described. The mounted_on_fileid attribute corresponds to
the fileid that readdir() would have returned as described
previously.
While the NFS version 4 client could simply fabricate a fileid Note that any AUDIT or ALARM ACEs are unaffected by changes to the
corresponding to what mounted_on_fileid provides (and if the server mode.
does not support mounted_on_fileid, the client has no choice), there
is a risk that the client will generate a fileid that conflicts with
one that is already assigned to another object in the filesystem.
Instead, if the server can provide the mounted_on_fileid, the
potential for client operational problems in this area is eliminated.
If the server detects that there is no mounted point at the target In cases in which the permissions bits are subject to change, the acl
file object, then the value for mounted_on_fileid that it returns is attribute MUST be modified such that the mode computed via the method
the same as that of the fileid attribute. in Section 6.3.2 yields the low-order nine bits (MODE4_R*, MODE4_W*,
MODE4_X*) of the mode attribute as modified by the attribute change.
The ACL attributes SHOULD also be modified such that:
The mounted_on_fileid attribute is RECOMMENDED, so the server SHOULD 1. If MODE4_RGRP is not set, entities explicitly listed in the ACL
provide it if possible, and for a UNIX-based server, this is other than OWNER@ and EVERYONE@ SHOULD NOT be granted
straightforward. Usually, mounted_on_fileid will be requested during ACE4_READ_DATA.
a READDIR operation, in which case it is trivial (at least for UNIX-
based servers) to return mounted_on_fileid since it is equal to the
fileid of a directory entry returned by readdir(). If
mounted_on_fileid is requested in a GETATTR operation, the server
should obey an invariant that has it returning a value that is equal
to the file object's entry in the object's parent directory, i.e.,
what readdir() would have returned. Some operating environments
allow a series of two or more filesystems to be mounted onto a single
mount point. In this case, for the server to obey the aforementioned
invariant, it will need to find the base mount point, and not the
intermediate mount points.
6. Filesystem Migration and Replication 2. If MODE4_WGRP is not set, entities explicitly listed in the ACL
other than OWNER@ and EVERYONE@ SHOULD NOT be granted
ACE4_WRITE_DATA or ACE4_APPEND_DATA.
With the use of the recommended attribute "fs_locations", the NFS 3. If MODE4_XGRP is not set, entities explicitly listed in the ACL
version 4 server has a method of providing filesystem migration or other than OWNER@ and EVERYONE@ SHOULD NOT be granted
replication services. For the purposes of migration and replication, ACE4_EXECUTE.
a filesystem will be defined as all files that share a given fsid
(both major and minor values are the same).
The fs_locations attribute provides a list of filesystem locations. Access mask bits other those listed above, appearing in ALLOW ACEs,
These locations are specified by providing the server name (either MAY also be disabled.
DNS domain or IP address) and the path name representing the root of
the filesystem. Depending on the type of service being provided, the
list will provide a new location or a set of alternate locations for
the filesystem. The client will use this information to redirect its
requests to the new server.
6.1. Replication Note that ACEs with the flag ACE4_INHERIT_ONLY_ACE set do not affect
the permissions of the ACL itself, nor do ACEs of the type AUDIT and
ALARM. As such, it is desirable to leave these ACEs unmodified when
modifying the ACL attributes.
It is expected that filesystem replication will be used in the case Also note that the requirement may be met by discarding the acl in
of read-only data. Typically, the filesystem will be replicated on favor of an ACL that represents the mode and only the mode. This is
two or more servers. The fs_locations attribute will provide the permitted, but it is preferable for a server to preserve as much of
list of these locations to the client. On first access of the the ACL as possible without violating the above requirements.
filesystem, the client should obtain the value of the fs_locations Discarding the ACL makes it effectively impossible for a file created
attribute. If, in the future, the client finds the server with a mode attribute to inherit an ACL (see Section 6.4.3).
unresponsive, the client may attempt to use another server specified
by fs_locations.
If applicable, the client must take the appropriate steps to recover 6.4.1.2. Setting ACL and not mode
valid filehandles from the new server. This is described in more
detail in the following sections.
6.2. Migration When setting the acl and not setting the mode or mode_set_masked
attributes, the permission bits of the mode need to be derived from
the ACL. In this case, the ACL attribute SHOULD be set as given.
The nine low-order bits of the mode attribute (MODE4_R*, MODE4_W*,
MODE4_X*) MUST be modified to match the result of the method
Section 6.3.2. The three high-order bits of the mode (MODE4_SUID,
MODE4_SGID, MODE4_SVTX) SHOULD remain unchanged.
Filesystem migration is used to move a filesystem from one server to 6.4.1.3. Setting both ACL and mode
another. Migration is typically used for a filesystem that is
writable and has a single copy. The expected use of migration is for
load balancing or general resource reallocation. The protocol does
not specify how the filesystem will be moved between servers. This
server-to-server transfer mechanism is left to the server
implementor. However, the method used to communicate the migration
event between client and server is specified here.
Once the servers participating in the migration have completed the When setting both the mode (includes use of either the mode attribute
move of the filesystem, the error NFS4ERR_MOVED will be returned for or the mode_set_masked attribute) and the acl attribute in the same
subsequent requests received by the original server. The operation, the attributes MUST be applied in this order: mode (or
NFS4ERR_MOVED error is returned for all operations except PUTFH and mode_set_masked), then ACL. The mode-related attribute is set as
GETATTR. Upon receiving the NFS4ERR_MOVED error, the client will given, then the ACL attribute is set as given, possibly changing the
obtain the value of the fs_locations attribute. The client will then final mode, as described above in Section 6.4.1.2.
use the contents of the attribute to redirect its requests to the
specified server. To facilitate the use of GETATTR, operations such
as PUTFH must also be accepted by the server for the migrated file
system's filehandles. Note that if the server returns NFS4ERR_MOVED,
the server MUST support the fs_locations attribute.
If the client requests more attributes than just fs_locations, the 6.4.2. Retrieving the mode and/or ACL Attributes
server may return fs_locations only. This is to be expected since
the server has migrated the filesystem and may not have a method of
obtaining additional attribute data.
The server implementor needs to be careful in developing a migration This section applies only to servers that support both the mode and
solution. The server must consider all of the state information ACL attributes.
clients may have outstanding at the server. This includes but is not
limited to locking/share state, delegation state, and asynchronous
file writes which are represented by WRITE and COMMIT verifiers. The
server should strive to minimize the impact on its clients during and
after the migration process.
6.3. Interpretation of the fs_locations Attribute Some server implementations may have a concept of "objects without
ACLs", meaning that all permissions are granted and denied according
to the mode attribute, and that no ACL attribute is stored for that
object. If an ACL attribute is requested of such a server, the
server SHOULD return an ACL that does not conflict with the mode;
that is to say, the ACL returned SHOULD represent the nine low-order
bits of the mode attribute (MODE4_R*, MODE4_W*, MODE4_X*) as
described in Section 6.3.2.
The fs_location attribute is structured in the following way: For other server implementations, the ACL attribute is always present
for every object. Such servers SHOULD store at least the three high-
order bits of the mode attribute (MODE4_SUID, MODE4_SGID,
MODE4_SVTX). The server SHOULD return a mode attribute if one is
requested, and the low-order nine bits of the mode (MODE4_R*,
MODE4_W*, MODE4_X*) MUST match the result of applying the method in
Section 6.3.2 to the ACL attribute.
6.4.3. Creating New Objects
If a server supports any ACL attributes, it may use the ACL
attributes on the parent directory to compute an initial ACL
attribute for a newly created object. This will be referred to as
the inherited ACL within this section. The act of adding one or more
ACEs to the inherited ACL that are based upon ACEs in the parent
directory's ACL will be referred to as inheriting an ACE within this
section.
Implementors should standardize on what the behavior of CREATE and
OPEN must be depending on the presence or absence of the mode and ACL
attributes.
1. If just the mode is given in the call:
In this case, inheritance SHOULD take place, but the mode MUST be
applied to the inherited ACL as described in Section 6.4.1.1,
thereby modifying the ACL.
2. If just the ACL is given in the call:
In this case, inheritance SHOULD NOT take place, and the ACL as
defined in the CREATE or OPEN will be set without modification,
and the mode modified as in Section 6.4.1.2
3. If both mode and ACL are given in the call:
In this case, inheritance SHOULD NOT take place, and both
attributes will be set as described in Section 6.4.1.3.
4. If neither mode nor ACL are given in the call:
In the case where an object is being created without any initial
attributes at all, e.g. an OPEN operation with an opentype4 of
OPEN4_CREATE and a createmode4 of EXCLUSIVE4, inheritance SHOULD
NOT take place. Instead, the server SHOULD set permissions to
deny all access to the newly created object. It is expected that
the appropriate client will set the desired attributes in a
subsequent SETATTR operation, and the server SHOULD allow that
operation to succeed, regardless of what permissions the object
is created with. For example, an empty ACL denies all
permissions, but the server should allow the owner's SETATTR to
succeed even though WRITE_ACL is implicitly denied.
In other cases, inheritance SHOULD take place, and no
modifications to the ACL will happen. The mode attribute, if
supported, MUST be as computed in Section 6.3.2, with the
MODE4_SUID, MODE4_SGID and MODE4_SVTX bits clear. If no
inheritable ACEs exist on the parent directory, the rules for
creating acl attributes are implementation defined.
6.4.3.1. The Inherited ACL
If the object being created is not a directory, the inherited ACL
SHOULD NOT inherit ACEs from the parent directory ACL unless the
ACE4_FILE_INHERIT_FLAG is set.
If the object being created is a directory, the inherited ACL should
inherit all inheritable ACEs from the parent directory, those that
have ACE4_FILE_INHERIT_ACE or ACE4_DIRECTORY_INHERIT_ACE flag set.
If the inheritable ACE has ACE4_FILE_INHERIT_ACE set, but
ACE4_DIRECTORY_INHERIT_ACE is clear, the inherited ACE on the newly
created directory MUST have the ACE4_INHERIT_ONLY_ACE flag set to
prevent the directory from being affected by ACEs meant for non-
directories.
When a new directory is created, the server MAY split any inherited
ACE which is both inheritable and effective (in other words, which
has neither ACE4_INHERIT_ONLY_ACE nor ACE4_NO_PROPAGATE_INHERIT_ACE
set), into two ACEs, one with no inheritance flags, and one with
ACE4_INHERIT_ONLY_ACE set. This makes it simpler to modify the
effective permissions on the directory without modifying the ACE
which is to be inherited to the new directory's children.
7. Multi-Server Namespace
NFSv4 supports attributes that allow a namespace to extend beyond the
boundaries of a single server. It is RECOMMENDED that clients and
servers support construction of such multi-server namespaces. Use of
such multi-server namespaces is OPTIONAL however, and for many
purposes, single-server namespace are perfectly acceptable. Use of
multi-server namespaces can provide many advantages, however, by
separating a file system's logical position in a namespace from the
(possibly changing) logistical and administrative considerations that
result in particular file systems being located on particular
servers.
7.1. Location Attributes
NFSv4 contains RECOMMENDED attributes that allow file systems on one
server to be associated with one or more instances of that file
system on other servers. These attributes specify such file system
instances by specifying a server address target (either as a DNS name
representing one or more IP addresses or as a literal IP address)
together with the path of that file system within the associated
single-server namespace.
The fs_locations RECOMMENDED attribute allows specification of the
file system locations where the data corresponding to a given file
system may be found.
7.2. File System Presence or Absence
A given location in an NFSv4 namespace (typically but not necessarily
a multi-server namespace) can have a number of file system instance
locations associated with it via the fs_locations attribute. There
may also be an actual current file system at that location,
accessible via normal namespace operations (e.g. LOOKUP). In this
case, the file system is said to be "present" at that position in the
namespace and clients will typically use it, reserving use of
additional locations specified via the location-related attributes to
situations in which the principal location is no longer available.
When there is no actual file system at the namespace location in
question, the file system is said to be "absent". An absent file
system contains no files or directories other than the root. Any
reference to it, except to access a small set of attributes useful in
determining alternate locations, will result in an error,
NFS4ERR_MOVED. Note that if the server ever returns the error
NFS4ERR_MOVED, it MUST support the fs_locations attribute.
While the error name suggests that we have a case of a file system
which once was present, and has only become absent later, this is
only one possibility. A position in the namespace may be permanently
absent with the set of file system(s) designated by the location
attributes being the only realization. The name NFS4ERR_MOVED
reflects an earlier, more limited conception of its function, but
this error will be returned whenever the referenced file system is
absent, whether it has moved or not.
Except in the case of GETATTR-type operations (to be discussed
later), when the current filehandle at the start of an operation is
within an absent file system, that operation is not performed and the
error NFS4ERR_MOVED returned, to indicate that the file system is
absent on the current server.
Because a GETFH cannot succeed if the current filehandle is within an
absent file system, filehandles within an absent file system cannot
be transferred to the client. When a client does have filehandles
within an absent file system, it is the result of obtaining them when
the file system was present, and having the file system become absent
subsequently.
It should be noted that because the check for the current filehandle
being within an absent file system happens at the start of every
operation, operations that change the current filehandle so that it
is within an absent file system will not result in an error. This
allows such combinations as PUTFH-GETATTR and LOOKUP-GETATTR to be
used to get attribute information, particularly location attribute
information, as discussed below.
7.3. Getting Attributes for an Absent File System
When a file system is absent, most attributes are not available, but
it is necessary to allow the client access to the small set of
attributes that are available, and most particularly that which gives
information about the correct current locations for this file system,
fs_locations.
7.3.1. GETATTR Within an Absent File System
As mentioned above, an exception is made for GETATTR in that
attributes may be obtained for a filehandle within an absent file
system. This exception only applies if the attribute mask contains
at least the fs_locations attribute bit, which indicates the client
is interested in a result regarding an absent file system. If it is
not requested, GETATTR will result in an NFS4ERR_MOVED error.
When a GETATTR is done on an absent file system, the set of supported
attributes is very limited. Many attributes, including those that
are normally REQUIRED, will not be available on an absent file
system. In addition to the fs_locations attribute, the following
attributes SHOULD be available on absent file systems, in the case of
RECOMMENDED attributes at least to the same degree that they are
available on present file systems.
fsid: This attribute should be provided so that the client can
determine file system boundaries, including, in particular, the
boundary between present and absent file systems. This value must
be different from any other fsid on the current server and need
have no particular relationship to fsids on any particular
destination to which the client might be directed.
mounted_on_fileid: For objects at the top of an absent file system
this attribute needs to be available. Since the fileid is one
which is within the present parent file system, there should be no
need to reference the absent file system to provide this
information.
Other attributes SHOULD NOT be made available for absent file
systems, even when it is possible to provide them. The server should
not assume that more information is always better and should avoid
gratuitously providing additional information.
When a GETATTR operation includes a bit mask for the attribute
fs_locations, but where the bit mask includes attributes which are
not supported, GETATTR will not return an error, but will return the
mask of the actual attributes supported with the results.
Handling of VERIFY/NVERIFY is similar to GETATTR in that if the
attribute mask does not include fs_locations the error NFS4ERR_MOVED
will result. It differs in that any appearance in the attribute mask
of an attribute not supported for an absent file system (and note
that this will include some normally REQUIRED attributes), will also
cause an NFS4ERR_MOVED result.
7.3.2. READDIR and Absent File Systems
A READDIR performed when the current filehandle is within an absent
file system will result in an NFS4ERR_MOVED error, since, unlike the
case of GETATTR, no such exception is made for READDIR.
Attributes for an absent file system may be fetched via a READDIR for
a directory in a present file system, when that directory contains
the root directories of one or more absent file systems. In this
case, the handling is as follows:
o If the attribute set requested includes fs_locations, then
fetching of attributes proceeds normally and no NFS4ERR_MOVED
indication is returned, even when the rdattr_error attribute is
requested.
o If the attribute set requested does not include fs_locations, then
if the rdattr_error attribute is requested, each directory entry
for the root of an absent file system, will report NFS4ERR_MOVED
as the value of the rdattr_error attribute.
o If the attribute set requested does not include either of the
attributes fs_locations or rdattr_error then the occurrence of the
root of an absent file system within the directory will result in
the READDIR failing with an NFS4ERR_MOVED error.
o The unavailability of an attribute because of a file system's
absence, even one that is ordinarily REQUIRED, does not result in
any error indication. The set of attributes returned for the root
directory of the absent file system in that case is simply
restricted to those actually available.
7.4. Uses of Location Information
The location-bearing attribute of fs_locations provides, together
with the possibility of absent file systems, a number of important
facilities in providing reliable, manageable, and scalable data
access.
When a file system is present, these attributes can provide
alternative locations, to be used to access the same data, in the
event of server failures, communications problems, or other
difficulties that make continued access to the current file system
impossible or otherwise impractical. Under some circumstances
multiple alternative locations may be used simultaneously to provide
higher performance access to the file system in question. Provision
of such alternate locations is referred to as "replication" although
there are cases in which replicated sets of data are not in fact
present, and the replicas are instead different paths to the same
data.
When a file system is present and becomes absent, clients can be
given the opportunity to have continued access to their data, at an
alternate location. In this case, a continued attempt to use the
data in the now-absent file system will result in an NFS4ERR_MOVED
error and at that point the successor locations (typically only one
but multiple choices are possible) can be fetched and used to
continue access. Transfer of the file system contents to the new
location is referred to as "migration", but it should be kept in mind
that there are cases in which this term can be used, like
"replication", when there is no actual data migration per se.
Where a file system was not previously present, specification of file
system location provides a means by which file systems located on one
server can be associated with a namespace defined by another server,
thus allowing a general multi-server namespace facility. A
designation of such a location, in place of an absent file system, is
called a "referral".
Because client support for location-related attributes is OPTIONAL, a
server may (but is not required to) take action to hide migration and
referral events from such clients, by acting as a proxy, for example.
7.4.1. File System Replication
The fs_locations attribute provides alternative locations, to be used
to access data in place of or in addition to the current file system
instance. On first access to a file system, the client should obtain
the value of the set of alternate locations by interrogating the
fs_locations attribute.
In the event that server failures, communications problems, or other
difficulties make continued access to the current file system
impossible or otherwise impractical, the client can use the alternate
locations as a way to get continued access to its data. Multiple
locations may be used simultaneously, to provide higher performance
through the exploitation of multiple paths between client and target
file system.
The alternate locations may be physical replicas of the (typically
read-only) file system data, or they may reflect alternate paths to
the same server or provide for the use of various forms of server
clustering in which multiple servers provide alternate ways of
accessing the same physical file system.
Multiple server addresses, whether they are derived from a single
entry with a DNS name representing a set of IP addresses, or from
multiple entries each with its own server address may correspond to
the same actual server.
7.4.2. File System Migration
When a file system is present and becomes absent, clients can be
given the opportunity to have continued access to their data, at an
alternate location, as specified by the fs_locations attribute.
Typically, a client will be accessing the file system in question,
get an NFS4ERR_MOVED error, and then use the fs_locations attribute
to determine the new location of the data.
Such migration can be helpful in providing load balancing or general
resource reallocation. The protocol does not specify how the file
system will be moved between servers. It is anticipated that a
number of different server-to-server transfer mechanisms might be
used with the choice left to the server implementer. The NFSv4
protocol specifies the method used to communicate the migration event
between client and server.
The new location may be an alternate communication path to the same
server, or, in the case of various forms of server clustering,
another server providing access to the same physical file system.
When an alternate location is designated as the target for migration,
it must designate the same data. Where file systems are writable, a
change made on the original file system must be visible on all
migration targets. Where a file system is not writable but
represents a read-only copy (possibly periodically updated) of a
writable file system, similar requirements apply to the propagation
of updates. Any change visible in the original file system must
already be effected on all migration targets, to avoid any
possibility, that a client in effecting a transition to the migration
target will see any reversion in file system state.
7.4.3. Referrals
Referrals provide a way of placing a file system in a location within
the namespace essentially without respect to its physical location on
a given server. This allows a single server or a set of servers to
present a multi-server namespace that encompasses file systems
located on multiple servers. Some likely uses of this include
establishment of site-wide or organization-wide namespaces, or even
knitting such together into a truly global namespace.
Referrals occur when a client determines, upon first referencing a
position in the current namespace, that it is part of a new file
system and that the file system is absent. When this occurs,
typically by receiving the error NFS4ERR_MOVED, the actual location
or locations of the file system can be determined by fetching the
fs_locations attribute.
The locations-related attribute may designate a single file system
location or multiple file system locations, to be selected based on
the needs of the client.
Use of multi-server namespaces is enabled by NFSv4 but is not
required. The use of multi-server namespaces and their scope will
depend on the applications used, and system administration
preferences.
Multi-server namespaces can be established by a single server
providing a large set of referrals to all of the included file
systems. Alternatively, a single multi-server namespace may be
administratively segmented with separate referral file systems (on
separate servers) for each separately-administered portion of the
namespace. Any segment or the top-level referral file system may use
replicated referral file systems for higher availability.
Generally, multi-server namespaces are for the most part uniform, in
that the same data made available to one client at a given location
in the namespace is made available to all clients at that location.
7.5. Location Entries and Server Identity
As mentioned above, a single location entry may have a server address
target in the form of a DNS name which may represent multiple IP
addresses, while multiple location entries may have their own server
address targets, that reference the same server.
When multiple addresses for the same server exist, the client may
assume that for each file system in the namespace of a given server
network address, there exist file systems at corresponding namespace
locations for each of the other server network addresses. It may do
this even in the absence of explicit listing in fs_locations. Such
corresponding file system locations can be used as alternate
locations, just as those explicitly specified via the fs_locations
attribute.
If a single location entry designates multiple server IP addresses,
the client cannot assume that these addresses are multiple paths to
the same server. In most case they will be, but the client MUST
verify that before acting on that assumption. When two server
addresses are designated by a single location entry and they
correspond to different servers, this normally indicates some sort of
misconfiguration, and so the client should avoid use such location
entries when alternatives are available. When they are not, clients
should pick one of IP addresses and use it, without using others that
are not directed to the same server.
7.6. Additional Client-side Considerations
When clients make use of servers that implement referrals,
replication, and migration, care should be taken so that a user who
mounts a given file system that includes a referral or a relocated
file system continues to see a coherent picture of that user-side
file system despite the fact that it contains a number of server-side
file systems which may be on different servers.
One important issue is upward navigation from the root of a server-
side file system to its parent (specified as ".." in UNIX), in the
case in which it transitions to that file system as a result of
referral, migration, or a transition as a result of replication.
When the client is at such a point, and it needs to ascend to the
parent, it must go back to the parent as seen within the multi-server
namespace rather issuing a LOOKUPP call to the server, which would
result in the parent within that server's single-server namespace.
In order to do this, the client needs to remember the filehandles
that represent such file system roots, and use these instead of
issuing a LOOKUPP to the current server. This will allow the client
to present to applications a consistent namespace, where upward
navigation and downward navigation are consistent.
Another issue concerns refresh of referral locations. When referrals
are used extensively, they may change as server configurations
change. It is expected that clients will cache information related
to traversing referrals so that future client side requests are
resolved locally without server communication. This is usually
rooted in client-side name lookup caching. Clients should
periodically purge this data for referral points in order to detect
changes in location information. When the change_policy attribute
changes for directories that hold referral entries or for the
referral entries themselves, clients should consider any associated
cached referral information to be out of date.
7.7. Effecting File System Transitions
Transitions between file system instances, whether due to switching
between replicas upon server unavailability, or in response to
server-initiated migration events are best dealt with together. This
is so even though for the server, pragmatic considerations will
normally force different implementation strategies for planned and
unplanned transitions. Even though the prototypical use cases of
replication and migration contain distinctive sets of features, when
all possibilities for these operations are considered, there is an
underlying unity of these operations, from the client's point of
view, that makes treating them together desirable.
A number of methods are possible for servers to replicate data and to
track client state in order to allow clients to transition between
file system instances with a minimum of disruption. Such methods
vary between those that use inter-server clustering techniques to
limit the changes seen by the client, to those that are less
aggressive, use more standard methods of replicating data, and impose
a greater burden on the client to adapt to the transition.
The NFSv4 protocol does not impose choices on clients and servers
with regard to that spectrum of transition methods. The NFSv4.0
protocol does not provide the servers a means of communicating the
transiation methods. In the NFSv4.1 protocol [27], an additional
attribute "fs_locations_info" is presented, which will define the
specific choices that can be made, how these choices are communicated
to the client and how the client is to deal with any discontinuities.
In the sections below, references will be made to various possible
server issues as a way of illustrating the transition scenarios that
clients may deal with. The intent here is not to define or limit
server implementations but rather to illustrate the range of issues
that clients may face. Again, as the NFSv4.0 protocol does not have
an explict means of communicating these issues to the client, the
intent is to document the problems that can be faced in a multi-
server name space and allow the client to use the inferred
transitions available via fs_locations and other attributes (see
Section 7.9.1).
In the discussion below, references will be made to a file system
having a particular property or of two file systems (typically the
source and destination) belonging to a common class of any of several
types. Two file systems that belong to such a class share some
important aspect of file system behavior that clients may depend upon
when present, to easily effect a seamless transition between file
system instances. Conversely, where the file systems do not belong
to such a common class, the client has to deal with various sorts of
implementation discontinuities which may cause performance or other
issues in effecting a transition.
While fs_locations is available, default assumptions with regard to
such classifications have to be inferred (see Section 7.9.1 for
details).
In cases in which one server is expected to accept opaque values from
the client that originated from another server, the servers SHOULD
encode the "opaque" values in big endian byte order. If this is
done, servers acting as replicas or immigrating file systems will be
able to parse values like stateids, directory cookies, filehandles,
etc. even if their native byte order is different from that of other
servers cooperating in the replication and migration of the file
system.
7.7.1. File System Transitions and Simultaneous Access
When a single file system may be accessed at multiple locations,
whether this is because of an indication of file system identity as
reported by the fs_locations attribute, the client will, depending on
specific circumstances as discussed below, either:
o The client accesses multiple instances simultaneously, as
representing alternate paths to the same data and metadata.
o The client accesses one instance (or set of instances) and then
transitions to an alternative instance (or set of instances) as a
result of network issues, server unresponsiveness, or server-
directed migration.
7.7.2. Filehandles and File System Transitions
There are a number of ways in which filehandles can be handled across
a file system transition. These can be divided into two broad
classes depending upon whether the two file systems across which the
transition happens share sufficient state to effect some sort of
continuity of file system handling.
When there is no such co-operation in filehandle assignment, the two
file systems are reported as being in different _handle_ classes. In
this case, all filehandles are assumed to expire as part of the file
system transition. Note that this behavior does not depend on
fh_expire_type attribute and depends on the specification of the
FH4_VOL_MIGRATION bit.
When there is co-operation in filehandle assignment, the two file
systems are reported as being in the same _handle_ classes. In this
case, persistent filehandles remain valid after the file system
transition, while volatile filehandles (excluding those that are only
volatile due to the FH4_VOL_MIGRATION bit) are subject to expiration
on the target server.
7.7.3. Fileids and File System Transitions
The issue of continuity of fileids in the event of a file system
transition needs to be addressed. The general expectation had been
that in situations in which the two file system instances are created
by a single vendor using some sort of file system image copy, fileids
will be consistent across the transition while in the analogous
multi-vendor transitions they will not. This poses difficulties,
especially for the client without special knowledge of the transition
mechanisms adopted by the server. Note that although fileid is not a
REQUIRED attribute, many servers support fileids and many clients
provide API's that depend on fileids.
It is important to note that while clients themselves may have no
trouble with a fileid changing as a result of a file system
transition event, applications do typically have access to the fileid
(e.g. via stat), and the result of this is that an application may
work perfectly well if there is no file system instance transition or
if any such transition is among instances created by a single vendor,
yet be unable to deal with the situation in which a multi-vendor
transition occurs, at the wrong time.
Providing the same fileids in a multi-vendor (multiple server
vendors) environment has generally been held to be quite difficult.
While there is work to be done, it needs to be pointed out that this
difficulty is partly self-imposed. Servers have typically identified
fileid with inode number, i.e. with a quantity used to find the file
in question. This identification poses special difficulties for
migration of a file system between vendors where assigning the same
index to a given file may not be possible. Note here that a fileid
is not required to be useful to find the file in question, only that
it is unique within the given file system. Servers prepared to
accept a fileid as a single piece of metadata and store it apart from
the value used to index the file information can relatively easily
maintain a fileid value across a migration event, allowing a truly
transparent migration event.
In any case, where servers can provide continuity of fileids, they
should, and the client should be able to find out that such
continuity is available and take appropriate action. Information
about the continuity (or lack thereof) of fileids across a file
system transition is represented by specifying whether the file
systems in question are of the same _fileid_ class.
Note that when consistent fileids do not exist across a transition
(either because there is no continuity of fileids or because fileid
is not a supported attribute on one of instances involved), and there
are no reliable filehandles across a transition event (either because
there is no filehandle continuity or because the filehandles are
volatile), the client is in a position where it cannot verify that
files it was accessing before the transition are the same objects.
It is forced to assume that no object has been renamed, and, unless
there are guarantees that provide this (e.g. the file system is read-
only), problems for applications may occur. Therefore, use of such
configurations should be limited to situations where the problems
that this may cause can be tolerated.
7.7.4. Fsids and File System Transitions
Since fsids are generally only unique within a per-server basis, it
is likely that they will change during a file system transition.
Clients should not make the fsids received from the server visible to
applications since they may not be globally unique, and because they
may change during a file system transition event. Applications are
best served if they are isolated from such transitions to the extent
possible.
7.7.5. The Change Attribute and File System Transitions
Since the change attribute is defined as a server-specific one,
change attributes fetched from one server are normally presumed to be
invalid on another server. Such a presumption is troublesome since
it would invalidate all cached change attributes, requiring
refetching. Even more disruptive, the absence of any assured
continuity for the change attribute means that even if the same value
is retrieved on refetch no conclusions can drawn as to whether the
object in question has changed. The identical change attribute could
be merely an artifact of a modified file with a different change
attribute construction algorithm, with that new algorithm just
happening to result in an identical change value.
When the two file systems have consistent change attribute formats,
and we say that they are in the same _change_ class, the client may
assume a continuity of change attribute construction and handle this
situation just as it would be handled without any file system
transition.
7.7.6. Lock State and File System Transitions
In a file system transition, the client needs to handle cases in
which the two servers have cooperated in state management and in
which they have not. Cooperation by two servers in state management
requires coordination of client IDs. Before the client attempts to
use a client ID associated with one server in a request to the server
of the other file system, it must eliminate the possibility that two
non-cooperating servers have assigned the same client ID by accident.
In the case of migration, the servers involved in the migration of a
file system SHOULD transfer all server state from the original to the
new server. When this is done, it must be done in a way that is
transparent to the client. With replication, such a degree of common
state is typically not the case.
This state transfer will reduce disruption to the client when a file
system transition occurs. If the servers are successful in
transferring all state, the client can attempt to establish sessions
associated with the client ID used for the source file system
instance. If the server accepts that as a valid client ID, then the
client may use the existing stateids associated with that client ID
for the old file system instance in connection with that same client
ID in connection with the transitioned file system instance.
File systems co-operating in state management may actually share
state or simply divide the identifier space so as to recognize (and
reject as stale) each other's stateids and client IDs. Servers which
do share state may not do so under all conditions or at all times.
The requirement for the server is that if it cannot be sure in
accepting a client ID that it reflects the locks the client was
given, it must treat all associated state as stale and report it as
such to the client.
The client must establish a new client ID on the destination, if it
does not have one already, and reclaim locks if possible. In this
case, old stateids and client IDs should not be presented to the new
server since there is no assurance that they will not conflict with
IDs valid on that server.
When actual locks are not known to be maintained, the destination
server may establish a grace period specific to the given file
system, with non-reclaim locks being rejected for that file system,
even though normal locks are being granted for other file systems.
Clients should not infer the absence of a grace period for file
systems being transitioned to a server from responses to requests for
other file systems.
In the case of lock reclamation for a given file system after a file
system transition, edge conditions can arise similar to those for
reclaim after server restart (although in the case of the planned
state transfer associated with migration, these can be avoided by
securely recording lock state as part of state migration). Unless
the destination server can guarantee that locks will not be
incorrectly granted, the destination server should not allow lock
reclaims and avoid establishing a grace period. (See Section 9.14
for further details.)
Information about client identity may be propagated between servers
in the form of client_owner4 and associated verifiers, under the
assumption that the client presents the same values to all the
servers with which it deals.
Servers are encouraged to provide facilities to allow locks to be
reclaimed on the new server after a file system transition. Often
such facilities may not be available and client should be prepared to
re-obtain locks, even though it is possible that the client may have
its LOCK or OPEN request denied due to a conflicting lock.
The consequences of having no facilities available to reclaim locks
on the sew server will depend on the type of environment. In some
environments, such as the transition between read-only file systems,
such denial of locks should not pose large difficulties in practice.
When an attempt to re-establish a lock on a new server is denied, the
client should treat the situation as if its original lock had been
revoked. Note that when the lock is granted, the client cannot
assume that no conflicting lock could have been granted in the
interim. Where change attribute continuity is present, the client
may check the change attribute to check for unwanted file
modifications. Where even this is not available, and the file system
is not read-only, a client may reasonably treat all pending locks as
having been revoked.
7.7.6.1. Transitions and the Lease_time Attribute
In order that the client may appropriately manage its leases in the
case of a file system transition, the destination server must
establish proper values for the lease_time attribute.
When state is transferred transparently, that state should include
the correct value of the lease_time attribute. The lease_time
attribute on the destination server must never be less than that on
the source since this would result in premature expiration of leases
granted by the source server. Upon transitions in which state is
transferred transparently, the client is under no obligation to re-
fetch the lease_time attribute and may continue to use the value
previously fetched (on the source server).
If state has not been transferred transparently because the client ID
is rejected when presented to the new server, the client should fetch
the value of lease_time on the new (i.e. destination) server, and use
it for subsequent locking requests. However the server must respect
a grace period at least as long as the lease_time on the source
server, in order to ensure that clients have ample time to reclaim
their lock before potentially conflicting non-reclaimed locks are
granted.
7.7.7. Write Verifiers and File System Transitions
In a file system transition, the two file systems may be clustered in
the handling of unstably written data. When this is the case, and
the two file systems belong to the same _write-verifier_ class, write
verifiers returned from one system may be compared to those returned
by the other and superfluous writes avoided.
When two file systems belong to different _write-verifier_ classes,
any verifier generated by one must not be compared to one provided by
the other. Instead, it should be treated as not equal even when the
values are identical.
7.7.8. Readdir Cookies and Verifiers and File System Transitions
In a file system transition, the two file systems may be consistent
in their handling of READDIR cookies and verifiers. When this is the
case, and the two file systems belong to the same _readdir_ class,
READDIR cookies and verifiers from one system may be recognized by
the other and READDIR operations started on one server may be validly
continued on the other, simply by presenting the cookie and verifier
returned by a READDIR operation done on the first file system to the
second.
When two file systems belong to different _readdir_ classes, any
READDIR cookie and verifier generated by one is not valid on the
second, and must not be presented to that server by the client. The
client should act as if the verifier was rejected.
7.7.9. File System Data and File System Transitions
When multiple replicas exist and are used simultaneously or in
succession by a client, applications using them will normally expect
that they contain data the same data or data which is consistent with
the normal sorts of changes that are made by other clients updating
the data of the file system. (with metadata being the same to the
degree inferred by the fs_locations attribute). However, when
multiple file systems are presented as replicas of one another, the
precise relationship between the data of one and the data of another
is not, as a general matter, specified by the NFSv4 protocol. It is
quite possible to present as replicas file systems where the data of
those file systems is sufficiently different that some applications
have problems dealing with the transition between replicas. The
namespace will typically be constructed so that applications can
choose an appropriate level of support, so that in one position in
the namespace a varied set of replicas will be listed while in
another only those that are up-to-date may be considered replicas.
The protocol does define three special cases of the relationship
among replicas to be specified by the server and relied upon by
clients:
o When multiple server addresses correspond to the same actual
server, the client may depend on the fact that changes to data,
metadata, or locks made on one file system are immediately
reflected on others.
o When multiple replicas exist and are used simultaneously by a
client, they must designate the same data. Where file systems are
writable, a change made on one instance must be visible on all
instances, immediately upon the earlier of the return of the
modifying requester or the visibility of that change on any of the
associated replicas. This allows a client to use these replicas
simultaneously without any special adaptation to the fact that
there are multiple replicas. In this case, locks, whether shared
or byte-range, and delegations obtained one replica are
immediately reflected on all replicas, even though these locks
will be managed under a set of client IDs.
o When one replica is designated as the successor instance to
another existing instance after return NFS4ERR_MOVED (i.e. the
case of migration), the client may depend on the fact that all
changes securely made to data (uncommitted writes are dealt with
in Section 7.7.7) on the original instance are made to the
successor image.
o Where a file system is not writable but represents a read-only
copy (possibly periodically updated) of a writable file system,
clients have similar requirements with regard to the propagation
of updates. They may need a guarantee that any change visible on
the original file system instance must be immediately visible on
any replica before the client transitions access to that replica,
in order to avoid any possibility that a client, in effecting a
transition to a replica, will see any reversion in file system
state. Since these file systems are presumed not to be suitable
for simultaneous use, there is no specification of how locking is
handled and it generally will be the case that locks obtained one
file system will be separate from those on others. Since these
are going to be read-only file systems, this is not expected to
pose an issue for clients or applications.
7.8. Effecting File System Referrals
Referrals are effected when an absent file system is encountered, and
one or more alternate locations are made available by the
fs_locations attribute. The client will typically get an
NFS4ERR_MOVED error, fetch the appropriate location information and
proceed to access the file system on a different server, even though
it retains its logical position within the original namespace.
Referrals differ from migration events in that they happen only when
the client has not previously referenced the file system in question
(so there is nothing to transition). Referrals can only come into
effect when an absent file system is encountered at its root.
The examples given in the sections below are somewhat artificial in
that an actual client will not typically do a multi-component lookup,
but will have cached information regarding the upper levels of the
name hierarchy. However, these example are chosen to make the
required behavior clear and easy to put within the scope of a small
number of requests, without getting unduly into details of how
specific clients might choose to cache things.
7.8.1. Referral Example (LOOKUP)
Let us suppose that the following COMPOUND is sent in an environment
in which /this/is/the/path is absent from the target server. This
may be for a number of reasons. It may be the case that the file
system has moved, or, it may be the case that the target server is
functioning mainly, or solely, to refer clients to the servers on
which various file systems are located.
o PUTROOTFH
o LOOKUP "this"
o LOOKUP "is"
o LOOKUP "the"
o LOOKUP "path"
o GETFH
o GETATTR fsid,fileid,size,time_modify
Under the given circumstances, the following will be the result.
o PUTROOTFH --> NFS_OK. The current fh is now the root of the
pseudo-fs.
o LOOKUP "this" --> NFS_OK. The current fh is for /this and is
within the pseudo-fs.
o LOOKUP "is" --> NFS_OK. The current fh is for /this/is and is
within the pseudo-fs.
o LOOKUP "the" --> NFS_OK. The current fh is for /this/is/the and
is within the pseudo-fs.
o LOOKUP "path" --> NFS_OK. The current fh is for /this/is/the/path
and is within a new, absent file system, but ... the client will
never see the value of that fh.
o GETFH --> NFS4ERR_MOVED. Fails because current fh is in an absent
file system at the start of the operation and the spec makes no
exception for GETFH.
o GETATTR fsid,fileid,size,time_modify. Not executed because the
failure of the GETFH stops processing of the COMPOUND.
Given the failure of the GETFH, the client has the job of determining
the root of the absent file system and where to find that file
system, i.e. the server and path relative to that server's root fh.
Note here that in this example, the client did not obtain filehandles
and attribute information (e.g. fsid) for the intermediate
directories, so that it would not be sure where the absent file
system starts. It could be the case, for example, that /this/is/the
is the root of the moved file system and that the reason that the
lookup of "path" succeeded is that the file system was not absent on
that operation but was moved between the last LOOKUP and the GETFH
(since COMPOUND is not atomic). Even if we had the fsids for all of
the intermediate directories, we could have no way of knowing that
/this/is/the/path was the root of a new file system, since we don't
yet have its fsid.
In order to get the necessary information, let us re-send the chain
of LOOKUPs with GETFHs and GETATTRs to at least get the fsids so we
can be sure where the appropriate file system boundaries are. The
client could choose to get fs_locations at the same time but in most
cases the client will have a good guess as to where file system
boundaries are (because of where and where not NFS4ERR_MOVED was
received) making fetching of fs_locations unnecessary.
OP01: PUTROOTFH --> NFS_OK
- Current fh is root of pseudo-fs.
OP02: GETATTR(fsid) --> NFS_OK
- Just for completeness. Normally, clients will know the fsid of
the pseudo-fs as soon as they establish communication with a
server.
OP03: LOOKUP "this" --> NFS_OK
OP04: GETATTR(fsid) --> NFS_OK
- Get current fsid to see where file system boundaries are. The
fsid will be that for the pseudo-fs in this example, so no
boundary.
OP05: GETFH --> NFS_OK
- Current fh is for /this and is within pseudo-fs.
OP06: LOOKUP "is" --> NFS_OK
- Current fh is for /this/is and is within pseudo-fs.
OP07: GETATTR(fsid) --> NFS_OK
- Get current fsid to see where file system boundaries are. The
fsid will be that for the pseudo-fs in this example, so no
boundary.
OP08: GETFH --> NFS_OK
- Current fh is for /this/is and is within pseudo-fs.
OP09: LOOKUP "the" --> NFS_OK
- Current fh is for /this/is/the and is within pseudo-fs.
OP10: GETATTR(fsid) --> NFS_OK
- Get current fsid to see where file system boundaries are. The
fsid will be that for the pseudo-fs in this example, so no
boundary.
OP11: GETFH --> NFS_OK
- Current fh is for /this/is/the and is within pseudo-fs.
OP12: LOOKUP "path" --> NFS_OK
- Current fh is for /this/is/the/path and is within a new, absent
file system, but ...
- The client will never see the value of that fh
OP13: GETATTR(fsid, fs_locations) --> NFS_OK
- We are getting the fsid to know where the file system boundaries
are. In this operation the fsid will be different than that of
the parent directory (which in turn was retrieved in OP10). Note
that the fsid we are given will not necessarily be preserved at
the new location. That fsid might be different and in fact the
fsid we have for this file system might be a valid fsid of a
different file system on that new server.
- In this particular case, we are pretty sure anyway that what has
moved is /this/is/the/path rather than /this/is/the since we have
the fsid of the latter and it is