draft-ietf-nfsv4-minorversion2-40.txt   draft-ietf-nfsv4-minorversion2-41.txt 
NFSv4 T. Haynes NFSv4 T. Haynes
Internet-Draft Primary Data Internet-Draft Primary Data
Intended status: Standards Track January 06, 2016 Intended status: Standards Track January 28, 2016
Expires: July 9, 2016 Expires: July 31, 2016
NFS Version 4 Minor Version 2 NFS Version 4 Minor Version 2
draft-ietf-nfsv4-minorversion2-40.txt draft-ietf-nfsv4-minorversion2-41.txt
Abstract Abstract
This Internet-Draft describes NFS version 4 minor version two, This Internet-Draft describes NFS version 4 minor version two,
describing the protocol extensions made from NFS version 4 minor describing the protocol extensions made from NFS version 4 minor
version 1. Major extensions introduced in NFS version 4 minor version 1. Major extensions introduced in NFS version 4 minor
version two include: Server Side Copy, Application Input/Output (I/O) version two include: Server Side Copy, Application Input/Output (I/O)
Advise, Space Reservations, Sparse Files, Application Data Blocks, Advise, Space Reservations, Sparse Files, Application Data Blocks,
and Labeled NFS. and Labeled NFS.
skipping to change at page 1, line 41 skipping to change at page 1, line 41
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on July 9, 2016. This Internet-Draft will expire on July 31, 2016.
Copyright Notice Copyright Notice
Copyright (c) 2016 IETF Trust and the persons identified as the Copyright (c) 2016 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 17 skipping to change at page 2, line 17
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1. Scope of This Document . . . . . . . . . . . . . . . . . 5 1.1. Scope of This Document . . . . . . . . . . . . . . . . . 5
1.2. NFSv4.2 Goals . . . . . . . . . . . . . . . . . . . . . . 5 1.2. NFSv4.2 Goals . . . . . . . . . . . . . . . . . . . . . . 5
1.3. Overview of NFSv4.2 Features . . . . . . . . . . . . . . 5 1.3. Overview of NFSv4.2 Features . . . . . . . . . . . . . . 6
1.3.1. Server Side Clone and Copy . . . . . . . . . . . . . 5 1.3.1. Server Side Clone and Copy . . . . . . . . . . . . . 6
1.3.2. Application Input/Output (I/O) Advise . . . . . . . . 6 1.3.2. Application Input/Output (I/O) Advise . . . . . . . . 6
1.3.3. Sparse Files . . . . . . . . . . . . . . . . . . . . 6 1.3.3. Sparse Files . . . . . . . . . . . . . . . . . . . . 6
1.3.4. Space Reservation . . . . . . . . . . . . . . . . . . 6 1.3.4. Space Reservation . . . . . . . . . . . . . . . . . . 7
1.3.5. Application Data Block (ADB) Support . . . . . . . . 6 1.3.5. Application Data Block (ADB) Support . . . . . . . . 7
1.3.6. Labeled NFS . . . . . . . . . . . . . . . . . . . . . 7 1.3.6. Labeled NFS . . . . . . . . . . . . . . . . . . . . . 7
1.3.7. Layout Enhancements . . . . . . . . . . . . . . . . . 7 1.3.7. Layout Enhancements . . . . . . . . . . . . . . . . . 7
1.4. Enhancements to Minor Versioning Model . . . . . . . . . 7 1.4. Enhancements to Minor Versioning Model . . . . . . . . . 7
2. Minor Versioning . . . . . . . . . . . . . . . . . . . . . . 7 2. Minor Versioning . . . . . . . . . . . . . . . . . . . . . . 8
3. pNFS considerations for New Operations . . . . . . . . . . . 8 3. pNFS considerations for New Operations . . . . . . . . . . . 8
3.1. Atomicity for ALLOCATE and DEALLOCATE . . . . . . . . . . 8 3.1. Atomicity for ALLOCATE and DEALLOCATE . . . . . . . . . . 9
3.2. Sharing of stateids with NFSv4.1 . . . . . . . . . . . . 8 3.2. Sharing of stateids with NFSv4.1 . . . . . . . . . . . . 9
3.3. NFSv4.2 as a Storage Protocol in pNFS: the File Layout 3.3. NFSv4.2 as a Storage Protocol in pNFS: the File Layout
Type . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Type . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3.1. Operations Sent to NFSv4.2 Data Servers . . . . . . . 9 3.3.1. Operations Sent to NFSv4.2 Data Servers . . . . . . . 9
4. Server Side Copy . . . . . . . . . . . . . . . . . . . . . . 9 4. Server Side Copy . . . . . . . . . . . . . . . . . . . . . . 9
4.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 9 4.1. Protocol Overview . . . . . . . . . . . . . . . . . . . . 10
4.2. Protocol Overview . . . . . . . . . . . . . . . . . . . . 9 4.1.1. Copy Operations . . . . . . . . . . . . . . . . . . . 11
4.2.1. Copy Operations . . . . . . . . . . . . . . . . . . . 10 4.1.2. Requirements for Operations . . . . . . . . . . . . . 11
4.2.2. Requirements for Operations . . . . . . . . . . . . . 11 4.2. Requirements for Inter-Server Copy . . . . . . . . . . . 12
4.3. Requirements for Inter-Server Copy . . . . . . . . . . . 12 4.3. Implementation Considerations . . . . . . . . . . . . . . 13
4.4. Implementation Considerations . . . . . . . . . . . . . . 12 4.3.1. Locking the Files . . . . . . . . . . . . . . . . . . 13
4.4.1. Locking the Files . . . . . . . . . . . . . . . . . . 12 4.3.2. Client Caches . . . . . . . . . . . . . . . . . . . . 13
4.4.2. Client Caches . . . . . . . . . . . . . . . . . . . . 13 4.4. Intra-Server Copy . . . . . . . . . . . . . . . . . . . . 13
4.5. Intra-Server Copy . . . . . . . . . . . . . . . . . . . . 13 4.5. Inter-Server Copy . . . . . . . . . . . . . . . . . . . . 15
4.6. Inter-Server Copy . . . . . . . . . . . . . . . . . . . . 14 4.6. Server-to-Server Copy Protocol . . . . . . . . . . . . . 19
4.7. Server-to-Server Copy Protocol . . . . . . . . . . . . . 18 4.6.1. Considerations on Selecting a Copy Protocol . . . . . 19
4.7.1. Considerations on Selecting a Copy Protocol . . . . . 18 4.6.2. Using NFSv4.x as the Copy Protocol . . . . . . . . . 19
4.7.2. Using NFSv4.x as the Copy Protocol . . . . . . . . . 18 4.6.3. Using an Alternative Copy Protocol . . . . . . . . . 19
4.7.3. Using an Alternative Copy Protocol . . . . . . . . . 18 4.7. netloc4 - Network Locations . . . . . . . . . . . . . . . 20
4.8. netloc4 - Network Locations . . . . . . . . . . . . . . . 19 4.8. Copy Offload Stateids . . . . . . . . . . . . . . . . . . 21
4.9. Copy Offload Stateids . . . . . . . . . . . . . . . . . . 20 4.9. Security Considerations . . . . . . . . . . . . . . . . . 21
4.10. Security Considerations . . . . . . . . . . . . . . . . . 20 4.9.1. Inter-Server Copy Security . . . . . . . . . . . . . 22
4.10.1. Inter-Server Copy Security . . . . . . . . . . . . . 21 5. Support for Application I/O Hints . . . . . . . . . . . . . . 29
6. Sparse Files . . . . . . . . . . . . . . . . . . . . . . . . 30
5. Support for Application I/O Hints . . . . . . . . . . . . . . 28 6.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 31
6. Sparse Files . . . . . . . . . . . . . . . . . . . . . . . . 28 6.2. New Operations . . . . . . . . . . . . . . . . . . . . . 31
6.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 29 6.2.1. READ_PLUS . . . . . . . . . . . . . . . . . . . . . . 31
6.2. New Operations . . . . . . . . . . . . . . . . . . . . . 30 6.2.2. DEALLOCATE . . . . . . . . . . . . . . . . . . . . . 31
6.2.1. READ_PLUS . . . . . . . . . . . . . . . . . . . . . . 30 7. Space Reservation . . . . . . . . . . . . . . . . . . . . . . 32
6.2.2. DEALLOCATE . . . . . . . . . . . . . . . . . . . . . 30 8. Application Data Block Support . . . . . . . . . . . . . . . 34
7. Space Reservation . . . . . . . . . . . . . . . . . . . . . . 30 8.1. Generic Framework . . . . . . . . . . . . . . . . . . . . 34
8. Application Data Block Support . . . . . . . . . . . . . . . 32 8.1.1. Data Block Representation . . . . . . . . . . . . . . 35
8.1. Generic Framework . . . . . . . . . . . . . . . . . . . . 33 8.2. An Example of Detecting Corruption . . . . . . . . . . . 35
8.1.1. Data Block Representation . . . . . . . . . . . . . . 34 8.3. Example of READ_PLUS . . . . . . . . . . . . . . . . . . 37
8.2. An Example of Detecting Corruption . . . . . . . . . . . 34 8.4. An Example of Zeroing Space . . . . . . . . . . . . . . . 38
8.3. Example of READ_PLUS . . . . . . . . . . . . . . . . . . 36 9. Labeled NFS . . . . . . . . . . . . . . . . . . . . . . . . . 38
8.4. An Example of Zeroing Space . . . . . . . . . . . . . . . 36 9.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . 39
9. Labeled NFS . . . . . . . . . . . . . . . . . . . . . . . . . 37 9.2. MAC Security Attribute . . . . . . . . . . . . . . . . . 40
9.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . 37 9.2.1. Delegations . . . . . . . . . . . . . . . . . . . . . 40
9.2. MAC Security Attribute . . . . . . . . . . . . . . . . . 38 9.2.2. Permission Checking . . . . . . . . . . . . . . . . . 41
9.2.1. Delegations . . . . . . . . . . . . . . . . . . . . . 39 9.2.3. Object Creation . . . . . . . . . . . . . . . . . . . 41
9.2.2. Permission Checking . . . . . . . . . . . . . . . . . 39 9.2.4. Existing Objects . . . . . . . . . . . . . . . . . . 41
9.2.3. Object Creation . . . . . . . . . . . . . . . . . . . 39 9.2.5. Label Changes . . . . . . . . . . . . . . . . . . . . 41
9.2.4. Existing Objects . . . . . . . . . . . . . . . . . . 40 9.3. pNFS Considerations . . . . . . . . . . . . . . . . . . . 42
9.2.5. Label Changes . . . . . . . . . . . . . . . . . . . . 40 9.4. Discovery of Server Labeled NFS Support . . . . . . . . . 42
9.3. pNFS Considerations . . . . . . . . . . . . . . . . . . . 40 9.5. MAC Security NFS Modes of Operation . . . . . . . . . . . 42
9.4. Discovery of Server Labeled NFS Support . . . . . . . . . 41 9.5.1. Full Mode . . . . . . . . . . . . . . . . . . . . . . 43
9.5. MAC Security NFS Modes of Operation . . . . . . . . . . . 41 9.5.2. Guest Mode . . . . . . . . . . . . . . . . . . . . . 44
9.5.1. Full Mode . . . . . . . . . . . . . . . . . . . . . . 41 9.6. Security Considerations for Labeled NFS . . . . . . . . . 44
9.5.2. Guest Mode . . . . . . . . . . . . . . . . . . . . . 43
9.6. Security Considerations for Labeled NFS . . . . . . . . . 43
10. Sharing change attribute implementation characteristics with 10. Sharing change attribute implementation characteristics with
NFSv4 clients . . . . . . . . . . . . . . . . . . . . . . . . 43 NFSv4 clients . . . . . . . . . . . . . . . . . . . . . . . . 45
11. Error Values . . . . . . . . . . . . . . . . . . . . . . . . 44 11. Error Values . . . . . . . . . . . . . . . . . . . . . . . . 45
11.1. Error Definitions . . . . . . . . . . . . . . . . . . . 44 11.1. Error Definitions . . . . . . . . . . . . . . . . . . . 46
11.1.1. General Errors . . . . . . . . . . . . . . . . . . . 45 11.1.1. General Errors . . . . . . . . . . . . . . . . . . . 46
11.1.2. Server to Server Copy Errors . . . . . . . . . . . . 45 11.1.2. Server to Server Copy Errors . . . . . . . . . . . . 46
11.1.3. Labeled NFS Errors . . . . . . . . . . . . . . . . . 46 11.1.3. Labeled NFS Errors . . . . . . . . . . . . . . . . . 47
11.2. New Operations and Their Valid Errors . . . . . . . . . 46 11.2. New Operations and Their Valid Errors . . . . . . . . . 47
11.3. New Callback Operations and Their Valid Errors . . . . . 50 11.3. New Callback Operations and Their Valid Errors . . . . . 52
12. New File Attributes . . . . . . . . . . . . . . . . . . . . . 51 12. New File Attributes . . . . . . . . . . . . . . . . . . . . . 52
12.1. New RECOMMENDED Attributes - List and Definition 12.1. New RECOMMENDED Attributes - List and Definition
References . . . . . . . . . . . . . . . . . . . . . . . 51 References . . . . . . . . . . . . . . . . . . . . . . . 52
12.2. Attribute Definitions . . . . . . . . . . . . . . . . . 52 12.2. Attribute Definitions . . . . . . . . . . . . . . . . . 53
13. Operations: REQUIRED, RECOMMENDED, or OPTIONAL . . . . . . . 54 13. Operations: REQUIRED, RECOMMENDED, or OPTIONAL . . . . . . . 56
14. Modifications to NFSv4.1 Operations . . . . . . . . . . . . . 58 14. Modifications to NFSv4.1 Operations . . . . . . . . . . . . . 59
14.1. Operation 42: EXCHANGE_ID - Instantiate Client ID . . . 58 14.1. Operation 42: EXCHANGE_ID - Instantiate Client ID . . . 59
14.2. Operation 48: GETDEVICELIST - Get All Device Mappings 14.2. Operation 48: GETDEVICELIST - Get All Device Mappings
for a File System . . . . . . . . . . . . . . . . . . . 59 for a File System . . . . . . . . . . . . . . . . . . . 60
15. NFSv4.2 Operations . . . . . . . . . . . . . . . . . . . . . 61 15. NFSv4.2 Operations . . . . . . . . . . . . . . . . . . . . . 62
15.1. Operation 59: ALLOCATE - Reserve Space in A Region of a 15.1. Operation 59: ALLOCATE - Reserve Space in A Region of a
File . . . . . . . . . . . . . . . . . . . . . . . . . . 61 File . . . . . . . . . . . . . . . . . . . . . . . . . . 62
15.2. Operation 60: COPY - Initiate a server-side copy . . . . 63
15.2. Operation 60: COPY - Initiate a server-side copy . . . . 62
15.3. Operation 61: COPY_NOTIFY - Notify a source server of a 15.3. Operation 61: COPY_NOTIFY - Notify a source server of a
future copy . . . . . . . . . . . . . . . . . . . . . . 66 future copy . . . . . . . . . . . . . . . . . . . . . . 68
15.4. Operation 62: DEALLOCATE - Unreserve Space in a Region 15.4. Operation 62: DEALLOCATE - Unreserve Space in a Region
of a File . . . . . . . . . . . . . . . . . . . . . . . 68 of a File . . . . . . . . . . . . . . . . . . . . . . . 70
15.5. Operation 63: IO_ADVISE - Application I/O access pattern 15.5. Operation 63: IO_ADVISE - Application I/O access pattern
hints . . . . . . . . . . . . . . . . . . . . . . . . . 70 hints . . . . . . . . . . . . . . . . . . . . . . . . . 71
15.6. Operation 64: LAYOUTERROR - Provide Errors for the 15.6. Operation 64: LAYOUTERROR - Provide Errors for the
Layout . . . . . . . . . . . . . . . . . . . . . . . . . 76 Layout . . . . . . . . . . . . . . . . . . . . . . . . . 77
15.7. Operation 65: LAYOUTSTATS - Provide Statistics for the 15.7. Operation 65: LAYOUTSTATS - Provide Statistics for the
Layout . . . . . . . . . . . . . . . . . . . . . . . . . 79 Layout . . . . . . . . . . . . . . . . . . . . . . . . . 80
15.8. Operation 66: OFFLOAD_CANCEL - Stop an Offloaded 15.8. Operation 66: OFFLOAD_CANCEL - Stop an Offloaded
Operation . . . . . . . . . . . . . . . . . . . . . . . 80 Operation . . . . . . . . . . . . . . . . . . . . . . . 81
15.9. Operation 67: OFFLOAD_STATUS - Poll for Status of 15.9. Operation 67: OFFLOAD_STATUS - Poll for Status of
Asynchronous Operation . . . . . . . . . . . . . . . . . 81 Asynchronous Operation . . . . . . . . . . . . . . . . . 82
15.10. Operation 68: READ_PLUS - READ Data or Holes from a File 82 15.10. Operation 68: READ_PLUS - READ Data or Holes from a File 83
15.11. Operation 69: SEEK - Find the Next Data or Hole . . . . 87 15.11. Operation 69: SEEK - Find the Next Data or Hole . . . . 88
15.12. Operation 70: WRITE_SAME - WRITE an ADB Multiple Times 15.12. Operation 70: WRITE_SAME - WRITE an ADB Multiple Times
to a File . . . . . . . . . . . . . . . . . . . . . . . 88 to a File . . . . . . . . . . . . . . . . . . . . . . . 89
15.13. Operation 71: CLONE - Clone a range of file into another 15.13. Operation 71: CLONE - Clone a range of file into another
file . . . . . . . . . . . . . . . . . . . . . . . . . . 92 file . . . . . . . . . . . . . . . . . . . . . . . . . . 93
16. NFSv4.2 Callback Operations . . . . . . . . . . . . . . . . . 94 16. NFSv4.2 Callback Operations . . . . . . . . . . . . . . . . . 95
16.1. Operation 15: CB_OFFLOAD - Report results of an 16.1. Operation 15: CB_OFFLOAD - Report results of an
asynchronous operation . . . . . . . . . . . . . . . . . 94 asynchronous operation . . . . . . . . . . . . . . . . . 95
17. Security Considerations . . . . . . . . . . . . . . . . . . . 95 17. Security Considerations . . . . . . . . . . . . . . . . . . . 96
18. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 96 18. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 97
19. References . . . . . . . . . . . . . . . . . . . . . . . . . 96 19. References . . . . . . . . . . . . . . . . . . . . . . . . . 97
19.1. Normative References . . . . . . . . . . . . . . . . . . 96 19.1. Normative References . . . . . . . . . . . . . . . . . . 97
19.2. Informative References . . . . . . . . . . . . . . . . . 97 19.2. Informative References . . . . . . . . . . . . . . . . . 98
Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 98 Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 99
Appendix B. RFC Editor Notes . . . . . . . . . . . . . . . . . . 99 Appendix B. RFC Editor Notes . . . . . . . . . . . . . . . . . . 100
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 99 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 101
1. Introduction 1. Introduction
The NFS version 4 minor version 2 (NFSv4.2) protocol is the third The NFS version 4 minor version 2 (NFSv4.2) protocol is the third
minor version of the NFS version 4 (NFSv4) protocol. The first minor minor version of the NFS version 4 (NFSv4) protocol. The first minor
version, NFSv4.0, is described in [RFC7530] and the second minor version, NFSv4.0, is described in [RFC7530] and the second minor
version, NFSv4.1, is described in [RFC5661]. version, NFSv4.1, is described in [RFC5661].
As a minor version, NFSv4.2 is consistent with the overall goals for As a minor version, NFSv4.2 is consistent with the overall goals for
NFSv4, but extends the protocol so as to better meet those goals, NFSv4, but extends the protocol so as to better meet those goals,
based on experiences with NFSv4.1. In addition, NFSv4.2 has adopted based on experiences with NFSv4.1. In addition, NFSv4.2 has adopted
some additional goals, which motivate some of the major extensions in some additional goals, which motivate some of the major extensions in
NFSv4.2. NFSv4.2.
1.1. Scope of This Document 1.1. Scope of This Document
This document describes the NFSv4.2 protocol. With respect to This document describes the NFSv4.2 protocol as a set of extensions
NFSv4.0 and NFSv4.1, this document does not: to the specification for NFSv4.1. That specification remains current
and forms the basis for the additions defined herein. In addition,
the specfication for NFSv4.0 remains current as well.
It is necessary to implement all the REQUIRED features of NFSv4.1
before adding NFSv4.2 features to the implementation. With respect
to NFSv4.0 and NFSv4.1, this document does not:
o describe the NFSv4.0 or NFSv4.1 protocols, except where needed to o describe the NFSv4.0 or NFSv4.1 protocols, except where needed to
contrast with NFSv4.2 contrast with NFSv4.2
o modify the specification of the NFSv4.0 or NFSv4.1 protocols o modify the specification of the NFSv4.0 or NFSv4.1 protocols
o clarify the NFSv4.0 or NFSv4.1 protocols, that is any o clarify the NFSv4.0 or NFSv4.1 protocols, that is any
clarifications made here apply only to NFSv4.2 and neither of the clarifications made here apply only to NFSv4.2 and neither of the
prior protocols prior protocols
NFSv4.2 is a superset of NFSv4.1, with all of the new features being NFSv4.2 is a superset of NFSv4.1, with all of the new features being
optional. As such, NFSv4.2 maintains the same compatibility that optional. As such, NFSv4.2 maintains the same compatibility that
NFSv4.1 had with NFSv4.0. Any interactions of a new feature with NFSv4.1 had with NFSv4.0. Any interactions of a new feature with
NFSv4.1 semantics, is described in the relevant text. NFSv4.1 semantics, is described in the relevant text.
The full External Data Representation (XDR) [RFC4506] for NFSv4.2 is The full External Data Representation (XDR) [RFC4506] for NFSv4.2 is
presented in [I-D.ietf-nfsv4-minorversion2-dot-x]. presented in [I-D.ietf-nfsv4-minorversion2-dot-x].
1.2. NFSv4.2 Goals 1.2. NFSv4.2 Goals
A major goal of the design of NFSv4.2 is to take common local file A major goal of the enhancements provided in NFSv4.2 is to take
system features and offer them remotely. These features might common local file system features that have not been available
through earlier versions of NFS, and to offer them remotely. These
features might
o already be available on the servers, e.g., sparse files o already be available on the servers, e.g., sparse files
o be under development as a new standard, e.g., SEEK pulls in both o be under development as a new standard, e.g., SEEK pulls in both
SEEK_HOLE and SEEK_DATA SEEK_HOLE and SEEK_DATA
o be used by clients with the servers via some proprietary means, o be used by clients with the servers via some proprietary means,
e.g., Labeled NFS e.g., Labeled NFS
NFSv4.2 provides means for clients to leverage these features on the NFSv4.2 provides means for clients to leverage these features on the
skipping to change at page 6, line 51 skipping to change at page 7, line 17
When a file is sparse, one concern applications have is ensuring that When a file is sparse, one concern applications have is ensuring that
there will always be enough data blocks available for the file during there will always be enough data blocks available for the file during
future writes. ALLOCATE (see Section 15.1) allows a client to future writes. ALLOCATE (see Section 15.1) allows a client to
request a guarantee that space will be available. Also DEALLOCATE request a guarantee that space will be available. Also DEALLOCATE
(see Section 15.4) allows the client to punch a hole into a file, (see Section 15.4) allows the client to punch a hole into a file,
thus releasing a space reservation. thus releasing a space reservation.
1.3.5. Application Data Block (ADB) Support 1.3.5. Application Data Block (ADB) Support
Some applications treat a file as if it were a disk and as such want Some applications treat a file as if it were a disk and as such want
to initialize (or format) the file image. We introduce WRITE_SAME to initialize (or format) the file image. The WRITE_SAME (see
(see Section 15.12) to send this metadata to the server to allow it Section 15.12) is introduced to send this metadata to the server to
to write the block contents. allow it to write the block contents.
1.3.6. Labeled NFS 1.3.6. Labeled NFS
While both clients and servers can employ Mandatory Access Control While both clients and servers can employ Mandatory Access Control
(MAC) security models to enforce data access, there has been no (MAC) security models to enforce data access, there has been no
protocol support for interoperability. A new file object attribute, protocol support for interoperability. A new file object attribute,
sec_label (see Section 12.2.4) allows for the server to store MAC sec_label (see Section 12.2.4) allows for the server to store MAC
labels on files, which the client retrieves and uses to enforce data labels on files, which the client retrieves and uses to enforce data
access (see Section 9.5.2). The format of the sec_label accommodates access (see Section 9.5.2). The format of the sec_label accommodates
any MAC security system. any MAC security system.
skipping to change at page 7, line 31 skipping to change at page 7, line 45
any errors or performance characteristics with the storage devices. any errors or performance characteristics with the storage devices.
NFSv4.2 provides two new operations to do so respectively: NFSv4.2 provides two new operations to do so respectively:
LAYOUTERROR (see Section 15.6) and LAYOUTSTATS (see Section 15.7). LAYOUTERROR (see Section 15.6) and LAYOUTSTATS (see Section 15.7).
1.4. Enhancements to Minor Versioning Model 1.4. Enhancements to Minor Versioning Model
In NFSv4.1, the only way to introduce new variants of an operation In NFSv4.1, the only way to introduce new variants of an operation
was to introduce a new operation. For instance, READ would have to was to introduce a new operation. For instance, READ would have to
be replaced or supplemented by, say, either READ2 or READ_PLUS. With be replaced or supplemented by, say, either READ2 or READ_PLUS. With
the use of discriminated unions as parameters to such functions in the use of discriminated unions as parameters to such functions in
NFSv4.2, it is possible to add a new arm in a subsequent minor NFSv4.2, it is possible to add a new arm (i.e., a new entry in the
version. And it is also possible to move such an operation from union and a corresponding new field in the structure) in a subsequent
OPTIONAL/RECOMMENDED to REQUIRED. Forcing an implementation to adopt minor version. And it is also possible to move such an operation
each arm of a discriminated union at such a time does not meet the from OPTIONAL/RECOMMENDED to REQUIRED. Forcing an implementation to
spirit of the minor versioning rules. As such, new arms of a adopt each arm of a discriminated union at such a time does not meet
the spirit of the minor versioning rules. As such, new arms of a
discriminated union MUST follow the same guidelines for minor discriminated union MUST follow the same guidelines for minor
versioning as operations in NFSv4.1 - i.e., they may not be made versioning as operations in NFSv4.1 - i.e., they may not be made
REQUIRED. To support this, a new error code, NFS4ERR_UNION_NOTSUPP, REQUIRED. To support this, a new error code, NFS4ERR_UNION_NOTSUPP,
allows the server to communicate to the client that the operation is allows the server to communicate to the client that the operation is
supported, but the specific arm of the discriminated union is not. supported, but the specific arm of the discriminated union is not.
2. Minor Versioning 2. Minor Versioning
NFSv4.2 is a minor version of NFSv4 and is built upon NFSv4.1 as NFSv4.2 is a minor version of NFSv4 and is built upon NFSv4.1 as
documented in [RFC5661] and [RFC5662]. documented in [RFC5661] and [RFC5662].
NFSv4.2 does not modify the rules applicable to the NFSv4 versioning NFSv4.2 does not modify the rules applicable to the NFSv4 versioning
process and follows the rules set out in [RFC5661] or in standard- process and follows the rules set out in [RFC5661] or in standard-
track documents updating that document (e.g., in an RFC based on track documents updating that document (e.g., in an RFC based on
[NFSv4-Versioning]). [I-D.ietf-nfsv4-versioning]).
NFSv4.2 only defines extensions to NFSv4.1, each of which may be NFSv4.2 only defines extensions to NFSv4.1, each of which may be
supported (or not) independently. It does not supported (or not) independently. It does not
o introduce infrastructural features o introduce infrastructural features
o make existing features MANDATORY to NOT implement o make existing features MANDATORY to NOT implement
o change the status of existing features (i.e., by changing their o change the status of existing features (i.e., by changing their
status among OPTIONAL, RECOMMENDED, REQUIRED). status among OPTIONAL, RECOMMENDED, REQUIRED).
The following versioning-related considerations should be noted. The following versioning-related considerations should be noted.
o When a new case is added to an existing switch, servers need to o When a new case is added to an existing switch, servers need to
report non-support of that new case by returning report non-support of that new case by returning
NFS4ERR_UNION_NOTSUPP. NFS4ERR_UNION_NOTSUPP.
o As regards the potential cross-minor-version transfer of stateids, o As regards the potential cross-minor-version transfer of stateids,
Parallel NFS (pNFS) (see Section 12 of [RFC5661]) implementations Parallel NFS (pNFS) (see Section 12 of [RFC5661]) implementations
of the file mapping type may support of use of an NFSv4.2 metadata of the file mapping type may support of use of an NFSv4.2 metadata
sever (see Sections 1.7.2.2 and 12.2.2 of [RFC5661]) with NFSv4.1 server (see Sections 1.7.2.2 and 12.2.2 of [RFC5661]) with NFSv4.1
data servers. In this context, a stateid returned by an NFSv4.2 data servers. In this context, a stateid returned by an NFSv4.2
COMPOUND will be used in an NFSv4.1 COMPOUND directed to the data COMPOUND will be used in an NFSv4.1 COMPOUND directed to the data
server (see Sections 3.2 and 3.3). server (see Sections 3.2 and 3.3).
3. pNFS considerations for New Operations 3. pNFS considerations for New Operations
The interactions of the new operations with non-pNFS functionality is The interactions of the new operations with non-pNFS functionality is
straight forward and covered in the relevant sections. However, the straight forward and covered in the relevant sections. However, the
interactions of the new operations with pNFS is more complicated and interactions of the new operations with pNFS is more complicated and
this section provides an overview. this section provides an overview.
skipping to change at page 9, line 15 skipping to change at page 9, line 29
3.3. NFSv4.2 as a Storage Protocol in pNFS: the File Layout Type 3.3. NFSv4.2 as a Storage Protocol in pNFS: the File Layout Type
A file layout provided by a NFSv4.2 server may refer either to a A file layout provided by a NFSv4.2 server may refer either to a
storage device that only implements NFSv4.1 as specified in storage device that only implements NFSv4.1 as specified in
[RFC5661], or to a storage device that implements additions from [RFC5661], or to a storage device that implements additions from
NFSv4.2, in which case the rules in Section 3.3.1 apply. As the File NFSv4.2, in which case the rules in Section 3.3.1 apply. As the File
Layout Type does not provide a means for informing the client as to Layout Type does not provide a means for informing the client as to
which minor version a particular storage device is providing, the which minor version a particular storage device is providing, the
client will have to negotiate this with the storage device via the client will have to negotiate this with the storage device via the
normal Remote Procedure Call (RPC) semantics of major and minor normal Remote Procedure Call (RPC) semantics of major and minor
version discovery. E.g., as per Section 16.2.3 of [RFC5661], the version discovery. For example, as per Section 16.2.3 of [RFC5661],
client could try a COMPOUND with a minorversion of 2 and if it gets the client could try a COMPOUND with a minorversion of 2 and if it
NFS4ERR_MINOR_VERS_MISMATCH, drop back to 1. gets NFS4ERR_MINOR_VERS_MISMATCH, drop back to 1.
3.3.1. Operations Sent to NFSv4.2 Data Servers 3.3.1. Operations Sent to NFSv4.2 Data Servers
In addition to the commands listed in [RFC5661], NFSv4.2 data servers In addition to the commands listed in [RFC5661], NFSv4.2 data servers
MAY accept a COMPOUND containing the following additional operations: MAY accept a COMPOUND containing the following additional operations:
IO_ADVISE (see Section 15.5), READ_PLUS (see Section 15.10), IO_ADVISE (see Section 15.5), READ_PLUS (see Section 15.10),
WRITE_SAME (see Section 15.12), and SEEK (see Section 15.11), which WRITE_SAME (see Section 15.12), and SEEK (see Section 15.11), which
will be treated like the subset specified as "Operations Sent to will be treated like the subset specified as "Operations Sent to
NFSv4.1 Data Servers" in Section 13.6 of [RFC5661]. NFSv4.1 Data Servers" in Section 13.6 of [RFC5661].
Additional details on the implementation of these operations in a Additional details on the implementation of these operations in a
pNFS context are documented in the operation specific sections. pNFS context are documented in the operation specific sections.
4. Server Side Copy 4. Server Side Copy
4.1. Introduction
The server-side copy features provide mechanisms which allow an NFS The server-side copy features provide mechanisms which allow an NFS
client to copy file data on a server or between two servers without client to copy file data on a server or between two servers without
the data being transmitted back and forth over the network through the data being transmitted back and forth over the network through
the NFS client. Without these features, an NFS client would copy the NFS client. Without these features, an NFS client would copy
data from one location to another by reading the data from the source data from one location to another by reading the data from the source
server over the network, and then writing the data back over the server over the network, and then writing the data back over the
network to the destination server. network to the destination server.
If the source object and destination object are on different file If the source object and destination object are on different file
servers, the file servers will communicate with one another to servers, the file servers will communicate with one another to
perform the copy operation. The server-to-server protocol by which perform the copy operation. The server-to-server protocol by which
this is accomplished is not defined in this document. this is accomplished is not defined in this document.
4.2. Protocol Overview The copy feature allows the server to perform the copying either
synchronously or asynchronously. The client can request synchronous
copying but the server may not be able to honor this request. If the
server intends to perform asynchronous copying, it supplies the
client with a request identifier that the client can use to monitor
the progress of the copying and, if appropriate, cancel a request in
progress. The request identifier is a stateid representing the
internal state held by the server while the copying is performed.
Multiple asynchronous copies of all or part of a file may be in
progress in parallel on a server; the stateid request identifier
allows monitoring and canceling to be applied to the correct request.
4.1. Protocol Overview
The server-side copy offload operations support both intra-server and The server-side copy offload operations support both intra-server and
inter-server file copies. An intra-server copy is a copy in which inter-server file copies. An intra-server copy is a copy in which
the source file and destination file reside on the same server. In the source file and destination file reside on the same server. In
an inter-server copy, the source file and destination file are on an inter-server copy, the source file and destination file are on
different servers. In both cases, the copy may be performed different servers. In both cases, the copy may be performed
synchronously or asynchronously. synchronously or asynchronously.
In addition, the CLONE operation provides copy-like functionality in In addition, the CLONE operation provides copy-like functionality in
the intra-sever case which is both synchronous and atomic, in that the intra-server case which is both synchronous and atomic, in that
other operations may not see the target file in any state between other operations may not see the target file in any state between
that before the clone operation and after it. that before the clone operation and after it.
Throughout the rest of this document, we refer to the NFS server Throughout the rest of this document, the NFS server containing the
containing the source file as the "source server" and the NFS server source file is referred to as the "source server" and the NFS server
to which the file is transferred as the "destination server". In the to which the file is transferred as the "destination server". In the
case of an intra-server copy, the source server and destination case of an intra-server copy, the source server and destination
server are the same server. Therefore in the context of an intra- server are the same server. Therefore in the context of an intra-
server copy, the terms source server and destination server refer to server copy, the terms source server and destination server refer to
the single server performing the copy. the single server performing the copy.
The new operations are designed to copy files or regions within them. The new operations are designed to copy files or regions within them.
Other file system objects can be copied by building on these Other file system objects can be copied by building on these
operations or using other techniques. For example, if a user wishes operations or using other techniques. For example, if a user wishes
to copy a directory, the client can synthesize a directory copy to copy a directory, the client can synthesize a directory copy
operation by first creating the destination directory and the operation by first creating the destination directory and the
individual (empty) files within it, and then copying the contents of individual (empty) files within it, and then copying the contents of
the source directory's files to files in the new destination the source directory's files to files in the new destination
directory. directory.
For the inter-server copy, the operations are defined to be For the inter-server copy, the operations are defined to be
compatible with the traditional copy authorization approach. The compatible with the traditional copy authorization approach. The
client and user are authorized at the source for reading. Then they client and user are authorized at the source for reading. Then they
are authorized at the destination for writing. are authorized at the destination for writing.
4.2.1. Copy Operations 4.1.1. Copy Operations
CLONE: Used by the client to request an synchronous atomic copy-like CLONE: Used by the client to request an synchronous atomic copy-like
operation. (Section 15.13) operation. (Section 15.13)
COPY_NOTIFY: Used by the client to request the source server to COPY_NOTIFY: Used by the client to request the source server to
authorize a future file copy that will be made by a given authorize a future file copy that will be made by a given
destination server on behalf of the given user. (Section 15.3) destination server on behalf of the given user. (Section 15.3)
COPY: Used by the client to request a file copy. (Section 15.2) COPY: Used by the client to request a file copy. (Section 15.2)
OFFLOAD_CANCEL: Used by the client to terminate an asynchronous file OFFLOAD_CANCEL: Used by the client to terminate an asynchronous file
copy. (Section 15.8) copy. (Section 15.8)
OFFLOAD_STATUS: Used by the client to poll the status of an OFFLOAD_STATUS: Used by the client to poll the status of an
asynchronous file copy. (Section 15.9) asynchronous file copy. (Section 15.9)
CB_OFFLOAD: Used by the destination server to report the results of CB_OFFLOAD: Used by the destination server to report the results of
an asynchronous file copy to the client. (Section 16.1) an asynchronous file copy to the client. (Section 16.1)
4.2.2. Requirements for Operations 4.1.2. Requirements for Operations
Three OPTIONAL features are provided relative to server-side copy. A Three OPTIONAL features are provided relative to server-side copy. A
server may choose independently to implement any of them. A server server may choose independently to implement any of them. A server
implementing any of these features may be REQUIRED to implement implementing any of these features may be REQUIRED to implement
certain operations. Other operations are OPTIONAL in the context of certain operations. Other operations are OPTIONAL in the context of
a particular feature Section 13, but may become REQUIRED depending on a particular feature (see Table 5 in Section 13), but may become
server behavior. Clients need to use these operations to REQUIRED depending on server behavior. Clients need to use these
successfully copy a file. operations to successfully copy a file.
For a client to do an intra-server file copy, it needs to use either For a client to do an intra-server file copy, it needs to use either
the COPY or the CLONE operation. If COPY is used the client MUST the COPY or the CLONE operation. If COPY is used the client MUST
support the CB_OFFLOAD operation. If COPY is used and it returns a support the CB_OFFLOAD operation. If COPY is used and it returns a
stateid, then the client MAY use the OFFLOAD_CANCEL and stateid, then the client MAY use the OFFLOAD_CANCEL and
OFFLOAD_STATUS operations. OFFLOAD_STATUS operations.
For a client to do an inter-server file copy, then it needs to use For a client to do an inter-server file copy, then it needs to use
the COPY and COPY_NOTIFY operations and MUST support the CB_OFFLOAD the COPY and COPY_NOTIFY operations and MUST support the CB_OFFLOAD
operation. If COPY returns a stateid, then the client MAY use the operation. If COPY returns a stateid, then the client MAY use the
skipping to change at page 12, line 8 skipping to change at page 12, line 33
the Open Network Computing (ONC) RPC credential of its containing the Open Network Computing (ONC) RPC credential of its containing
COMPOUND or CB_COMPOUND request. For example, an OFFLOAD_CANCEL COMPOUND or CB_COMPOUND request. For example, an OFFLOAD_CANCEL
operation issued by a given user indicates that a specified COPY operation issued by a given user indicates that a specified COPY
operation initiated by the same user be canceled. Therefore an operation initiated by the same user be canceled. Therefore an
OFFLOAD_CANCEL MUST NOT interfere with a copy of the same file OFFLOAD_CANCEL MUST NOT interfere with a copy of the same file
initiated by another user. initiated by another user.
An NFS server MAY allow an administrative user to monitor or cancel An NFS server MAY allow an administrative user to monitor or cancel
copy operations using an implementation specific interface. copy operations using an implementation specific interface.
4.3. Requirements for Inter-Server Copy 4.2. Requirements for Inter-Server Copy
The specification of inter-server copy is driven by several The specification of inter-server copy is driven by several
requirements: requirements:
o The specification MUST NOT mandate the server-to-server protocol. o The specification MUST NOT mandate the server-to-server protocol.
o The specification MUST provide guidance for using NFSv4.x as a o The specification MUST provide guidance for using NFSv4.x as a
copy protocol. For those source and destination servers willing copy protocol. For those source and destination servers willing
to use NFSv4.x, there are specific security considerations that to use NFSv4.x, there are specific security considerations that
this specification MUST address. this specification MUST address.
skipping to change at page 12, line 32 skipping to change at page 13, line 8
destination first have a "copying relationship" increases the destination first have a "copying relationship" increases the
administrative burden. However the specification MUST NOT administrative burden. However the specification MUST NOT
preclude implementations that require preconfiguration. preclude implementations that require preconfiguration.
o The specification MUST NOT mandate a trust relationship between o The specification MUST NOT mandate a trust relationship between
the source and destination server. The NFSv4 security model the source and destination server. The NFSv4 security model
requires mutual authentication between a principal on an NFS requires mutual authentication between a principal on an NFS
client and a principal on an NFS server. This model MUST continue client and a principal on an NFS server. This model MUST continue
with the introduction of COPY. with the introduction of COPY.
4.4. Implementation Considerations 4.3. Implementation Considerations
4.4.1. Locking the Files 4.3.1. Locking the Files
Both the source and destination file may need to be locked to protect Both the source and destination file may need to be locked to protect
the content during the copy operations. A client can achieve this by the content during the copy operations. A client can achieve this by
a combination of OPEN and LOCK operations. I.e., either share or a combination of OPEN and LOCK operations. I.e., either share or
byte range locks might be desired. byte range locks might be desired.
Note that when the client establishes a lock stateid on the source, Note that when the client establishes a lock stateid on the source,
the context of that stateid is for the client and not the the context of that stateid is for the client and not the
destination. As such, there might already be an outstanding stateid, destination. As such, there might already be an outstanding stateid,
issued to the destination as client of the source, with the same issued to the destination as client of the source, with the same
value as that provided for the lock stateid. The source MUST value as that provided for the lock stateid. The source MUST
interpret the lock stateid as that of the client, i.e., when the interpret the lock stateid as that of the client, i.e., when the
destination presents it in the context of a inter-server copy, it is destination presents it in the context of a inter-server copy, it is
on behalf of the client. on behalf of the client.
4.4.2. Client Caches 4.3.2. Client Caches
In a traditional copy, if the client is in the process of writing to In a traditional copy, if the client is in the process of writing to
the file before the copy (and perhaps with a write delegation), it the file before the copy (and perhaps with a write delegation), it
will be straightforward to update the destination server. With an will be straightforward to update the destination server. With an
inter-server copy, the source has no insight into the changes cached inter-server copy, the source has no insight into the changes cached
on the client. The client SHOULD write back the data to the source. on the client. The client SHOULD write back the data to the source.
If it does not do so, it is possible that the destination will If it does not do so, it is possible that the destination will
receive a corrupt copy of file. receive a corrupt copy of file.
4.5. Intra-Server Copy 4.4. Intra-Server Copy
To copy a file on a single server, the client uses a COPY operation. To copy a file on a single server, the client uses a COPY operation.
The server may respond to the copy operation with the final results The server may respond to the copy operation with the final results
of the copy or it may perform the copy asynchronously and deliver the of the copy or it may perform the copy asynchronously and deliver the
results using a CB_OFFLOAD operation callback. If the copy is results using a CB_OFFLOAD operation callback. If the copy is
performed asynchronously, the client may poll the status of the copy performed asynchronously, the client may poll the status of the copy
using OFFLOAD_STATUS or cancel the copy using OFFLOAD_CANCEL. using OFFLOAD_STATUS or cancel the copy using OFFLOAD_CANCEL.
A synchronous intra-server copy is shown in Figure 1. In this A synchronous intra-server copy is shown in Figure 1. In this
example, the NFS server chooses to perform the copy synchronously. example, the NFS server chooses to perform the copy synchronously.
skipping to change at page 14, line 48 skipping to change at page 15, line 38
|--- CLOSE --------------------------->| Client closes |--- CLOSE --------------------------->| Client closes
|<------------------------------------/| the destination file |<------------------------------------/| the destination file
| | | |
|--- CLOSE --------------------------->| Client closes |--- CLOSE --------------------------->| Client closes
|<------------------------------------/| the source file |<------------------------------------/| the source file
| | | |
| | | |
Figure 2: An asynchronous intra-server copy. Figure 2: An asynchronous intra-server copy.
4.6. Inter-Server Copy 4.5. Inter-Server Copy
A copy may also be performed between two servers. The copy protocol A copy may also be performed between two servers. The copy protocol
is designed to accommodate a variety of network topologies. As shown is designed to accommodate a variety of network topologies. As shown
in Figure 3, the client and servers may be connected by multiple in Figure 3, the client and servers may be connected by multiple
networks. In particular, the servers may be connected by a networks. In particular, the servers may be connected by a
specialized, high speed network (network 192.0.2.0/24 in the diagram) specialized, high speed network (network 192.0.2.0/24 in the diagram)
that does not include the client. The protocol allows the client to that does not include the client. The protocol allows the client to
setup the copy between the servers (over network 203.0.113.0/24 in setup the copy between the servers (over network 203.0.113.0/24 in
the diagram) and for the servers to communicate on the high speed the diagram) and for the servers to communicate on the high speed
network if they choose to do so. network if they choose to do so.
skipping to change at page 18, line 5 skipping to change at page 19, line 5
| | | | | |
|--- LOCKU --->| | Only if LOCK was done |--- LOCKU --->| | Only if LOCK was done
|<------------------/| | |<------------------/| |
| | | | | |
|--- CLOSE --->| | Release os1 |--- CLOSE --->| | Release os1
|<------------------/| | |<------------------/| |
| | | | | |
Figure 5: An asynchronous inter-server copy. Figure 5: An asynchronous inter-server copy.
4.7. Server-to-Server Copy Protocol 4.6. Server-to-Server Copy Protocol
The choice of what protocol to use in an inter-server copy is The choice of what protocol to use in an inter-server copy is
ultimately the destination server's decision. However, the ultimately the destination server's decision. However, the
destination server has to be cognizant that it is working on behalf destination server has to be cognizant that it is working on behalf
of the client. of the client.
4.7.1. Considerations on Selecting a Copy Protocol 4.6.1. Considerations on Selecting a Copy Protocol
The client can have requirements over both the size of transactions The client can have requirements over both the size of transactions
and error recovery semantics. It may want to split the copy up such and error recovery semantics. It may want to split the copy up such
that each chunk is synchronously transferred. It may want the copy that each chunk is synchronously transferred. It may want the copy
protocol to copy the bytes in consecutive order such that upon an protocol to copy the bytes in consecutive order such that upon an
error, the client can restart the copy at the last known good offset. error, the client can restart the copy at the last known good offset.
If the destination server cannot meet these requirements, the client If the destination server cannot meet these requirements, the client
may prefer the traditional copy mechanism such that it can meet those may prefer the traditional copy mechanism such that it can meet those
requirements. requirements.
4.7.2. Using NFSv4.x as the Copy Protocol 4.6.2. Using NFSv4.x as the Copy Protocol
The destination server MAY use standard NFSv4.x (where x >= 1) The destination server MAY use standard NFSv4.x (where x >= 1)
operations to read the data from the source server. If NFSv4.x is operations to read the data from the source server. If NFSv4.x is
used for the server-to-server copy protocol, the destination server used for the server-to-server copy protocol, the destination server
can use the source filehandle and ca_src_stateid provided in the COPY can use the source filehandle and ca_src_stateid provided in the COPY
request with standard NFSv4.x operations to read data from the source request with standard NFSv4.x operations to read data from the source
server. Note that the ca_src_stateid MUST be the cnr_stateid server. Note that the ca_src_stateid MUST be the cnr_stateid
returned from the source via the COPY_NOTIFY (Section 15.3). returned from the source via the COPY_NOTIFY (Section 15.3).
4.7.3. Using an Alternative Copy Protocol 4.6.3. Using an Alternative Copy Protocol
In a homogeneous environment, the source and destination servers In a homogeneous environment, the source and destination servers
might be able to perform the file copy extremely efficiently using might be able to perform the file copy extremely efficiently using
specialized protocols. For example the source and destination specialized protocols. For example the source and destination
servers might be two nodes sharing a common file system format for servers might be two nodes sharing a common file system format for
the source and destination file systems. Thus the source and the source and destination file systems. Thus the source and
destination are in an ideal position to efficiently render the image destination are in an ideal position to efficiently render the image
of the source file to the destination file by replicating the file of the source file to the destination file by replicating the file
system formats at the block level. Another possibility is that the system formats at the block level. Another possibility is that the
source and destination might be two nodes sharing a common storage source and destination might be two nodes sharing a common storage
skipping to change at page 19, line 26 skipping to change at page 20, line 26
destination server receives the source server's URL, it would use destination server receives the source server's URL, it would use
"_FH/0x12345" as the file name to pass to the FTP server listening on "_FH/0x12345" as the file name to pass to the FTP server listening on
port 9999 of s1.example.com. On port 9999 there would be a special port 9999 of s1.example.com. On port 9999 there would be a special
instance of the FTP service that understands how to convert NFS instance of the FTP service that understands how to convert NFS
filehandles to an open file descriptor (in many operating systems, filehandles to an open file descriptor (in many operating systems,
this would require a new system call, one which is the inverse of the this would require a new system call, one which is the inverse of the
makefh() function that the pre-NFSv4 MOUNT service needs). makefh() function that the pre-NFSv4 MOUNT service needs).
Authenticating and identifying the destination server to the source Authenticating and identifying the destination server to the source
server is also a challenge. Recommendations for how to accomplish server is also a challenge. Recommendations for how to accomplish
this are given in Section 4.10.1.3. this are given in Section 4.9.1.3.
4.8. netloc4 - Network Locations 4.7. netloc4 - Network Locations
The server-side copy operations specify network locations using the The server-side copy operations specify network locations using the
netloc4 data type shown below: netloc4 data type shown below:
<CODE BEGINS> <CODE BEGINS>
enum netloc_type4 { enum netloc_type4 {
NL4_NAME = 1, NL4_NAME = 1,
NL4_URL = 2, NL4_URL = 2,
NL4_NETADDR = 3 NL4_NETADDR = 3
skipping to change at page 20, line 16 skipping to change at page 21, line 16
UTF-8 string. If the netloc4 is of type NL4_NETADDR, the nl_addr UTF-8 string. If the netloc4 is of type NL4_NETADDR, the nl_addr
field MUST contain a valid netaddr4 as defined in Section 3.3.9 of field MUST contain a valid netaddr4 as defined in Section 3.3.9 of
[RFC5661]. [RFC5661].
When netloc4 values are used for an inter-server copy as shown in When netloc4 values are used for an inter-server copy as shown in
Figure 3, their values may be evaluated on the source server, Figure 3, their values may be evaluated on the source server,
destination server, and client. The network environment in which destination server, and client. The network environment in which
these systems operate should be configured so that the netloc4 values these systems operate should be configured so that the netloc4 values
are interpreted as intended on each system. are interpreted as intended on each system.
4.9. Copy Offload Stateids 4.8. Copy Offload Stateids
A server may perform a copy offload operation asynchronously. An A server may perform a copy offload operation asynchronously. An
asynchronous copy is tracked using a copy offload stateid. Copy asynchronous copy is tracked using a copy offload stateid. Copy
offload stateids are included in the COPY, OFFLOAD_CANCEL, offload stateids are included in the COPY, OFFLOAD_CANCEL,
OFFLOAD_STATUS, and CB_OFFLOAD operations. OFFLOAD_STATUS, and CB_OFFLOAD operations.
A copy offload stateid will be valid until either (A) the client or A copy offload stateid will be valid until either (A) the client or
server restarts or (B) the client returns the resource by issuing a server restarts or (B) the client returns the resource by issuing a
OFFLOAD_CANCEL operation or the client replies to a CB_OFFLOAD OFFLOAD_CANCEL operation or the client replies to a CB_OFFLOAD
operation. operation.
A copy offload stateid's seqid MUST NOT be zero. In the context of a A copy offload stateid's seqid MUST NOT be zero. In the context of a
copy offload operation, it is ambiguous to indicate the most recent copy offload operation, it is inappropriate to indicate "the most
copy offload operation using a stateid with seqid of zero. Therefore recent copy offload operation" using a stateid with seqid of zero
a copy offload stateid with seqid of zero MUST be considered invalid. (see Section 8.2.2 of [RFC5661]). It is inappropriate because the
stateid refers to internal state in the server and there may be
several asynchronous copy operations being performed in parallel on
the same file by the server. Therefore a copy offload stateid with
seqid of zero MUST be considered invalid.
4.10. Security Considerations 4.9. Security Considerations
The security considerations pertaining to NFSv4.1 [RFC5661] apply to The security considerations pertaining to NFSv4.1 [RFC5661] apply to
this section. And as such, the standard security mechanisms used by this section. And as such, the standard security mechanisms used by
the protocol can be used to secure the server-to-server operations. the protocol can be used to secure the server-to-server operations.
NFSv4 clients and servers supporting the inter-server copy operations NFSv4 clients and servers supporting the inter-server copy operations
described in this chapter are REQUIRED to implement the mechanism described in this chapter are REQUIRED to implement the mechanism
described in Section 4.10.1.1, and to support rejecting COPY_NOTIFY described in Section 4.9.1.1, and to support rejecting COPY_NOTIFY
requests that do not use RPCSEC_GSS with privacy. If the server-to- requests that do not use RPCSEC_GSS with privacy. If the server-to-
server copy protocol is ONC RPC based, the servers are also REQUIRED server copy protocol is ONC RPC based, the servers are also REQUIRED
to implement [rpcsec_gssv3] including the RPCSEC_GSSv3 copy_to_auth, to implement [I-D.ietf-nfsv4-rpcsec-gssv3] including the RPCSEC_GSSv3
copy_from_auth, and copy_confirm_auth structured privileges. This copy_to_auth, copy_from_auth, and copy_confirm_auth structured
requirement to implement is not a requirement to use; for example, a privileges. This requirement to implement is not a requirement to
server may depending on configuration also allow COPY_NOTIFY requests use; for example, a server may depending on configuration also allow
that use only AUTH_SYS. COPY_NOTIFY requests that use only AUTH_SYS.
If a server requires the use of RPCSEC_GSSv3 copy_to_auth, If a server requires the use of an RPCSEC_GSSv3 copy_to_auth,
copy_from_auth, or copy_confirm_auth and it is not used, the server copy_from_auth, or copy_confirm_auth privilege and it is not used,
will reject the request with NFS4ERR_PARTNER_NO_AUTH. the server will reject the request with NFS4ERR_PARTNER_NO_AUTH.
4.10.1. Inter-Server Copy Security 4.9.1. Inter-Server Copy Security
4.10.1.1. Inter-Server Copy via ONC RPC with RPCSEC_GSSv3 4.9.1.1. Inter-Server Copy via ONC RPC with RPCSEC_GSSv3
When the client sends a COPY_NOTIFY to the source server to expect When the client sends a COPY_NOTIFY to the source server to expect
the destination to attempt to copy data from the source server, it is the destination to attempt to copy data from the source server, it is
expected that this copy is being done on behalf of the principal expected that this copy is being done on behalf of the principal
(called the "user principal") that sent the RPC request that encloses (called the "user principal") that sent the RPC request that encloses
the COMPOUND procedure that contains the COPY_NOTIFY operation. The the COMPOUND procedure that contains the COPY_NOTIFY operation. The
user principal is identified by the RPC credentials. A mechanism user principal is identified by the RPC credentials. A mechanism
that allows the user principal to authorize the destination server to that allows the user principal to authorize the destination server to
perform the copy, that lets the source server properly authenticate perform the copy, that lets the source server properly authenticate
the destination's copy, and does not allow the destination server to the destination's copy, and does not allow the destination server to
skipping to change at page 21, line 36 skipping to change at page 22, line 38
reason. If the client's user delegated its credentials, the reason. If the client's user delegated its credentials, the
destination would authenticate as the user principal. If the destination would authenticate as the user principal. If the
destination were using the NFSv4 protocol to perform the copy, then destination were using the NFSv4 protocol to perform the copy, then
the source server would authenticate the destination server as the the source server would authenticate the destination server as the
user principal, and the file copy would securely proceed. However, user principal, and the file copy would securely proceed. However,
this approach would allow the destination server to copy other files. this approach would allow the destination server to copy other files.
The user principal would have to trust the destination server to not The user principal would have to trust the destination server to not
do so. This is counter to the requirements, and therefore is not do so. This is counter to the requirements, and therefore is not
considered. considered.
Instead, a feature of the RPCSEC_GSSv3 [rpcsec_gssv3] protocol can be Instead, a feature of the RPCSEC_GSSv3 [I-D.ietf-nfsv4-rpcsec-gssv3]
used: RPC application defined structured privilege assertion. This protocol can be used: RPC application defined structured privilege
features allow the destination server to authenticate to the source assertion. This feature allows the destination server to
server as acting on behalf of the user principal, and to authorize authenticate to the source server as acting on behalf of the user
the destination server to perform READs of the file to be copied from principal, and to authorize the destination server to perform READs
the source on behalf of the user principal. Once the copy is of the file to be copied from the source on behalf of the user
complete, the client can destroy the RPCSEC_GSSv3 handles to end the principal. Once the copy is complete, the client can destroy the
authorization of both the source and destination servers to copy. RPCSEC_GSSv3 handles to end the authorization of both the source and
destination servers to copy.
We define three RPCSEC_GSSv3 structured privilege assertions that For each structured privilege assertion defined by a RPC application
work in tandem to authorize the copy: RPCSEC_GSSv3 requires the application to define a name string and a
data structure that will be encoded and passed between client and
server as opaque data. For NFSv4 the data structures specified below
MUST be serialized using XDR.
Three RPCSEC_GSSv3 structured privilege assertions that work together
to authorize the copy are defined here. For each of the assertions
the description starts with the name string passed in the rp_name
field of the rgss3_privs structure defined in Section 2.7.1.4 of
[I-D.ietf-nfsv4-rpcsec-gssv3] and specifies the XDR encoding of the
associated structured data passed via the rp_privilege field of the
structure.
copy_from_auth: A user principal is authorizing a source principal copy_from_auth: A user principal is authorizing a source principal
("nfs@<source>") to allow a destination principal ("nfs@<source>") to allow a destination principal
("nfs@<destination>") to setup the copy_confirm_auth privilege ("nfs@<destination>") to setup the copy_confirm_auth privilege
required to copy a file from the source to the destination on required to copy a file from the source to the destination on
behalf of the user principal. This privilege is established on behalf of the user principal. This privilege is established on
the source server before the user principal sends a COPY_NOTIFY the source server before the user principal sends a COPY_NOTIFY
operation to the source server, and the resultant RPCSEC_GSSv3 operation to the source server, and the resultant RPCSEC_GSSv3
context is used to secure the COPY_NOTIFY operation. context is used to secure the COPY_NOTIFY operation.
skipping to change at page 22, line 45 skipping to change at page 24, line 17
secret4 ctap_shared_secret; secret4 ctap_shared_secret;
netloc4 ctap_source<>; netloc4 ctap_source<>;
/* the NFSv4 user name that the user principal maps to */ /* the NFSv4 user name that the user principal maps to */
utf8str_mixed ctap_username; utf8str_mixed ctap_username;
}; };
<CODE ENDS> <CODE ENDS>
ctap_shared_secret is the automatically generated secret value ctap_shared_secret is the automatically generated secret value
used to establish the copy_from_auth privilege with the source used to establish the copy_from_auth privilege with the source
principal. See Section 4.10.1.1.1. principal. See Section 4.9.1.1.1.
copy_confirm_auth: A destination principal ("nfs@<destination>") is copy_confirm_auth: A destination principal ("nfs@<destination>") is
confirming with the source principal ("nfs@<source>") that it is confirming with the source principal ("nfs@<source>") that it is
authorized to copy data from the source. This privilege is authorized to copy data from the source. This privilege is
established on the destination server before the file is copied established on the destination server before the file is copied
from the source to the destination. The resultant RPCSEC_GSSv3 from the source to the destination. The resultant RPCSEC_GSSv3
context is used to secure the READ operations from the source to context is used to secure the READ operations from the source to
the destination server. the destination server.
<CODE BEGINS> <CODE BEGINS>
struct copy_confirm_auth_priv { struct copy_confirm_auth_priv {
/* equal to GSS_GetMIC() of cfap_shared_secret */ /* equal to GSS_GetMIC() of cfap_shared_secret */
opaque ccap_shared_secret_mic<>; opaque ccap_shared_secret_mic<>;
/* the NFSv4 user name that the user principal maps to */ /* the NFSv4 user name that the user principal maps to */
utf8str_mixed ccap_username; utf8str_mixed ccap_username;
}; };
<CODE ENDS> <CODE ENDS>
4.10.1.1.1. Establishing a Security Context 4.9.1.1.1. Establishing a Security Context
When the user principal wants to COPY a file between two servers, if When the user principal wants to COPY a file between two servers, if
it has not established copy_from_auth and copy_to_auth privileges on it has not established copy_from_auth and copy_to_auth privileges on
the servers, it establishes them: the servers, it establishes them:
o As noted in [rpcsec_gssv3] the client uses an existing o As noted in [I-D.ietf-nfsv4-rpcsec-gssv3] the client uses an
RPCSEC_GSSv3 context termed the "parent" handle to establish and existing RPCSEC_GSSv3 context termed the "parent" handle to
protect RPCSEC_GSSv3 structured privilege assertion exchanges. establish and protect RPCSEC_GSSv3 structured privilege assertion
The copy_from_auth privilege will use the context established exchanges. The copy_from_auth privilege will use the context
between the user principal and the source server used to OPEN the established between the user principal and the source server used
source file as the RPCSEC_GSSv3 parent handle. The copy_to_auth to OPEN the source file as the RPCSEC_GSSv3 parent handle. The
privilege will use the context established between the user copy_to_auth privilege will use the context established between
principal and the destination server used to OPEN the destination the user principal and the destination server used to OPEN the
file as the RPCSEC_GSSv3 parent handle. destination file as the RPCSEC_GSSv3 parent handle.
o A random number is generated to use as a secret to be shared o A random number is generated to use as a secret to be shared
between the two servers. This shared secret will be placed in the between the two servers. Note that the random number SHOULD not
cfap_shared_secret and ctap_shared_secret fields of the be reused between establishing different security contexts. The
appropriate privilege data types, copy_from_auth_priv and resulting shared secret will be placed in the cfap_shared_secret
copy_to_auth_priv. Because of this shared_secret the and ctap_shared_secret fields of the appropriate privilege data
RPCSEC_GSS3_CREATE control messages for copy_from_auth and types, copy_from_auth_priv and copy_to_auth_priv. Because of this
copy_to_auth MUST use a Quality of Protection (QOP) of shared_secret the RPCSEC_GSS3_CREATE control messages for
rpc_gss_svc_privacy. copy_from_auth and copy_to_auth MUST use a Quality of Protection
(QOP) of rpc_gss_svc_privacy.
o An instance of copy_from_auth_priv is filled in with the shared o An instance of copy_from_auth_priv is filled in with the shared
secret, the destination server, and the NFSv4 user id of the user secret, the destination server, and the NFSv4 user id of the user
principal and is placed in rpc_gss3_create_args principal and is placed in rpc_gss3_create_args
assertions[0].privs.privilege. The string "copy_from_auth" is assertions[0].privs.privilege. The string "copy_from_auth" is
placed in assertions[0].privs.name. The source server unwraps the placed in assertions[0].privs.name. The source server unwraps the
rpc_gss_svc_privacy RPCSEC_GSS3_CREATE payload and verifies that rpc_gss_svc_privacy RPCSEC_GSS3_CREATE payload and verifies that
the NFSv4 user id being asserted matches the source server's the NFSv4 user id being asserted matches the source server's
mapping of the user principal. If it does, the privilege is mapping of the user principal. If it does, the privilege is
established on the source server as: <"copy_from_auth", user id, established on the source server as: <"copy_from_auth", user id,
skipping to change at page 24, line 23 skipping to change at page 25, line 43
placed in assertions[0].privs.name. The destination server placed in assertions[0].privs.name. The destination server
unwraps the rpc_gss_svc_privacy RPCSEC_GSS3_CREATE payload and unwraps the rpc_gss_svc_privacy RPCSEC_GSS3_CREATE payload and
verifies that the NFSv4 user id being asserted matches the verifies that the NFSv4 user id being asserted matches the
destination server's mapping of the user principal. If it does, destination server's mapping of the user principal. If it does,
the privilege is established on the destination server as: the privilege is established on the destination server as:
<"copy_to_auth", user id, source list>. The field "handle" in a <"copy_to_auth", user id, source list>. The field "handle" in a
successful reply is the RPCSEC_GSSv3 copy_to_auth "child" handle successful reply is the RPCSEC_GSSv3 copy_to_auth "child" handle
that the client will use on COPY requests to the destination that the client will use on COPY requests to the destination
server involving the source server. server involving the source server.
As noted in [rpcsec_gssv3] Section 2.3.1 "Create Request", both the As noted in [I-D.ietf-nfsv4-rpcsec-gssv3] Section 2.3.1 "Create
client and the source server should associate the RPCSEC_GSSv3 Request", both the client and the source server should associate the
"child" handle with the parent RPCSEC_GSSv3 handle used to create the RPCSEC_GSSv3 "child" handle with the parent RPCSEC_GSSv3 handle used
RPCSEC_GSSv3 child handle. to create the RPCSEC_GSSv3 child handle.
4.10.1.1.2. Starting a Secure Inter-Server Copy 4.9.1.1.2. Starting a Secure Inter-Server Copy
When the client sends a COPY_NOTIFY request to the source server, it When the client sends a COPY_NOTIFY request to the source server, it
uses the privileged "copy_from_auth" RPCSEC_GSSv3 handle. uses the privileged "copy_from_auth" RPCSEC_GSSv3 handle.
cna_destination_server in COPY_NOTIFY MUST be the same as cna_destination_server in COPY_NOTIFY MUST be the same as
cfap_destination specified in copy_from_auth_priv. Otherwise, cfap_destination specified in copy_from_auth_priv. Otherwise,
COPY_NOTIFY will fail with NFS4ERR_ACCESS. The source server COPY_NOTIFY will fail with NFS4ERR_ACCESS. The source server
verifies that the privilege <"copy_from_auth", user id, destination> verifies that the privilege <"copy_from_auth", user id, destination>
exists, and annotates it with the source filehandle, if the user exists, and annotates it with the source filehandle, if the user
principal has read access to the source file, and if administrative principal has read access to the source file, and if administrative
policies give the user principal and the NFS client read access to policies give the user principal and the NFS client read access to
skipping to change at page 25, line 11 skipping to change at page 26, line 29
the source and destination filehandles. If the COPY returns a the source and destination filehandles. If the COPY returns a
wr_callback_id, then this is an asynchronous copy and the wr_callback_id, then this is an asynchronous copy and the
wr_callback_id must also must be annotated to the copy_to_auth wr_callback_id must also must be annotated to the copy_to_auth
privilege. If the client has failed to establish the "copy_to_auth" privilege. If the client has failed to establish the "copy_to_auth"
privilege it will reject the request with NFS4ERR_PARTNER_NO_AUTH. privilege it will reject the request with NFS4ERR_PARTNER_NO_AUTH.
If either the COPY_NOTIFY, or the COPY operations fail, the If either the COPY_NOTIFY, or the COPY operations fail, the
associated "copy_from_auth" and "copy_to_auth" RPCSEC_GSSv3 handles associated "copy_from_auth" and "copy_to_auth" RPCSEC_GSSv3 handles
MUST be destroyed. MUST be destroyed.
4.10.1.1.3. Securing ONC RPC Server-to-Server Copy Protocols 4.9.1.1.3. Securing ONC RPC Server-to-Server Copy Protocols
After a destination server has a "copy_to_auth" privilege established After a destination server has a "copy_to_auth" privilege established
on it, and it receives a COPY request, if it knows it will use an ONC on it, and it receives a COPY request, if it knows it will use an ONC
RPC protocol to copy data, it will establish a "copy_confirm_auth" RPC protocol to copy data, it will establish a "copy_confirm_auth"
privilege on the source server prior to responding to the COPY privilege on the source server prior to responding to the COPY
operation as follows: operation as follows:
o Before establishing an RPCSEC_GSSv3 context, a parent context o Before establishing an RPCSEC_GSSv3 context, a parent context
needs to exist between nfs@<destination> as the initiator needs to exist between nfs@<destination> as the initiator
principal, and nfs@<source> as the target principal. If NFS is to principal, and nfs@<source> as the target principal. If NFS is to
skipping to change at page 26, line 10 skipping to change at page 27, line 26
established on the source server as < "copy_confirm_auth", established on the source server as < "copy_confirm_auth",
shared_secret_mic, user id> Because the shared secret has been shared_secret_mic, user id> Because the shared secret has been
verified, the resultant copy_confirm_auth RPCSEC_GSSv3 child verified, the resultant copy_confirm_auth RPCSEC_GSSv3 child
handle is noted to be acting on behalf of the user principal. handle is noted to be acting on behalf of the user principal.
o If the source server fails to verify the copy_from_auth privilege o If the source server fails to verify the copy_from_auth privilege
the COPY_NOTIFY operation will be rejected with the COPY_NOTIFY operation will be rejected with
NFS4ERR_PARTNER_NO_AUTH. NFS4ERR_PARTNER_NO_AUTH.
o If the destination server fails to verify the copy_to_auth or o If the destination server fails to verify the copy_to_auth or
copy_confirm_auth privilege, the COPY will be rejeced with copy_confirm_auth privilege, the COPY will be rejected with
NFS4ERR_PARTNER_NO_AUTH, causing the client to destroy the NFS4ERR_PARTNER_NO_AUTH, causing the client to destroy the
associated copy_from_auth and copy_to_auth RPCSEC_GSSv3 structured associated copy_from_auth and copy_to_auth RPCSEC_GSSv3 structured
privilege assertion handles. privilege assertion handles.
o All subsequent ONC RPC READ requests sent from the destination to o All subsequent ONC RPC READ requests sent from the destination to
copy data from the source to the destination will use the copy data from the source to the destination will use the
RPCSEC_GSSv3 copy_confirm_auth child handle. RPCSEC_GSSv3 copy_confirm_auth child handle.
Note that the use of the "copy_confirm_auth" privilege accomplishes Note that the use of the "copy_confirm_auth" privilege accomplishes
the following: the following:
o If a protocol like NFS is being used, with export policies, export o If a protocol like NFS is being used, with export policies, export
policies can be overridden in case the destination server as-an- policies can be overridden in case the destination server as-an-
NFS-client is not authorized NFS-client is not authorized
o Manual configuration to allow a copy relationship between the o Manual configuration to allow a copy relationship between the
source and destination is not needed. source and destination is not needed.
4.10.1.1.4. Maintaining a Secure Inter-Server Copy 4.9.1.1.4. Maintaining a Secure Inter-Server Copy
If the client determines that either the copy_from_auth or the If the client determines that either the copy_from_auth or the
copy_to_auth handle becomes invalid during a copy, then the copy MUST copy_to_auth handle becomes invalid during a copy, then the copy MUST
be aborted by the client sending an OFFLOAD_CANCEL to both the source be aborted by the client sending an OFFLOAD_CANCEL to both the source
and destination servers and destroying the respective copy related and destination servers and destroying the respective copy related
context handles as described in Section 4.10.1.1.5. context handles as described in Section 4.9.1.1.5.
4.10.1.1.5. Finishing or Stopping a Secure Inter-Server Copy 4.9.1.1.5. Finishing or Stopping a Secure Inter-Server Copy
Under normal operation, the client MUST destroy the copy_from_auth Under normal operation, the client MUST destroy the copy_from_auth
and the copy_to_auth RPCSEC_GSSv3 handle once the COPY operation and the copy_to_auth RPCSEC_GSSv3 handle once the COPY operation
returns for a synchronous inter-server copy or a CB_OFFLOAD reports returns for a synchronous inter-server copy or a CB_OFFLOAD reports
the result of an asynchronous copy. the result of an asynchronous copy.
The copy_confirm_auth privilege constructed from information held by The copy_confirm_auth privilege constructed from information held by
the copy_to_auth privilege, and MUST be destroyed by the destination the copy_to_auth privilege, and MUST be destroyed by the destination
server (via an RPCSEC_GSS3_DESTROY call) when the copy_to_auth server (via an RPCSEC_GSS3_DESTROY call) when the copy_to_auth
RPCSEC_GSSv3 handle is destroyed. RPCSEC_GSSv3 handle is destroyed.
skipping to change at page 27, line 34 skipping to change at page 29, line 5
If the client sends an OFFLOAD_CANCEL to the destination server to If the client sends an OFFLOAD_CANCEL to the destination server to
cancel an asynchronous copy, it uses the privileged "copy_to_auth" cancel an asynchronous copy, it uses the privileged "copy_to_auth"
RPCSEC_GSSv3 handle and the oaa_stateid in OFFLOAD_CANCEL MUST be the RPCSEC_GSSv3 handle and the oaa_stateid in OFFLOAD_CANCEL MUST be the
same as the wr_callback_id specified in the "copy_to_auth" privilege same as the wr_callback_id specified in the "copy_to_auth" privilege
stored on the destination server. The destination server will then stored on the destination server. The destination server will then
delete the <"copy_to_auth", user id, source list, nounce, nounce MIC, delete the <"copy_to_auth", user id, source list, nounce, nounce MIC,
context handle, handle version> privilege and the associated context handle, handle version> privilege and the associated
"copy_confirm_auth" RPCSEC_GSSv3 handle. The client MUST destroy "copy_confirm_auth" RPCSEC_GSSv3 handle. The client MUST destroy
both the copy_to_auth and copy_from_auth RPCSEC_GSSv3 handles. both the copy_to_auth and copy_from_auth RPCSEC_GSSv3 handles.
4.10.1.2. Inter-Server Copy via ONC RPC without RPCSEC_GSS 4.9.1.2. Inter-Server Copy via ONC RPC without RPCSEC_GSS
ONC RPC security flavors other than RPCSEC_GSS MAY be used with the ONC RPC security flavors other than RPCSEC_GSS MAY be used with the
server-side copy offload operations described in this chapter. In server-side copy offload operations described in this chapter. In
particular, host-based ONC RPC security flavors such as AUTH_NONE and particular, host-based ONC RPC security flavors such as AUTH_NONE and
AUTH_SYS MAY be used. If a host-based security flavor is used, a AUTH_SYS MAY be used. If a host-based security flavor is used, a
minimal level of protection for the server-to-server copy protocol is minimal level of protection for the server-to-server copy protocol is
possible. possible.
In the absence of a strong security mechanism designed for the In the absence of a strong security mechanism designed for the
purpose, the challenge is how the source server and destination purpose, the challenge is how the source server and destination
skipping to change at page 28, line 14 skipping to change at page 29, line 33
the destination server, but cannot defend against man-in-the-middle the destination server, but cannot defend against man-in-the-middle
attacks after authentication or an eavesdropper that observes the attacks after authentication or an eavesdropper that observes the
opaque stateid on the wire. Other secure communication techniques opaque stateid on the wire. Other secure communication techniques
(e.g., IPsec) are necessary to block these attacks. (e.g., IPsec) are necessary to block these attacks.
Servers SHOULD reject COPY_NOTIFY requests that do not use RPCSEC_GSS Servers SHOULD reject COPY_NOTIFY requests that do not use RPCSEC_GSS
with privacy, thus ensuring the cnr_stateid in the COPY_NOTIFY reply with privacy, thus ensuring the cnr_stateid in the COPY_NOTIFY reply
is encrypted. For the same reason, clients SHOULD send COPY requests is encrypted. For the same reason, clients SHOULD send COPY requests
to the destination using RPCSEC_GSS with privacy. to the destination using RPCSEC_GSS with privacy.
4.10.1.3. Inter-Server Copy without ONC RPC 4.9.1.3. Inter-Server Copy without ONC RPC
The same techniques as Section 4.10.1.2, using unique URLs for each The same techniques as Section 4.9.1.2, using unique URLs for each
destination server, can be used for other protocols (e.g., HTTP destination server, can be used for other protocols (e.g., HTTP
[RFC7230] and FTP [RFC959]) as well. [RFC7230] and FTP [RFC959]) as well.
5. Support for Application I/O Hints 5. Support for Application I/O Hints
Applications can issue client I/O hints via posix_fadvise() Applications can issue client I/O hints via posix_fadvise()
[posix_fadvise] to the NFS client. While this can help the NFS [posix_fadvise] to the NFS client. While this can help the NFS
client optimize I/O and caching for a file, it does not allow the NFS client optimize I/O and caching for a file, it does not allow the NFS
server and its exported file system to do likewise. We add an server and its exported file system to do likewise. The IO_ADVISE
IO_ADVISE procedure (Section 15.5) to communicate the client file procedure (Section 15.5) is used to communicate the client file
access patterns to the NFS server. The NFS server upon receiving a access patterns to the NFS server. The NFS server upon receiving a
IO_ADVISE operation MAY choose to alter its I/O and caching behavior, IO_ADVISE operation MAY choose to alter its I/O and caching behavior,
but is under no obligation to do so. but is under no obligation to do so.
Application specific NFS clients such as those used by hypervisors Application specific NFS clients such as those used by hypervisors
and databases can also leverage application hints to communicate and databases can also leverage application hints to communicate
their specialized requirements. their specialized requirements.
6. Sparse Files 6. Sparse Files
skipping to change at page 29, line 46 skipping to change at page 31, line 14
can dramatically improve performance with sparse files. READ_PLUS can dramatically improve performance with sparse files. READ_PLUS
does not depend on pNFS protocol features, but can be used by pNFS to does not depend on pNFS protocol features, but can be used by pNFS to
support sparse files. support sparse files.
6.1. Terminology 6.1. Terminology
Regular file: An object of file type NF4REG or NF4NAMEDATTR. Regular file: An object of file type NF4REG or NF4NAMEDATTR.
Sparse file: A Regular file that contains one or more holes. Sparse file: A Regular file that contains one or more holes.
Hole: A byte range within a Sparse file that contains regions of all Hole: A byte range within a Sparse file that contains all zeroes. A
zeroes. A hole might or might not have space allocated or hole might or might not have space allocated or reserved to it.
reserved to it.
6.2. New Operations 6.2. New Operations
6.2.1. READ_PLUS 6.2.1. READ_PLUS
READ_PLUS is a new variant of the NFSv4.1 READ operation [RFC5661]. READ_PLUS is a new variant of the NFSv4.1 READ operation [RFC5661].
Besides being able to support all of the data semantics of the READ Besides being able to support all of the data semantics of the READ
operation, it can also be used by the client and server to operation, it can also be used by the client and server to
efficiently transfer holes. Note that as the client has no a priori efficiently transfer holes. Because the client does not know in
knowledge of whether a hole is present or not, if the client supports advance whether a hole is present or not, if the client supports
READ_PLUS and so does the server, then it should always use the READ_PLUS and so does the server, then it should always use the
READ_PLUS operation in preference to the READ operation. READ_PLUS operation in preference to the READ operation.
READ_PLUS extends the response with a new arm representing holes to READ_PLUS extends the response with a new arm representing holes to
avoid returning data for portions of the file which are initialized avoid returning data for portions of the file which are initialized
to zero and may or may not contain a backing store. Returning data to zero and may or may not contain a backing store. Returning actual
blocks of uninitialized data wastes computational and network data blocks corresponding to holes wastes computational and network
resources, thus reducing performance. resources, thus reducing performance.
When a client sends a READ operation, it is not prepared to accept a When a client sends a READ operation, it is not prepared to accept a
READ_PLUS-style response providing a compact encoding of the scope of READ_PLUS-style response providing a compact encoding of the scope of
holes. If a READ occurs on a sparse file, then the server must holes. If a READ occurs on a sparse file, then the server must
expand such data to be raw bytes. If a READ occurs in the middle of expand such data to be raw bytes. If a READ occurs in the middle of
a hole, the server can only send back bytes starting from that a hole, the server can only send back bytes starting from that
offset. By contrast, if a READ_PLUS occurs in the middle of a hole, offset. By contrast, if a READ_PLUS occurs in the middle of a hole,
the server can send back a range which starts before the offset and the server can send back a range which starts before the offset and
extends past the range. extends past the requested length.
6.2.2. DEALLOCATE 6.2.2. DEALLOCATE
DEALLOCATE can be used to hole punch, which allows the client to The client can use the DEALLOCATE operation on a range of a file as a
avoid the transfer of a repetitive pattern of zeros across the hole punch, which allows the client to avoid the transfer of a
network. repetitive pattern of zeros across the network. This hole punch is a
result of the unreserved space returning all zeros until overwritten.
7. Space Reservation 7. Space Reservation
Applications want to be able to reserve space for a file, report the Applications want to be able to reserve space for a file, report the
amount of actual disk space a file occupies, and free-up the backing amount of actual disk space a file occupies, and free-up the backing
space of a file when it is not required. space of a file when it is not required.
One example is the posix_fallocate operation ([posix_fallocate]) One example is the posix_fallocate operation ([posix_fallocate])
which allows applications to ask for space reservations from the which allows applications to ask for space reservations from the
operating system, usually to provide a better file layout and reduce operating system, usually to provide a better file layout and reduce
skipping to change at page 31, line 43 skipping to change at page 33, line 11
a single block with a block reference count to guard against a single block with a block reference count to guard against
premature freeing. Having a way to tell the number of blocks that premature freeing. Having a way to tell the number of blocks that
would be freed if the file was deleted would be useful to would be freed if the file was deleted would be useful to
applications that wish to migrate files when a volume is low on applications that wish to migrate files when a volume is low on
space. space.
Since virtual disks represent a hard drive in a virtual machine, a Since virtual disks represent a hard drive in a virtual machine, a
virtual disk can be viewed as a file system within a file. Since not virtual disk can be viewed as a file system within a file. Since not
all blocks within a file system are in use, there is an opportunity all blocks within a file system are in use, there is an opportunity
to reclaim blocks that are no longer in use. A call to deallocate to reclaim blocks that are no longer in use. A call to deallocate
blocks could result in better space efficiency. Lesser space MAY be blocks could result in better space efficiency. Lesser space might
consumed for backups after block deallocation. be consumed for backups after block deallocation.
The following operations and attributes can be used to resolve these The following operations and attributes can be used to resolve these
issues: issues:
space_freed This attribute reports the space that would be freed space_freed This attribute reports the space that would be freed
when a file is deleted, taking block sharing into consideration. when a file is deleted, taking block sharing into consideration.
DEALLOCATE This operation deallocates the blocks backing a region of DEALLOCATE This operation deallocates the blocks backing a region of
the file. the file.
skipping to change at page 32, line 43 skipping to change at page 34, line 13
result in the deallocation of all 10 blocks. result in the deallocation of all 10 blocks.
The addition of these attributes does not solve the problem of space The addition of these attributes does not solve the problem of space
being over-reported. However, over-reporting is better than under- being over-reported. However, over-reporting is better than under-
reporting. reporting.
8. Application Data Block Support 8. Application Data Block Support
At the OS level, files are contained on disk blocks. Applications At the OS level, files are contained on disk blocks. Applications
are also free to impose structure on the data contained in a file and are also free to impose structure on the data contained in a file and
we can define an Application Data Block (ADB) to be such a structure. thus can define an Application Data Block (ADB) to be such a
From the application's viewpoint, it only wants to handle ADBs and structure. From the application's viewpoint, it only wants to handle
not raw bytes (see [Strohm11]). An ADB is typically comprised of two ADBs and not raw bytes (see [Strohm11]). An ADB is typically
sections: header and data. The header describes the characteristics comprised of two sections: header and data. The header describes the
of the block and can provide a means to detect corruption in the data characteristics of the block and can provide a means to detect
payload. The data section is typically initialized to all zeros. corruption in the data payload. The data section is typically
initialized to all zeros.
The format of the header is application specific, but there are two The format of the header is application specific, but there are two
main components typically encountered: main components typically encountered:
1. An Application Data Block Number (ADBN) which allows the 1. An Application Data Block Number (ADBN) which allows the
application to determine which data block is being referenced. application to determine which data block is being referenced.
This is useful when the client is not storing the blocks in This is useful when the client is not storing the blocks in
contiguous memory, i.e., a logical block number. contiguous memory, i.e., a logical block number.
2. Fields to describe the state of the ADB and a means to detect 2. Fields to describe the state of the ADB and a means to detect
skipping to change at page 33, line 24 skipping to change at page 34, line 43
between big and little endian architectures is detectable. For between big and little endian architectures is detectable. For
example, 0xF0DEDEF0 has the same (32 wide) bit pattern in both example, 0xF0DEDEF0 has the same (32 wide) bit pattern in both
architectures, making it inappropriate. architectures, making it inappropriate.
Applications already impose structures on files [Strohm11] and detect Applications already impose structures on files [Strohm11] and detect
corruption in data blocks [Ashdown08]. What they are not able to do corruption in data blocks [Ashdown08]. What they are not able to do
is efficiently transfer and store ADBs. To initialize a file with is efficiently transfer and store ADBs. To initialize a file with
ADBs, the client must send each full ADB to the server and that must ADBs, the client must send each full ADB to the server and that must
be stored on the server. be stored on the server.
In this section, we define a framework for transferring the ADB from This section defines a framework for transferring the ADB from client
client to server and present one approach to detecting corruption in to server and present one approach to detecting corruption in a given
a given ADB implementation. ADB implementation.
8.1. Generic Framework 8.1. Generic Framework
We want the representation of the ADB to be flexible enough to The representation of the ADB needs to be flexible enough to support
support many different applications. The most basic approach is no many different applications. The most basic approach is no
imposition of a block at all, which means we are working with the raw imposition of a block at all, which entails working with the raw
bytes. Such an approach would be useful for storing holes, punching bytes. Such an approach would be useful for storing holes, punching
holes, etc. In more complex deployments, a server might be holes, etc. In more complex deployments, a server might be
supporting multiple applications, each with their own definition of supporting multiple applications, each with their own definition of
the ADB. One might store the ADBN at the start of the block and then the ADB. One might store the ADBN at the start of the block and then
have a guard pattern to detect corruption [McDougall07]. The next have a guard pattern to detect corruption [McDougall07]. The next
might store the ADBN at an offset of 100 bytes within the block and might store the ADBN at an offset of 100 bytes within the block and
have no guard pattern at all, i.e., existing applications might have no guard pattern at all, i.e., existing applications might
already have well defined formats for their data blocks. already have well defined formats for their data blocks.
The guard pattern can be used to represent the state of the block, to The guard pattern can be used to represent the state of the block, to
protect against corruption, or both. Again, it needs to be able to protect against corruption, or both. Again, it needs to be able to
be placed anywhere within the ADB. be placed anywhere within the ADB.
We need to be able to represent the starting offset of the block and Both the starting offset of the block and the size of the block need
the size of the block. Note that nothing prevents the application to be represented. Note that nothing prevents the application from
from defining different sized blocks in a file. defining different sized blocks in a file.
8.1.1. Data Block Representation 8.1.1. Data Block Representation
<CODE BEGINS> <CODE BEGINS>
struct app_data_block4 { struct app_data_block4 {
offset4 adb_offset; offset4 adb_offset;
length4 adb_block_size; length4 adb_block_size;
length4 adb_block_count; length4 adb_block_count;
length4 adb_reloff_blocknum; length4 adb_reloff_blocknum;
count4 adb_block_num; count4 adb_block_num;
length4 adb_reloff_pattern; length4 adb_reloff_pattern;
opaque adb_pattern<>; opaque adb_pattern<>;
}; };
<CODE ENDS> <CODE ENDS>
The app_data_block4 structure captures the abstraction presented for The app_data_block4 structure captures the abstraction presented for
the ADB. The additional fields present are to allow the transmission the ADB. The additional fields present are to allow the transmission
of adb_block_count ADBs at one time. We also use adb_block_num to of adb_block_count ADBs at one time. The adb_block_num is used to
convey the ADBN of the first block in the sequence. Each ADB will convey the ADBN of the first block in the sequence. Each ADB will
contain the same adb_pattern string. contain the same adb_pattern string.
As both adb_block_num and adb_pattern are optional, if either As both adb_block_num and adb_pattern are optional, if either
adb_reloff_pattern or adb_reloff_blocknum is set to NFS4_UINT64_MAX, adb_reloff_pattern or adb_reloff_blocknum is set to NFS4_UINT64_MAX,
then the corresponding field is not set in any of the ADB. then the corresponding field is not set in any of the ADB.
8.2. An Example of Detecting Corruption 8.2. An Example of Detecting Corruption
In this section, we define an ADB format in which corruption can be In this section, an example ADB format is defined in which corruption
detected. Note that this is just one possible format and means to can be detected. Note that this is just one possible format and
detect corruption. means to detect corruption.
Consider a very basic implementation of an operating system's disk Consider a very basic implementation of an operating system's disk
blocks. A block is either data or it is an indirect block which blocks. A block is either data or it is an indirect block which
allows for files to be larger than one block. It is desired to be allows for files to be larger than one block. It is desired to be
able to initialize a block. Lastly, to quickly unlink a file, a able to initialize a block. Lastly, to quickly unlink a file, a
block can be marked invalid. The contents remain intact - which block can be marked invalid. The contents remain intact - which
would enable this OS application to undelete a file. would enable this OS application to undelete a file.
The application defines 4k sized data blocks, with an 8 byte block The application defines 4k sized data blocks, with an 8 byte block
counter occurring at offset 0 in the block, and with the guard counter occurring at offset 0 in the block, and with the guard
skipping to change at page 35, line 46 skipping to change at page 37, line 13
file. file.
o If the guard pattern is INDIRECT and one of the stored indirect o If the guard pattern is INDIRECT and one of the stored indirect
block numbers is a duplicate of another stored indirect block block numbers is a duplicate of another stored indirect block
number. number.
As can be seen, the application can detect errors based on the As can be seen, the application can detect errors based on the
combination of the guard pattern state and the checksum. But also, combination of the guard pattern state and the checksum. But also,
the application can detect corruption based on the state and the the application can detect corruption based on the state and the
contents of the ADB. This last point is important in validating the contents of the ADB. This last point is important in validating the
minimum amount of data we incorporated into our generic framework. minimum amount of data incorporated into the generic framework.
I.e., the guard pattern is sufficient in allowing applications to I.e., the guard pattern is sufficient in allowing applications to
design their own corruption detection. design their own corruption detection.
Finally, it is important to note that none of these corruption checks Finally, it is important to note that none of these corruption checks
occur in the transport layer. The server and client components are occur in the transport layer. The server and client components are
totally unaware of the file format and might report everything as totally unaware of the file format and might report everything as
being transferred correctly even in the case the application detects being transferred correctly even in the case the application detects
corruption. corruption.
8.3. Example of READ_PLUS 8.3. Example of READ_PLUS
The hypothetical application presented in Section 8.2 can be used to The hypothetical application presented in Section 8.2 can be used to
illustrate how READ_PLUS would return an array of results. A file is illustrate how READ_PLUS would return an array of results. A file is
created and initialized with 100 4k ADBs in the FREE state with the created and initialized with 100 4k ADBs in the FREE state with the
WRITE_SAME operation (see Section 15.12): WRITE_SAME operation (see Section 15.12):
WRITE_SAME {0, 4k, 100, 0, 0, 8, 0xfeedface} WRITE_SAME {0, 4k, 100, 0, 0, 8, 0xfeedface}
Further, assume the application writes a single ADB at 16k, changing Further, assume the application writes a single ADB at 16k, changing
the guard pattern to 0xcafedead, we would then have in memory: the guard pattern to 0xcafedead, then there would be in memory:
0k -> (4k - 1) : 00 00 00 00 ... fe ed fa ce 00 00 ... 00 0k -> (4k - 1) : 00 00 00 00 ... fe ed fa ce 00 00 ... 00
4k -> (8k - 1) : 00 00 00 01 ... fe ed fa ce 00 00 ... 00 4k -> (8k - 1) : 00 00 00 01 ... fe ed fa ce 00 00 ... 00
8k -> (12k - 1) : 00 00 00 02 ... fe ed fa ce 00 00 ... 00 8k -> (12k - 1) : 00 00 00 02 ... fe ed fa ce 00 00 ... 00
12k -> (16k - 1) : 00 00 00 03 ... fe ed fa ce 00 00 ... 00 12k -> (16k - 1) : 00 00 00 03 ... fe ed fa ce 00 00 ... 00
16k -> (20k - 1) : 00 00 00 04 ... ca fe de ad 00 00 ... 00 16k -> (20k - 1) : 00 00 00 04 ... ca fe de ad 00 00 ... 00
20k -> (24k - 1) : 00 00 00 05 ... fe ed fa ce 00 00 ... 00 20k -> (24k - 1) : 00 00 00 05 ... fe ed fa ce 00 00 ... 00
24k -> (28k - 1) : 00 00 00 06 ... fe ed fa ce 00 00 ... 00 24k -> (28k - 1) : 00 00 00 06 ... fe ed fa ce 00 00 ... 00
... ...
396k -> (400k - 1) : 00 00 00 63 ... fe ed fa ce 00 00 ... 00 396k -> (400k - 1) : 00 00 00 63 ... fe ed fa ce 00 00 ... 00
skipping to change at page 37, line 18 skipping to change at page 38, line 37
Lists are commonly referred to as Discretionary Access Control (DAC) Lists are commonly referred to as Discretionary Access Control (DAC)
models. These systems base their access decisions on user identity models. These systems base their access decisions on user identity
and resource ownership. In contrast Mandatory Access Control (MAC) and resource ownership. In contrast Mandatory Access Control (MAC)
models base their access control decisions on the label on the models base their access control decisions on the label on the
subject (usually a process) and the object it wishes to access subject (usually a process) and the object it wishes to access
[RFC4949]. These labels may contain user identity information but [RFC4949]. These labels may contain user identity information but
usually contain additional information. In DAC systems users are usually contain additional information. In DAC systems users are
free to specify the access rules for resources that they own. MAC free to specify the access rules for resources that they own. MAC
models base their security decisions on a system wide policy models base their security decisions on a system wide policy
established by an administrator or organization which the users do established by an administrator or organization which the users do
not have the ability to override. In this section, we add a MAC not have the ability to override. In this section, a MAC model is
model to NFSv4.2. added to NFSv4.2.
First we provide a method for transporting and storing security label First, a method is provided for transporting and storing security
data on NFSv4 file objects. Security labels have several semantics label data on NFSv4 file objects. Security labels have several
that are met by NFSv4 recommended attributes such as the ability to semantics that are met by NFSv4 recommended attributes such as the
set the label value upon object creation. Access control on these ability to set the label value upon object creation. Access control
attributes are done through a combination of two mechanisms. As with on these attributes are done through a combination of two mechanisms.
other recommended attributes on file objects the usual DAC checks, As with other recommended attributes on file objects the usual DAC
Access Control Lists (ACLs) and permission bits, will be performed to checks, Access Control Lists (ACLs) and permission bits, will be
ensure that proper file ownership is enforced. In addition a MAC performed to ensure that proper file ownership is enforced. In
system MAY be employed on the client, server, or both to enforce addition a MAC system MAY be employed on the client, server, or both
additional policy on what subjects may modify security label to enforce additional policy on what subjects may modify security
information. label information.
Second, we describe a method for the client to determine if an NFSv4 Second, a method is described for the client to determine if an NFSv4
file object security label has changed. A client which needs to know file object security label has changed. A client which needs to know
if a label on a file or set of files is going to change SHOULD if a label on a file or set of files is going to change SHOULD
request a delegation on each labeled file. In order to change such a request a delegation on each labeled file. In order to change such a
security label, the server will have to recall delegations on any security label, the server will have to recall delegations on any
file affected by the label change, so informing clients of the label file affected by the label change, so informing clients of the label
change. change.
An additional useful feature would be modification to the RPC layer An additional useful feature would be modification to the RPC layer
used by NFSv4 to allow RPC calls to assert client process subject used by NFSv4 to allow RPC calls to assert client process subject
security labels and enable full mode enforcement as described in security labels and enable full mode enforcement as described in
Section 9.5.1. Such modifications are outside the scope of this Section 9.5.1. Such modifications are outside the scope of this
document (see [rpcsec_gssv3]). document (see [I-D.ietf-nfsv4-rpcsec-gssv3]).
9.1. Definitions 9.1. Definitions
Label Format Specifier (LFS): is an identifier used by the client to Label Format Specifier (LFS): is an identifier used by the client to
establish the syntactic format of the security label and the establish the syntactic format of the security label and the
semantic meaning of its components. These specifiers exist in a semantic meaning of its components. These specifiers exist in a
registry associated with documents describing the format and registry associated with documents describing the format and
semantics of the label. semantics of the label.
Label Format Registry: is the IANA registry (see [RFC7569]) Label Format Registry: is the IANA registry (see [RFC7569])
containing all registered LFSes along with references to the containing all registered LFSes along with references to the
documents that describe the syntactic format and semantics of the documents that describe the syntactic format and semantics of the
security label. security label.
Policy Identifier (PI): is an optional part of the definition of a Policy Identifier (PI): is an optional part of the definition of a
Label Format Specifier which allows for clients and server to Label Format Specifier which allows for clients and server to
identify specific security policies. identify specific security policies.
Object: is a passive resource within the system that we wish to be Object: is a passive resource within the system that is to be
protected. Objects can be entities such as files, directories, protected. Objects can be entities such as files, directories,
pipes, sockets, and many other system resources relevant to the pipes, sockets, and many other system resources relevant to the
protection of the system state. protection of the system state.
Subject: is an active entity usually a process which is requesting Subject: is an active entity usually a process which is requesting
access to an object. access to an object.
MAC-Aware: is a server which can transmit and store object labels. MAC-Aware: is a server which can transmit and store object labels.
MAC-Functional: is a client or server which is Labeled NFS enabled. MAC-Functional: is a client or server which is Labeled NFS enabled.
Such a system can interpret labels and apply policies based on the Such a system can interpret labels and apply policies based on the
security system. security system.
Multi-Level Security (MLS): is a traditional model where objects are Multi-Level Security (MLS): is a traditional model where objects are
given a sensitivity level (Unclassified, Secret, Top Secret, etc) given a sensitivity level (Unclassified, Secret, Top Secret, etc)
and a category set (see [BL73], [RFC1108], and [RFC2401]). and a category set (see [LB96], [RFC1108], [RFC2401], and
[RFC4949]).
9.2. MAC Security Attribute 9.2. MAC Security Attribute
MAC models base access decisions on security attributes bound to MAC models base access decisions on security attributes bound to
subjects (usually processes) and objects (for NFS, file objects). subjects (usually processes) and objects (for NFS, file objects).
This information can range from a user identity for an identity based This information can range from a user identity for an identity based
MAC model, sensitivity levels for Multi-level security, or a type for MAC model, sensitivity levels for Multi-level security, or a type for
Type Enforcement. These models base their decisions on different Type Enforcement. These models base their decisions on different
criteria but the semantics of the security attribute remain the same. criteria but the semantics of the security attribute remain the same.
The semantics required by the security attributes are listed below: The semantics required by the security attributes are listed below:
skipping to change at page 41, line 43 skipping to change at page 43, line 15
9.5.1. Full Mode 9.5.1. Full Mode
Full mode environments consist of MAC-Functional NFSv4 servers and Full mode environments consist of MAC-Functional NFSv4 servers and
clients and may be composed of mixed MAC models and policies. The clients and may be composed of mixed MAC models and policies. The
system requires that both the client and server have an opportunity system requires that both the client and server have an opportunity
to perform an access control check based on all relevant information to perform an access control check based on all relevant information
within the network. The file object security attribute is provided within the network. The file object security attribute is provided
using the mechanism described in Section 9.2. using the mechanism described in Section 9.2.
Fully MAC-Functional NFSv4 servers are not possible in the absence of Fully MAC-Functional NFSv4 servers are not possible in the absence of
RPCSEC_GSSv3 [rpcsec_gssv3] support for client process subject label RPCSEC_GSSv3 [I-D.ietf-nfsv4-rpcsec-gssv3] support for client process
assertion. However, servers may make decisions based on the RPC subject label assertion. However, servers may make decisions based
credential information available. on the RPC credential information available.
9.5.1.1. Initial Labeling and Translation 9.5.1.1. Initial Labeling and Translation
The ability to create a file is an action that a MAC model may wish The ability to create a file is an action that a MAC model may wish
to mediate. The client is given the responsibility to determine the to mediate. The client is given the responsibility to determine the
initial security attribute to be placed on a file. This allows the initial security attribute to be placed on a file. This allows the
client to make a decision as to the acceptable security attributes to client to make a decision as to the acceptable security attributes to
create a file with before sending the request to the server. Once create a file with before sending the request to the server. Once
the server receives the creation request from the client it may the server receives the creation request from the client it may
choose to evaluate if the security attribute is acceptable. choose to evaluate if the security attribute is acceptable.
skipping to change at page 51, line 34 skipping to change at page 52, line 46
12.1. New RECOMMENDED Attributes - List and Definition References 12.1. New RECOMMENDED Attributes - List and Definition References
The list of new RECOMMENDED attributes appears in Table 4. The The list of new RECOMMENDED attributes appears in Table 4. The
meaning of the columns of the table are: meaning of the columns of the table are:
Name: The name of the attribute. Name: The name of the attribute.
Id: The number assigned to the attribute. In the event of conflicts Id: The number assigned to the attribute. In the event of conflicts
between the assigned number and between the assigned number and
[I-D.ietf-nfsv4-minorversion2-dot-x], the latter is likely [I-D.ietf-nfsv4-minorversion2-dot-x], the latter is authoritative,
authoritative, but should be resolved with Errata to this document but in such an event, it should be resolved with Errata to this
and/or [I-D.ietf-nfsv4-minorversion2-dot-x]. See [IESG08] for the document and/or [I-D.ietf-nfsv4-minorversion2-dot-x]. See
Errata process. [IESG08] for the Errata process.
Data Type: The XDR data type of the attribute. Data Type: The XDR data type of the attribute.
Acc: Access allowed to the attribute. Acc: Access allowed to the attribute.
R means read-only (GETATTR may retrieve, SETATTR may not set). R means read-only (GETATTR may retrieve, SETATTR may not set).
W means write-only (SETATTR may set, GETATTR may not retrieve). W means write-only (SETATTR may set, GETATTR may not retrieve).
R W means read/write (GETATTR may retrieve, SETATTR may set). R W means read/write (GETATTR may retrieve, SETATTR may set).
skipping to change at page 65, line 17 skipping to change at page 66, line 24
If the ca_source_server list is specified, then this is an inter- If the ca_source_server list is specified, then this is an inter-
server copy operation and the source file is on a remote server. The server copy operation and the source file is on a remote server. The
client is expected to have previously issued a successful COPY_NOTIFY client is expected to have previously issued a successful COPY_NOTIFY
request to the remote source server. The ca_source_server list MUST request to the remote source server. The ca_source_server list MUST
be the same as the COPY_NOTIFY response's cnr_source_server list. If be the same as the COPY_NOTIFY response's cnr_source_server list. If
the client includes the entries from the COPY_NOTIFY response's the client includes the entries from the COPY_NOTIFY response's
cnr_source_server list in the ca_source_server list, the source cnr_source_server list in the ca_source_server list, the source
server can indicate a specific copy protocol for the destination server can indicate a specific copy protocol for the destination
server to use by returning a URL, which specifies both a protocol server to use by returning a URL, which specifies both a protocol
service and server name. Server-to-server copy protocol service and server name. Server-to-server copy protocol
considerations are described in Section 4.7 and Section 4.10.1. considerations are described in Section 4.6 and Section 4.9.1.
If ca_consecutive is set, then the client has specified that the copy If ca_consecutive is set, then the client has specified that the copy
protocol selected MUST copy bytes in consecutive order from protocol selected MUST copy bytes in consecutive order from
ca_src_offset to ca_count. If the destination server cannot meet ca_src_offset to ca_count. If the destination server cannot meet
this requirement, then it MUST return an error of this requirement, then it MUST return an error of
NFS4ERR_OFFLOAD_NO_REQS and set cr_consecutive to be false. NFS4ERR_OFFLOAD_NO_REQS and set cr_consecutive to be false.
Likewise, if ca_synchronous is set, then the client has required that Likewise, if ca_synchronous is set, then the client has required that
the copy protocol selected MUST perform a synchronous copy. If the the copy protocol selected MUST perform a synchronous copy. If the
destination server cannot meet this requirement, then it MUST return destination server cannot meet this requirement, then it MUST return
an error of NFS4ERR_OFFLOAD_NO_REQS and set cr_synchronous to be an error of NFS4ERR_OFFLOAD_NO_REQS and set cr_synchronous to be
skipping to change at page 68, line 35 skipping to change at page 69, line 43
in Section 8.2 of [RFC5661], a stateid is tied to the current in Section 8.2 of [RFC5661], a stateid is tied to the current
filehandle and if the same stateid is presented by two different filehandle and if the same stateid is presented by two different
clients, it may refer to different state. As the source does not clients, it may refer to different state. As the source does not
know which netloc4 network location the destination might use to know which netloc4 network location the destination might use to
establish the copy operation, it can use the cnr_stateid to identify establish the copy operation, it can use the cnr_stateid to identify
that the destination is operating on behalf of the client. Thus the that the destination is operating on behalf of the client. Thus the
source server MUST construct copy stateids such that they are source server MUST construct copy stateids such that they are
distinct from all other stateids handed out to clients. These copy distinct from all other stateids handed out to clients. These copy
stateids MUST denote the same set of locks as each of the earlier stateids MUST denote the same set of locks as each of the earlier
delegation, locking, and open states for the client on the given file delegation, locking, and open states for the client on the given file
(see Section 4.4.1). (see Section 4.3.1).
A successful response will also contain a list of netloc4 network A successful response will also contain a list of netloc4 network
location formats called cnr_source_server, on which the source is location formats called cnr_source_server, on which the source is
willing to accept connections from the destination. These might not willing to accept connections from the destination. These might not
be reachable from the client and might be located on networks to be reachable from the client and might be located on networks to
which the client has no connection. which the client has no connection.
For a copy only involving one server (the source and destination are For a copy only involving one server (the source and destination are
on the same server), this operation is unnecessary. on the same server), this operation is unnecessary.
skipping to change at page 95, line 50 skipping to change at page 96, line 50
COPY: the total number of bytes copied COPY: the total number of bytes copied
WRITE_SAME: the same information that a synchronous WRITE_SAME would WRITE_SAME: the same information that a synchronous WRITE_SAME would
provide provide
17. Security Considerations 17. Security Considerations
NFSv4.2 has all of the security concerns present in NFSv4.1 (see NFSv4.2 has all of the security concerns present in NFSv4.1 (see
Section 21 of [RFC5661]) and those present in the Server Side Copy Section 21 of [RFC5661]) and those present in the Server Side Copy
(see Section 4.10) and in Labeled NFS (see Section 9.6). (see Section 4.9) and in Labeled NFS (see Section 9.6).
18. IANA Considerations 18. IANA Considerations
The IANA Considerations for Labeled NFS are addressed in [RFC7569]. The IANA Considerations for Labeled NFS are addressed in [RFC7569].
19. References 19. References
19.1. Normative References 19.1. Normative References
[I-D.ietf-nfsv4-minorversion2-dot-x] [I-D.ietf-nfsv4-minorversion2-dot-x]
Haynes, T., "NFSv4 Minor Version 2 Protocol External Data Haynes, T., "NFSv4 Minor Version 2 Protocol External Data
Representation Standard (XDR) Description", draft-ietf- Representation Standard (XDR) Description", draft-ietf-
nfsv4-minorversion2-dot-x-40 (work in progress), January nfsv4-minorversion2-dot-x-40 (work in progress), January
2016. 2016.
[I-D.ietf-nfsv4-rpcsec-gssv3]
Adamson, A. and N. Williams, "Remote Procedure Call (RPC)
Security Version 3", draft-ietf-nfsv4-rpcsec-gssv3-17
(work in progress), January 2016.
[RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
Resource Identifier (URI): Generic Syntax", STD 66, RFC Resource Identifier (URI): Generic Syntax", STD 66, RFC
3986, January 2005. 3986, January 2005.
[RFC5661] Shepler, S., Eisler, M., and D. Noveck, "Network File [RFC5661] Shepler, S., Eisler, M., and D. Noveck, "Network File
System (NFS) Version 4 Minor Version 1 Protocol", RFC System (NFS) Version 4 Minor Version 1 Protocol", RFC
5661, January 2010. 5661, January 2010.
[RFC5662] Shepler, S., Eisler, M., and D. Noveck, "Network File [RFC5662] Shepler, S., Eisler, M., and D. Noveck, "Network File
System (NFS) Version 4 Minor Version 1 External Data System (NFS) Version 4 Minor Version 1 External Data
skipping to change at page 96, line 46 skipping to change at page 98, line 5
[posix_fadvise] [posix_fadvise]
The Open Group, "Section 'posix_fadvise()' of System The Open Group, "Section 'posix_fadvise()' of System
Interfaces of The Open Group Base Specifications Issue 6, Interfaces of The Open Group Base Specifications Issue 6,
IEEE Std 1003.1, 2004 Edition", 2004. IEEE Std 1003.1, 2004 Edition", 2004.
[posix_fallocate] [posix_fallocate]
The Open Group, "Section 'posix_fallocate()' of System The Open Group, "Section 'posix_fallocate()' of System
Interfaces of The Open Group Base Specifications Issue 6, Interfaces of The Open Group Base Specifications Issue 6,
IEEE Std 1003.1, 2004 Edition", 2004. IEEE Std 1003.1, 2004 Edition", 2004.
[rpcsec_gssv3]
Adamson, W. and N. Williams, "Remote Procedure Call (RPC)
Security Version 3", December 2014.
19.2. Informative References 19.2. Informative References
[Ashdown08] [Ashdown08]
Ashdown, L., "Chapter 15, Validating Database Files and Ashdown, L., "Chapter 15, Validating Database Files and
Backups, of Oracle Database Backup and Recovery User's Backups, of Oracle Database Backup and Recovery User's
Guide 11g Release 1 (11.1)", August 2008. Guide 11g Release 1 (11.1)", August 2008.
[BL73] Bell, D. and L. LaPadula, "Secure Computer Systems:
Mathematical Foundations and Model", Technical Report
M74-244, The MITRE Corporation, Bedford, MA, May 1973.
[Baira08] Bairavasundaram, L., Goodson, G., Schroeder, B., Arpaci- [Baira08] Bairavasundaram, L., Goodson, G., Schroeder, B., Arpaci-
Dusseau, A., and R. Arpaci-Dusseau, "An Analysis of Data Dusseau, A., and R. Arpaci-Dusseau, "An Analysis of Data
Corruption in the Storage Stack", Proceedings of the 6th Corruption in the Storage Stack", Proceedings of the 6th
USENIX Symposium on File and Storage Technologies (FAST USENIX Symposium on File and Storage Technologies (FAST
'08) , 2008. '08) , 2008.
[IESG08] ISEG, "IESG Processing of RFC Errata for the IETF Stream", [I-D.ietf-nfsv4-versioning]
Noveck, D., "NFSv4 Version Management", draft-ietf-
nfsv4-versioning-03 (work in progress), January 2016.
[IESG08] IESG, "IESG Processing of RFC Errata for the IETF Stream",
2008. 2008.
[LB96] LaPadula, L. and D. Bell, "MITRE technical report 2547,
volume II", Journal of Computer Security, Volume 4, Issue
2-3, 249-263 IOS Press, Amsterdam, The Netherlands,
January 1996.
[McDougall07] [McDougall07]
McDougall, R. and J. Mauro, "Section 11.4.3, Detecting McDougall, R. and J. Mauro, "Section 11.4.3, Detecting
Memory Corruption of Solaris Internals", 2007. Memory Corruption of Solaris Internals", 2007.
[NFSv4-Versioning]
Haynes, T. and D. Noveck, "NFSv4 Version Management",
November 2014.
[RFC1108] Kent, S., "Security Options for the Internet Protocol", [RFC1108] Kent, S., "Security Options for the Internet Protocol",
RFC 1108, November 1991. RFC 1108, November 1991.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", March 1997. Requirement Levels", March 1997.
[RFC2401] Kent, S. and R. Atkinson, "Security Architecture for the [RFC2401] Kent, S. and R. Atkinson, "Security Architecture for the
Internet Protocol", RFC 2401, November 1998. Internet Protocol", RFC 2401, November 1998.
[RFC4506] Eisler, M., "XDR: External Data Representation Standard", [RFC4506] Eisler, M., "XDR: External Data Representation Standard",
skipping to change at page 99, line 33 skipping to change at page 100, line 38
Jorge Mora provided a very detailed review and caught some important Jorge Mora provided a very detailed review and caught some important
issues with the tables. issues with the tables.
During the review process, Talia Reyes-Ortiz helped the sessions run During the review process, Talia Reyes-Ortiz helped the sessions run
smoothly. While many people contributed here and there, the core smoothly. While many people contributed here and there, the core
reviewers were Andy Adamson, Pranoop Erasani, Bruce Fields, Chuck reviewers were Andy Adamson, Pranoop Erasani, Bruce Fields, Chuck
Lever, Trond Myklebust, David Noveck, Peter Staubach, and Mike Lever, Trond Myklebust, David Noveck, Peter Staubach, and Mike
Kupfer. Kupfer.
Elwyn Davies was the General Area Reviewer for this document and her
insights as to the relationship of this document and both [RFC5661]
and [RFC7530] were very much appreciated!
Appendix B. RFC Editor Notes Appendix B. RFC Editor Notes
[RFC Editor: please remove this section prior to publishing this [RFC Editor: please remove this section prior to publishing this
document as an RFC] document as an RFC]
[RFC Editor: prior to publishing this document as an RFC, please [RFC Editor: prior to publishing this document as an RFC, please
replace all occurrences of I-D.ietf-nfsv4-minorversion2-dot-x with replace all occurrences of I-D.ietf-nfsv4-minorversion2-dot-x with
RFCxxxx where xxxx is the RFC number of the companion XDR document] RFCxxxx where xxxx is the RFC number of the companion XDR document]
Author's Address Author's Address
 End of changes. 105 change blocks. 
270 lines changed or deleted 311 lines changed or added

This html diff was produced by rfcdiff 1.42. The latest version is available from http://tools.ietf.org/tools/rfcdiff/