draft-ietf-nfsv4-minorversion2-10.txt   draft-ietf-nfsv4-minorversion2-11.txt 
NFSv4 T. Haynes NFSv4 T. Haynes
Internet-Draft Editor Internet-Draft Editor
Intended status: Standards Track May 08, 2012 Intended status: Standards Track May 23, 2012
Expires: November 9, 2012 Expires: November 24, 2012
NFS Version 4 Minor Version 2 NFS Version 4 Minor Version 2
draft-ietf-nfsv4-minorversion2-10.txt draft-ietf-nfsv4-minorversion2-11.txt
Abstract Abstract
This Internet-Draft describes NFS version 4 minor version two, This Internet-Draft describes NFS version 4 minor version two,
focusing mainly on the protocol extensions made from NFS version 4 focusing mainly on the protocol extensions made from NFS version 4
minor version 0 and NFS version 4 minor version 1. Major extensions minor version 0 and NFS version 4 minor version 1. Major extensions
introduced in NFS version 4 minor version two include: Server-side introduced in NFS version 4 minor version two include: Server-side
Copy, Application I/O Advise, Space Reservations, Sparse Files, Copy, Application I/O Advise, Space Reservations, Sparse Files,
Application Data Blocks, and Labeled NFS. Application Data Blocks, and Labeled NFS.
skipping to change at page 1, line 41 skipping to change at page 1, line 41
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on November 9, 2012. This Internet-Draft will expire on November 24, 2012.
Copyright Notice Copyright Notice
Copyright (c) 2012 IETF Trust and the persons identified as the Copyright (c) 2012 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 4, line 5 skipping to change at page 4, line 5
6.5. Zero Filled Holes . . . . . . . . . . . . . . . . . . . . 36 6.5. Zero Filled Holes . . . . . . . . . . . . . . . . . . . . 36
7. Labeled NFS . . . . . . . . . . . . . . . . . . . . . . . . . 36 7. Labeled NFS . . . . . . . . . . . . . . . . . . . . . . . . . 36
7.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 37 7.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 37
7.2. Definitions . . . . . . . . . . . . . . . . . . . . . . . 38 7.2. Definitions . . . . . . . . . . . . . . . . . . . . . . . 38
7.3. MAC Security Attribute . . . . . . . . . . . . . . . . . 38 7.3. MAC Security Attribute . . . . . . . . . . . . . . . . . 38
7.3.1. Delegations . . . . . . . . . . . . . . . . . . . . . 39 7.3.1. Delegations . . . . . . . . . . . . . . . . . . . . . 39
7.3.2. Permission Checking . . . . . . . . . . . . . . . . . 39 7.3.2. Permission Checking . . . . . . . . . . . . . . . . . 39
7.3.3. Object Creation . . . . . . . . . . . . . . . . . . . 39 7.3.3. Object Creation . . . . . . . . . . . . . . . . . . . 39
7.3.4. Existing Objects . . . . . . . . . . . . . . . . . . . 40 7.3.4. Existing Objects . . . . . . . . . . . . . . . . . . . 40
7.3.5. Label Changes . . . . . . . . . . . . . . . . . . . . 40 7.3.5. Label Changes . . . . . . . . . . . . . . . . . . . . 40
7.4. pNFS Considerations . . . . . . . . . . . . . . . . . . . 40 7.4. pNFS Considerations . . . . . . . . . . . . . . . . . . . 41
7.5. Discovery of Server LNFS Support . . . . . . . . . . . . 41 7.5. Discovery of Server Labeled NFS Support . . . . . . . . . 41
7.6. MAC Security NFS Modes of Operation . . . . . . . . . . . 41 7.6. MAC Security NFS Modes of Operation . . . . . . . . . . . 41
7.6.1. Full Mode . . . . . . . . . . . . . . . . . . . . . . 42 7.6.1. Full Mode . . . . . . . . . . . . . . . . . . . . . . 42
7.6.2. Guest Mode . . . . . . . . . . . . . . . . . . . . . . 43 7.6.2. Guest Mode . . . . . . . . . . . . . . . . . . . . . . 43
7.7. Security Considerations . . . . . . . . . . . . . . . . . 43 7.7. Security Considerations . . . . . . . . . . . . . . . . . 43
8. Sharing change attribute implementation details with NFSv4 8. Sharing change attribute implementation details with NFSv4
clients . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 clients . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
8.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 44 8.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 44
9. Security Considerations . . . . . . . . . . . . . . . . . . . 44 9. Security Considerations . . . . . . . . . . . . . . . . . . . 44
10. Error Values . . . . . . . . . . . . . . . . . . . . . . . . . 44 10. Error Values . . . . . . . . . . . . . . . . . . . . . . . . . 44
10.1. Error Definitions . . . . . . . . . . . . . . . . . . . . 45 10.1. Error Definitions . . . . . . . . . . . . . . . . . . . . 45
skipping to change at page 6, line 39 skipping to change at page 6, line 39
o modify the specification of the NFSv4.0 or NFSv4.1 protocols. o modify the specification of the NFSv4.0 or NFSv4.1 protocols.
o clarify the NFSv4.0 or NFSv4.1 protocols. I.e., any o clarify the NFSv4.0 or NFSv4.1 protocols. I.e., any
clarifications made here apply to NFSv4.2 and neither of the prior clarifications made here apply to NFSv4.2 and neither of the prior
protocols. protocols.
The full XDR for NFSv4.2 is presented in [3]. The full XDR for NFSv4.2 is presented in [3].
1.3. NFSv4.2 Goals 1.3. NFSv4.2 Goals
The goal of the design of NFSv4.2 is to take common local filesystem The goal of the design of NFSv4.2 is to take common local file system
features and offer them remotely. These features might features and offer them remotely. These features might
o already be available on the servers, e.g., sparse files o already be available on the servers, e.g., sparse files
o be under development as a new standard, e.g., SEEK_HOLE and o be under development as a new standard, e.g., SEEK_HOLE and
SEEK_DATA SEEK_DATA
o be used by clients with the servers via some proprietary means, o be used by clients with the servers via some proprietary means,
e.g., Labeled NFS e.g., Labeled NFS
skipping to change at page 30, line 32 skipping to change at page 30, line 32
One such example is space reservation. When a hypervisor creates a One such example is space reservation. When a hypervisor creates a
virtual disk file, it often tries to preallocate the space for the virtual disk file, it often tries to preallocate the space for the
file so that there are no future allocation related errors during the file so that there are no future allocation related errors during the
operation of the virtual machine. Such errors prevent a virtual operation of the virtual machine. Such errors prevent a virtual
machine from continuing execution and result in downtime. machine from continuing execution and result in downtime.
Currently, in order to achieve such a guarantee, applications zero Currently, in order to achieve such a guarantee, applications zero
the entire file. The initial zeroing allocates the backing blocks the entire file. The initial zeroing allocates the backing blocks
and all subsequent writes are overwrites of already allocated blocks. and all subsequent writes are overwrites of already allocated blocks.
This approach is not only inefficient in terms of the amount of I/O This approach is not only inefficient in terms of the amount of I/O
done, it is also not guaranteed to work on filesystems that are log done, it is also not guaranteed to work on file systems that are log
structured or deduplicated. An efficient way of guaranteeing space structured or deduplicated. An efficient way of guaranteeing space
reservation would be beneficial to such applications. reservation would be beneficial to such applications.
If the space_reserved attribute (see Section 11.2.3) is set on a If the space_reserved attribute (see Section 11.2.3) is set on a
file, it is guaranteed that writes that do not grow the file will not file, it is guaranteed that writes that do not grow the file will not
fail with NFSERR_NOSPC. fail with NFSERR_NOSPC.
Another useful feature would be the ability to report the number of Another useful feature would be the ability to report the number of
blocks that would be freed when a file is deleted. Currently, NFS blocks that would be freed when a file is deleted. Currently, NFS
reports two size attributes: reports two size attributes:
size The logical file size of the file. size The logical file size of the file.
space_used The size in bytes that the file occupies on disk space_used The size in bytes that the file occupies on disk
While these attributes are sufficient for space accounting in While these attributes are sufficient for space accounting in
traditional filesystems, they prove to be inadequate in modern traditional file systems, they prove to be inadequate in modern file
filesystems that support block sharing. In such filesystems, systems that support block sharing. In such file systems, multiple
multiple inodes can point to a single block with a block reference inodes can point to a single block with a block reference count to
count to guard against premature freeing. Having a way to tell the guard against premature freeing. Having a way to tell the number of
number of blocks that would be freed if the file was deleted would be blocks that would be freed if the file was deleted would be useful to
useful to applications that wish to migrate files when a volume is applications that wish to migrate files when a volume is low on
low on space. space.
Since virtual disks represent a hard drive in a virtual machine, a Since virtual disks represent a hard drive in a virtual machine, a
virtual disk can be viewed as a filesystem within a file. Since not virtual disk can be viewed as a file system within a file. Since not
all blocks within a filesystem are in use, there is an opportunity to all blocks within a file system are in use, there is an opportunity
reclaim blocks that are no longer in use. A call to deallocate to reclaim blocks that are no longer in use. A call to deallocate
blocks could result in better space efficiency. Lesser space MAY be blocks could result in better space efficiency. Lesser space MAY be
consumed for backups after block deallocation. consumed for backups after block deallocation.
The following operations and attributes can be used to resolve this The following operations and attributes can be used to resolve this
issues: issues:
space_reserved This attribute specifies whether the blocks backing space_reserved This attribute specifies whether the blocks backing
the file have been preallocated. the file have been preallocated.
space_freed This attribute specifies the space freed when a file is space_freed This attribute specifies the space freed when a file is
skipping to change at page 37, line 17 skipping to change at page 37, line 17
Lists are commonly referred to as Discretionary Access Control (DAC) Lists are commonly referred to as Discretionary Access Control (DAC)
models. These systems base their access decisions on user identity models. These systems base their access decisions on user identity
and resource ownership. In contrast Mandatory Access Control (MAC) and resource ownership. In contrast Mandatory Access Control (MAC)
models base their access control decisions on the label on the models base their access control decisions on the label on the
subject (usually a process) and the object it wishes to access [7]. subject (usually a process) and the object it wishes to access [7].
These labels may contain user identity information but usually These labels may contain user identity information but usually
contain additional information. In DAC systems users are free to contain additional information. In DAC systems users are free to
specify the access rules for resources that they own. MAC models specify the access rules for resources that they own. MAC models
base their security decisions on a system wide policy established by base their security decisions on a system wide policy established by
an administrator or organization which the users do not have the an administrator or organization which the users do not have the
ability to override. In this section, we add a MAC model to NFSv4. ability to override. In this section, we add a MAC model to NFSv4.2.
The first change necessary is to devise a method for transporting and The first change necessary is to devise a method for transporting and
storing security label data on NFSv4 file objects. Security labels storing security label data on NFSv4 file objects. Security labels
have several semantics that are met by NFSv4 recommended attributes have several semantics that are met by NFSv4 recommended attributes
such as the ability to set the label value upon object creation. such as the ability to set the label value upon object creation.
Access control on these attributes are done through a combination of Access control on these attributes are done through a combination of
two mechanisms. As with other recommended attributes on file objects two mechanisms. As with other recommended attributes on file objects
the usual DAC checks (ACLs and permission bits) will be performed to the usual DAC checks (ACLs and permission bits) will be performed to
ensure that proper file ownership is enforced. In addition a MAC ensure that proper file ownership is enforced. In addition a MAC
system MAY be employed on the client, server, or both to enforce system MAY be employed on the client, server, or both to enforce
skipping to change at page 38, line 26 skipping to change at page 38, line 26
Policy Identifier (PI): is an optional part of the definition of a Policy Identifier (PI): is an optional part of the definition of a
Label Format Specifier which allows for clients and server to Label Format Specifier which allows for clients and server to
identify specific security policies. identify specific security policies.
Object: is a passive resource within the system that we wish to be Object: is a passive resource within the system that we wish to be
protected. Objects can be entities such as files, directories, protected. Objects can be entities such as files, directories,
pipes, sockets, and many other system resources relevant to the pipes, sockets, and many other system resources relevant to the
protection of the system state. protection of the system state.
Subject: A subject is an active entity usually a process which is Subject: is an active entity usually a process which is requesting
requesting access to an object. access to an object.
MAC-Aware: is a server which can transmit and store object labels.
MAC-Functional: is a client or server which is Labeled NFS enabled.
Such a system can interpret labels and apply policies based on the
security system.
Multi-Level Security (MLS): is a traditional model where objects are Multi-Level Security (MLS): is a traditional model where objects are
given a sensitivity level (Unclassified, Secret, Top Secret, etc) given a sensitivity level (Unclassified, Secret, Top Secret, etc)
and a category set [21]. and a category set [21].
7.3. MAC Security Attribute 7.3. MAC Security Attribute
MAC models base access decisions on security attributes bound to MAC models base access decisions on security attributes bound to
subjects and objects. This information can range from a user subjects and objects. This information can range from a user
identity for an identity based MAC model, sensitivity levels for identity for an identity based MAC model, sensitivity levels for
Multi-level security, or a type for Type Enforcement. These models Multi-level security, or a type for Type Enforcement. These models
base their decisions on different criteria but the semantics of the base their decisions on different criteria but the semantics of the
security attribute remain the same. The semantics required by the security attribute remain the same. The semantics required by the
security attributes are listed below: security attributes are listed below:
o Must provide flexibility with respect to MAC model. o MUST provide flexibility with respect to the MAC model.
o Must provide the ability to atomically set security information o MUST provide the ability to atomically set security information
upon object creation. upon object creation.
o Must provide the ability to enforce access control decisions both o MUST provide the ability to enforce access control decisions both
on the client and the server. on the client and the server.
o Must not expose an object to either the client or server name o MUST not expose an object to either the client or server name
space before its security information has been bound to it. space before its security information has been bound to it.
NFSv4 implements the security attribute as a recommended attribute. NFSv4 implements the security attribute as a recommended attribute.
These attributes have a fixed format and semantics, which conflicts These attributes have a fixed format and semantics, which conflicts
with the flexible nature of the security attribute. To resolve this with the flexible nature of the security attribute. To resolve this
the security attribute consists of two components. The first the security attribute consists of two components. The first
component is a LFS as defined in [22] to allow for interoperability component is a LFS as defined in [22] to allow for interoperability
between MAC mechanisms. The second component is an opaque field between MAC mechanisms. The second component is an opaque field
which is the actual security attribute data. To allow for various which is the actual security attribute data. To allow for various
MAC models NFSv4 should be used solely as a transport mechanism for MAC models, NFSv4 should be used solely as a transport mechanism for
the security attribute. It is the responsibility of the endpoints to the security attribute. It is the responsibility of the endpoints to
consume the security attribute and make access decisions based on consume the security attribute and make access decisions based on
their respective models. In addition, creation of objects through their respective models. In addition, creation of objects through
OPEN and CREATE allows for the security attribute to be specified OPEN and CREATE allows for the security attribute to be specified
upon creation. By providing an atomic create and set operation for upon creation. By providing an atomic create and set operation for
the security attribute it is possible to enforce the second and the security attribute it is possible to enforce the second and
fourth requirements. The recommended attribute FATTR4_SEC_LABEL (see fourth requirements. The recommended attribute FATTR4_SEC_LABEL (see
Section 11.2.2) will be used to satisfy this requirement. Section 11.2.2) will be used to satisfy this requirement.
7.3.1. Delegations 7.3.1. Delegations
In the event that a security attribute is changed on the server while In the event that a security attribute is changed on the server while
a client holds a delegation on the file, the client should follow the a client holds a delegation on the file, both the server and the
existing protocol with respect to attribute changes. It should flush client MUST follow the NFSv4.1 protocol (see Chapter 10 of [2]) with
all changes back to the server and relinquish the delegation. respect to attribute changes. It SHOULD flush all changes back to
the server and relinquish the delegation.
7.3.2. Permission Checking 7.3.2. Permission Checking
It is not feasible to enumerate all possible MAC models and even It is not feasible to enumerate all possible MAC models and even
levels of protection within a subset of these models. This means levels of protection within a subset of these models. This means
that the NFSv4 client and servers cannot be expected to directly make that the NFSv4 client and servers cannot be expected to directly make
access control decisions based on the security attribute. Instead access control decisions based on the security attribute. Instead
NFSv4 should defer permission checking on this attribute to the host NFSv4 should defer permission checking on this attribute to the host
system. These checks are performed in addition to existing DAC and system. These checks are performed in addition to existing DAC and
ACL checks outlined in the NFSv4 protocol. Section 7.6 gives a ACL checks outlined in the NFSv4 protocol. Section 7.6 gives a
specific example of how the security attribute is handled under a specific example of how the security attribute is handled under a
particular MAC model. particular MAC model.
7.3.3. Object Creation 7.3.3. Object Creation
When creating files in NFSv4 the OPEN and CREATE operations are used. When creating files in NFSv4 the OPEN and CREATE operations are used.
One of the parameters to these operations is an fattr4 structure One of the parameters to these operations is an fattr4 structure
containing the attributes the file is to be created with. This containing the attributes the file is to be created with. This
allows NFSv4 to atomically set the security attribute of files upon allows NFSv4 to atomically set the security attribute of files upon
creation. When a client is MAC aware it must always provide the creation. When a client is MAC-Functional it must always provide the
initial security attribute upon file creation. In the event that the initial security attribute upon file creation. In the event that the
server is the only MAC aware entity in the system it should ignore server is MAC-Functional as well, it should determine by policy
the security attribute specified by the client and instead make the whether it will accept the attribute from the client or instead make
determination itself. A more in depth explanation can be found in the determination itself. If the client is not MAC-Functional, then
Section 7.6. the MAC-Functional server must decide on a default label. A more in
depth explanation can be found in Section 7.6.
7.3.4. Existing Objects 7.3.4. Existing Objects
Note that under the MAC model, all objects must have labels. Note that under the MAC model, all objects must have labels.
Therefore, if an existing server is upgraded to include LNFS support, Therefore, if an existing server is upgraded to include Labeled NFS
then it is the responsibility of the security system to define the support, then it is the responsibility of the security system to
behavior for existing objects. For example, if the security system define the behavior for existing objects.
is LFS 0, which means the server just stores and returns labels, then
existing files should return labels which are set to an empty value.
7.3.5. Label Changes 7.3.5. Label Changes
As per the requirements, when a file's security label is modified, As per the requirements, when a file's security label is modified,
the server must notify all clients which have the file opened of the the server must notify all clients which have the file opened of the
change in label. It does so with CB_ATTR_CHANGED. There are change in label. It does so with CB_ATTR_CHANGED. There are
preconditions to making an attribute change imposed by NFSv4 and the preconditions to making an attribute change imposed by NFSv4 and the
security system might want to impose others. In the process of security system might want to impose others. In the process of
meeting these preconditions, the server may chose to either serve the meeting these preconditions, the server may chose to either serve the
request in whole or return NFS4ERR_DELAY to the SETATTR operation. request in whole or return NFS4ERR_DELAY to the SETATTR operation.
skipping to change at page 40, line 51 skipping to change at page 41, line 9
regardless of the subject label. regardless of the subject label.
The way in which MAC labels are enforced is by the client. So if The way in which MAC labels are enforced is by the client. So if
client A changes a security label on a file, then the server MUST client A changes a security label on a file, then the server MUST
inform all clients that have the file opened that the label has inform all clients that have the file opened that the label has
changed via CB_ATTR_CHANGED. Then the clients MUST retrieve the new changed via CB_ATTR_CHANGED. Then the clients MUST retrieve the new
label and MUST enforce access via the new attribute values. label and MUST enforce access via the new attribute values.
7.4. pNFS Considerations 7.4. pNFS Considerations
This section examines the issues in deploying LNFS in a pNFS This section examines the issues in deploying Labeled NFS in a pNFS
community of servers. community of servers.
7.4.1. MAC Label Checks 7.4.1. MAC Label Checks
The new FATTR4_SEC_LABEL attribute is metadata information and as The new FATTR4_SEC_LABEL attribute is metadata information and as
such the DS is not aware of the value contained on the MDS. such the DS is not aware of the value contained on the MDS.
Fortunately, the NFSv4.1 protocol [2] already has provisions for Fortunately, the NFSv4.1 protocol [2] already has provisions for
doing access level checks from the DS to the MDS. In order for the doing access level checks from the DS to the MDS. In order for the
DS to validate the subject label presented by the client, it SHOULD DS to validate the subject label presented by the client, it SHOULD
utilize this mechanism. utilize this mechanism.
If a file's FATTR4_SEC_LABEL is changed, then the MDS should utilize If a file's FATTR4_SEC_LABEL is changed, then the MDS should utilize
CB_ATTR_CHANGED to inform the client of that fact. If the MDS is CB_ATTR_CHANGED to inform the client of that fact. If the MDS is
maintaining maintaining [[Comment.2: Houston, we seem to have a problem! --TH]]
7.5. Discovery of Server LNFS Support 7.5. Discovery of Server Labeled NFS Support
The server can easily determine that a client supports LNFS when it The server can easily determine that a client supports Labeled NFS
queries for the FATTR4_SEC_LABEL label for an object. Note that it when it queries for the FATTR4_SEC_LABEL label for an object. Note
cannot assume that the presence of RPCSEC_GSSv3 indicates LNFS that it cannot assume that the presence of RPCSEC_GSSv3 indicates
support. The client might need to discover which LFS the server Labeled NFS support. The client might need to discover which LFS the
supports. server supports.
A server which supports LNFS MUST allow a client with any subject A server which supports Labeled NFS MUST allow a client with any
label to retrieve the FATTR4_SEC_LABEL attribute for the root subject label to retrieve the FATTR4_SEC_LABEL attribute for the root
filehandle, ROOTFH. The following compound must always succeed as filehandle, ROOTFH. The following compound must always succeed as
far as a MAC label check is concerned: far as a MAC label check is concerned:
PUTROOTFH, GETATTR {FATTR4_SEC_LABEL} PUTROOTFH, GETATTR {FATTR4_SEC_LABEL}
Note that the server might have imposed a security flavor on the root Note that the server might have imposed a security flavor on the root
that precludes such access. I.e., if the server requires kerberized that precludes such access. I.e., if the server requires kerberized
access and the client presents a compound with AUTH_SYS, then the access and the client presents a compound with AUTH_SYS, then the
server is allowed to return NFS4ERR_WRONGSEC in this case. But if server is allowed to return NFS4ERR_WRONGSEC in this case. But if
the client presents a correct security flavor, then the server MUST the client presents a correct security flavor, then the server MUST
skipping to change at page 42, line 7 skipping to change at page 42, line 12
A system using Labeled NFS may operate in two modes. The first mode A system using Labeled NFS may operate in two modes. The first mode
provides the most protection and is called "full mode". In this mode provides the most protection and is called "full mode". In this mode
both the client and server implement a MAC model allowing each end to both the client and server implement a MAC model allowing each end to
make an access control decision. The remaining mode is called the make an access control decision. The remaining mode is called the
"guest mode" and in this mode one end of the connection is not "guest mode" and in this mode one end of the connection is not
implementing a MAC model and thus offers less protection than full implementing a MAC model and thus offers less protection than full
mode. mode.
7.6.1. Full Mode 7.6.1. Full Mode
Full mode environments consist of MAC aware NFSv4 servers and clients Full mode environments consist of MAC-Functional NFSv4 servers and
and may be composed of mixed MAC models and policies. The system clients and may be composed of mixed MAC models and policies. The
requires that both the client and server have an opportunity to system requires that both the client and server have an opportunity
perform an access control check based on all relevant information to perform an access control check based on all relevant information
within the network. The file object security attribute is provided within the network. The file object security attribute is provided
using the mechanism described in Section 7.3. The security attribute using the mechanism described in Section 7.3. The security attribute
of the subject making the request is transported at the RPC layer of the subject making the request is transported at the RPC layer
using the mechanism described in RPCSECGSSv3 [5]. using the mechanism described in RPCSECGSSv3 [5].
7.6.1.1. Initial Labeling and Translation 7.6.1.1. Initial Labeling and Translation
The ability to create a file is an action that a MAC model may wish The ability to create a file is an action that a MAC model may wish
to mediate. The client is given the responsibility to determine the to mediate. The client is given the responsibility to determine the
initial security attribute to be placed on a file. This allows the initial security attribute to be placed on a file. This allows the
skipping to change at page 42, line 35 skipping to change at page 42, line 40
Security attributes on the client and server may vary based on MAC Security attributes on the client and server may vary based on MAC
model and policy. To handle this the security attribute field has an model and policy. To handle this the security attribute field has an
LFS component. This component is a mechanism for the host to LFS component. This component is a mechanism for the host to
identify the format and meaning of the opaque portion of the security identify the format and meaning of the opaque portion of the security
attribute. A full mode environment may contain hosts operating in attribute. A full mode environment may contain hosts operating in
several different LFSs. In this case a mechanism for translating the several different LFSs. In this case a mechanism for translating the
opaque portion of the security attribute is needed. The actual opaque portion of the security attribute is needed. The actual
translation function will vary based on MAC model and policy and is translation function will vary based on MAC model and policy and is
out of the scope of this document. If a translation is unavailable out of the scope of this document. If a translation is unavailable
for a given LFS then the request SHOULD be denied. Another recourse for a given LFS then the request MUST be denied. Another recourse is
is to allow the host to provide a fallback mapping for unknown to allow the host to provide a fallback mapping for unknown security
security attributes. attributes.
7.6.1.2. Policy Enforcement 7.6.1.2. Policy Enforcement
In full mode access control decisions are made by both the clients In full mode access control decisions are made by both the clients
and servers. When a client makes a request it takes the security and servers. When a client makes a request it takes the security
attribute from the requesting process and makes an access control attribute from the requesting process and makes an access control
decision based on that attribute and the security attribute of the decision based on that attribute and the security attribute of the
object it is trying to access. If the client denies that access an object it is trying to access. If the client denies that access an
RPC call to the server is never made. If however the access is RPC call to the server is never made. If however the access is
allowed the client will make a call to the NFS server. allowed the client will make a call to the NFS server.
skipping to change at page 43, line 13 skipping to change at page 43, line 19
trying to access to make an access control decision. If the server's trying to access to make an access control decision. If the server's
policy allows this access it will fulfill the client's request, policy allows this access it will fulfill the client's request,
otherwise it will return NFS4ERR_ACCESS. otherwise it will return NFS4ERR_ACCESS.
Implementations MAY validate security attributes supplied over the Implementations MAY validate security attributes supplied over the
network to ensure that they are within a set of attributes permitted network to ensure that they are within a set of attributes permitted
from a specific peer, and if not, reject them. Note that a system from a specific peer, and if not, reject them. Note that a system
may permit a different set of attributes to be accepted from each may permit a different set of attributes to be accepted from each
peer. peer.
7.6.1.3. Label Aware Only Server 7.6.1.3. Limited Server
If the LFS is 0, then it indicates a server which is label aware, but A Limited Server mode (see Section 3.5.2 of [7]) consists of a server
does not enforce policies. Such a server will store and retrieve all which is label aware, but does not enforce policies. Such a server
object labels presented by clients, notify the clients of any label will store and retrieve all object labels presented by clients,
changes via CB_ATTR_CHANGED, but will not restrict access via the notify the clients of any label changes via CB_ATTR_CHANGED, but will
subject label. Instead, it will expect the clients to enforce all not restrict access via the subject label. Instead, it will expect
such access locally. the clients to enforce all such access locally.
7.6.2. Guest Mode 7.6.2. Guest Mode
Guest mode implies that either the client or the server does not Guest mode implies that either the client or the server does not
handle labels. If the client is not LNFS aware, then it will not handle labels. If the client is not Labeled NFS aware, then it will
offer subject labels to the server. The server is the only entity not offer subject labels to the server. The server is the only
enforcing policy, and may selectively provide standard NFS services entity enforcing policy, and may selectively provide standard NFS
to clients based on their authentication credentials and/or services to clients based on their authentication credentials and/or
associated network attributes (e.g., IP address, network interface). associated network attributes (e.g., IP address, network interface).
The level of trust and access extended to a client in this mode is The level of trust and access extended to a client in this mode is
configuration-specific. If the server is not LNFS aware, then it configuration-specific. If the server is not Labeled NFS aware, then
will not return object labels to the client. Clients in this it will not return object labels to the client. Clients in this
environment are may consist of groups implementing different MAC environment are may consist of groups implementing different MAC
model policies. The system requires that all clients in the model policies. The system requires that all clients in the
environment be responsible for access control checks. environment be responsible for access control checks.
7.7. Security Considerations 7.7. Security Considerations
This entire document deals with security issues. This entire chapter deals with security issues.
Depending on the level of protection the MAC system offers there may Depending on the level of protection the MAC system offers there may
be a requirement to tightly bind the security attribute to the data. be a requirement to tightly bind the security attribute to the data.
When only one of the client or server enforces labels, it is When only one of the client or server enforces labels, it is
important to realize that the other side is not enforcing MAC important to realize that the other side is not enforcing MAC
protections. Alternate methods might be in use to handle the lack of protections. Alternate methods might be in use to handle the lack of
MAC support and care should be taken to identify and mitigate threats MAC support and care should be taken to identify and mitigate threats
from possible tampering outside of these methods. from possible tampering outside of these methods.
An example of this is that a server that modifies READDIR or LOOKUP An example of this is that a server that modifies READDIR or LOOKUP
results based on the client's subject label might want to always results based on the client's subject label might want to always
construct the same subject label for a client which does not present construct the same subject label for a client which does not present
one. This will prevent a non-LNFS client from mixing entries in the one. This will prevent a non-Labeled NFS client from mixing entries
directory cache. in the directory cache.
8. Sharing change attribute implementation details with NFSv4 clients 8. Sharing change attribute implementation details with NFSv4 clients
8.1. Introduction 8.1. Introduction
Although both the NFSv4 [10] and NFSv4.1 protocol [2], define the Although both the NFSv4 [10] and NFSv4.1 protocol [2], define the
change attribute as being mandatory to implement, there is little in change attribute as being mandatory to implement, there is little in
the way of guidance. The only feature that is mandated by them is the way of guidance. The only mandated feature is that the value
that the value must change whenever the file data or metadata change. must change whenever the file data or metadata change.
While this allows for a wide range of implementations, it also leaves While this allows for a wide range of implementations, it also leaves
the client with a conundrum: how does it determine which is the most the client with a conundrum: how does it determine which is the most
recent value for the change attribute in a case where several RPC recent value for the change attribute in a case where several RPC
calls have been issued in parallel? In other words if two COMPOUNDs, calls have been issued in parallel? In other words if two COMPOUNDs,
both containing WRITE and GETATTR requests for the same file, have both containing WRITE and GETATTR requests for the same file, have
been issued in parallel, how does the client determine which of the been issued in parallel, how does the client determine which of the
two change attribute values returned in the replies to the GETATTR two change attribute values returned in the replies to the GETATTR
requests correspond to the most recent state of the file? In some requests correspond to the most recent state of the file? In some
cases, the only recourse may be to send another COMPOUND containing a cases, the only recourse may be to send another COMPOUND containing a
third GETATTR that is fully serialised with the first two. third GETATTR that is fully serialised with the first two.
NFSv4.2 avoids this kind of inefficiency by allowing the server to NFSv4.2 avoids this kind of inefficiency by allowing the server to
share details about how the change attribute is expected to evolve, share details about how the change attribute is expected to evolve,
so that the client may immediately determine which, out of the so that the client may immediately determine which, out of the
several change attribute values returned by the server, is the most several change attribute values returned by the server, is the most
recent. change_attr_type is defined as a new recommended attribute recent. change_attr_type is defined as a new recommended attribute
(see Section 11.2.1), and is per filesystem. (see Section 11.2.1), and is per file system.
9. Security Considerations 9. Security Considerations
10. Error Values 10. Error Values
NFS error numbers are assigned to failed operations within a Compound NFS error numbers are assigned to failed operations within a Compound
(COMPOUND or CB_COMPOUND) request. A Compound request contains a (COMPOUND or CB_COMPOUND) request. A Compound request contains a
number of NFS operations that have their results encoded in sequence number of NFS operations that have their results encoded in sequence
in a Compound reply. The results of successful operations will in a Compound reply. The results of successful operations will
consist of an NFS4_OK status followed by the encoded results of the consist of an NFS4_OK status followed by the encoded results of the
skipping to change at page 46, line 20 skipping to change at page 46, line 23
receiving a server-to-server copy offload request after the copy receiving a server-to-server copy offload request after the copy
lease time expired, or for some other permission problem. lease time expired, or for some other permission problem.
10.1.2.4. NFS4ERR_PARTNER_NOTSUPP (Error Code 10088) 10.1.2.4. NFS4ERR_PARTNER_NOTSUPP (Error Code 10088)
The remote server does not support the server-to-server copy offload The remote server does not support the server-to-server copy offload
protocol. protocol.
10.1.3. Labeled NFS Errors 10.1.3. Labeled NFS Errors
These errors are used in LNFS. These errors are used in Labeled NFS.
10.1.3.1. NFS4ERR_BADLABEL (Error Code 10093) 10.1.3.1. NFS4ERR_BADLABEL (Error Code 10093)
The label specified is invalid in some manner. The label specified is invalid in some manner.
10.1.3.2. NFS4ERR_WRONG_LFS (Error Code 10092) 10.1.3.2. NFS4ERR_WRONG_LFS (Error Code 10092)
The LFS specified in the subject label is not compatible with the LFS The LFS specified in the subject label is not compatible with the LFS
in object label. in the object label.
11. New File Attributes 11. New File Attributes
11.1. New RECOMMENDED Attributes - List and Definition References 11.1. New RECOMMENDED Attributes - List and Definition References
The list of new RECOMMENDED attributes appears in Table 2. The The list of new RECOMMENDED attributes appears in Table 2. The
meaning of the columns of the table are: meaning of the columns of the table are:
Name: The name of the attribute. Name: The name of the attribute.
skipping to change at page 47, line 37 skipping to change at page 47, line 39
11.2.1. Attribute 79: change_attr_type 11.2.1. Attribute 79: change_attr_type
enum change_attr_type4 { enum change_attr_type4 {
NFS4_CHANGE_TYPE_IS_MONOTONIC_INCR = 0, NFS4_CHANGE_TYPE_IS_MONOTONIC_INCR = 0,
NFS4_CHANGE_TYPE_IS_VERSION_COUNTER = 1, NFS4_CHANGE_TYPE_IS_VERSION_COUNTER = 1,
NFS4_CHANGE_TYPE_IS_VERSION_COUNTER_NOPNFS = 2, NFS4_CHANGE_TYPE_IS_VERSION_COUNTER_NOPNFS = 2,
NFS4_CHANGE_TYPE_IS_TIME_METADATA = 3, NFS4_CHANGE_TYPE_IS_TIME_METADATA = 3,
NFS4_CHANGE_TYPE_IS_UNDEFINED = 4 NFS4_CHANGE_TYPE_IS_UNDEFINED = 4
}; };
change_attr_type is a per filesystem attribute which enables the change_attr_type is a per file system attribute which enables the
NFSv4.2 server to provide additional information about how it expects NFSv4.2 server to provide additional information about how it expects
the change attribute value to evolve after the file data or metadata the change attribute value to evolve after the file data, or metadata
has changed. has changed. While Section 5.4 of [2] discusses per file system
attributes, it is expected that the value of change_attr_type not
depend on the value of "homogeneous" and only changes in the event of
a migration.
NFS4_CHANGE_TYPE_IS_UNDEFINED: The change attribute does not take
values that fit into any of these categories.
NFS4_CHANGE_TYPE_IS_MONOTONIC_INCR: The change attribute value MUST NFS4_CHANGE_TYPE_IS_MONOTONIC_INCR: The change attribute value MUST
monotonically increase for every atomic change to the file monotonically increase for every atomic change to the file
attributes, data or directory contents. attributes, data, or directory contents.
NFS4_CHANGE_TYPE_IS_VERSION_COUNTER: The change attribute value MUST NFS4_CHANGE_TYPE_IS_VERSION_COUNTER: The change attribute value MUST
be incremented by one unit for every atomic change to the file be incremented by one unit for every atomic change to the file
attributes, data or directory contents. This property is attributes, data, or directory contents. This property is
preserved when writing to pNFS data servers. preserved when writing to pNFS data servers.
NFS4_CHANGE_TYPE_IS_VERSION_COUNTER_NOPNFS: The change attribute NFS4_CHANGE_TYPE_IS_VERSION_COUNTER_NOPNFS: The change attribute
value MUST be incremented by one unit for every atomic change to value MUST be incremented by one unit for every atomic change to
the file attributes, data or directory contents. In the case the file attributes, data, or directory contents. In the case
where the client is writing to pNFS data servers, the number of where the client is writing to pNFS data servers, the number of
increments is not guaranteed to exactly match the number of increments is not guaranteed to exactly match the number of
writes. writes.
NFS4_CHANGE_TYPE_IS_TIME_METADATA: The change attribute is NFS4_CHANGE_TYPE_IS_TIME_METADATA: The change attribute is
implemented as suggested in the NFSv4 spec [10] in terms of the implemented as suggested in the NFSv4 spec [10] in terms of the
time_metadata attribute. time_metadata attribute.
NFS4_CHANGE_TYPE_IS_UNDEFINED: The change attribute does not take
values that fit into any of these categories.
If either NFS4_CHANGE_TYPE_IS_MONOTONIC_INCR, If either NFS4_CHANGE_TYPE_IS_MONOTONIC_INCR,
NFS4_CHANGE_TYPE_IS_VERSION_COUNTER, or NFS4_CHANGE_TYPE_IS_VERSION_COUNTER, or
NFS4_CHANGE_TYPE_IS_TIME_METADATA are set, then the client knows at NFS4_CHANGE_TYPE_IS_TIME_METADATA are set, then the client knows at
the very least that the change attribute is monotonically increasing, the very least that the change attribute is monotonically increasing,
which is sufficient to resolve the question of which value is the which is sufficient to resolve the question of which value is the
most recent. most recent.
If the client sees the value NFS4_CHANGE_TYPE_IS_TIME_METADATA, then If the client sees the value NFS4_CHANGE_TYPE_IS_TIME_METADATA, then
by inspecting the value of the 'time_delta' attribute it additionally by inspecting the value of the 'time_delta' attribute it additionally
has the option of detecting rogue server implementations that use has the option of detecting rogue server implementations that use
time_metadata in violation of the spec. time_metadata in violation of the spec.
Finally, if the client sees NFS4_CHANGE_TYPE_IS_VERSION_COUNTER, it If the client sees NFS4_CHANGE_TYPE_IS_VERSION_COUNTER, it has the
has the ability to predict what the resulting change attribute value ability to predict what the resulting change attribute value should
should be after a COMPOUND containing a SETATTR, WRITE, or CREATE. be after a COMPOUND containing a SETATTR, WRITE, or CREATE. This
This again allows it to detect changes made in parallel by another again allows it to detect changes made in parallel by another client.
client. The value NFS4_CHANGE_TYPE_IS_VERSION_COUNTER_NOPNFS permits The value NFS4_CHANGE_TYPE_IS_VERSION_COUNTER_NOPNFS permits the
the same, but only if the client is not doing pNFS WRITEs. same, but only if the client is not doing pNFS WRITEs.
Finally, if the server does not support change_attr_type or if
NFS4_CHANGE_TYPE_IS_UNDEFINED is set, then the server SHOULD make an
effort to implement the change attribute in terms of the
time_metadata attribute.
11.2.2. Attribute 80: sec_label 11.2.2. Attribute 80: sec_label
typedef uint32_t policy4; typedef uint32_t policy4;
struct labelformat_spec4 { struct labelformat_spec4 {
policy4 lfs_lfs; policy4 lfs_lfs;
policy4 lfs_pi; policy4 lfs_pi;
}; };
struct sec_label4 { struct sec_label4 {
labelformat_spec4 slai_lfs; labelformat_spec4 slai_lfs;
opaque slai_data<>; opaque slai_data<>;
}; };
The FATTR4_SEC_LABEL contains an array of two components with the The FATTR4_SEC_LABEL contains an array of two components with the
first component being an LFS. It serves to provide the receiving end first component being an LFS. It serves to provide the receiving end
with the information necessary to translate the security attribute with the information necessary to translate the security attribute
into a form that is usable by the endpoint. Label Formats assigned into a form that is usable by the endpoint. Label Formats assigned
an LFS may optionally choose to include a Policy Identifier field to an LFS may optionally choose to include a Policy Identifier field to
allow for complex policy deployments. The LFS and Label Format allow for complex policy deployments. The LFS and Label Format
Registry are described in detail in [22]. The translation used to Registry are described in detail in [22]. The translation used to
interpret the security attribute is not specified as part of the interpret the security attribute is not specified as part of the
protocol as it may depend on various factors. The second component protocol as it may depend on various factors. The second component
is an opaque section which contains the data of the attribute. This is an opaque section which contains the data of the attribute. This
skipping to change at page 67, line 14 skipping to change at page 67, line 14
13.6.3. MOTIVATION 13.6.3. MOTIVATION
Enterprise applications require guarantees that an operation has Enterprise applications require guarantees that an operation has
either aborted or completed. NFSv4.1 provides this guarantee as long either aborted or completed. NFSv4.1 provides this guarantee as long
as the session is alive: simply send a SEQUENCE operation on the same as the session is alive: simply send a SEQUENCE operation on the same
slot with a new sequence number, and the successful return of slot with a new sequence number, and the successful return of
SEQUENCE indicates the previous operation has completed. However, if SEQUENCE indicates the previous operation has completed. However, if
the session is lost, there is no way to know when any in progress the session is lost, there is no way to know when any in progress
operations have aborted or completed. In hindsight, the NFSv4.1 operations have aborted or completed. In hindsight, the NFSv4.1
specification should have mandated that DESTROY_SESSION abort/ specification should have mandated that DESTROY_SESSION either abort
complete all outstanding operations. or complete all outstanding operations.
13.6.4. DESCRIPTION 13.6.4. DESCRIPTION
A client SHOULD request the EXCHGID4_FLAG_SUPP_FENCE_OPS capability A client SHOULD request the EXCHGID4_FLAG_SUPP_FENCE_OPS capability
when it sends an EXCHANGE_ID operation. The server SHOULD set this when it sends an EXCHANGE_ID operation. The server SHOULD set this
capability in the EXCHANGE_ID reply whether the client requests it or capability in the EXCHANGE_ID reply whether the client requests it or
not. If the client ID is created with this capability then the not. If the client ID is created with this capability then the
following will occur: following will occur:
o The server will not reply to DESTROY_SESSION until all operations o The server will not reply to any DESTROY_SESSION invoked with the
in progress are completed or aborted. client ID until all operations in progress are completed or
aborted.
o The server will not reply to subsequent EXCHANGE_ID invoked on the o The server will not reply to subsequent EXCHANGE_ID invoked on the
same Client Owner with a new verifier until all operations in same client owner with a new verifier until all operations in
progress on the Client ID's session are completed or aborted. progress on the client ID's session are completed or aborted.
o When DESTROY_CLIENTID is invoked, if there are sessions (both idle o When DESTROY_CLIENTID is invoked, if there are sessions (both idle
and non-idle), opens, locks, delegations, layouts, and/or wants and non-idle), opens, locks, delegations, layouts, and/or wants
(Section 18.49 of [2]) associated with the client ID are removed. (Section 18.49 of [2]) associated with the client ID are removed.
Pending operations will be completed or aborted before the Pending operations will be completed or aborted before the
sessions, opens, locks, delegations, layouts, and/or wants are sessions, opens, locks, delegations, layouts, and/or wants are
deleted. deleted.
o The NFS server SHOULD support client ID trunking, and if it does o The NFS server SHOULD support client ID trunking, and if it does
and the EXCHGID4_FLAG_SUPP_FENCE_OPS capability is enabled, then a and the EXCHGID4_FLAG_SUPP_FENCE_OPS capability is enabled, then a
skipping to change at page 69, line 19 skipping to change at page 69, line 19
current filehandle set to the filehandle of the file in question, and current filehandle set to the filehandle of the file in question, and
the equivalent of start offset and length in bytes of the region set the equivalent of start offset and length in bytes of the region set
in ia_hole.di_offset and ia_hole.di_length respectively. If the in ia_hole.di_offset and ia_hole.di_length respectively. If the
ia_hole.di_allocated is set to TRUE, then the blocks will be zeroed ia_hole.di_allocated is set to TRUE, then the blocks will be zeroed
and if it is set to FALSE, then they will be deallocated. All and if it is set to FALSE, then they will be deallocated. All
further reads to this region MUST return zeros until overwritten. further reads to this region MUST return zeros until overwritten.
The filehandle specified must be that of a regular file. The filehandle specified must be that of a regular file.
Situations may arise where di_offset and/or di_offset + di_length Situations may arise where di_offset and/or di_offset + di_length
will not be aligned to a boundary that the server does allocations/ will not be aligned to a boundary that the server does allocations/
deallocations in. For most filesystems, this is the block size of deallocations in. For most file systems, this is the block size of
the file system. In such a case, the server can deallocate as many the file system. In such a case, the server can deallocate as many
bytes as it can in the region. The blocks that cannot be deallocated bytes as it can in the region. The blocks that cannot be deallocated
MUST be zeroed. Except for the block deallocation and maximum hole MUST be zeroed. Except for the block deallocation and maximum hole
punching capability, a INITIALIZE operation is to be treated similar punching capability, a INITIALIZE operation is to be treated similar
to a write of zeroes. to a write of zeroes.
The server is not required to complete deallocating the blocks The server is not required to complete deallocating the blocks
specified in the operation before returning. It is acceptable to specified in the operation before returning. It is acceptable to
have the deallocation be deferred. In fact, INITIALIZE is merely a have the deallocation be deferred. In fact, INITIALIZE is merely a
hint; it is valid for a server to return success without ever doing hint; it is valid for a server to return success without ever doing
skipping to change at page 71, line 10 skipping to change at page 71, line 10
misaligned creation of ADBs. Even while it can detect them, it misaligned creation of ADBs. Even while it can detect them, it
cannot disallow them, as the application might be in the process of cannot disallow them, as the application might be in the process of
changing the size of the ADBs. Thus the server must be prepared to changing the size of the ADBs. Thus the server must be prepared to
handle an INITIALIZE into an existing ADB. handle an INITIALIZE into an existing ADB.
This document does not mandate the manner in which the server stores This document does not mandate the manner in which the server stores
ADBs sparsely for a file. It does assume that if ADBs are stored ADBs sparsely for a file. It does assume that if ADBs are stored
sparsely, then the server can detect when an INITIALIZE arrives that sparsely, then the server can detect when an INITIALIZE arrives that
will force a new ADB to start inside an existing ADB. For example, will force a new ADB to start inside an existing ADB. For example,
assume that ADBi has a adb_block_size of 4k and that an INITIALIZE assume that ADBi has a adb_block_size of 4k and that an INITIALIZE
starts 1k inside ADBi. The server should [[Comment.2: Need to flesh starts 1k inside ADBi. The server should [[Comment.3: Need to flesh
this out. --TH]] this out. --TH]]
13.8. Operation 67: IO_ADVISE - Application I/O access pattern hints 13.8. Operation 67: IO_ADVISE - Application I/O access pattern hints
This section introduces a new operation, named IO_ADVISE, which This section introduces a new operation, named IO_ADVISE, which
allows NFS clients to communicate application I/O access pattern allows NFS clients to communicate application I/O access pattern
hints to the NFS server. This new operation will allow hints to be hints to the NFS server. This new operation will allow hints to be
sent to the server when applications use posix_fadvise, direct I/O, sent to the server when applications use posix_fadvise, direct I/O,
or at any other point at which the client finds useful. or at any other point at which the client finds useful.
 End of changes. 50 change blocks. 
101 lines changed or deleted 116 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/