draft-ietf-ipngwg-rfc2292bis-06.txt   draft-ietf-ipngwg-rfc2292bis-07.txt 
INTERNET-DRAFT W. Richard Stevens INTERNET-DRAFT W. Richard Stevens
Expires: August 25, 2002 Matt Thomas (Consultant) Expires: October 19, 2002 Matt Thomas (Consultant)
Obsoletes RFC 2292 Erik Nordmark (Sun) Obsoletes RFC 2292 Erik Nordmark (Sun)
Tatuya Jinmei (Toshiba) Tatuya Jinmei (Toshiba)
February 25, 2002 April 19, 2002
Advanced Sockets API for IPv6 Advanced Sockets API for IPv6
<draft-ietf-ipngwg-rfc2292bis-06.txt> <draft-ietf-ipngwg-rfc2292bis-07.txt>
Abstract Abstract
A separate specification [RFC-2553] contain changes to the sockets A separate specification [RFC-2553] contain changes to the sockets
API to support IP version 6. Those changes are for TCP and UDP-based API to support IP version 6. Those changes are for TCP and UDP-based
applications and will support most end-user applications in use applications and will support most end-user applications in use
today: Telnet and FTP clients and servers, HTTP clients and servers, today: Telnet and FTP clients and servers, HTTP clients and servers,
and the like. and the like.
But another class of applications exists that will also be run under But another class of applications exists that will also be run under
skipping to change at page 2, line 14 skipping to change at page 2, line 14
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet Draft expires August 25, 2002. This Internet Draft expires October 19, 2002.
Table of Contents Table of Contents
1. Introduction .................................................... 6 1. Introduction .................................................... 6
2. Common Structures and Definitions ............................... 7 2. Common Structures and Definitions ............................... 7
2.1. The ip6_hdr Structure ...................................... 8 2.1. The ip6_hdr Structure ...................................... 8
2.1.1. IPv6 Next Header Values ............................. 8 2.1.1. IPv6 Next Header Values ............................. 8
2.1.2. IPv6 Extension Headers .............................. 9 2.1.2. IPv6 Extension Headers .............................. 9
2.1.3. IPv6 Options ........................................ 10 2.1.3. IPv6 Options ........................................ 10
skipping to change at page 4, line 19 skipping to change at page 4, line 19
10.1. inet6_opt_init ............................................ 42 10.1. inet6_opt_init ............................................ 42
10.2. inet6_opt_append .......................................... 42 10.2. inet6_opt_append .......................................... 42
10.3. inet6_opt_finish .......................................... 43 10.3. inet6_opt_finish .......................................... 43
10.4. inet6_opt_set_val ......................................... 43 10.4. inet6_opt_set_val ......................................... 43
10.5. inet6_opt_next ............................................ 43 10.5. inet6_opt_next ............................................ 43
10.6. inet6_opt_find ............................................ 44 10.6. inet6_opt_find ............................................ 44
10.7. inet6_opt_get_val ......................................... 44 10.7. inet6_opt_get_val ......................................... 44
11. Additional Advanced API Functions ............................... 45 11. Additional Advanced API Functions ............................... 45
11.1. Sending with the Minimum MTU .............................. 45 11.1. Sending with the Minimum MTU .............................. 45
11.2. Sending without fragmentation ............................. 45 11.2. Sending without fragmentation ............................. 46
11.3. Path MTU Discovery and UDP ................................ 46 11.3. Path MTU Discovery and UDP ................................ 47
11.4. Determining the current path MTU .......................... 48 11.4. Determining the current path MTU .......................... 48
12. Ordering of Ancillary Data and IPv6 Extension Headers ........... 48 12. Ordering of Ancillary Data and IPv6 Extension Headers ........... 49
13. IPv6-Specific Options with IPv4-Mapped IPv6 Addresses ........... 50 13. IPv6-Specific Options with IPv4-Mapped IPv6 Addresses ........... 51
14. Extended interfaces for rresvport, rcmd and rexec ............... 51 14. Extended interfaces for rresvport, rcmd and rexec ............... 52
14.1. rresvport_af .............................................. 51 14.1. rresvport_af .............................................. 52
14.2. rcmd_af ................................................... 52 14.2. rcmd_af ................................................... 53
14.3. rexec_af .................................................. 52 14.3. rexec_af .................................................. 53
15. Summary of New Definitions ...................................... 53 15. Summary of New Definitions ...................................... 53
16. Security Considerations ......................................... 56 16. Security Considerations ......................................... 57
17. Change History .................................................. 57 17. Change History .................................................. 58
18. References ...................................................... 62 18. References ...................................................... 63
19. Acknowledgments ................................................. 62 19. Acknowledgments ................................................. 63
20. Authors' Addresses .............................................. 63 20. Authors' Addresses .............................................. 64
21. Appendix A: Ancillary Data Overview ............................. 63 21. Appendix A: Ancillary Data Overview ............................. 64
21.1. The msghdr Structure ...................................... 64 21.1. The msghdr Structure ...................................... 65
21.2. The cmsghdr Structure ..................................... 65 21.2. The cmsghdr Structure ..................................... 66
21.3. Ancillary Data Object Macros .............................. 66 21.3. Ancillary Data Object Macros .............................. 67
21.3.1. CMSG_FIRSTHDR ...................................... 67 21.3.1. CMSG_FIRSTHDR ...................................... 68
21.3.2. CMSG_NXTHDR ........................................ 68 21.3.2. CMSG_NXTHDR ........................................ 69
21.3.3. CMSG_DATA .......................................... 69 21.3.3. CMSG_DATA .......................................... 70
21.3.4. CMSG_SPACE ......................................... 69 21.3.4. CMSG_SPACE ......................................... 70
21.3.5. CMSG_LEN ........................................... 70 21.3.5. CMSG_LEN ........................................... 71
22. Appendix B: Examples using the inet6_rth_XXX() functions ........ 70 22. Appendix B: Examples using the inet6_rth_XXX() functions ........ 71
22.1. Sending a Routing Header .................................. 70 22.1. Sending a Routing Header .................................. 71
22.2. Receiving Routing Headers ................................. 75 22.2. Receiving Routing Headers ................................. 76
23. Appendix C: Examples using the inet6_opt_XXX() functions ........ 77 23. Appendix C: Examples using the inet6_opt_XXX() functions ........ 78
23.1. Building options .......................................... 77 23.1. Building options .......................................... 78
23.2. Parsing received options .................................. 79 23.2. Parsing received options .................................. 80
1. Introduction 1. Introduction
A separate specification [RFC-2553] contain changes to the sockets A separate specification [RFC-2553] contain changes to the sockets
API to support IP version 6. Those changes are for TCP and UDP-based API to support IP version 6. Those changes are for TCP and UDP-based
applications. This document defines some of the "advanced" features applications. This document defines some of the "advanced" features
of the sockets API that are required for applications to take of the sockets API that are required for applications to take
advantage of additional features of IPv6. advantage of additional features of IPv6.
Today, the portability of applications using IPv4 raw sockets is Today, the portability of applications using IPv4 raw sockets is
skipping to change at page 20, line 43 skipping to change at page 20, line 43
setsockopt(fd, IPPROTO_IPV6, IPV6_CHECKSUM, &offset, sizeof(offset)); setsockopt(fd, IPPROTO_IPV6, IPV6_CHECKSUM, &offset, sizeof(offset));
By default, this socket option is disabled. Setting the offset to -1 By default, this socket option is disabled. Setting the offset to -1
also disables the option. By disabled we mean (1) the kernel will also disables the option. By disabled we mean (1) the kernel will
not calculate and store a checksum for outgoing packets, and (2) the not calculate and store a checksum for outgoing packets, and (2) the
kernel will not verify a checksum for received packets. kernel will not verify a checksum for received packets.
This option assumes the use of the 16-bit one's complement of the This option assumes the use of the 16-bit one's complement of the
one's complement sum as the checksum algorithm and that the checksum one's complement sum as the checksum algorithm and that the checksum
field is aligned on a 16-bit boundary. Thus, specifying a positive field is aligned on a 16-bit boundary. Thus, specifying a positive
odd value as offset is invalid, and setsockopt() will fail for odd odd value as offset is invalid, and setsockopt() will fail for such
offset values. offset values.
An attempt to set IPV6_CHECKSUM for an ICMPv6 socket will fail. An attempt to set IPV6_CHECKSUM for an ICMPv6 socket will fail.
Also, an attempt to set or get IPV6_CHECKSUM for a non-raw IPv6 Also, an attempt to set or get IPV6_CHECKSUM for a non-raw IPv6
socket will fail. socket will fail.
(Note: Since the checksum is always calculated by the kernel for an (Note: Since the checksum is always calculated by the kernel for an
ICMPv6 socket, applications are not able to generate ICMPv6 packets ICMPv6 socket, applications are not able to generate ICMPv6 packets
with incorrect checksums (presumably for testing purposes) using this with incorrect checksums (presumably for testing purposes) using this
API.) API.)
skipping to change at page 22, line 19 skipping to change at page 22, line 19
void ICMP6_FILTER_SETBLOCK( int, struct icmp6_filter *); void ICMP6_FILTER_SETBLOCK( int, struct icmp6_filter *);
int ICMP6_FILTER_WILLPASS (int, int ICMP6_FILTER_WILLPASS (int,
const struct icmp6_filter *); const struct icmp6_filter *);
int ICMP6_FILTER_WILLBLOCK(int, int ICMP6_FILTER_WILLBLOCK(int,
const struct icmp6_filter *); const struct icmp6_filter *);
The first argument to the last four macros (an integer) is an ICMPv6 The first argument to the last four macros (an integer) is an ICMPv6
message type, between 0 and 255. The pointer argument to all six message type, between 0 and 255. The pointer argument to all six
macros is a pointer to a filter that is modified by the first four macros is a pointer to a filter that is modified by the first four
macros examined by the last two macros. macros and is examined by the last two macros.
The first two macros, SETPASSALL and SETBLOCKALL, let us specify that The first two macros, SETPASSALL and SETBLOCKALL, let us specify that
all ICMPv6 messages are passed to the application or that all ICMPv6 all ICMPv6 messages are passed to the application or that all ICMPv6
messages are blocked from being passed to the application. messages are blocked from being passed to the application.
The next two macros, SETPASS and SETBLOCK, let us specify that The next two macros, SETPASS and SETBLOCK, let us specify that
messages of a given ICMPv6 type should be passed to the application messages of a given ICMPv6 type should be passed to the application
or not passed to the application (blocked). or not passed to the application (blocked).
The final two macros, WILLPASS and WILLBLOCK, return true or false The final two macros, WILLPASS and WILLBLOCK, return true or false
skipping to change at page 45, line 27 skipping to change at page 45, line 27
rely on the alignment. rely on the alignment.
The function returns the offset for the next field (i.e., offset + The function returns the offset for the next field (i.e., offset +
vallen) which can be used when extracting option content with vallen) which can be used when extracting option content with
multiple fields. multiple fields.
11. Additional Advanced API Functions 11. Additional Advanced API Functions
11.1. Sending with the Minimum MTU 11.1. Sending with the Minimum MTU
Some applications might not want to incur the overhead of path MTU Unicast applications should usually let the kernel perform path MTU
discovery, especially if the applications only send a single datagram discovery, as long as the kernel support it, and should not care
to a destination. A potential example is a DNS server. about the path MTU. Some applications, however, might not want to
incur the overhead of path MTU discovery, especially if the
applications only send a single datagram to a destination. A
potential example is a DNS server.
Also, path MTU discovery for multicast has severe scalability
limitations and should thus be avoided by default. Applications
sending multicast traffic should explicitly enable path MTU discovery
only when they understand that the benefit of possibly larger MTU
usage outweights the possible impact of MTU discovery for active
sources across the delivery tree(s). This default behavior is based
on today's available MTU path discovery mechanism and may change in
the future once better scalable mechanisms are sufficiently
ubiquitously available.
This specification defines a mechanism to avoid path MTU discovery by This specification defines a mechanism to avoid path MTU discovery by
sending at the minimum IPv6 MTU [RFC-2460]. If the packet is larger sending at the minimum IPv6 MTU [RFC-2460]. If the packet is larger
than the minimum MTU and this feature has been enabled the IP layer than the minimum MTU and this feature has been enabled the IP layer
will fragment to the minimum MTU. This can be enabled using the will fragment to the minimum MTU. To control the policy about path
IPV6_USE_MIN_MTU socket option. MTU discovery, applications can use the IPV6_USE_MIN_MTU socket
option.
As described above, the default policy should depend on whether the
destination is unicast or multicast. For unicast destinations path
MTU discovery should be performed by default. For multicast
destinations path MTU discovery should be disabled by default. This
option thus takes the following three types of integer arguments:
-1: Perform path MTU discovery for unicast destinations but do
not perform it for multicast destinations. Packets to multicast
destinations are therefore sent with the minimum MTU.
0: always perform path MTU discovery.
1: always disable path MTU discovery and send packets at the
minimum MTU.
The default value of this option is -1. Values other than -1, 0, and
1 are invalid, and an error EINVAL will be returned for those values.
As an example, if a unicast application intentionally wants to
disable path MTU discovery, it will add the following lines:
int on = 1; int on = 1;
setsockopt(fd, IPPROTO_IPV6, IPV6_USE_MIN_MTU, &on, sizeof(on)); setsockopt(fd, IPPROTO_IPV6, IPV6_USE_MIN_MTU, &on, sizeof(on));
By default, this socket option is disabled. Setting the value to 0 Note that this API intentionally excludes the case where the
also disables the option. This option can also be sent as ancillary application wants to perform path MTU discovery for multicast but to
data. In the cmsghdr structure containing this ancillary data, the disable it for unicast. This is because such usage is not feasible
cmsg_level member will be IPPROTO_IPV6, the cmsg_type member will be considering a scale of performance issues around whether to do path
IPV6_USE_MIN_MTU, and the first byte of cmsg_data[] will be the first MTU discovery or not. When path MTU discovery makes sense to a
byte of the integer. destination but not to a different destination, regardless of whether
the destination is unicast or multicast, applications either need to
toggle the option between sending such packets on the same socket, or
use different sockets for the two classes of destinations.
This option can also be sent as ancillary data. In the cmsghdr
structure containing this ancillary data, the cmsg_level member will
be IPPROTO_IPV6, the cmsg_type member will be IPV6_USE_MIN_MTU, and
the first byte of cmsg_data[] will be the first byte of the integer.
11.2. Sending without fragmentation 11.2. Sending without fragmentation
In order to provide for easy porting of existing UDP and raw socket In order to provide for easy porting of existing UDP and raw socket
applications IPv6 implementations will, when originating packets, applications IPv6 implementations will, when originating packets,
automatically insert a fragment header in the packet if the packet is automatically insert a fragment header in the packet if the packet is
too big for the path MTU. too big for the path MTU.
Some applications might not want this behavior. An example is Some applications might not want this behavior. An example is
traceroute which might want to discover the actual path MTU. traceroute which might want to discover the actual path MTU.
This specification defines a mechanism to turn off the automatic This specification defines a mechanism to turn off the automatic
inserting of a fragment header for UDP and raw sockets. This can be inserting of a fragment header for UDP and raw sockets. This can be
skipping to change at page 62, line 22 skipping to change at page 63, line 19
- Moved description about the ordering between IPV6_PKTINFO and - Moved description about the ordering between IPV6_PKTINFO and
IPV6_MULTICAST_IF to Section 6.7, which summarized the ordering IPV6_MULTICAST_IF to Section 6.7, which summarized the ordering
among various options. among various options.
- Removed the section for IPV6_REACHCONF and all references to this - Removed the section for IPV6_REACHCONF and all references to this
option, based on a discussion after 04. option, based on a discussion after 04.
- Clarified the header ordering issue much more, to make it clear that - Clarified the header ordering issue much more, to make it clear that
the ordering is just for this particular API. the ordering is just for this particular API.
Changes since -06:
- Revised the "minimum MTU" section so that path MTU discovery would
be disabled for multicast by default. A new (default) value "-1" as
an argument was introduced accordingly.
18. References 18. References
[RFC-2460] Deering, S., Hinden, R., "Internet Protocol, Version 6 [RFC-2460] Deering, S., Hinden, R., "Internet Protocol, Version 6
(IPv6), Specification", RFC 2460, Dec. 1998. (IPv6), Specification", RFC 2460, Dec. 1998.
[RFC-2553] Gilligan, R. E., Thomson, S., Bound, J., Stevens, W., [RFC-2553] Gilligan, R. E., Thomson, S., Bound, J., Stevens, W.,
"Basic Socket Interface Extensions for IPv6", RFC 2553, "Basic Socket Interface Extensions for IPv6", RFC 2553,
March 1999. March 1999.
[RFC-1981] McCann, J., Deering, S., Mogul, J, "Path MTU Discovery for [RFC-1981] McCann, J., Deering, S., Mogul, J, "Path MTU Discovery for
skipping to change at page 63, line 8 skipping to change at page 64, line 11
implementor of ancillary data in the BSD networking code. Craig Metz implementor of ancillary data in the BSD networking code. Craig Metz
provided lots of feedback, suggestions, and comments based on his provided lots of feedback, suggestions, and comments based on his
implementing many of these features as the document was being implementing many of these features as the document was being
written. Mark Andrews first proposed the idea of the written. Mark Andrews first proposed the idea of the
IPV6_USE_MIN_MTU option. Jun-ichiro Hagino contributed text for the IPV6_USE_MIN_MTU option. Jun-ichiro Hagino contributed text for the
traffic class API from a draft of his own. traffic class API from a draft of his own.
The following provided comments on earlier drafts: Pascal Anelli, The following provided comments on earlier drafts: Pascal Anelli,
Hamid Asayesh, Ran Atkinson, Karl Auerbach, Hamid Asayesh, Don Hamid Asayesh, Ran Atkinson, Karl Auerbach, Hamid Asayesh, Don
Coolidge, Matt Crawford, Sam T. Denton, Richard Draves, Francis Coolidge, Matt Crawford, Sam T. Denton, Richard Draves, Francis
Dupont, Lilian Fernandes, Bob Gilligan, Gerri Harter, Tim Hartrick, Dupont, Toerless Eckert, Lilian Fernandes, Bob Gilligan, Gerri
Bob Halley, Masaki Hirabaru, Yoshinobu Inoue, Mukesh Kacker, A. N. Harter, Tim Hartrick, Bob Halley, Masaki Hirabaru, Yoshinobu Inoue,
Kuznetsov, Sam Manthorpe, Pedro Marques, Jack McCann, der Mouse, John Mukesh Kacker, A. N. Kuznetsov, Sam Manthorpe, Pedro Marques, Jack
Moy, Lori Napoli, Thomas Narten, Atsushi Onoe, Steve Parker, Charles McCann, der Mouse, John Moy, Lori Napoli, Thomas Narten, Atsushi
Perkins, Ken Powell, Tom Pusateri, Pedro Roque, Sameer Shah, Peter Onoe, Steve Parker, Charles Perkins, Ken Powell, Tom Pusateri, Pedro
Sjodin, Stephen P. Spackman, Jinmei Tatuya, Karen Tracey, Sowmini Roque, Sameer Shah, Peter Sjodin, Stephen P. Spackman, Jinmei Tatuya,
Varadhan, Quaizar Vohra, Carl Williams, Steve Wise, Eric Wong, Karen Tracey, Sowmini Varadhan, Quaizar Vohra, Carl Williams, Steve
Farrell Woods, Kazu Yamamoto, Vladislav Yasevich, and YOSHIFUJI Wise, Eric Wong, Farrell Woods, Kazu Yamamoto, Vladislav Yasevich,
Hideaki. and YOSHIFUJI Hideaki.
20. Authors' Addresses 20. Authors' Addresses
W. Richard Stevens (deceased) W. Richard Stevens (deceased)
Matt Thomas Matt Thomas
3am Software Foundry 3am Software Foundry
8053 Park Villa Circle 8053 Park Villa Circle
Cupertino, CA 95014 Cupertino, CA 95014
Email: matt@3am-software.com Email: matt@3am-software.com
 End of changes. 24 change blocks. 
54 lines changed or deleted 102 lines changed or added

This html diff was produced by rfcdiff 1.33. The latest version is available from http://tools.ietf.org/tools/rfcdiff/