[Docs] [txt|pdf] [Tracker] [Email] [Nits]
Versions: 00
Network Working Group M. Mathis
Internet-Draft J. Heffner
Expires: January 8, 2005 B. Chandler
PSC
July 10, 2004
Fragmentation Considered Very Harmful
draft-mathis-frag-harmful-00
Status of this Memo
By submitting this Internet-Draft, I certify that any applicable
patent or other IPR claims of which I am aware have been disclosed,
and any of which I become aware will be disclosed, in accordance with
RFC 3668.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that other
groups may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at http://
www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on January 8, 2005.
Copyright Notice
Copyright (C) The Internet Society (2004). All Rights Reserved.
Abstract
IPv4 fragmentation is not sufficiently robust for general use in
today's Internet. The 16-bit IP identification field is not large
enough to prevent frequent missassociated IP fragments and the TCP
and UDP checksums are insufficient to prevent the resulting corrupted
data from being delivered to higher protocol layers. In this note we
describe some easily reproduced experiments demonstrating the problem
and estimate the scale the data corruption in the presence of ever
growing data rates.
Mathis, et al. Expires January 8, 2005 [Page 1]
Internet-Draft Fragmentation Considered Very Harmful July 2004
1. Introduction
The IPv4 header was designed at a time when data rates were several
orders of magnitude lower than those achievable today. In this
document, we describe a consequent scale-related failure in the IP
identification (ID) field, where fragments may be mis-associated at a
rate high enough likely to invalidate assumptions about data
integrity failure rates. We also outline scenarios in which data
corruption may happen reliably and reproducibly.
While a number of problems with IP fragmentation have been well
documented [1], this presents a relatively new and serious
operational problem given the severity of the failure mode, and that
it occurs on what is today common communications equipment. It is
especially pertinent due to the recent proliferation of UDP bulk
transport tools which do not do MTU discovery , and some network
equipment which ignores the Don't Fragment (DF) bit in the IP header
as a work-around for MTU discovery problems [2].
2. Wrapping the IP ID Field
The Internet Protocol standard specifies:
"The choice of the Identifier for a datagram is based on the need
to provide a way to uniquely identify the fragments of a
particular datagram. The protocol module assembling fragments
judges fragments to belong to the same datagram if they have the
same source, destination, protocol, and Identifier. Thus, the
sender must choose the Identifier to be unique for this source,
destination pair and protocol for the time the datagram (or any
fragment of it) could be alive in the internet." [3]
Strict conformance to this standard limits transmissions in one
direction between any address pair to no more than 65536 datagrams
per maximum packet lifetime.
Obviously hosts do not follow the standard so strictly. Assuming a
maximum packet lifetime on the order of seconds, today it is common
for host interfaces to send at rates higher than this. For example,
a host with a 100 Mbps interface sending 1500 byte packets may send
65536 packets in under 8 seconds.
The problem occurs when a fragment is dropped by the network, and a
later fragment is received that, while part of a different datagram,
has the same ID value and fragment offset as the dropped fragment.
The two fragments will be incorrectly spliced together and delivered
to the layer above IP. It is common that the fragment offset and
length would match since packets of the same size sent along the same
Mathis, et al. Expires January 8, 2005 [Page 2]
Internet-Draft Fragmentation Considered Very Harmful July 2004
path will be fragmented in the same manner. In 65537 segments, there
must be at least two with matching ID fields. If the sender is
transmitting segments fast enough that datagrams are send with
duplicate ID fields within the reassembly timeout (a suggested value
is 15 seconds [3]), then fragments may be mis-associated.
The case of particular concern occurs when only the first fragment of
a datagram is lost by the network. The remaining fragments will be
stored in the fragment reassembly buffer, and at some point in the
future a new packet will arrive with the matching ID field. This new
first fragment will be (incorrectly) matched up with the rest of the
old packet and delivered to the upper layer. Assuming the fragments
are delivered in order, the rest of the new datagram will be
buffered, forming a cycle. One of every 65536 datagrams will be
incorrectly reassembled by the IP layer. It is possible to have a
number of simultaneous cycles, bounded by the size of the fragment
reassembly buffer.
Most TCP implementations today participate in MTU discovery [4],
which will avoid this problem by avoiding fragmentation. However, as
a work-around for MTU discovery problems [2], some TCP
implementations and communications gear provide mechanisms to disable
path MTU discovery by clearing or ignoring the DF bit.
3. Harmful Effects of Mis-associated Fragments
When the mis-associated fragments are delivered, transport-layer
checksumming should detect these datagrams as incorrect and discard
them. When the datagrams are discarded, it could pose a problem for
loss feedback congestion control algorithms since there will be a
high number of non-congestion-related losses.
However, transport checksums may not be designed to handle such high
error rates, either. The UDP checksum is only 16 bits in length. If
these checksums follow a uniform random distribution, we expect
mis-associated datagrams to be accepted by the checksum at a rate of
one per 65536. With only one mis-association cycle, we expect
corrupt data delivered to the application layer once per 2^32
datagrams. This number can be significantly higher with multiple
cycles.
With non-random data, the UDP checksum may be even weaker still. It
is possible to construct datasets where mis-associated fragments will
always have the same checksum. Such a case may be considered
unlikely, but is worth considering. "Real" data may be more likely
than random data to cause checksum hotspots and increase the
probability of false checksum match [5]. Also, some applications may
turn off checksumming to increase speed, though this practice has
Mathis, et al. Expires January 8, 2005 [Page 3]
Internet-Draft Fragmentation Considered Very Harmful July 2004
been found to be dangerous for other reasons [6].
4. Experimental Results
To test the practical impact of fragmentation on UDP, we ran a series
of experiments with a common UDP bulk transport protocol, Reliable
Blast UDP (RBUDP), part of the QUANTA networking toolkit. It is one
of the tools used as an alternative to TCP for high-bandwidth
applications on specialized networks. The choice to use RBUDP has
very little to do with the protocol itself, as any UDP transport tool
without extra corruption detection would work equally well.
In order to diagnose corruption on files transferred with RBUDP, we
used a file format including embedded sequence numbers and MD5
checksums. These were placed such that one set was included in each
fragment of each datagram. Thus it was possible to distinguish
random corruption from that caused by mis-associated fragments.
Two types of dataset were used. In the first, all space not used for
sequence numbers and MD5 checksums was filled with pseudo-random
data, giving datagrams random checksums. The second was constructed
in a similar manner except that the upper halves of each 32-bit word
were filled with the 16-bit ones complement of the lower half. This
gave each 32-bit word a zero ones-complement sum, so datagrams had
constant checksums. With these constant checksums, mis-associated
fragments were guaranteed not to fail the UDP checksum test. Each
dataset used was 400 MB in size.
The RBUDP tools were used to send the datasets between a pair of
hosts at slightly less than the available datarate. Near the
beginning of each flow, a brief secondary flow was started to induce
packet loss in the primary flow. Throughout the life of the primary
flow, we typically observed mis-association rates on the order of
0.05%. In datasets with constant checksums, each of these
mis-associations resulted in corrupted data. In sending datasets
with random checksums 100 times (for a total of 100 GB), we observed
one corruption and 41091 bad UDP checksums.
5. Remedies
IPv6 is less vulnerable to this type of problem, since its fragment
header contains a 32-bit identification field [7]. Mis-association
will only be a problem at packet rates 65536 times higher than for
IPv4.
Since mis-association of fragments will only occur when the IP ID
field is wrapped within the fragment reassembly timeout, it is
possible to reduce the timeout so that this situation is less likely
Mathis, et al. Expires January 8, 2005 [Page 4]
Internet-Draft Fragmentation Considered Very Harmful July 2004
to occur. Since the timeout is set by the receiving host while the
IP ID field is set by the sending host, it is not generally possible
to set the timeout low enough so that a fast sender's fragments will
not be mis-association, yet high enough so that a slow sender's
fragments will not be unconditionally discarded before it is possible
to reassemble them. It is not within the scope of this document to
recommend timeout values.
Another means of solving the corruption issue is to add stronger
integrity checking, which can be done at any layer above IP. This is
a natural side effect of using cryptographic authentication. At the
network layer, if IPsec AH is in use, the mis-associated fragments
should be discarded with extremely high probability. Other higher
layers may use longer checksums (for example, SCTP's is 32 bits in
length [8]) or cryptographic authentication (SSH message
authentication codes [10]). While stronger integrity checking may
prevent data corruption, it will not solve the problem of a high
effective loss rate.
6. Security Considerations
If a malicious entity knows that a pair of hosts are communicating
using a fragmented stream, it may present an opportunity for this
entity to corrupt the flow. By sending "high" fragments (those with
offset greater than zero) with a forged source address, the attacker
can deliberately cause corruption as described above. Exploiting
this vulnerability requires only knowledge of the source and
destination addresses of the flow, and fragment boundaries. It does
not require knowledge of port or sequence numbers.
If the attacker has visibility of packets on the path, the attack
profile is similar to injecting full segments. Using this attack
makes blind disruptions easier, and could certainly be used
effectively to cause denial of service. However, only streams using
IPv4 fragmentation are vulnerable. Because of the nature of the
problems outlined in this draft, the use of IPv4 fragmentation for
critical applications may not be advisable regardless of security
concerns.
7 References
[1] Kent, C. and J. Mogul, "Fragmentation considered harmful",
Proc. SIGCOMM '87 vol. 17, No. 5, October 1987.
[2] Lahey, K., "TCP Problems with Path MTU Discovery", RFC 2923,
September 2000.
[3] Postel, J., "Internet Protocol", STD 5, RFC 791, September
Mathis, et al. Expires January 8, 2005 [Page 5]
Internet-Draft Fragmentation Considered Very Harmful July 2004
1981.
[4] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191,
November 1990.
[5] Stone, J., Greenwald, M., Partridge, C. and J. Hughes,
"Performance of Checksums and CRC's over Real Data", IEEE/ACM
Transactions on Networking vol. 6, No. 5, October 1998.
[6] Stone, J. and C. Partridge, "When The CRC and TCP Checksum
Disagree", Proc. SIGCOMM 2000 vol. 30, No. 4, October 2000.
[7] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6)
Specification", RFC 2460, December 1998.
[8] Stewart, R., Xie, Q., Morneault, K., Sharp, C., Schwarzbauer,
H., Taylor, T., Rytina, I., Kalla, M., Zhang, L. and V. Paxson,
"Stream Control Transmission Protocol", RFC 2960, October 2000.
[9] Kent, S. and R. Atkinson, "IP Authentication Header", RFC 2402,
November 1998.
[10] Ylonen, T. and C. Lonvick, "SSH Transport Layer Protocol",
draft-ietf-secsh-transport-18 (work in progress), June 2004.
[11] Clark, D., "IP datagram reassembly algorithms", RFC 815, July
1982.
Authors' Addresses
Matt Mathis
Pittsburgh Supercomputing Center
4400 Fifth Avenue
Pittsburgh, PA 15213
US
Phone: 412-268-3319
EMail: mathis@psc.edu
Mathis, et al. Expires January 8, 2005 [Page 6]
Internet-Draft Fragmentation Considered Very Harmful July 2004
John W. Heffner
Pittsburgh Supercomputing Center
4400 Fifth Avenue
Pittsburgh, PA 15213
US
Phone: 412-268-2329
EMail: jheffner@psc.edu
Ben Chandler
Pittsburgh Supercomputing Center
4400 Fifth Avenue
Pittsburgh, PA 15213
US
Phone: 412-268-9783
EMail: bchandle@psc.edu
Appendix A. Support
This work was supported by the National Science Foundation under
Grant No. 0083285.
Mathis, et al. Expires January 8, 2005 [Page 7]
Internet-Draft Fragmentation Considered Very Harmful July 2004
Intellectual Property Statement
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the IETF's procedures with respect to rights in IETF Documents can
be found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
Disclaimer of Validity
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Copyright Statement
Copyright (C) The Internet Society (2004). This document is subject
to the rights, licenses and restrictions contained in BCP 78, and
except as set forth therein, the authors retain all their rights.
Acknowledgment
Funding for the RFC Editor function is currently provided by the
Internet Society.
Mathis, et al. Expires January 8, 2005 [Page 8]
Html markup produced by rfcmarkup 1.129d, available from
https://tools.ietf.org/tools/rfcmarkup/