 1/draftietfgeoprivuncertainty00.txt 20140704 10:14:31.489502044 0700
+++ 2/draftietfgeoprivuncertainty01.txt 20140704 10:14:31.557503671 0700
@@ 1,19 +1,19 @@
GEOPRIV M. Thomson
InternetDraft Mozilla
Intended status: Standards Track J. Winterbottom
Expires: July 26, 2014 Unaffiliated
 January 22, 2014
+Expires: January 5, 2015 Unaffiliated
+ July 4, 2014
Representation of Uncertainty and Confidence in PIDFLO
 draftietfgeoprivuncertainty00
+ draftietfgeoprivuncertainty01
Abstract
The key concepts of uncertainty and confidence as they pertain to
location information are defined. Methods for the manipulation of
location estimates that include uncertainty information are outlined.
Status of This Memo
This InternetDraft is submitted in full conformance with the
@@ 22,21 +22,21 @@
InternetDrafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as InternetDrafts. The list of current Internet
Drafts is at http://datatracker.ietf.org/drafts/current/.
InternetDrafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use InternetDrafts as reference
material or to cite them other than as "work in progress."
 This InternetDraft will expire on July 26, 2014.
+ This InternetDraft will expire on January 5, 2015.
Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/licenseinfo) in effect on the date of
publication of this document. Please review these documents
@@ 66,83 +66,90 @@
5. Manipulation of Uncertainty . . . . . . . . . . . . . . . . . 13
5.1. Reduction of a Location Estimate to a Point . . . . . . . 13
5.1.1. Centroid Calculation . . . . . . . . . . . . . . . . 14
5.1.1.1. ArcBand Centroid . . . . . . . . . . . . . . . . 14
5.1.1.2. Polygon Centroid . . . . . . . . . . . . . . . . 15
5.2. Conversion to Circle or Sphere . . . . . . . . . . . . . 17
5.3. ThreeDimensional to TwoDimensional Conversion . . . . . 18
5.4. Increasing and Decreasing Uncertainty and Confidence . . 19
5.4.1. Rectangular Distributions . . . . . . . . . . . . . . 19
5.4.2. Normal Distributions . . . . . . . . . . . . . . . . 20
 5.5. Determining Whether a Location is Within a Given Region . 20
+ 5.5. Determining Whether a Location is Within a Given Region . 21
5.5.1. Determining the Area of Overlap for Two Circles . . . 22
 5.5.2. Determining the Area of Overlap for Two Polygons . . 22
+ 5.5.2. Determining the Area of Overlap for Two Polygons . . 23
6. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 23
6.1. Reduction to a Point or Circle . . . . . . . . . . . . . 23
 6.2. Increasing and Decreasing Confidence . . . . . . . . . . 26
 6.3. Matching Location Estimates to Regions of Interest . . . 26
 6.4. PIDFLO With Confidence Example . . . . . . . . . . . . . 27
 7. Confidence Schema . . . . . . . . . . . . . . . . . . . . . . 27
 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 29
+ 6.2. Increasing and Decreasing Confidence . . . . . . . . . . 27
+ 6.3. Matching Location Estimates to Regions of Interest . . . 27
+ 6.4. PIDFLO With Confidence Example . . . . . . . . . . . . . 28
+ 7. Confidence Schema . . . . . . . . . . . . . . . . . . . . . . 28
+ 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 30
8.1. URN SubNamespace Registration for
 urn:ietf:params:xml:ns:geopriv:conf . . . . . . . . . . . 29
 8.2. XML Schema Registration . . . . . . . . . . . . . . . . . 29
 9. Security Considerations . . . . . . . . . . . . . . . . . . . 30
 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 30
 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 30
 11.1. Normative References . . . . . . . . . . . . . . . . . . 30
 11.2. Informative References . . . . . . . . . . . . . . . . . 30
+ urn:ietf:params:xml:ns:geopriv:conf . . . . . . . . . . . 30
+ 8.2. XML Schema Registration . . . . . . . . . . . . . . . . . 30
+ 9. Security Considerations . . . . . . . . . . . . . . . . . . . 31
+ 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 31
+ 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 31
+ 11.1. Normative References . . . . . . . . . . . . . . . . . . 31
+ 11.2. Informative References . . . . . . . . . . . . . . . . . 31
Appendix A. Conversion Between Cartesian and Geodetic
 Coordinates in WGS84 . . . . . . . . . . . . . . . . 32
 Appendix B. Calculating the Upward Normal of a Polygon . . . . . 33
 B.1. Checking that a Polygon Upward Normal Points Up . . . . . 34
 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 34
+ Coordinates in WGS84 . . . . . . . . . . . . . . . . 33
+ Appendix B. Calculating the Upward Normal of a Polygon . . . . . 34
+ B.1. Checking that a Polygon Upward Normal Points Up . . . . . 35
+ Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 35
1. Introduction
Location information represents an estimation of the position of a
 Target. Under ideal circumstances, a location estimate precisely
 reflects the actual location of the Target. In reality, there are
 many factors that introduce errors into the measurements that are
 used to determine location estimates.
+ Target [RFC6280]. Under ideal circumstances, a location estimate
+ precisely reflects the actual location of the Target. For automated
+ systems that determine location, there are many factors that
+ introduce errors into the measurements that are used to determine
+ location estimates.
The process by which measurements are combined to generate a location
estimate is outside of the scope of work within the IETF. However,
the results of such a process are carried in IETF data formats and
protocols. This document outlines how uncertainty, and its
associated datum, confidence, are expressed and interpreted.
This document provides a common nomenclature for discussing
uncertainty and confidence as they relate to location information.
This document also provides guidance on how to manage location
information that includes uncertainty. Methods for expanding or
reducing uncertainty to obtain a required level of confidence are
described. Methods for determining the probability that a Target is
within a specified region based on their location estimate are
described. These methods are simplified by making certain
assumptions about the location estimate and are designed to be
 applicable to location estimates in a relatively small area.
+ applicable to location estimates in a relatively small geographic
+ area.
A confidence extension for the Presence Information Data Format 
Location Object (PIDFLO) [RFC4119] is described.
+ This document describes methods that can be used in combination with
+ automatically determined location information. These are
+ statisticallybased methods.
+
1.1. Conventions and Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
This document assumes a basic understanding of the principles of
mathematics, particularly statistics and geometry.
 Some terminology is borrowed from [RFC3693] and [RFC6280].
+ Some terminology is borrowed from [RFC3693] and [RFC6280], in
+ particular Target.
Mathematical formulae are presented using the following notation: add
"+", subtract "", multiply "*", divide "/", power "^" and absolute
value "x". Precedence is indicated using parentheses.
Mathematical functions are represented by common abbreviations:
square root "sqrt(x)", sine "sin(x)", cosine "cos(x)", inverse cosine
"acos(x)", tangent "tan(x)", inverse tangent "atan(x)", error
function "erf(x)", and inverse error function "erfinv(x)".
2. A General Definition of Uncertainty
@@ 158,54 +165,61 @@
possible values for the quantity.
A probability distribution describing a measured quantity can be
arbitrarily complex and so it is desirable to find a simplified
model. One approach commonly taken is to reduce the probability
distribution to a confidence interval. Many alternative models are
used in other areas, but study of those is not the focus of this
document.
In addition to the central estimate of the observed quantity, a
 confidence interval is succintly described by two values: an error
+ confidence interval is succinctly described by two values: an error
range and a confidence. The error range describes an interval and
the confidence describes an estimated upper bound on the probability
that a "true" value is found within the extents defined by the error.
In the following example, a measurement result for a length is shown
as a nominal value with additional information on error range (0.0043
meters) and confidence (95%).
e.g. x = 1.00742 +/ 0.0043 meters at 95% confidence
This result indicates that the measurement indicates that the value
of "x" between 1.00312 and 1.01172 meters with 95% probability. No
other assertion is made: in particular, this does not assert that x
is 1.00742.
 This document uses the term _uncertainty_ to refer in general to the
 concept as well as more specifically to refer to the error increment.

Uncertainty and confidence for location estimates can be derived in a
number of ways. This document does not attempt to enumerate the many
methods for determining uncertainty. [ISO.GUM] and [NIST.TN1297]
provide a set of general guidelines for determining and manipulating
measurement uncertainty. This document applies that general guidance
for consumers of location information.
+ As a statistical measure, values determined for uncertainty are
+ determined based on information in the aggregate, across numerous
+ individual estimates. An individual estimate might be determined to
+ be "correct"  by using a survey to validate the result, for example
+  without invalidating the statistical assertion.
+
+ This understanding of estimates in the statistical sense explains why
+ asserting a confidence of 100%, which might seem intuitively correct,
+ is rarely advisable.
+
2.1. Uncertainty as a Probability Distribution
The Probability Density Function (PDF) that is described by
uncertainty indicates the probability that the "true" value lies at
any one point. The shape of the probability distribution can vary
depending on the method that is used to determine the result. The
 two probability density functions most generally applicable most
 applicable to location information are considered in this document:
+ two probability density functions most generally applicable to
+ location information are considered in this document:
o The normal PDF (also referred to as a Gaussian PDF) is used where
a large number of small random factors contribute to errors. The
value used for the error range in a normal PDF is related to the
standard deviation of the distribution.
o A rectangular PDF is used where the errors are known to be
consistent across a limited range. A rectangular PDF can occur
where a single error source, such as a rounding error, is
significantly larger than other errors. A rectangular PDF is
@@ 313,133 +327,141 @@
be an erroneous use of this term.
3. Uncertainty in Location
A _location estimate_ is the result of location determination. A
location estimate is subject to uncertainty like any other
observation. However, unlike a simple measure of a one dimensional
property like length, a location estimate is specified in two or
three dimensions.
 Uncertainty in 2 or 3dimensional locations can be described using
 confidence intervals. The confidence interval for a location
+ Uncertainty in two or three dimensional locations can be described
+ using confidence intervals. The confidence interval for a location
estimate in two or three dimensional space is expressed as a subset
of that space. This document uses the term _region of uncertainty_
to refer to the area or volume that describes the confidence
interval.
Areas or volumes that describe regions of uncertainty can be formed
by the combination of two or three onedimensional ranges, or more
 complex shapes could be described.
+ complex shapes could be described (for example, the shapes in
+ [RFC5491]).
3.1. Targets as Points in Space
This document makes a simplifying assumption that the Target of the
PIDFLO occupies just a single point in space. While this is clearly
false in virtually all scenarios with any practical application, it
 is often a reasonable assumption to make.
+ is often a reasonable simplifying assumption to make.
 To a large extent, whether this simplication is valid depends on the
 size of the target relative to the size of the uncertainty region.
 When locating a personal device using contemporary location
+ To a large extent, whether this simplification is valid depends on
+ the size of the target relative to the size of the uncertainty
+ region. When locating a personal device using contemporary location
determination techniques, the space the device occupies relative to
the uncertainty is proportionally quite small. Even where that
device is used as a proxy for a person, the proportions change
little.
 This assumption is less useful as the Target of the PIDFLO becomes
 large relative to the uncertainty region. For instance, describing
 the location of a football stadium or small country would include a
 region of uncertainty that is infinitesimally larger than the Target
 itself. In these cases, much of the guidance in this document is not
 applicable. Indeed, as the accuracy of location determination
 technology improves, it could be that the advice this document
 contains becomes less relevant by the same measure.
+ This assumption is less useful as uncertainty becomes small relative
+ to the size of the Target of the PIDFLO (or conversely, as
+ uncertainty becomes small relative to the Target). For instance,
+ describing the location of a football stadium or small country would
+ include a region of uncertainty that is infinitesimally larger than
+ the Target itself. In these cases, much of the guidance in this
+ document is not applicable. Indeed, as the accuracy of location
+ determination technology improves, it could be that the advice this
+ document contains becomes less relevant by the same measure.
3.2. Representation of Uncertainty and Confidence in PIDFLO
A set of shapes suitable for the expression of uncertainty in
location estimates in the Presence Information Data Format  Location
Object (PIDFLO) are described in [GeoShape]. These shapes are the
recommended form for the representation of uncertainty in PIDFLO
[RFC4119] documents.
 The PIDFLO does not include an indication of confidence, but that
 confidence is 95%, by definition in [RFC5491]. Similarly, the PIDF
 LO format does not provide an indication of the shape of the PDF.
 Section 4 defines elements to convey this information.
+ The PIDFLO can contain uncertainty, but does not include an
+ indication of confidence. [RFC5491] defines a fixed value of 95%.
+ Similarly, the PIDFLO format does not provide an indication of the
+ shape of the PDF. Section 4 defines elements to convey this
+ information in PIDFLO.
Absence of uncertainty information in a PIDFLO document does not
indicate that there is no uncertainty in the location estimate.
Uncertainty might not have been calculated for the estimate, or it
may be withheld for privacy purposes.
If the Point shape is used, confidence and uncertainty are unknown; a
receiver can either assume a confidence of 0% or infinite
uncertainty. The same principle applies on the altitude axis for
twodimension shapes like the Circle.
3.3. Uncertainty and Confidence for Civic Addresses
 Civic addresses [RFC5139] inherently include uncertainty, based on
 the area of the most precise element that is specified. Uncertainty
 is effectively defined by the presence or absence of elements 
 elements that are not present are deemed to be uncertain.
+ Automatically determined civic addresses [RFC5139] inherently include
+ uncertainty, based on the area of the most precise element that is
+ specified. In this case, uncertainty is effectively described by the
+ presence or absence of elements  elements that are not present are
+ deemed to be uncertain.
To apply the concept of uncertainty to civic addresses, it is helpful
to unify the conceptual models of civic address with geodetic
 location information.

 Note: This view is one perspective on the process of geocoding 
 the translation of a civic address to a geodetic location.
+ location information. This is particularly useful when considering
+ civic addresses that are determined using reverse geocoding (that is,
+ the process of translating geodetic information into civic
+ addresses).
In the unified view, a civic address defines a series of (sometimes
nonorthogonal) spatial partitions. The first is the implicit
partition that identifies the surface of the earth and the space near
the surface. The second is the country. Each label that is included
in a civic address provides information about a different set of
 spatial partitions. Some partions require slight adjustments from a
 standard interpretation: for instance, a road includes all properties
 that adjoin the street. Each label might need to be interpreted with
 other values to provide context.
+ spatial partitions. Some partitions require slight adjustments from
+ a standard interpretation: for instance, a road includes all
+ properties that adjoin the street. Each label might need to be
+ interpreted with other values to provide context.
As a value at each level is interpreted, one or more spatial
partitions at that level are selected, and all other partitions of
that type are excluded. For nonorthogonal partitions, only the
portion of the partition that fits within the existing space is
selected. This is what distinguishes King Street in Sydney from King
Street in Melbourne. Each defined element selects a partition of
space. The resulting location is the intersection of all selected
spaces.
 The resulting spatial partition can be considered to represent a
 region of uncertainty. At no stage does this process select a point;
 although, as spaces get smaller this distinction might have no
 practical significance and an approximation if a point could be used.
+ The resulting spatial partition can be considered as a region of
+ uncertainty.
+
+ Note: This view is a potential perspective on the process of geo
+ coding  the translation of a civic address to a geodetic
+ location.
Uncertainty in civic addresses can be increased by removing elements.
 This doesn't necessarily improve confidence in the same way that
 arbitrarily increasing uncertainty in a geodetic location doesn't
 increase confidence.
+ This does not increase confidence unless additional information is
+ used. Similarly, arbitrarily increasing uncertainty in a geodetic
+ location does not increase confidence.
3.4. DHCP Location Configuration Information and Uncertainty
Location information is often measured in two or three dimensions;
expressions of uncertainty in one dimension only are rare. The
 "resolution" parameters in [RFC3825] provide an indication of
 uncertainty in one dimension.
+ "resolution" parameters in [RFC6225] provide an indication of how
+ many bits of a number are valid, which could be interpreted as an
+ expression of uncertainty in one dimension.
 [RFC3825] defines a means for representing uncertainty, but a value
+ [RFC6225] defines a means for representing uncertainty, but a value
for confidence is not specified. A default value of 95% confidence
 can be assumed for the combination of the uncertainty on each axis.
 That is, the confidence of the resultant rectangular polygon or prism
 is 95%.
+ is assumed for the combination of the uncertainty on each axis. This
+ is consistent with the transformation of those forms into the
+ uncertainty representations from [RFC5491]. That is, the confidence
+ of the resultant rectangular polygon or prism is assumed to be 95%.
4. Representation of Confidence in PIDFLO
On the whole, a fixed definition for confidence is preferable.
Primarily because it ensures consistency between implementations.
Location generators that are aware of this constraint can generate
location information at the required confidence. Location recipients
are able to make sensible assumptions about the quality of the
information that they receive.
@@ 454,69 +476,74 @@
previously unavailable to recipients of location information.
Without this information, a location server or generator that has
access to location information with a confidence lower than 95% has
two options:
o The location server can scale regions of uncertainty in an attempt
to acheive 95% confidence. This scaling process significantly
degrades the quality of the information, because the location
server might not have the necessary information to scale
appropriately; the location server is forced to make assumptions
 that are likely result in either an overly conservative estimate
 with high uncertainty or a overestimate of confidence.
+ that are likely to result in either an overly conservative
+ estimate with high uncertainty or a overestimate of confidence.
o The location server can ignore the confidence entirely, which
results in giving the recipient a false impression of its quality.
Both of these choices degrade the quality of the information
provided.
The addition of a confidence element avoids this problem entirely if
a location recipient supports and understands the element. A
 recipient that does not understand, and hence ignores, the confidence
 element is in no worse a position than if the location server ignored
 confidence.
+ recipient that does not understand  and hence ignores  the
+ confidence element is in no worse a position than if the location
+ server ignored confidence.
4.1. The "confidence" Element
The confidence element MAY be added to the "locationinfo" element of
the Presence Information Data Format  Location Object (PIDFLO)
[RFC4119] document. This element expresses the confidence in the
 associated location information as a percentage.
+ associated location information as a percentage. A special "unknown"
+ value is reserved to indicate that confidence is supported, but not
+ known to the Location Generator.
The confidence element optionally includes an attribute that
indicates the shape of the probability density function (PDF) of the
associated region of uncertainty. Three values are possible:
unknown, normal and rectangular.
Indicating a particular PDF only indicates that the distribution
approximately fits the given shape based on the methods used to
generate the location information. The PDF is normal if there are a
large number of small, independent sources of error; rectangular if
all points within the area have roughly equal probability of being
the actual location of the Target; otherwise, the PDF MUST either be
set to unknown or omitted.
 If a PIDFLO does not include the confidence element, confidence is
 95% [RFC5491]. A Point shape does not have uncertainty (or it has
 infinite uncertainty), so confidence is meaningless for a point;
 therefore, this element MUST be omitted if only a point is provided.
+ If a PIDFLO does not include the confidence element, the confidence
+ of the location estimate is 95%, as defined in [RFC5491].
+
+ A Point shape does not have uncertainty (or it has infinite
+ uncertainty), so confidence is meaningless for a point; therefore,
+ this element MUST be omitted if only a point is provided.
4.2. Generating Locations with Confidence
Location generators SHOULD attempt to ensure that confidence is equal
in each dimension when generating location information. This
restriction, while not always practical, allows for more accurate
scaling, if scaling is necessary.
 Confidence MUST NOT be included unless location information cannot be
 acquired with 95% confidence.
+ A confidence element MUST be included with all location information
+ that includes uncertainty (that is, all forms other than a point). A
+ special "unknown" MAY be used if confidence is not known.
4.3. Consuming and Presenting Confidence
The inclusion of confidence that is anything other than 95% presents
a potentially difficult usability problem for applications that use
location information. Effectively communicating the probability that
a location is incorrect to a user can be difficult.
It is inadvisable to simply display locations of any confidence, or
to display confidence in a separate or nonobvious fashion. If
@@ 593,21 +620,21 @@
estimate to a point. Different methods each make a set of
assumptions about the properties of the PDF and the selected point;
no one method is more "correct" than any other. For any given region
of uncertainty, selecting an arbitrary point within the area could be
considered valid; however, given the aforementioned problems with
point locations, a more rigorous approach is appropriate.
Given a result with a known distribution, selecting the point within
the area that has the highest probability is a more rigorous method.
Alternatively, a point could be selected that minimizes the overall
 error; that is, it minimises the expected value of the difference
+ error; that is, it minimizes the expected value of the difference
between the selected point and the "true" value.
If a rectangular distribution is assumed, the centroid of the area or
volume minimizes the overall error. Minimizing the error for a
normal distribution is mathematically complex. Therefore, this
document opts to select the centroid of the region of uncertainty
when selecting a point.
5.1.1. Centroid Calculation
@@ 627,24 +654,21 @@
The centroid of the ArcBand shape is found along a line that bisects
the arc. The centroid can be found at the following distance from
the starting point of the arcband (assuming an arcband with an
inner radius of "r", outer radius "R", start angle "a", and opening
angle "o"):
d = 4 * sin(o/2) * (R*R + R*r + r*r) / (3*o*(R + r))
This point can be found along the line that bisects the arc; that is,
 the line at an angle of "a + (o/2)". Negative values are possible if
 the angle of opening is greater than 180 degrees; negative values
 indicate that the centroid is found along the angle "a + (o/
 2) + 180".
+ the line at an angle of "a + (o/2)".
5.1.1.2. Polygon Centroid
Calculating a centroid for the Polygon and Prism shapes is more
complex. Polygons that are specified using geodetic coordinates are
not necessarily coplanar. For Polygons that are specified without an
altitude, choose a value for altitude before attempting this process;
an altitude of 0 is acceptable.
The method described in this section is simplified by assuming
@@ 827,24 +851,24 @@
"C[2d]" is the confidence of the twodimensional shape and "C[3d]" is
the confidence of the threedimensional shape. For example, a Sphere
with a confidence of 95% can be simplified to a Circle of equal
radius with confidence of 96.6%.
5.4. Increasing and Decreasing Uncertainty and Confidence
The combination of uncertainty and confidence provide a great deal of
information about the nature of the data that is being measured. If
 both uncertainty, confidence and PDF are known, certain information
 can be extrapolated. In particular, the uncertainty can be scaled to
 meet a desired confidence or the confidence for a particular region
 of uncertainty can be found.
+ uncertainty, confidence and PDF are known, certain information can be
+ extrapolated. In particular, the uncertainty can be scaled to meet a
+ desired confidence or the confidence for a particular region of
+ uncertainty can be found.
In general, confidence decreases as the region of uncertainty
decreases in size and confidence increases as the region of
uncertainty increases in size. However, this depends on the PDF;
expanding the region of uncertainty for a rectangular distribution
has no effect on confidence without additional information. If the
region of uncertainty is increased during the process of obfuscation
(see [ID.thomsongeoprivlocationobscuring]), then the confidence
cannot be increased.
@@ 857,92 +881,97 @@
This section makes the simplifying assumption that location
information is symmetrically and evenly distributed in each
dimension. This is not necessarily true in practice. If better
information is available, alternative methods might produce better
results.
5.4.1. Rectangular Distributions
Uncertainty that follows a rectangular distribution can only be
 decreased in size. Since the PDF is constant over the region of
 uncertainty, the resulting confidence is determined by the following
 formula:
+ decreased in size. Increasing uncertainty has no value, since it has
+ no effect on confidence. Since the PDF is constant over the region
+ of uncertainty, the resulting confidence is determined by the
+ following formula:
Cr = Co * Ur / Uo
Where "Uo" and "Ur" are the sizes of the original and reduced regions
of uncertainty (either the area or the volume of the region); "Co"
and "Cb" are the confidence values associated with each region.
Information is lost by decreasing the region of uncertainty for a
rectangular distribution. Once reduced in size, the uncertainty
region cannot subsequently be increased in size.
5.4.2. Normal Distributions
Uncertainty and confidence can be both increased and decreased for a
 normal distribution. However, the process is more complicated.
+ normal distribution. This calculation depends on the number of
+ dimensions of the uncertainty region.
For a normal distribution, uncertainty and confidence are related to
the standard deviation of the function. The following function
 defines the relationship between standard deviation, uncertainty and
+ defines the relationship between standard deviation, uncertainty, and
confidence along a single axis:
S[x] = U[x] / ( sqrt(2) * erfinv(C[x]) )
 Where "S[x]" is the standard deviation, "U[x]" is the uncertainty and
 "C[x]" is the confidence along a single axis. "erfinv" is the
+ Where "S[x]" is the standard deviation, "U[x]" is the uncertainty,
+ and "C[x]" is the confidence along a single axis. "erfinv" is the
inverse error function.
Scaling a normal distribution in two dimensions requires several
assumptions. Firstly, it is assumed that the distribution along each
 axis is independent. Secondly, the confidence for each axis is the
 same. Therefore, the confidence along each axis can be assumed to
 be:
+ axis is independent. Secondly, the confidence for each axis is
+ assumed to be the same. Therefore, the confidence along each axis
+ can be assumed to be:
C[x] = Co ^ (1/n)
Where "C[x]" is the confidence along a single axis and "Co" is the
overall confidence and "n" is the number of dimensions in the
uncertainty.
Therefore, to find the uncertainty for each axis at a desired
confidence, "Cd", apply the following formula:
Ud[x] <= U[x] * (erfinv(Cd ^ (1/n)) / erfinv(Co ^ (1/n)))
For regular shapes, this formula can be applied as a scaling factor
in each dimension to reach a required confidence.
5.5. Determining Whether a Location is Within a Given Region
 A number of applications require that a judgement be made about
+ A number of applications require that a judgment be made about
whether a Target is within a given region of interest. Given a
 location estimate with uncertainty, this judgement can be difficult.
+ location estimate with uncertainty, this judgment can be difficult.
A location estimate represents a probability distribution, and the
true location of the Target cannot be definitively known. Therefore,
 the judgement relies on determining the probability that the Target
 is within the region.
+ the judgment relies on determining the probability that the Target is
+ within the region.
The probability that the Target is within a particular region is
found by integrating the PDF over the region. For a normal
distribution, there are no analytical methods that can be used to
determine the integral of the two or three dimensional PDF over an
arbitrary region. The complexity of numerical methods is also too
great to be useful in many applications; for example, finding the
integral of the PDF in two or three dimensions across the overlap
between the uncertainty region and the target region. If the PDF is
 unknown, no determination can be made. When judging whether a
 location is within a given region, uncertainties using these PDFs can
 be assumed to be rectangular. If this assumption is made, the
 confidence should be scaled to 95%, if possible.
+ unknown, no determination can be made without a simplifying
+ assumption.
+
+ When judging whether a location is within a given region, this
+ document assumes that uncertainties are rectangular. This introduces
+ errors, but simplifies the calculations significantly. Prior to
+ applying this assumption, confidence should be scaled to 95%.
Note: The selection of confidence has a significant impact on the
final result. Only use a different confidence if an uncertainty
value for 95% confidence cannot be found.
Given the assumption of a rectangular distribution, the probability
that a Target is found within a given region is found by first
finding the area (or volume) of overlap between the uncertainty
region and the region of interest. This is multiplied by the
confidence of the location estimate to determine the probability.
@@ 1029,21 +1058,21 @@
contained within the smaller polygon. Where the entire area of the
larger polygon is of interest, geodesic interpolation is necessary.
6. Examples
This section presents some examples of how to apply the methods
described in Section 5.
6.1. Reduction to a Point or Circle
 Alice receives a location estimate from her LIS that contains a
+ Alice receives a location estimate from her LIS that contains an
ellipsoidal region of uncertainty. This information is provided at
19% confidence with a normal PDF. A PIDFLO extract for this
information is shown in Figure 8.
34.407242 150.882518 34
7.7156
@@ 1091,21 +1120,21 @@
Figure 9
To convert this to a polygon, each point is firstly assigned an
altitude of zero and converted to ECEF coordinates (see Appendix A).
Then a normal vector for this polygon is found (see Appendix B). The
 results of each of these stages is shown in Figure 10. Note that the
+ result of each of these stages is shown in Figure 10. Note that the
numbers shown are all rounded; no rounding is possible during this
process since rounding would contribute significant errors.
Polygon in ECEF coordinate space
(repeated point omitted and transposed to fit):
[ 4.6470e+06 2.5530e+06 3.5333e+06 ]
[ 4.6470e+06 2.5531e+06 3.5332e+06 ]
pecef = [ 4.6470e+06 2.5531e+06 3.5332e+06 ]
[ 4.6469e+06 2.5531e+06 3.5333e+06 ]
[ 4.6469e+06 2.5531e+06 3.5334e+06 ]
@@ 1145,56 +1174,61 @@
ignoring the altitude since the original shape did not include
altitude.
To convert this to a circle, take the maximum distance in ECEF
coordinates from the center point to each of the points. This
results in a radius of 99.1 meters. Confidence is unchanged.
6.2. Increasing and Decreasing Confidence
Assuming that confidence is known to be 19% for Alice's location
 information. This is typical value for a threedimensional ellipsoid
 uncertainty of normal distribution where the standard deviation is
 supplied in each dimension. The confidence associated with Alice's
 location estimate is quite low for many applications. Since the
 estimate is known to follow a normal distribution, the method in
 Section 5.4.2 can be used. Each axis can be scaled by:
+ information. This is a typical value for a threedimensional
+ ellipsoid uncertainty of normal distribution where the standard
+ deviation is used directly for uncertainty in each dimension. The
+ confidence associated with Alice's location estimate is quite low for
+ many applications. Since the estimate is known to follow a normal
+ distribution, the method in Section 5.4.2 can be used. Each axis can
+ be scaled by:
scale = erfinv(0.95^(1/3)) / erfinv(0.19^(1/3)) = 2.9937
Ensuring that rounding always increases uncertainty, the location
estimate at 95% includes a semimajor axis of 23.1, a semiminor axis
of 10 and a vertical axis of 86.
 Bob's location estimate covers an area of approximately 12600 square
 meters. If the estimate follows a rectangular distribution, the
 region of uncertainty can be reduced in size. To find the confidence
 that he is within the smaller area of the concert hall, given by the
 polygon [33.856473, 151.215257; 33.856322, 151.214973;
+ Bob's location estimate (from the previous example) covers an area of
+ approximately 12600 square meters. If the estimate follows a
+ rectangular distribution, the region of uncertainty can be reduced in
+ size. Here we find the confidence that Bob is within the smaller
+ area of the concert hall. For the concert hall, the polygon
+ [33.856473, 151.215257; 33.856322, 151.214973;
33.856424, 151.21471; 33.857248, 151.214753;
 33.857413, 151.214941; 33.857311, 151.215128]. To use this new
 region of uncertainty, find its area using the same translation
 method described in Section 5.1.1.2, which is 4566.2 square meters.
 The confidence associated with the smaller area is therefore 95% *
 4566.2 / 12600 = 34%.
+ 33.857413, 151.214941; 33.857311, 151.215128] is used. To use this
+ new region of uncertainty, find its area using the same translation
+ method described in Section 5.1.1.2, which produces 4566.2 square
+ meters. Given that the concert hall is entirely within Bob's
+ original location estimate, the confidence associated with the
+ smaller area is therefore 95% * 4566.2 / 12600 = 34%.
6.3. Matching Location Estimates to Regions of Interest
 Suppose than a circular area is defined centered at
+ Suppose that a circular area is defined centered at
[33.872754, 151.20683] with a radius of 1950 meters. To determine
 whether Bob is found within this area, we apply the method in
 Section 5.5. Using the converted Circle shape for Bob's location,
 the distance between these points is found to be 1915.26 meters. The
 area of overlap between Bob's location estimate and the region of
 interest is therefore 2209 square meters and the area of Bob's
 location estimate is 30853 square meters. This gives the probability
 that Bob is less than 1950 meters from the selected point as 67.8%.
+ whether Bob is found within this area  given that Bob is at
+ [34.407242, 150.882518] with an uncertainty radius 7.7156 meters 
+ we apply the method in Section 5.5. Using the converted Circle shape
+ for Bob's location, the distance between these points is found to be
+ 1915.26 meters. The area of overlap between Bob's location estimate
+ and the region of interest is therefore 2209 square meters and the
+ area of Bob's location estimate is 30853 square meters. This gives
+ the estimated probability that Bob is less than 1950 meters from the
+ selected point as 67.8%.
Note that if 1920 meters were chosen for the distance from the
selected point, the area of overlap is only 16196 square meters and
the confidence is 49.8%. Therefore, it is marginally more likely
that Bob is outside the region of interest, despite the center point
of his location estimate being within the region.
6.4. PIDFLO With Confidence Example
The PIDFLO document in Figure 11 includes a representation of
@@ 1256,23 +1290,27 @@
+


+
+
+
+
+
@@ 1281,22 +1319,22 @@
8.1. URN SubNamespace Registration for
urn:ietf:params:xml:ns:geopriv:conf
This section registers a new XML namespace,
"urn:ietf:params:xml:ns:geopriv:conf", as per the guidelines in
[RFC3688].
URI: urn:ietf:params:xml:ns:geopriv:conf
 Registrant Contact: IETF, GEOPRIV working group,
 (geopriv@ietf.org), Martin Thomson (martin.thomson@andrew.com).
+ Registrant Contact: IETF, GEOPRIV working group, (geopriv@ietf.org),
+ Martin Thomson (martin.thomson@gmail.com).
XML:
BEGIN
PIDFLO Confidence Attribute
@@ 1312,21 +1350,21 @@
END
8.2. XML Schema Registration
This section registers an XML schema as per the guidelines in
[RFC3688].
URI: urn:ietf:params:xml:schema:geopriv:conf
Registrant Contact: IETF, GEOPRIV working group, (geopriv@ietf.org),
 Martin Thomson (martin.thomson@andrew.com).
+ Martin Thomson (martin.thomson@gmail.com).
Schema: The XML for this schema can be found as the entirety of
Section 7 of this document.
9. Security Considerations
This document describes methods for managing and manipulating
uncertainty in location. No specific security concerns arise from
most of the information provided.
@@ 1375,41 +1413,37 @@
measurement (GUM)", Guide 98:1995, 1995.
[NIST.TN1297]
Taylor, B. and C. Kuyatt, "Guidelines for Evaluating and
Expressing the Uncertainty of NIST Measurement Results",
Technical Note 1297, Sep 1994.
[RFC3693] Cuellar, J., Morris, J., Mulligan, D., Peterson, J., and
J. Polk, "Geopriv Requirements", RFC 3693, February 2004.
 [RFC3694] Danley, M., Mulligan, D., Morris, J., and J. Peterson,
 "Threat Analysis of the Geopriv Protocol", RFC 3694,
 February 2004.

 [RFC3825] Polk, J., Schnizlein, J., and M. Linsner, "Dynamic Host
 Configuration Protocol Option for Coordinatebased
 Location Configuration Information", RFC 3825, July 2004.

[RFC5139] Thomson, M. and J. Winterbottom, "Revised Civic Location
Format for Presence Information Data Format Location
Object (PIDFLO)", RFC 5139, February 2008.
[RFC5222] Hardie, T., Newton, A., Schulzrinne, H., and H.
Tschofenig, "LoST: A LocationtoService Translation
Protocol", RFC 5222, August 2008.
[RFC5491] Winterbottom, J., Thomson, M., and H. Tschofenig, "GEOPRIV
Presence Information Data Format Location Object (PIDFLO)
Usage Clarification, Considerations, and Recommendations",
RFC 5491, March 2009.
+ [RFC6225] Polk, J., Linsner, M., Thomson, M., and B. Aboba, "Dynamic
+ Host Configuration Protocol Options for CoordinateBased
+ Location Configuration Information", RFC 6225, July 2011.
+
[RFC6280] Barnes, R., Lepinski, M., Cooper, A., Morris, J.,
Tschofenig, H., and H. Schulzrinne, "An Architecture for
Location and Location Privacy in Internet Applications",
BCP 160, RFC 6280, July 2011.
[Sunday02]
Sunday, D., "Fast polygon area and Newell normal
computation", Journal of Graphics Tools JGT,
7(2):913,2002, 2002,
.
@@ 1459,23 +1493,23 @@
methods introduce some error in latitude and altitude. A range of
techniques are described in [Convert]. A variant on the method
originally proposed by Bowring, which results in an acceptably small
error, is described by the following:
p = sqrt(X^2 + Y^2)
r = sqrt(X^2 + Y^2 + Z^2)
u = atan((1f) * Z * (1 + e'^2 * (1f) * R / r) / p)
 latitude = atan((Z + e'^2 * (1f) * R * sin(u)^3) /
 (p  e^2 * R * cos(u)^3))
+ latitude = atan((Z + e'^2 * (1f) * R * sin(u)^3)
+ / (p  e^2 * R * cos(u)^3))
longitude = atan(Y / X)
altitude = sqrt((p  R * cos(u))^2 + (Z  (1f) * R * sin(u))^2)
If the point is near the poles, that is "p < 1", the value for
altitude that this method produces is unstable. A simpler method for
determining the altitude of a point near the poles is:
altitude = Z  R * (1  f)
@@ 1539,22 +1574,21 @@
Up = [ cos(lat) * cos(lng) ; cos(lat) * sin(lng) ; sin(lat) ]
For polygons that span less than half the globe, any point in the
polygon  including the centroid  can be selected to generate an
approximate up vector for comparison with the upward normal.
Authors' Addresses
Martin Thomson
Mozilla
 Suite 300
 650 Castro Street
+ 331 E Evelyn Street
Mountain View, CA 94041
US
Email: martin.thomson@gmail.com
James Winterbottom
Unaffiliated
AU
Email: a.james.winterbottom@gmail.com