draft-ietf-mops-ar-use-case-02.txt   draft-ietf-mops-ar-use-case-03.txt 
MOPS R. Krishna MOPS R. Krishna
Internet-Draft InterDigital Europe Limited Internet-Draft InterDigital Europe Limited
Intended status: Informational A. Rahman Intended status: Informational A. Rahman
Expires: January 29, 2022 InterDigital Communications, LLC Expires: April 28, 2022 InterDigital Communications, LLC
July 28, 2021 October 25, 2021
Media Operations Use Case for an Augmented Reality Application on Edge Media Operations Use Case for an Augmented Reality Application on Edge
Computing Infrastructure Computing Infrastructure
draft-ietf-mops-ar-use-case-02 draft-ietf-mops-ar-use-case-03
Abstract Abstract
A use case describing transmission of an application on the Internet A use case describing transmission of an application on the Internet
that has several unique characteristics of Augmented Reality (AR) that has several unique characteristics of Augmented Reality (AR)
applications is presented for the consideration of the Media applications is presented for the consideration of the Media
Operations (MOPS) Working Group. One key requirement identified is Operations (MOPS) Working Group. One key requirement identified is
that the Adaptive-Bit-Rate (ABR) algorithms' current usage of that the Adaptive-Bit-Rate (ABR) algorithms' current usage of
policies based on heuristics and models is inadequate for AR policies based on heuristics and models is inadequate for AR
applications running on the Edge Computing infrastructure. applications running on the Edge Computing infrastructure.
skipping to change at page 1, line 38 skipping to change at page 1, line 38
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 29, 2022. This Internet-Draft will expire on April 28, 2022.
Copyright Notice Copyright Notice
Copyright (c) 2021 IETF Trust and the persons identified as the Copyright (c) 2021 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 17 skipping to change at page 2, line 17
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Conventions used in this document . . . . . . . . . . . . . . 3 2. Conventions used in this document . . . . . . . . . . . . . . 3
3. Use Case . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3. Use Case . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.1. Processing of Scenes . . . . . . . . . . . . . . . . . . 3 3.1. Processing of Scenes . . . . . . . . . . . . . . . . . . 3
3.2. Generation of Images . . . . . . . . . . . . . . . . . . 4 3.2. Generation of Images . . . . . . . . . . . . . . . . . . 4
4. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 4 4. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 4
5. Informative References . . . . . . . . . . . . . . . . . . . 6 5. AR Network Traffic and Interaction with TCP . . . . . . . . . 6
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 9 6. Informative References . . . . . . . . . . . . . . . . . . . 7
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10
1. Introduction 1. Introduction
The MOPS draft, [I-D.ietf-mops-streaming-opcons], provides an The MOPS draft, [I-D.ietf-mops-streaming-opcons], provides an
overview of operational networking issues that pertain to Quality of overview of operational networking issues that pertain to Quality of
Experience (QoE) in delivery of video and other high-bitrate media Experience (QoE) in delivery of video and other high-bitrate media
over the Internet. However, as it does not cover the increasingly over the Internet. However, as it does not cover the increasingly
large number of applications with Augmented Reality (AR) large number of applications with Augmented Reality (AR)
characteristics and their requirements on ABR algorithms, the characteristics and their requirements on ABR algorithms, the
discussion in this draft compliments the overview presented in that discussion in this draft compliments the overview presented in that
skipping to change at page 3, line 16 skipping to change at page 3, line 17
deployments [ABR_1]. deployments [ABR_1].
2. Conventions used in this document 2. Conventions used in this document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119]. document are to be interpreted as described in [RFC2119].
3. Use Case 3. Use Case
We now descibe a use case that involves an application with AR We now describe a use case that involves an application with AR
systems' characteristics. Consider a group of tourists who are being systems' characteristics. Consider a group of tourists who are being
conducted in a tour around the historical site of the Tower of conducted in a tour around the historical site of the Tower of
London. As they move around the site and within the historical London. As they move around the site and within the historical
buildings, they can watch and listen to historical scenes in 3D that buildings, they can watch and listen to historical scenes in 3D that
are generated by the AR application and then overlaid by their AR are generated by the AR application and then overlaid by their AR
headsets onto their real-world view. The headset then continuously headsets onto their real-world view. The headset then continuously
updates their view as they move around. updates their view as they move around.
The AR application first processes the scene that the walking tourist The AR application first processes the scene that the walking tourist
is watching in real-time and identifies objects that will be targeted is watching in real-time and identifies objects that will be targeted
for overlay of high resolution videos. It then generates high for overlay of high resolution videos. It then generates high
resolution 3D images of historical scenes related to the perspective resolution 3D images of historical scenes related to the perspective
of the tourist in real-time. These generated video images are then of the tourist in real-time. These generated video images are then
overlaid on the view of the real-world as seen by the tourist. overlaid on the view of the real-world as seen by the tourist.
We now discuss this processsing of scenes and generation of high We now discuss this processing of scenes and generation of high
resolution images in greater detail. resolution images in greater detail.
3.1. Processing of Scenes 3.1. Processing of Scenes
The AR application that runs on the mobile device needs to first The task of processing a scene can be broken down into a pipeline of
track the pose (coordinates and orientation) of the user's head, eyes three consecutive subtasks namely tracking, followed by an
and the objects that are in view.This requires tracking natural acquisition of a model of the real world, and finally registration
features and developing an annotated point cloud based model that is [AUGMENTED].
then stored in a database.To ensure that this database can be scaled
up,techniques such as combining a client side simultaneous tracking
and mapping and a server-side localization are used[SLAM_1],
[SLAM_2], [SLAM_3], [SLAM_4]. Once the natural features are tracked,
virtual objects are geometrically aligned with those features.This is
followed by resolving occlusion that can occur between virtual and
the real objects [OCCL_1], [OCCL_2].
The next step for the AR apllication is to apply photometric Tracking: This includes tracking of the three dimensional coordinates
registration [PHOTO_REG]. This requires aligning the brightness and and six dimensional pose (coordinates and orientation) of objects in
color between the virtual and real objects.Additionally, algorithms the real world[AUGMENTED]. The AR application that runs on the
that calculate global illumination of both the virtual and real mobile device needs to track the pose of the user's head, eyes and
objects [GLB_ILLUM_1], [GLB_ILLUM_2] are executed.Various algorithms the objects that are in view.This requires tracking natural features
to deal with artifacts generated by lens distortion [LENS_DIST], blur that are then used in the next stage of the pipeline.
[BLUR], noise [NOISE] etc are also required.
Acquisition of a model of the real world: The tracked natural
features are used to develop an annotated point cloud based model
that is then stored in a database.To ensure that this database can be
scaled up,techniques such as combining a client side simultaneous
tracking and mapping and a server-side localization are used[SLAM_1],
[SLAM_2], [SLAM_3], [SLAM_4].
Registration: The coordinate systems, brightness, and color of
virtual and real objects need to be aligned in a process called
registration [REG]. Once the natural features are tracked as
discussed above, virtual objects are geometrically aligned with those
features by geometric registration .This is followed by resolving
occlusion that can occur between virtual and the real objects
[OCCL_1], [OCCL_2]. The AR application also applies photometric
registration [PHOTO_REG] by aligning the brightness and color between
the virtual and real objects.Additionally, algorithms that calculate
global illumination of both the virtual and real objects
[GLB_ILLUM_1], [GLB_ILLUM_2] are executed.Various algorithms to deal
with artifacts generated by lens distortion [LENS_DIST], blur [BLUR],
noise [NOISE] etc are also required.
3.2. Generation of Images 3.2. Generation of Images
The AR application must generate a high-quality video that has the The AR application must generate a high-quality video that has the
properties descibed in the previous step and overlay the video on the properties described in the previous step and overlay the video on
AR device's display- a step called situated visualization. This the AR device's display- a step called situated visualization. This
entails dealing with registration errors that may arise, esuring that entails dealing with registration errors that may arise, ensuring
there is no visual interference [VIS_INTERFERE], and finally that there is no visual interference [VIS_INTERFERE], and finally
maintaining temporal coherence by adapting to the movement of user's maintaining temporal coherence by adapting to the movement of user's
eyes and head. eyes and head.
4. Requirements 4. Requirements
The components of AR applications perform tasks such as real-time The components of AR applications perform tasks such as real-time
generation and processing of high-quality video content that are generation and processing of high-quality video content that are
computationally intensive. As a result,on AR devices such as AR computationally intensive. As a result,on AR devices such as AR
glasses excessive heat is generated by the chip-sets that are glasses excessive heat is generated by the chip-sets that are
involved in the computation [DEV_HEAT_1], [DEV_HEAT_2]. involved in the computation [DEV_HEAT_1], [DEV_HEAT_2].
Additionally, the battery on such devices discharges quickly when Additionally, the battery on such devices discharges quickly when
running such applications [BATT_DRAIN]. running such applications [BATT_DRAIN].
A solution to the heat dissipation and battery drainge problem is to A solution to the heat dissipation and battery drainage problem is to
offload the processing and video generation tasks to the remote offload the processing and video generation tasks to the remote
cloud.However, running such tasks on the cloud is not feasible as the cloud.However, running such tasks on the cloud is not feasible as the
end-to-end delays must be within the order of a few milliseconds. end-to-end delays must be within the order of a few milliseconds.
Additionally,such applications require high bandwidth and low jitter Additionally,such applications require high bandwidth and low jitter
to provide a high QoE to the user.In order to achieve such hard to provide a high QoE to the user.In order to achieve such hard
timing constraints, computationally intensive tasks can be offloaded timing constraints, computationally intensive tasks can be offloaded
to Edge devices. to Edge devices.
Another requirement for our use case and similar applications such as Another requirement for our use case and similar applications such as
360 degree streaming is that the display on the AR/VR device should 360 degree streaming is that the display on the AR/VR device should
skipping to change at page 6, line 20 skipping to change at page 6, line 33
limited on the AR device. The ABR algorithm must be able to limited on the AR device. The ABR algorithm must be able to
handle this situation. handle this situation.
o Handling side effects of deciding a specific bit rate: For o Handling side effects of deciding a specific bit rate: For
example, selecting a bit rate of a particular value might result example, selecting a bit rate of a particular value might result
in the ABR algorithm not changing to a different rate so as to in the ABR algorithm not changing to a different rate so as to
ensure a non-fluctuating bit-rate and the resultant smoothness of ensure a non-fluctuating bit-rate and the resultant smoothness of
video quality . The ABR algorithm must be able to handle this video quality . The ABR algorithm must be able to handle this
situation. situation.
5. Informative References 5. AR Network Traffic and Interaction with TCP
In addition to the requirements for ABR algorithms, there are other
operational issues that need to be considered for AR use cases such
as the one descibed above. In a study [AR_TRAFFIC] conducted to
characterize multi-user AR over cellular networks, the following
issues were identified:
o The uploading of data from an AR device to a remote server for
processing dominates the end-to-end latency.
o A lack of visual features in the grid environment can cause
increased latencies as the AR device uploads additional visual
data for processing to the remote server.
o AR applications tend to have large bursts that are separated by
significant time gaps. As a result, the TCP congestion window
enters slow start before the large bursts of data arrive
increasing the perceived user latency. The study [AR_TRAFFIC]
shows that segmentation latency at 4G LTE (Long Term Evolution)'s
RAN (Radio Access Network)'s RLC (Radio Link Control) layer
impacts TCP's performance during slow-start.
6. Informative References
[ABR_1] Mao, H., Netravali, R., and M. Alizadeh, "Neural Adaptive [ABR_1] Mao, H., Netravali, R., and M. Alizadeh, "Neural Adaptive
Video Streaming with Pensieve", In Proceedings of the Video Streaming with Pensieve", In Proceedings of the
Conference of the ACM Special Interest Group on Data Conference of the ACM Special Interest Group on Data
Communication, pp. 197-210, 2017. Communication, pp. 197-210, 2017.
[ABR_2] Yan, F., Ayers, H., Zhu, C., Fouladi, S., Hong, J., Zhang, [ABR_2] Yan, F., Ayers, H., Zhu, C., Fouladi, S., Hong, J., Zhang,
K., Levis, P., and K. Winstein, "Learning in situ: a K., Levis, P., and K. Winstein, "Learning in situ: a
randomized experiment in video streaming", In 17th randomized experiment in video streaming", In 17th
USENIX Symposium on Networked Systems Design and USENIX Symposium on Networked Systems Design and
Implementation (NSDI 20), pp. 495-511, 2020. Implementation (NSDI 20), pp. 495-511, 2020.
[AR_TRAFFIC]
Apicharttrisorn, K., Balasubramanian, B., Chen, J.,
Sivaraj, R., Tsai, Y., Jana, R., Krishnamurthy, S., Tran,
T., and Y. Zhou, "Characterization of Multi-User Augmented
Reality over Cellular Networks", In 17th Annual IEEE
International Conference on Sensing, Communication, and
Networking (SECON), pp. 1-9. IEEE, 2020.
[AUGMENTED]
Schmalstieg, D. and T. Hollerer, "Augmented
Reality", Addison Wesley, 2016.
[BATT_DRAIN] [BATT_DRAIN]
Seneviratne, S., Hu, Y., Nguyen, T., Lan, G., Khalifa, S., Seneviratne, S., Hu, Y., Nguyen, T., Lan, G., Khalifa, S.,
Thilakarathna, K., Hassan, M., and A. Seneviratne, "A Thilakarathna, K., Hassan, M., and A. Seneviratne, "A
survey of wearable devices and challenges.", In IEEE survey of wearable devices and challenges.", In IEEE
Communication Surveys and Tutorials, 19(4), p.2573-2620., Communication Surveys and Tutorials, 19(4), p.2573-2620.,
2017. 2017.
[BLUR] Kan, P. and H. Kaufmann, "Physically-Based Depth of Field [BLUR] Kan, P. and H. Kaufmann, "Physically-Based Depth of Field
in Augmented Reality.", In Eurographics (Short Papers), in Augmented Reality.", In Eurographics (Short Papers),
pp. 89-92., 2012. pp. 89-92., 2012.
skipping to change at page 7, line 47 skipping to change at page 8, line 47
infrastructure, traffic and applications", John Wiley and infrastructure, traffic and applications", John Wiley and
Sons Inc., 2006. Sons Inc., 2006.
[HEAVY_TAIL_2] [HEAVY_TAIL_2]
Taleb, N., "The Statistical Consequences of Fat Tails", Taleb, N., "The Statistical Consequences of Fat Tails",
STEM Academic Press, 2020. STEM Academic Press, 2020.
[I-D.ietf-mops-streaming-opcons] [I-D.ietf-mops-streaming-opcons]
Holland, J., Begen, A., and S. Dawkins, "Operational Holland, J., Begen, A., and S. Dawkins, "Operational
Considerations for Streaming Media", draft-ietf-mops- Considerations for Streaming Media", draft-ietf-mops-
streaming-opcons-06 (work in progress), July 2021. streaming-opcons-07 (work in progress), September 2021.
[LENS_DIST] [LENS_DIST]
Fuhrmann, A. and D. Schmalstieg, "Practical calibration Fuhrmann, A. and D. Schmalstieg, "Practical calibration
procedures for augmented reality.", In Virtual procedures for augmented reality.", In Virtual
Environments 2000, pp. 3-12. Springer, Vienna, 2000. Environments 2000, pp. 3-12. Springer, Vienna, 2000.
[NOISE] Fischer, J., Bartz, D., and W. Strasser, "Enhanced visual [NOISE] Fischer, J., Bartz, D., and W. Strasser, "Enhanced visual
realism by incorporating camera image effects.", realism by incorporating camera image effects.",
In IEEE/ACM International Symposium on Mixed and In IEEE/ACM International Symposium on Mixed and
Augmented Reality, pp. 205-208., 2006. Augmented Reality, pp. 205-208., 2006.
skipping to change at page 8, line 45 skipping to change at page 9, line 45
lighting variations for augmented reality with moving lighting variations for augmented reality with moving
cameras", In IEEE Transactions on visualization and cameras", In IEEE Transactions on visualization and
computer graphics, 18(4), pp.573-580, 2012. computer graphics, 18(4), pp.573-580, 2012.
[PREDICT] Buker, T., Vincenzi, D., and J. Deaton, "The effect of [PREDICT] Buker, T., Vincenzi, D., and J. Deaton, "The effect of
apparent latency on simulator sickness while using a see- apparent latency on simulator sickness while using a see-
through helmet-mounted display: Reducing apparent latency through helmet-mounted display: Reducing apparent latency
with predictive compensation..", In Human factors 54.2, with predictive compensation..", In Human factors 54.2,
pp. 235-249., 2012. pp. 235-249., 2012.
[REG] Holloway, R., "Registration error analysis for augmented
reality.", In Presence:Teleoperators and Virtual
Environments 6.4, pp. 413-432., 1997.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997, DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>. <https://www.rfc-editor.org/info/rfc2119>.
[SLAM_1] Ventura, J., Arth, C., Reitmayr, G., and D. Schmalstieg, [SLAM_1] Ventura, J., Arth, C., Reitmayr, G., and D. Schmalstieg,
"A minimal solution to the generalized pose-and-scale "A minimal solution to the generalized pose-and-scale
problem", In Proceedings of the IEEE Conference on problem", In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, pp. 422-429, Computer Vision and Pattern Recognition, pp. 422-429,
2014. 2014.
skipping to change at page 10, line 18 skipping to change at page 11, line 18
United Kingdom United Kingdom
Email: renan.krishna@interdigital.com Email: renan.krishna@interdigital.com
Akbar Rahman Akbar Rahman
InterDigital Communications, LLC InterDigital Communications, LLC
1000 Sherbrooke Street West 1000 Sherbrooke Street West
Montreal H3A 3G4 Montreal H3A 3G4
Canada Canada
Email: rahmansakbar@yahoo.com Email: Akbar.Rahman@InterDigital.com
 End of changes. 15 change blocks. 
33 lines changed or deleted 86 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/