draft-ietf-mops-ar-use-case-02.txt | draft-ietf-mops-ar-use-case-03.txt | |||
---|---|---|---|---|
MOPS R. Krishna | MOPS R. Krishna | |||
Internet-Draft InterDigital Europe Limited | Internet-Draft InterDigital Europe Limited | |||
Intended status: Informational A. Rahman | Intended status: Informational A. Rahman | |||
Expires: January 29, 2022 InterDigital Communications, LLC | Expires: April 28, 2022 InterDigital Communications, LLC | |||
July 28, 2021 | October 25, 2021 | |||
Media Operations Use Case for an Augmented Reality Application on Edge | Media Operations Use Case for an Augmented Reality Application on Edge | |||
Computing Infrastructure | Computing Infrastructure | |||
draft-ietf-mops-ar-use-case-02 | draft-ietf-mops-ar-use-case-03 | |||
Abstract | Abstract | |||
A use case describing transmission of an application on the Internet | A use case describing transmission of an application on the Internet | |||
that has several unique characteristics of Augmented Reality (AR) | that has several unique characteristics of Augmented Reality (AR) | |||
applications is presented for the consideration of the Media | applications is presented for the consideration of the Media | |||
Operations (MOPS) Working Group. One key requirement identified is | Operations (MOPS) Working Group. One key requirement identified is | |||
that the Adaptive-Bit-Rate (ABR) algorithms' current usage of | that the Adaptive-Bit-Rate (ABR) algorithms' current usage of | |||
policies based on heuristics and models is inadequate for AR | policies based on heuristics and models is inadequate for AR | |||
applications running on the Edge Computing infrastructure. | applications running on the Edge Computing infrastructure. | |||
skipping to change at page 1, line 38 ¶ | skipping to change at page 1, line 38 ¶ | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on January 29, 2022. | This Internet-Draft will expire on April 28, 2022. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2021 IETF Trust and the persons identified as the | Copyright (c) 2021 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
skipping to change at page 2, line 17 ¶ | skipping to change at page 2, line 17 ¶ | |||
described in the Simplified BSD License. | described in the Simplified BSD License. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | |||
2. Conventions used in this document . . . . . . . . . . . . . . 3 | 2. Conventions used in this document . . . . . . . . . . . . . . 3 | |||
3. Use Case . . . . . . . . . . . . . . . . . . . . . . . . . . 3 | 3. Use Case . . . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
3.1. Processing of Scenes . . . . . . . . . . . . . . . . . . 3 | 3.1. Processing of Scenes . . . . . . . . . . . . . . . . . . 3 | |||
3.2. Generation of Images . . . . . . . . . . . . . . . . . . 4 | 3.2. Generation of Images . . . . . . . . . . . . . . . . . . 4 | |||
4. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 4 | 4. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
5. Informative References . . . . . . . . . . . . . . . . . . . 6 | 5. AR Network Traffic and Interaction with TCP . . . . . . . . . 6 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 9 | 6. Informative References . . . . . . . . . . . . . . . . . . . 7 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 | ||||
1. Introduction | 1. Introduction | |||
The MOPS draft, [I-D.ietf-mops-streaming-opcons], provides an | The MOPS draft, [I-D.ietf-mops-streaming-opcons], provides an | |||
overview of operational networking issues that pertain to Quality of | overview of operational networking issues that pertain to Quality of | |||
Experience (QoE) in delivery of video and other high-bitrate media | Experience (QoE) in delivery of video and other high-bitrate media | |||
over the Internet. However, as it does not cover the increasingly | over the Internet. However, as it does not cover the increasingly | |||
large number of applications with Augmented Reality (AR) | large number of applications with Augmented Reality (AR) | |||
characteristics and their requirements on ABR algorithms, the | characteristics and their requirements on ABR algorithms, the | |||
discussion in this draft compliments the overview presented in that | discussion in this draft compliments the overview presented in that | |||
skipping to change at page 3, line 16 ¶ | skipping to change at page 3, line 17 ¶ | |||
deployments [ABR_1]. | deployments [ABR_1]. | |||
2. Conventions used in this document | 2. Conventions used in this document | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
document are to be interpreted as described in [RFC2119]. | document are to be interpreted as described in [RFC2119]. | |||
3. Use Case | 3. Use Case | |||
We now descibe a use case that involves an application with AR | We now describe a use case that involves an application with AR | |||
systems' characteristics. Consider a group of tourists who are being | systems' characteristics. Consider a group of tourists who are being | |||
conducted in a tour around the historical site of the Tower of | conducted in a tour around the historical site of the Tower of | |||
London. As they move around the site and within the historical | London. As they move around the site and within the historical | |||
buildings, they can watch and listen to historical scenes in 3D that | buildings, they can watch and listen to historical scenes in 3D that | |||
are generated by the AR application and then overlaid by their AR | are generated by the AR application and then overlaid by their AR | |||
headsets onto their real-world view. The headset then continuously | headsets onto their real-world view. The headset then continuously | |||
updates their view as they move around. | updates their view as they move around. | |||
The AR application first processes the scene that the walking tourist | The AR application first processes the scene that the walking tourist | |||
is watching in real-time and identifies objects that will be targeted | is watching in real-time and identifies objects that will be targeted | |||
for overlay of high resolution videos. It then generates high | for overlay of high resolution videos. It then generates high | |||
resolution 3D images of historical scenes related to the perspective | resolution 3D images of historical scenes related to the perspective | |||
of the tourist in real-time. These generated video images are then | of the tourist in real-time. These generated video images are then | |||
overlaid on the view of the real-world as seen by the tourist. | overlaid on the view of the real-world as seen by the tourist. | |||
We now discuss this processsing of scenes and generation of high | We now discuss this processing of scenes and generation of high | |||
resolution images in greater detail. | resolution images in greater detail. | |||
3.1. Processing of Scenes | 3.1. Processing of Scenes | |||
The AR application that runs on the mobile device needs to first | The task of processing a scene can be broken down into a pipeline of | |||
track the pose (coordinates and orientation) of the user's head, eyes | three consecutive subtasks namely tracking, followed by an | |||
and the objects that are in view.This requires tracking natural | acquisition of a model of the real world, and finally registration | |||
features and developing an annotated point cloud based model that is | [AUGMENTED]. | |||
then stored in a database.To ensure that this database can be scaled | ||||
up,techniques such as combining a client side simultaneous tracking | ||||
and mapping and a server-side localization are used[SLAM_1], | ||||
[SLAM_2], [SLAM_3], [SLAM_4]. Once the natural features are tracked, | ||||
virtual objects are geometrically aligned with those features.This is | ||||
followed by resolving occlusion that can occur between virtual and | ||||
the real objects [OCCL_1], [OCCL_2]. | ||||
The next step for the AR apllication is to apply photometric | Tracking: This includes tracking of the three dimensional coordinates | |||
registration [PHOTO_REG]. This requires aligning the brightness and | and six dimensional pose (coordinates and orientation) of objects in | |||
color between the virtual and real objects.Additionally, algorithms | the real world[AUGMENTED]. The AR application that runs on the | |||
that calculate global illumination of both the virtual and real | mobile device needs to track the pose of the user's head, eyes and | |||
objects [GLB_ILLUM_1], [GLB_ILLUM_2] are executed.Various algorithms | the objects that are in view.This requires tracking natural features | |||
to deal with artifacts generated by lens distortion [LENS_DIST], blur | that are then used in the next stage of the pipeline. | |||
[BLUR], noise [NOISE] etc are also required. | ||||
Acquisition of a model of the real world: The tracked natural | ||||
features are used to develop an annotated point cloud based model | ||||
that is then stored in a database.To ensure that this database can be | ||||
scaled up,techniques such as combining a client side simultaneous | ||||
tracking and mapping and a server-side localization are used[SLAM_1], | ||||
[SLAM_2], [SLAM_3], [SLAM_4]. | ||||
Registration: The coordinate systems, brightness, and color of | ||||
virtual and real objects need to be aligned in a process called | ||||
registration [REG]. Once the natural features are tracked as | ||||
discussed above, virtual objects are geometrically aligned with those | ||||
features by geometric registration .This is followed by resolving | ||||
occlusion that can occur between virtual and the real objects | ||||
[OCCL_1], [OCCL_2]. The AR application also applies photometric | ||||
registration [PHOTO_REG] by aligning the brightness and color between | ||||
the virtual and real objects.Additionally, algorithms that calculate | ||||
global illumination of both the virtual and real objects | ||||
[GLB_ILLUM_1], [GLB_ILLUM_2] are executed.Various algorithms to deal | ||||
with artifacts generated by lens distortion [LENS_DIST], blur [BLUR], | ||||
noise [NOISE] etc are also required. | ||||
3.2. Generation of Images | 3.2. Generation of Images | |||
The AR application must generate a high-quality video that has the | The AR application must generate a high-quality video that has the | |||
properties descibed in the previous step and overlay the video on the | properties described in the previous step and overlay the video on | |||
AR device's display- a step called situated visualization. This | the AR device's display- a step called situated visualization. This | |||
entails dealing with registration errors that may arise, esuring that | entails dealing with registration errors that may arise, ensuring | |||
there is no visual interference [VIS_INTERFERE], and finally | that there is no visual interference [VIS_INTERFERE], and finally | |||
maintaining temporal coherence by adapting to the movement of user's | maintaining temporal coherence by adapting to the movement of user's | |||
eyes and head. | eyes and head. | |||
4. Requirements | 4. Requirements | |||
The components of AR applications perform tasks such as real-time | The components of AR applications perform tasks such as real-time | |||
generation and processing of high-quality video content that are | generation and processing of high-quality video content that are | |||
computationally intensive. As a result,on AR devices such as AR | computationally intensive. As a result,on AR devices such as AR | |||
glasses excessive heat is generated by the chip-sets that are | glasses excessive heat is generated by the chip-sets that are | |||
involved in the computation [DEV_HEAT_1], [DEV_HEAT_2]. | involved in the computation [DEV_HEAT_1], [DEV_HEAT_2]. | |||
Additionally, the battery on such devices discharges quickly when | Additionally, the battery on such devices discharges quickly when | |||
running such applications [BATT_DRAIN]. | running such applications [BATT_DRAIN]. | |||
A solution to the heat dissipation and battery drainge problem is to | A solution to the heat dissipation and battery drainage problem is to | |||
offload the processing and video generation tasks to the remote | offload the processing and video generation tasks to the remote | |||
cloud.However, running such tasks on the cloud is not feasible as the | cloud.However, running such tasks on the cloud is not feasible as the | |||
end-to-end delays must be within the order of a few milliseconds. | end-to-end delays must be within the order of a few milliseconds. | |||
Additionally,such applications require high bandwidth and low jitter | Additionally,such applications require high bandwidth and low jitter | |||
to provide a high QoE to the user.In order to achieve such hard | to provide a high QoE to the user.In order to achieve such hard | |||
timing constraints, computationally intensive tasks can be offloaded | timing constraints, computationally intensive tasks can be offloaded | |||
to Edge devices. | to Edge devices. | |||
Another requirement for our use case and similar applications such as | Another requirement for our use case and similar applications such as | |||
360 degree streaming is that the display on the AR/VR device should | 360 degree streaming is that the display on the AR/VR device should | |||
skipping to change at page 6, line 20 ¶ | skipping to change at page 6, line 33 ¶ | |||
limited on the AR device. The ABR algorithm must be able to | limited on the AR device. The ABR algorithm must be able to | |||
handle this situation. | handle this situation. | |||
o Handling side effects of deciding a specific bit rate: For | o Handling side effects of deciding a specific bit rate: For | |||
example, selecting a bit rate of a particular value might result | example, selecting a bit rate of a particular value might result | |||
in the ABR algorithm not changing to a different rate so as to | in the ABR algorithm not changing to a different rate so as to | |||
ensure a non-fluctuating bit-rate and the resultant smoothness of | ensure a non-fluctuating bit-rate and the resultant smoothness of | |||
video quality . The ABR algorithm must be able to handle this | video quality . The ABR algorithm must be able to handle this | |||
situation. | situation. | |||
5. Informative References | 5. AR Network Traffic and Interaction with TCP | |||
In addition to the requirements for ABR algorithms, there are other | ||||
operational issues that need to be considered for AR use cases such | ||||
as the one descibed above. In a study [AR_TRAFFIC] conducted to | ||||
characterize multi-user AR over cellular networks, the following | ||||
issues were identified: | ||||
o The uploading of data from an AR device to a remote server for | ||||
processing dominates the end-to-end latency. | ||||
o A lack of visual features in the grid environment can cause | ||||
increased latencies as the AR device uploads additional visual | ||||
data for processing to the remote server. | ||||
o AR applications tend to have large bursts that are separated by | ||||
significant time gaps. As a result, the TCP congestion window | ||||
enters slow start before the large bursts of data arrive | ||||
increasing the perceived user latency. The study [AR_TRAFFIC] | ||||
shows that segmentation latency at 4G LTE (Long Term Evolution)'s | ||||
RAN (Radio Access Network)'s RLC (Radio Link Control) layer | ||||
impacts TCP's performance during slow-start. | ||||
6. Informative References | ||||
[ABR_1] Mao, H., Netravali, R., and M. Alizadeh, "Neural Adaptive | [ABR_1] Mao, H., Netravali, R., and M. Alizadeh, "Neural Adaptive | |||
Video Streaming with Pensieve", In Proceedings of the | Video Streaming with Pensieve", In Proceedings of the | |||
Conference of the ACM Special Interest Group on Data | Conference of the ACM Special Interest Group on Data | |||
Communication, pp. 197-210, 2017. | Communication, pp. 197-210, 2017. | |||
[ABR_2] Yan, F., Ayers, H., Zhu, C., Fouladi, S., Hong, J., Zhang, | [ABR_2] Yan, F., Ayers, H., Zhu, C., Fouladi, S., Hong, J., Zhang, | |||
K., Levis, P., and K. Winstein, "Learning in situ: a | K., Levis, P., and K. Winstein, "Learning in situ: a | |||
randomized experiment in video streaming", In 17th | randomized experiment in video streaming", In 17th | |||
USENIX Symposium on Networked Systems Design and | USENIX Symposium on Networked Systems Design and | |||
Implementation (NSDI 20), pp. 495-511, 2020. | Implementation (NSDI 20), pp. 495-511, 2020. | |||
[AR_TRAFFIC] | ||||
Apicharttrisorn, K., Balasubramanian, B., Chen, J., | ||||
Sivaraj, R., Tsai, Y., Jana, R., Krishnamurthy, S., Tran, | ||||
T., and Y. Zhou, "Characterization of Multi-User Augmented | ||||
Reality over Cellular Networks", In 17th Annual IEEE | ||||
International Conference on Sensing, Communication, and | ||||
Networking (SECON), pp. 1-9. IEEE, 2020. | ||||
[AUGMENTED] | ||||
Schmalstieg, D. and T. Hollerer, "Augmented | ||||
Reality", Addison Wesley, 2016. | ||||
[BATT_DRAIN] | [BATT_DRAIN] | |||
Seneviratne, S., Hu, Y., Nguyen, T., Lan, G., Khalifa, S., | Seneviratne, S., Hu, Y., Nguyen, T., Lan, G., Khalifa, S., | |||
Thilakarathna, K., Hassan, M., and A. Seneviratne, "A | Thilakarathna, K., Hassan, M., and A. Seneviratne, "A | |||
survey of wearable devices and challenges.", In IEEE | survey of wearable devices and challenges.", In IEEE | |||
Communication Surveys and Tutorials, 19(4), p.2573-2620., | Communication Surveys and Tutorials, 19(4), p.2573-2620., | |||
2017. | 2017. | |||
[BLUR] Kan, P. and H. Kaufmann, "Physically-Based Depth of Field | [BLUR] Kan, P. and H. Kaufmann, "Physically-Based Depth of Field | |||
in Augmented Reality.", In Eurographics (Short Papers), | in Augmented Reality.", In Eurographics (Short Papers), | |||
pp. 89-92., 2012. | pp. 89-92., 2012. | |||
skipping to change at page 7, line 47 ¶ | skipping to change at page 8, line 47 ¶ | |||
infrastructure, traffic and applications", John Wiley and | infrastructure, traffic and applications", John Wiley and | |||
Sons Inc., 2006. | Sons Inc., 2006. | |||
[HEAVY_TAIL_2] | [HEAVY_TAIL_2] | |||
Taleb, N., "The Statistical Consequences of Fat Tails", | Taleb, N., "The Statistical Consequences of Fat Tails", | |||
STEM Academic Press, 2020. | STEM Academic Press, 2020. | |||
[I-D.ietf-mops-streaming-opcons] | [I-D.ietf-mops-streaming-opcons] | |||
Holland, J., Begen, A., and S. Dawkins, "Operational | Holland, J., Begen, A., and S. Dawkins, "Operational | |||
Considerations for Streaming Media", draft-ietf-mops- | Considerations for Streaming Media", draft-ietf-mops- | |||
streaming-opcons-06 (work in progress), July 2021. | streaming-opcons-07 (work in progress), September 2021. | |||
[LENS_DIST] | [LENS_DIST] | |||
Fuhrmann, A. and D. Schmalstieg, "Practical calibration | Fuhrmann, A. and D. Schmalstieg, "Practical calibration | |||
procedures for augmented reality.", In Virtual | procedures for augmented reality.", In Virtual | |||
Environments 2000, pp. 3-12. Springer, Vienna, 2000. | Environments 2000, pp. 3-12. Springer, Vienna, 2000. | |||
[NOISE] Fischer, J., Bartz, D., and W. Strasser, "Enhanced visual | [NOISE] Fischer, J., Bartz, D., and W. Strasser, "Enhanced visual | |||
realism by incorporating camera image effects.", | realism by incorporating camera image effects.", | |||
In IEEE/ACM International Symposium on Mixed and | In IEEE/ACM International Symposium on Mixed and | |||
Augmented Reality, pp. 205-208., 2006. | Augmented Reality, pp. 205-208., 2006. | |||
skipping to change at page 8, line 45 ¶ | skipping to change at page 9, line 45 ¶ | |||
lighting variations for augmented reality with moving | lighting variations for augmented reality with moving | |||
cameras", In IEEE Transactions on visualization and | cameras", In IEEE Transactions on visualization and | |||
computer graphics, 18(4), pp.573-580, 2012. | computer graphics, 18(4), pp.573-580, 2012. | |||
[PREDICT] Buker, T., Vincenzi, D., and J. Deaton, "The effect of | [PREDICT] Buker, T., Vincenzi, D., and J. Deaton, "The effect of | |||
apparent latency on simulator sickness while using a see- | apparent latency on simulator sickness while using a see- | |||
through helmet-mounted display: Reducing apparent latency | through helmet-mounted display: Reducing apparent latency | |||
with predictive compensation..", In Human factors 54.2, | with predictive compensation..", In Human factors 54.2, | |||
pp. 235-249., 2012. | pp. 235-249., 2012. | |||
[REG] Holloway, R., "Registration error analysis for augmented | ||||
reality.", In Presence:Teleoperators and Virtual | ||||
Environments 6.4, pp. 413-432., 1997. | ||||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, | |||
<https://www.rfc-editor.org/info/rfc2119>. | <https://www.rfc-editor.org/info/rfc2119>. | |||
[SLAM_1] Ventura, J., Arth, C., Reitmayr, G., and D. Schmalstieg, | [SLAM_1] Ventura, J., Arth, C., Reitmayr, G., and D. Schmalstieg, | |||
"A minimal solution to the generalized pose-and-scale | "A minimal solution to the generalized pose-and-scale | |||
problem", In Proceedings of the IEEE Conference on | problem", In Proceedings of the IEEE Conference on | |||
Computer Vision and Pattern Recognition, pp. 422-429, | Computer Vision and Pattern Recognition, pp. 422-429, | |||
2014. | 2014. | |||
skipping to change at page 10, line 18 ¶ | skipping to change at page 11, line 18 ¶ | |||
United Kingdom | United Kingdom | |||
Email: renan.krishna@interdigital.com | Email: renan.krishna@interdigital.com | |||
Akbar Rahman | Akbar Rahman | |||
InterDigital Communications, LLC | InterDigital Communications, LLC | |||
1000 Sherbrooke Street West | 1000 Sherbrooke Street West | |||
Montreal H3A 3G4 | Montreal H3A 3G4 | |||
Canada | Canada | |||
Email: rahmansakbar@yahoo.com | Email: Akbar.Rahman@InterDigital.com | |||
End of changes. 15 change blocks. | ||||
33 lines changed or deleted | 86 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |