draft-ietf-opsawg-ntf-05.txt   draft-ietf-opsawg-ntf-06.txt 
OPSAWG H. Song OPSAWG H. Song
Internet-Draft Futurewei Internet-Draft Futurewei
Intended status: Informational F. Qin Intended status: Informational F. Qin
Expires: April 12, 2021 China Mobile Expires: July 25, 2021 China Mobile
P. Martinez-Julia P. Martinez-Julia
NICT NICT
L. Ciavaglia L. Ciavaglia
Nokia Nokia
A. Wang A. Wang
China Telecom China Telecom
October 9, 2020 January 21, 2021
Network Telemetry Framework Network Telemetry Framework
draft-ietf-opsawg-ntf-05 draft-ietf-opsawg-ntf-06
Abstract Abstract
Network telemetry is the technology for gaining network insight and Network telemetry is a technology for gaining network insight and
facilitating efficient and automated network management. It engages facilitating efficient and automated network management. It
various techniques for remote data collection, correlation, and encompasses various techniques for remote data generation,
consumption. This document provides an architectural framework for collection, correlation, and consumption. This document describes an
network telemetry, motivated by the network operation challenges and architectural framework for network telemetry, motivated by
requirements. As evidenced by some key characteristics and industry challenges that are encountered as part of the operation of networks
practices, network telemetry covers technologies and protocols beyond and by the requirements that ensue. Network telemetry, as
the conventional network Operations, Administration, and Management necessitated by best industry practices, covers technologies and
(OAM). It promises better flexibility, scalability, accuracy, protocols that extend beyond conventional network Operations,
coverage, and performance and allows automated control loops to suit Administration, and Management (OAM). The presented network
both today's and tomorrow's network operation. This document telemetry framework promises better flexibility, scalability,
clarifies the terminologies and classifies the modules and components accuracy, coverage, and performance. In addition, it facilitates the
of a network telemetry system from several different perspectives. implementation of automated control loops to address both today's and
The framework and taxonomy help to set a common ground for the tomorrow's network operational needs. This document clarifies the
collection of related work and provide guidance for related technique terminologies and classifies the modules and components of a network
and standard developments. telemetry system from several different perspectives. The framework
and taxonomy help to set a common ground for the collection of
related work and provide guidance for related technique and standard
developments.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on April 12, 2021. This Internet-Draft will expire on July 25, 2021.
Copyright Notice Copyright Notice
Copyright (c) 2020 IETF Trust and the persons identified as the Copyright (c) 2021 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Background . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Background . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1. Telemetry Data Coverage . . . . . . . . . . . . . . . . . 5 2.1. Telemetry Data Coverage . . . . . . . . . . . . . . . . . 5
2.2. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3. Challenges . . . . . . . . . . . . . . . . . . . . . . . 6 2.3. Challenges . . . . . . . . . . . . . . . . . . . . . . . 7
2.4. Glossary . . . . . . . . . . . . . . . . . . . . . . . . 8 2.4. Glossary . . . . . . . . . . . . . . . . . . . . . . . . 8
2.5. Network Telemetry . . . . . . . . . . . . . . . . . . . . 9 2.5. Network Telemetry . . . . . . . . . . . . . . . . . . . . 10
3. The Necessity of a Network Telemetry Framework . . . . . . . 11 3. The Necessity of a Network Telemetry Framework . . . . . . . 12
4. Network Telemetry Framework . . . . . . . . . . . . . . . . . 13 4. Network Telemetry Framework . . . . . . . . . . . . . . . . . 13
4.1. Top Level Modules . . . . . . . . . . . . . . . . . . . . 13 4.1. Top Level Modules . . . . . . . . . . . . . . . . . . . . 13
4.1.1. Management Plane Telemetry . . . . . . . . . . . . . 16 4.1.1. Management Plane Telemetry . . . . . . . . . . . . . 17
4.1.2. Control Plane Telemetry . . . . . . . . . . . . . . . 16 4.1.2. Control Plane Telemetry . . . . . . . . . . . . . . . 17
4.1.3. Data Plane Telemetry . . . . . . . . . . . . . . . . 17 4.1.3. Forwarding Plane Telemetry . . . . . . . . . . . . . 18
4.1.4. External Data Telemetry . . . . . . . . . . . . . . . 19 4.1.4. External Data Telemetry . . . . . . . . . . . . . . . 20
4.2. Second Level Function Components . . . . . . . . . . . . 19 4.2. Second Level Function Components . . . . . . . . . . . . 20
4.3. Data Acquiring Mechanism and Type Abstraction . . . . . . 21 4.3. Data Acquiring Mechanism and Type Abstraction . . . . . . 22
4.4. Existing Works Mapped in the Framework . . . . . . . . . 23 4.4. Existing Works Mapped in the Framework . . . . . . . . . 24
5. Evolution of Network Telemetry . . . . . . . . . . . . . . . 24 5. Evolution of Network Telemetry . . . . . . . . . . . . . . . 26
6. Security Considerations . . . . . . . . . . . . . . . . . . . 25 6. Security Considerations . . . . . . . . . . . . . . . . . . . 26
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 26 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 27
8. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 26 8. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 28
9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 26 9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 28
10. Informative References . . . . . . . . . . . . . . . . . . . 26 10. Informative References . . . . . . . . . . . . . . . . . . . 28
Appendix A. A Survey on Existing Network Telemetry Techniques . 30 Appendix A. A Survey on Existing Network Telemetry Techniques . 32
A.1. Management Plane Telemetry . . . . . . . . . . . . . . . 30 A.1. Management Plane Telemetry . . . . . . . . . . . . . . . 32
A.1.1. Push Extensions for NETCONF . . . . . . . . . . . . . 30 A.1.1. Push Extensions for NETCONF . . . . . . . . . . . . . 32
A.1.2. gRPC Network Management Interface . . . . . . . . . . 31 A.1.2. gRPC Network Management Interface . . . . . . . . . . 33
A.2. Control Plane Telemetry . . . . . . . . . . . . . . . . . 31 A.2. Control Plane Telemetry . . . . . . . . . . . . . . . . . 33
A.2.1. BGP Monitoring Protocol . . . . . . . . . . . . . . . 31 A.2.1. BGP Monitoring Protocol . . . . . . . . . . . . . . . 33
A.3. Data Plane Telemetry . . . . . . . . . . . . . . . . . . 32 A.3. Data Plane Telemetry . . . . . . . . . . . . . . . . . . 34
A.3.1. The Alternate Marking technology . . . . . . . . . . 32 A.3.1. The Alternate Marking technology . . . . . . . . . . 34
A.3.2. Dynamic Network Probe . . . . . . . . . . . . . . . . 33 A.3.2. Dynamic Network Probe . . . . . . . . . . . . . . . . 35
A.3.3. IP Flow Information Export (IPFIX) protocol . . . . . 33 A.3.3. IP Flow Information Export (IPFIX) protocol . . . . . 35
A.3.4. In-Situ OAM . . . . . . . . . . . . . . . . . . . . . 34 A.3.4. In-Situ OAM . . . . . . . . . . . . . . . . . . . . . 35
A.3.5. Postcard Based Telemetry . . . . . . . . . . . . . . 34 A.3.5. Postcard Based Telemetry . . . . . . . . . . . . . . 36
A.4. External Data and Event Telemetry . . . . . . . . . . . . 34 A.4. External Data and Event Telemetry . . . . . . . . . . . . 36
A.4.1. Sources of External Events . . . . . . . . . . . . . 34 A.4.1. Sources of External Events . . . . . . . . . . . . . 36
A.4.2. Connectors and Interfaces . . . . . . . . . . . . . . 36 A.4.2. Connectors and Interfaces . . . . . . . . . . . . . . 37
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 36 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 38
1. Introduction 1. Introduction
Network visibility is the ability of management tools to see the Network visibility is the ability of management tools to see the
state and behavior of a network. It is essential for successful state and behavior of a network, which is essential for successful
network operation. Network telemetry is the process of measuring, network operation. Network Telemetry revolves around network data
correlating, recording, and distributing information about the that can help provide insights about the current state of the
behavior of a network. Network telemetry has been considered as an network, including network devices, forwarding, control, and
ideal means to gain sufficient network visibility with better management planes, and that can be generated and obtained through a
flexibility, scalability, accuracy, coverage, and performance than variety of techniques, including but not limited to network
some conventional network Operations, Administration, and Management instrumentation and measurements, and that can be processed for
(OAM) techniques. purposes ranging from service assurance to network security using a
wide variety of techniques including machine learning, data analysis,
and correlation. In this document, Network Telemetry refer to both
the data itself (i.e., "Network Telemetry Data"), and the techniques
and processes used to generate, export, collect, and consume that
data for use by potentially automated management applications.
Network telemetry extends beyond the conventional network Operations,
Administration, and Management (OAM) techniques and expects to
support better flexibility, scalability, accuracy, coverage, and
performance.
However, the term of network telemetry lacks a solid and unambiguous However, the term of network telemetry lacks a solid and unambiguous
definition. The scope and coverage of it cause confusion and definition. The scope and coverage of it cause confusion and
misunderstandings. It is beneficial to clarify the concept and misunderstandings. It is beneficial to clarify the concept and
provide a clear architectural framework for network telemetry, so we provide a clear architectural framework for network telemetry, so we
can articulate the technical field, and better align the related can articulate the technical field, and better align the related
techniques and standard works. techniques and standard works.
To fulfill such an undertaking, we first discuss some key To fulfill such an undertaking, we first discuss some key
characteristics of network telemetry which set a clear distinction characteristics of network telemetry which set a clear distinction
from the conventional network OAM and show that some conventional OAM from the conventional network OAM and show that some conventional OAM
technologies can be considered a subset of the network telemetry technologies can be considered a subset of the network telemetry
technologies. We then provide an architectural framework for network technologies. We then provide an architectural framework for network
telemetry by partitioning a network telemetry system into four telemetry which includes four modules, each concerned with a
modules each with the same building components and data abstracts. different category of telemetry data and corresponding procedures.
We show how the network telemetry framework can benefit the current All the modules are internally structured in the same way, including
and future network operations. Based on the distinction of modules components that allow to configure data sources with regards to what
and function components, we can map the existing and emerging data to generate and how to make that available to client
techniques and protocols into the framework. The framework can also applications, components that instrument the underlying data sources,
simplify the tasks for designing, maintaining, and understanding a and components that perform the actual rendering, encoding, and
network telemetry system. At last, we outline the evolution stages exporting of the generated data. We show how the network telemetry
of the network telemetry system and discuss the potential security framework can benefit the current and future network operations.
concerns. Based on the distinction of modules and function components, we can
map the existing and emerging techniques and protocols into the
framework. The framework can also simplify the tasks for designing,
maintaining, and understanding a network telemetry system. At last,
we outline the evolution stages of the network telemetry system and
discuss the potential security concerns.
The purpose of the framework and taxonomy is to set a common ground The purpose of the framework and taxonomy is to set a common ground
for the collection of related work and provide guidance for future for the collection of related work and provide guidance for future
technique and standard developments. To the best of our knowledge, technique and standard developments. To the best of our knowledge,
this document is the first such effort for network telemetry in this document is the first such effort for network telemetry in
industry standards organizations. industry standards organizations.
2. Background 2. Background
The term "big data" is used to describe the extremely large volume of The term "big data" is used to describe the extremely large volume of
data sets that can be analyzed computationally to reveal patterns, data sets that can be analyzed computationally to reveal patterns,
trends, and associations. Network is undoubtedly a source of big trends, and associations. Networks are undoubtedly a source of big
data because of its scale and all the traffic goes through it. It is data because of their scale and the volume of network traffic they
easy to see that network OAM can benefit from network big data. forward. It is easy to see that network operations can benefit from
network big data.
Today one can access advanced big data analytics capability through a Today one can access advanced big data analytics capability through a
plethora of commercial and open source platforms (e.g., Apache plethora of commercial and open source platforms (e.g., Apache
Hadoop), tools (e.g., Apache Spark), and techniques (e.g., machine Hadoop), tools (e.g., Apache Spark), and techniques (e.g., machine
learning). Thanks to the advance of computing and storage learning). Thanks to the advance of computing and storage
technologies, network big data analytics gives network operators an technologies, network big data analytics gives network operators an
opportunity to gain network insights and move towards network opportunity to gain network insights and move towards network
autonomy. Some operators start to explore the application of autonomy. Some operators start to explore the application of
Artificial Intelligence (AI) to make sense of network data. Software Artificial Intelligence (AI) to make sense of network data. Software
tools can use the network data to detect and react on network faults, tools can use the network data to detect and react on network faults,
skipping to change at page 4, line 50 skipping to change at page 5, line 19
However, while the data processing capability is improved and However, while the data processing capability is improved and
applications are hungry for more data, the networks lag behind in applications are hungry for more data, the networks lag behind in
extracting and translating network data into useful and actionable extracting and translating network data into useful and actionable
information in efficient ways. The system bottleneck is shifting information in efficient ways. The system bottleneck is shifting
from data consumption to data supply. Both the number of network from data consumption to data supply. Both the number of network
nodes and the traffic bandwidth keep increasing at a fast pace. The nodes and the traffic bandwidth keep increasing at a fast pace. The
network configuration and policy change at smaller time slots than network configuration and policy change at smaller time slots than
before. More subtle events and fine-grained data through all network before. More subtle events and fine-grained data through all network
planes need to be captured and exported in real time. In a nutshell, planes need to be captured and exported in real time. In a nutshell,
it is a challenge to get enough high-quality data out of network it is a challenge to get enough high-quality data out of the network
efficiently, timely, and flexibly. Therefore, we need to examine the in a manner that is efficient, timely, and flexible. Therefore, we
existing network technologies and protocols, and identify any need to survey the existing technologies and protocols and identify
potential technique and standard gaps based on the real network and any potential gaps.
device architectures.
In the remaining of this section, first we clarify the scope of In the remainder of this section, first we clarify the scope of
network data (i.e., telemetry data) concerned in the context. Then, network data (i.e., telemetry data) concerned in the context. Then,
we discuss several key use cases for today's and future network we discuss several key use cases for today's and future network
operations. Next, we show why the current network OAM techniques and operations. Next, we show why the current network OAM techniques and
protocols are insufficient for these use cases. The discussion protocols are insufficient for these use cases. The discussion
underlines the need of new methods, techniques, and protocols which underlines the need of new methods, techniques, and protocols which
we assign under an umbrella term - network telemetry. we assign under the umbrella term - Network Telemetry.
2.1. Telemetry Data Coverage 2.1. Telemetry Data Coverage
Any information that can be extracted from networks (including data Any information that can be extracted from networks (including data
plane, control plane, and management plane) and used to gain plane, control plane, and management plane) and used to gain
visibility or as basis for actions is considered telemetry data. It visibility or as basis for actions is considered telemetry data. It
includes statistics, event records and logs, snapshots of state, includes statistics, event records and logs, snapshots of state,
configuration data, etc. It also covers the outputs of any active configuration data, etc. It also covers the outputs of any active
and passive measurements [RFC7799]. Specially, raw data can be and passive measurements [RFC7799]. Specially, raw data can be
processed in network before sending to a data consumer. Such processed in-network before being sent to a data consumer. Such
processed data are also telemetry data in the context. A processed data is also considered telemetry data. A classification
classification of the telemetry data form is provided in Section 4. of telemetry data is provided in Section 4.
2.2. Use Cases 2.2. Use Cases
These use cases are essential for network operations. While the list The following set of use cases is essential for network operations.
is by no means exhaustive, it is enough to highlight the requirements While the list is by no means exhaustive, it is enough to highlight
for data velocity, variety, volume, and veracity in networks. the requirements for data velocity, variety, volume, and veracity in
networks.
Security: Network intrusion detection and prevention need monitor Security: Network intrusion detection and prevention systems need to
network traffic and activities, and act upon anomalies. Given the monitor network traffic and activities and act upon anomalies.
more and more sophisticated attack vector and higher and higher Given increasingly sophisticated attack vector coupled with
tolls due to security breach, new tools and techniques need to be increasingly severe consequences of security breaches, new tools
developed, relying on wider and deeper visibility in networks. and techniques need to be developed, relying on wider and deeper
visibility in networks.
Policy and Intent Compliance: Network policies are the rules that Policy and Intent Compliance: Network policies are the rules that
constraint the services for network access, provide service constraint the services for network access, provide service
differentiation, or enforce specific treatment on the traffic. differentiation, or enforce specific treatment on the traffic.
For example, a service function chain is a policy that requires For example, a service function chain is a policy that requires
the selected flows to pass through a set of ordered network the selected flows to pass through a set of ordered network
functions. Intent, as defined in functions. Intent, as defined in
[I-D.irtf-nmrg-ibn-concepts-definitions], is a set of operational [I-D.irtf-nmrg-ibn-concepts-definitions], is a set of operational
goal that a network should meet and outcomes that a network is goal that a network should meet and outcomes that a network is
supposed to deliver, defined in a declarative manner without supposed to deliver, defined in a declarative manner without
skipping to change at page 6, line 15 skipping to change at page 6, line 35
needs to be reported immediately. needs to be reported immediately.
SLA Compliance: A Service-Level Agreement (SLA) defines the level of SLA Compliance: A Service-Level Agreement (SLA) defines the level of
service a user expects from a network operator, which include the service a user expects from a network operator, which include the
metrics for the service measurement and remedy/penalty procedures metrics for the service measurement and remedy/penalty procedures
when the service level misses the agreement. Users need to check when the service level misses the agreement. Users need to check
if they get the service as promised and network operators need to if they get the service as promised and network operators need to
evaluate how they can deliver the services that can meet the SLA evaluate how they can deliver the services that can meet the SLA
based on realtime network measurement. based on realtime network measurement.
Root Cause Analysis: Any network failure can be the cause or effect Root Cause Analysis: Any network failure can be the effect of a
of a sequence of chained events. Troubleshooting and recovery sequence of chained events. Troubleshooting and recovery require
require quick identification of the root cause of any observable quick identification of the root cause of any observable issues.
issues. However, the root cause is not always straightforward to However, the root cause is not always straightforward to identify,
identify, especially when the failure is sporadic and the related especially when the failure is sporadic and the number of event
and unrelated events are overwhelming and interleaved. While messages, both related and unrelated to the same cause, is
machine learning technologies can be used for root cause analysis, overwhelming. While machine learning technologies can be used for
it up to the network to sense and provide the relevant data. root cause analysis, it up to the network to sense and provide the
relevant data.
Network Optimization: This covers all short-term and long-term Network Optimization: This covers all short-term and long-term
network optimization techniques, including load balancing, Traffic network optimization techniques, including load balancing, Traffic
Engineering (TE), and network planning. Network operators are Engineering (TE), and network planning. Network operators are
motivated to optimize their network utilization and differentiate motivated to optimize their network utilization and differentiate
services for better Return On Investment (ROI) or lower Capital services for better Return On Investment (ROI) or lower Capital
Expenditures (CAPEX). The first step is to know the real-time Expenditures (CAPEX). The first step is to know the real-time
network conditions before applying policies for traffic network conditions before applying policies for traffic
manipulation. In some cases, micro-bursts need to be detected in manipulation. In some cases, micro-bursts need to be detected in
a very short time-frame so that fine-grained traffic control can a very short time-frame so that fine-grained traffic control can
be applied to avoid network congestion. The long-term network be applied to avoid network congestion. Long-term planning of
capacity planning and topology augmentation rely on the network capacity and topology requires analysis of real-world
accumulated data of network operations. network telemetry data that is obtained over long periods of time.
Event Tracking and Prediction: The visibility of traffic path and Event Tracking and Prediction: The visibility of traffic path and
performance is critical for services and applications that rely on performance is critical for services and applications that rely on
healthy network operation. Numerous related network events are of healthy network operation. Numerous related network events are of
interest to network operators. For example, Network operators interest to network operators. For example, Network operators
want to learn where and why packets are dropped for an application want to learn where and why packets are dropped for an application
flow. They also want to be warned of issues in advance so flow. They also want to be warned of issues in advance so
proactive actions can be taken to avoid catastrophic consequences. proactive actions can be taken to avoid catastrophic consequences.
2.3. Challenges 2.3. Challenges
For a long time, network operators have relied upon SNMP [RFC3416], For a long time, network operators have relied upon SNMP [RFC3416],
Command-Line Interface (CLI), or Syslog to monitor the network. Some Command-Line Interface (CLI), or Syslog to monitor the network. Some
other OAM techniques as described in [RFC7276] are also used to other OAM techniques as described in [RFC7276] are also used to
facilitate network troubleshooting. these conventional techniques facilitate network troubleshooting. These conventional techniques
are not sufficient to support the above use cases for the following are not sufficient to support the above use cases for the following
reasons, which explains why new standards and techniques keep reasons:
emerging and the needs remain high:
o Most use cases need to continuously monitor the network and o Most use cases need to continuously monitor the network and
dynamically refine the data collection in real-time. The poll- dynamically refine the data collection in real-time. The poll-
based low-frequency data collection is ill-suited for these based low-frequency data collection is ill-suited for these
applications. Subscription-based streaming data directly pushed applications. Subscription-based streaming data directly pushed
from the data source (e.g., the forwarding chip) is preferred to from the data source (e.g., the forwarding chip) is preferred to
provide enough data quantity and precision at scale. provide enough data quantity and precision at scale.
o Comprehensive data is needed from packet processing engine to o Comprehensive data is needed from packet processing engine to
traffic manager, from line cards to main control board, from user traffic manager, from line cards to main control board, from user
skipping to change at page 8, line 7 skipping to change at page 8, line 25
precision which are beyond the capability of the existing precision which are beyond the capability of the existing
techniques. techniques.
o The conventional passive measurement techniques can either consume o The conventional passive measurement techniques can either consume
excessive network resources and render excessive redundant data, excessive network resources and render excessive redundant data,
or lead to inaccurate results; on the other hand, the conventional or lead to inaccurate results; on the other hand, the conventional
active measurement techniques can interfere with the user traffic active measurement techniques can interfere with the user traffic
and their results are indirect. Techniques that can collect and their results are indirect. Techniques that can collect
direct and on-demand data from user traffic are more favorable. direct and on-demand data from user traffic are more favorable.
These challenges were addressed by newer standards and techniques
(e.g., IPFIX/Netflow, PSAMP, IOAM, and YANG-Push) and more are
emerging. These standards and techniques need to be recognized and
accommodated in a new framework.
2.4. Glossary 2.4. Glossary
Before further discussion, we list some key terminology and acronyms Before further discussion, we list some key terminology and acronyms
used in this documents. We make an intended differentiation between used in this documents. We make an intended differentiation between
network telemetry and network OAM. However, it should be understood the terms of network telemetry and OAM. However, it should be
that there is not a hard-line distinction between the two concepts. understood that there is not a hard-line distinction between the two
Rather, some OAM techniques are in the scope of network telemetry. concepts. Rather, network telemetry is considered as the extension
of OAM. It covers all the existing OAM protocols but puts more
emphasis on the newer and emerging techniques and protocols
concerning all aspects of network data from acquisition to
consumption.
AI: Artificial Intelligence. In network domain, AI refers to the AI: Artificial Intelligence. In network domain, AI refers to the
machine-learning based technologies for automated network machine-learning based technologies for automated network
operation and other tasks. operation and other tasks.
AM: Alternate Marking, a flow performance measurement method, AM: Alternate Marking, a flow performance measurement method,
specified in [RFC8321]. specified in [RFC8321].
BMP: BGP Monitoring Protocol, specified in [RFC7854]. BMP: BGP Monitoring Protocol, specified in [RFC7854].
skipping to change at page 8, line 46 skipping to change at page 9, line 24
IPFIX: IP Flow Information Export Protocol, specified in [RFC7011]. IPFIX: IP Flow Information Export Protocol, specified in [RFC7011].
IOAM: In-situ OAM, a dataplane on-path telemetry technique. IOAM: In-situ OAM, a dataplane on-path telemetry technique.
NETCONF: Network Configuration Protocol, specified in [RFC6241]. NETCONF: Network Configuration Protocol, specified in [RFC6241].
NetFlow: A Cisco protocol for flow record collecting, described in NetFlow: A Cisco protocol for flow record collecting, described in
[RFC3594]. [RFC3594].
Network Telemetry: Acquiring and processing network data remotely Network Telemetry: The process and instrumentation for acquiring and
for network monitoring and operation. A general term for a large utilizing network data remotely for network monitoring and
set of network visibility techniques and protocols, with the operation. A general term for a large set of network visibility
characteristics defined in this document. Network telemetry techniques and protocols, concerning aspects like data generation,
collection, correlation, and consumption. Network telemetry
addresses the current network operation issues and enables smooth addresses the current network operation issues and enables smooth
evolution toward future intent-driven autonomous networks. evolution toward future intent-driven autonomous networks.
NMS: Network Management System, referring to applications that allow NMS: Network Management System, referring to applications that allow
network administrators manage a network's software and hardware network administrators manage a network.
components. It usually records data from a network's remote
points to carry out central reporting to a system administrator.
OAM: Operations, Administration, and Maintenance. A group of OAM: Operations, Administration, and Maintenance. A group of
network management functions that provide network fault network management functions that provide network fault
indication, fault localization, performance information, and data indication, fault localization, performance information, and data
and diagnosis functions. Most conventional network monitoring and diagnosis functions. Most conventional network monitoring
techniques and protocols belong to network OAM. techniques and protocols belong to network OAM.
PBT: Postcard-Based Telemetry, a dataplane on-path telemetry PBT: Postcard-Based Telemetry, a dataplane on-path telemetry
technique. technique.
skipping to change at page 9, line 30 skipping to change at page 10, line 7
[RFC2578]. [RFC2578].
SNMP: Simple Network Management Protocol. Version 1 and 2 are SNMP: Simple Network Management Protocol. Version 1 and 2 are
specified in [RFC1157] and [RFC3416], respectively. specified in [RFC1157] and [RFC3416], respectively.
YANG: The abbreviation of "Yet Another Next Generation". YANG is a YANG: The abbreviation of "Yet Another Next Generation". YANG is a
data modeling language for the definition of data sent over data modeling language for the definition of data sent over
network management protocols such as the NETCONF and RESTCONF. network management protocols such as the NETCONF and RESTCONF.
YANG is defined in [RFC6020]. YANG is defined in [RFC6020].
YANG ECN A YANG model for Event-Condition-Action policies, defined YANG ECA A YANG model for Event-Condition-Action policies, defined
in [I-D.wwx-netmod-event-yang]. in [I-D.wwx-netmod-event-yang].
YANG FSM: A YANG model that describes events, operations, and finite YANG FSM: A YANG model that describes events, operations, and finite
state machine of YANG-defined network elements. state machine of YANG-defined network elements.
YANG PUSH: A method to subscribe pushed data from remote YANG YANG PUSH: A method to subscribe pushed data from remote YANG
datastore on network devices. Details are specified in [RFC8641] datastore on network devices. Details are specified in [RFC8641]
and [RFC8639]. and [RFC8639].
2.5. Network Telemetry 2.5. Network Telemetry
Network telemetry has emerged as a mainstream technical term to refer Network telemetry has emerged as a mainstream technical term to refer
to the newer data collection and consumption techniques, to the network data collection and consumption techniques. Several
distinguishing itself in some notable ways from the convention network telemetry techniques and protocols (e.g., IPFIX [RFC7011] and
network OAM. Several such techniques have been widely deployed. The gPRC [grpc]) have been widely deployed. Network telemetry allows
representative techniques and protocols include IPFIX [RFC7011] and separate entities to acquire data from network devices so that data
gPRC [grpc]. Network telemetry allows separate entities to acquire can be visualized and analyzed to support network monitoring and
data from network devices so that data can be visualized and analyzed operation. Network telemetry covers the conventional network OAM and
to support network monitoring and operation. Network telemetry has a wider scope. It is expected that network telemetry can provide
overlaps with the conventional network OAM and has a wider scope than the necessary network insight for autonomous networks and address the
it. It is expected that network telemetry can provide the necessary shortcomings of conventional OAM techniques.
network insight for autonomous networks and address the shortcomings
of conventional OAM techniques.
One difference between the network telemetry and the conventional Network telemetry usually assumes machines as data consumer rather
network OAM is that in general the network telemetry assumes machines than human operators. Hence, the network telemetry can directly
as data consumer rather than human operators. Hence, the network trigger the automated network operation, while in contrast some
telemetry can directly trigger the automated network operation, while conventional OAM tools are designed and used to help human operators
the conventional OAM tools usually help human operators to monitor to monitor and diagnose the networks and guide manual network
and diagnose the networks and guide manual network operations. The operations. Such a proposition leads to very different techniques.
difference leads to very different techniques.
Although the network telemetry techniques are just emerging and Although new network telemetry techniques are emerging and subject to
subject to continuous evolution, several characteristics of network continuous evolution, several characteristics of network telemetry
telemetry have been well accepted. Note that network telemetry is have been well accepted. Note that network telemetry is intended to
intended to be an umbrella term covering a wide spectrum of be an umbrella term covering a wide spectrum of techniques, so the
techniques, so the following characteristics are not expected to be following characteristics are not expected to be held by every
held by every specific technique. specific technique.
o Push and Streaming: Instead of polling data from network devices, o Push and Streaming: Instead of polling data from network devices,
the telemetry collector subscribes to the streaming data pushed telemetry collectors subscribe to streaming data pushed from data
from data sources in network devices. sources in network devices.
o Volume and Velocity: The telemetry data is intended to be consumed o Volume and Velocity: The telemetry data is intended to be consumed
by machines rather than by human being. Therefore, the data by machines rather than by human being. Therefore, the data
volume is huge and the processing is often in realtime. volume is huge and the processing is often in realtime.
o Normalization and Unification: Telemetry aims to address the o Normalization and Unification: Telemetry aims to address the
overall network automation needs. The piecemeal solutions offered overall network automation needs. Efforts are made to normalize
by the conventional OAM approach are no longer suitable. Efforts the data representation and unify the protocols, so to simplify
need to be made to normalize the data representation and unify the data analysis and tying it all in with automation solutions
protocols.
o Model-based: The telemetry data is modeled in advance which allows o Model-based: The telemetry data is modeled in advance which allows
applications to configure and consume data with ease. applications to configure and consume data with ease.
o Data Fusion: The data for a single application can come from o Data Fusion: The data for a single application can come from
multiple data sources (e.g., cross-domain, cross-device, and multiple data sources (e.g., cross-domain, cross-device, and
cross-layer) and needs to be correlated to take effect. cross-layer) and needs to be correlated to take effect.
o Dynamic and Interactive: Since the network telemetry means to be o Dynamic and Interactive: Since the network telemetry means to be
used in a closed control loop for network automation, it needs to used in a closed control loop for network automation, it needs to
skipping to change at page 11, line 37 skipping to change at page 12, line 8
data collection approaches, the new hybrid approach allows to data collection approaches, the new hybrid approach allows to
directly collect data for any target flow on its entire forwarding directly collect data for any target flow on its entire forwarding
path [I-D.song-opsawg-ifit-framework]. path [I-D.song-opsawg-ifit-framework].
It is worth noting that, a network telemetry system should not be It is worth noting that, a network telemetry system should not be
intrusive to normal network operations, by avoiding the pitfall of intrusive to normal network operations, by avoiding the pitfall of
the "observer effect". That is, it should not change the network the "observer effect". That is, it should not change the network
behavior and affect the forwarding performance. Otherwise, the whole behavior and affect the forwarding performance. Otherwise, the whole
purpose of network telemetry is defied. purpose of network telemetry is defied.
Although in many cases a network telemetry system involves a remote Although in many cases a system for network telemetry involves a
data collecting, processing, and reacting entity, it is important to remote data collecting and consuming entity, it is important to
understand that network telemetry does not infer the necessity of understand that there are no inherent assumptions about how a system
such an entity. Telemetry data producers and consumers can work in should be architected. Telemetry data producers and consumers can
distributed or peer-to-peer fashions instead. In such cases, a work in distributed or peer-to-peer fashions rather than assuming a
network node can be the direct consumer of telemetry data from other centralized data consuming entity. In such cases, a network node can
nodes. be the direct consumer of telemetry data from other nodes.
3. The Necessity of a Network Telemetry Framework 3. The Necessity of a Network Telemetry Framework
Network data analytics and machine-learning technologies are applied Network data analytics and machine-learning technologies are applied
for network operation automation, relying on abundant and coherent for network operation automation, relying on abundant and coherent
data from networks. The single-sourced and static data acquisition data from networks. Data acquisition that is limited to a single
cannot meet the data requirements. The scattered standards and source and static in nature will in many cases not be sufficient to
diverse techniques are hard to be integrated. It is desirable to meet an application's telemetry data needs. As a result, multiple
have a framework that classifies and organizes different telemetry data sources, involving a variety of techniques and standards, will
data source and types, defines different components of a network need to be integrated. It is desirable to have a framework that
telemetry system and their interactions, and helps coordinate and classifies and organizes different telemetry data source and types,
integrate multiple telemetry approaches from different layers. This defines different components of a network telemetry system and their
allows flexible combinations for different applications, while interactions, and helps coordinate and integrate multiple telemetry
normalizing and simplifying interfaces. In detail, such a framework approaches across layers. This allows flexible combinations of data
would benefit application development for the following reasons: for different applications, while normalizing and simplifying
interfaces. In detail, such a framework would benefit application
development for the following reasons:
o The future autonomous networks will require a holistic view on o Future networks, autonomous or otherwise, depend on holistic and
network visibility. All the use cases and applications need to be comprehensive network visibility. All the use cases and
supported uniformly and coherently under a single intelligent applications are better to be supported uniformly and coherently
agent. Therefore, the protocols and mechanisms should be under a single intelligent agent. Therefore, the protocols and
consolidated into a minimum yet comprehensive set. A telemetry mechanisms should be consolidated into a minimum yet comprehensive
framework can help to normalize the technique developments. set. A telemetry framework can help to normalize the technique
developments.
o Network visibility presents multiple viewpoints. For example, the o Network visibility presents multiple viewpoints. For example, the
device viewpoint takes the network infrastructure as the device viewpoint takes the network infrastructure as the
monitoring object from which the network topology and device monitoring object from which the network topology and device
status can be acquired; the traffic viewpoint takes the flows or status can be acquired; the traffic viewpoint takes the flows or
packets as the monitoring object from which the traffic quality packets as the monitoring object from which the traffic quality
and path can be acquired. An application may need to switch its and path can be acquired. An application may need to switch its
viewpoint during operation. It may also need to correlate a viewpoint during operation. It may also need to correlate a
service and its impact on network experience to acquire the service and its impact on network experience to acquire the
comprehensive information. comprehensive information.
o Applications require network telemetry to be elastic in order to o Applications require network telemetry to be elastic in order to
efficiently use the network resource and reduce the performance make efficient use of network resources and reduce the impact of
impact. Routine network monitoring covers the entire network with processing related to network telemetry on network performance.
low data sampling rate. When issues arise or trends emerge, the For example, routine network monitoring should cover the entire
telemetry data source can be modified and the data rate can be network with a low data sampling rate. Only when issues arise or
boosted. critical trends emerge should telemetry data source be modified
and telemetry data rates boosted as needed.
o Efficient data fusion is critical for applications to reduce the o Efficient data fusion is critical for applications to reduce the
overall quantity of data and improve the accuracy of analysis. overall quantity of data and improve the accuracy of analysis.
A telemetry framework collects together all of the telemetry-related A telemetry framework collects together all of the telemetry-related
works from different sources and working groups within IETF. This works from different sources and working groups within IETF. This
makes it possible to assemble a comprehensive network telemetry makes it possible to assemble a comprehensive network telemetry
system and to avoid repetitious or redundant work. The framework system and to avoid repetitious or redundant work. The framework
should cover the concepts and components from the standardization should cover the concepts and components from the standardization
perspective. This document clarifies the layered modules on which perspective. This document describes the modules which make up a
the telemetry is exerted and decomposes the telemetry system into a network telemetry framework and decomposes the telemetry system into
set of distinct components that the existing and future work can a set of distinct components that existing and future work can easily
easily map to. map to.
4. Network Telemetry Framework 4. Network Telemetry Framework
The top level network telemetry framework partitions the network The top level network telemetry framework partitions the network
telemetry into four modules based on the telemetry data object source telemetry into four modules based on the telemetry data object source
and represents their relationship. The next level framework reveals and represents their relationship. At the next level, the framework
that each module replicates the same architecture comprising the same decomposes each module into separate components. Each of the modules
set of components. Throughout the framework, the same set of follows the same underlying structure, with one component dedicated
to the configuration of data subscriptions and data sources, a second
component dedicated to encoding and exporting data, and a third
component instrumenting the generation of telemetry related to the
underlying resources. Throughout the framework, the same set of
abstract data acquiring mechanisms and data types are applied. The abstract data acquiring mechanisms and data types are applied. The
two-level architecture with the uniform data abstraction helps two-level architecture with the uniform data abstraction helps
accurately pinpoint a protocol or technique to its position in a accurately pinpoint a protocol or technique to its position in a
network telemetry system or disaggregate a network telemetry system network telemetry system or disaggregate a network telemetry system
into manageable parts. into manageable parts.
4.1. Top Level Modules 4.1. Top Level Modules
Telemetry can be applied on the forwarding plane, the control plane, Telemetry can be applied on the forwarding plane, the control plane,
and the management plane in a network, as well as other sources out and the management plane in a network, as well as other sources out
skipping to change at page 14, line 12 skipping to change at page 14, line 37
Figure 1: Modules in Layer Category of NTF Figure 1: Modules in Layer Category of NTF
The rationale of this partition lies in the different telemetry data The rationale of this partition lies in the different telemetry data
objects which result in different data source and export locations. objects which result in different data source and export locations.
Such differences have profound implications on in-network data Such differences have profound implications on in-network data
programming and processing capability, data encoding and transport programming and processing capability, data encoding and transport
protocol, and data bandwidth and latency. protocol, and data bandwidth and latency.
We summarize the major differences of the four modules in the We summarize the major differences of the four modules in the
following table. They are compared from six aspects: data object, following table. They are compared from six aspects:
data export location, data model, data encoding, telemetry protocol,
and transport method. Data object is the target and source of each o Data Object
module. Because the data source varies, the data export location
varies. For example, the forwarding plane data are mainly from the o Data Export Location
fast path(e.g., forwarding chips) while the control plane data are
mainly from the slow path (e.g., main control CPU). For convenience o Data Model
and efficiency, it is preferred to export the data from locations
near the source. Because each data export location has different o Data Encoding
capability, the proper data model, encoding, and transport method
cannot be kept the same. For example, the forwarding chip has high o Telemetry Protocol
throughput but limited capacity for processing complex data and
maintaining states, while the main control CPU is capable of complex o Transport Method
data and state processing, but has limited bandwidth for high
throughput data. As a result, the suitable telemetry protocol for Data object is the target and source of each module. Because the
each module can be different. Some representative techniques are data source varies, the data export location varies. For example,
shown in the corresponding table blocks to highlight the technical the forwarding plane data are mainly from the fast path(e.g.,
diversity of these modules. The key point is that one cannot expect forwarding chips) while the control plane data are mainly from the
to use a universal protocol to cover all the network telemetry slow path (e.g., main control CPU). For convenience and efficiency,
requirements. it is preferred to export the data from locations near the source.
Because each data export location has different capability, the
proper data model, encoding, and transport method cannot be kept the
same. For example, the forwarding chip has high throughput but
limited capacity for processing complex data and maintaining states,
while the main control CPU is capable of complex data and state
processing, but has limited bandwidth for high throughput data. As a
result, the suitable telemetry protocol for each module can be
different. Some representative techniques are shown in the
corresponding table blocks to highlight the technical diversity of
these modules. Note that the selected techniques just reflect the
de-facto state of the art and are not exhaustive. The key point is
that one cannot expect to use a universal protocol to cover all the
network telemetry requirements.
+---------+--------------+--------------+--------------+-----------+ +---------+--------------+--------------+--------------+-----------+
| Module | Control | Management | Forwarding | External | | Module | Control | Management | Forwarding | External |
| | Plane | Plane | Plane | Data | | | Plane | Plane | Plane | Data |
+---------+--------------+--------------+--------------+-----------+ +---------+--------------+--------------+--------------+-----------+
|Object | control | config. & | flow & packet| terminal, | |Object | control | config. & | flow & packet| terminal, |
| | protocol & | operation | QoS, traffic | social & | | | protocol & | operation | QoS, traffic | social & |
| | signaling, | state, MIB | stat., buffer| environ- | | | signaling, | state, MIB | stat., buffer| environ- |
| | RIB, ACL | | & queue stat.| mental | | | RIB, ACL | | & queue stat.| mental |
+---------+--------------+--------------+--------------+-----------+ +---------+--------------+--------------+--------------+-----------+
skipping to change at page 16, line 10 skipping to change at page 17, line 10
the control plane telemetry. the control plane telemetry.
The requirements and challenges for each module are summarized as The requirements and challenges for each module are summarized as
follows. follows.
4.1.1. Management Plane Telemetry 4.1.1. Management Plane Telemetry
The management plane of network elements interacts with the Network The management plane of network elements interacts with the Network
Management System (NMS), and provides information such as performance Management System (NMS), and provides information such as performance
data, network logging data, network warning and defects data, and data, network logging data, network warning and defects data, and
network statistics and state data. Some legacy protocols, such as network statistics and state data. The management plane includes
SNMP and Syslog, are widely used for the management plane. However, many protocols, including some that are considered "legacy", such as
these protocols are insufficient to meet the requirements of the SNMP and syslog. Regardless the protocol, management plane telemetry
future automated network operation applications. must address the following requirements:
New management plane telemetry protocols should consider the
following requirements:
Convenient Data Subscription: An application should have the freedom Convenient Data Subscription: An application should have the freedom
to choose the data export means such as the data types and the to choose the data export means such as the data types and the
export frequency. export frequency.
Structured Data: For automatic network operation, machines will Structured Data: For automatic network operation, machines will
replace human for network data comprehension. The schema replace human for network data comprehension. The schema
languages such as YANG can efficiently describe structured data languages such as YANG can efficiently describe structured data
and normalize data encoding and transformation. and normalize data encoding and transformation.
High Speed Data Transport: In order to retain the information, a High Speed Data Transport: In order to keep up with the velocity of
server needs to send a large amount of data at high frequency. information, a server needs to be able to send large amounts of
Compact encoding formats are needed to compress the data and data at high frequency. Compact encoding formats are needed to
improve the data transport efficiency. The subscription mode, by compress the data and improve the data transport efficiency. The
replacing the query mode, reduces the interactions between clients subscription mode, by replacing the query mode, reduces the
and servers and helps to improve the server's efficiency. interactions between clients and servers and helps to improve the
server's efficiency.
4.1.2. Control Plane Telemetry 4.1.2. Control Plane Telemetry
The control plane telemetry refers to the health condition monitoring The control plane telemetry refers to the health condition monitoring
of different network control protocols covering Layer 2 to Layer 7. of different network control protocols covering Layer 2 to Layer 7.
Keeping track of the running status of these protocols is beneficial Keeping track of the running status of these protocols is beneficial
for detecting, localizing, and even predicting various network for detecting, localizing, and even predicting various network
issues, as well as network optimization, in real-time and in fine issues, as well as network optimization, in real-time and in fine
granularity. granularity.
skipping to change at page 17, line 20 skipping to change at page 18, line 20
and network optimization. and network optimization.
An example of the control plane telemetry is the BGP monitoring An example of the control plane telemetry is the BGP monitoring
protocol (BMP), it is currently used to monitoring the BGP routes and protocol (BMP), it is currently used to monitoring the BGP routes and
enables rich applications, such as BGP peer analysis, AS analysis, enables rich applications, such as BGP peer analysis, AS analysis,
prefix analysis, security analysis, and so on. However, the prefix analysis, security analysis, and so on. However, the
monitoring of other layers, protocols and the cross-layer, cross- monitoring of other layers, protocols and the cross-layer, cross-
protocol KPI correlations are still in their infancy (e.g., the IGP protocol KPI correlations are still in their infancy (e.g., the IGP
monitoring is missing), which require further research. monitoring is missing), which require further research.
4.1.3. Data Plane Telemetry 4.1.3. Forwarding Plane Telemetry
An effective data plane telemetry system relies on the data that the An effective forwarding plane telemetry system relies on the data
network device can expose. The data's quality, quantity, and that the network device can expose. The quality, quantity, and
timeliness must meet some stringent requirements. This raises some timeliness of data must meet some stringent requirements. This
challenges to the network data plane devices where the first hand raises some challenges to the network data plane devices where the
data originate. first hand data originate.
o A data plane device's main function is user traffic processing and o A data plane device's main function is user traffic processing and
forwarding. While supporting network visibility is important, the forwarding. While supporting network visibility is important, the
telemetry is just an auxiliary function, and it should not impede telemetry is just an auxiliary function, and it should not impede
normal traffic processing and forwarding (i.e., the performance is normal traffic processing and forwarding (i.e., the performance is
not lowered and the behavior is not altered due to the telemetry not lowered and the behavior is not altered due to the telemetry
functions). functions).
o The network operation applications requires end-to-end visibility o Network operation applications require end-to-end visibility
from various sources, which results in a huge volume of data. across various sources, which can result in a huge volume of data.
However, the sheer data quantity should not stress the network However, the sheer data quantity should not exhaust the network
bandwidth, regardless of the data delivery approach (i.e., through bandwidth, regardless of the data delivery approach (i.e., whether
in-band or out-of-band channels). through in-band or out-of-band channels).
o The data plane devices must provide timely data with the minimum o The data plane devices must provide timely data with the minimum
possible delay. Long processing, transport, storage, and analysis possible delay. Long processing, transport, storage, and analysis
delay can impact the effectiveness of the control loop and even delay can impact the effectiveness of the control loop and even
render the data useless. render the data useless.
o The data should be structured and labeled, and easy for o The data should be structured and labeled, and easy for
applications to parse and consume. At the same time, the data applications to parse and consume. At the same time, the data
types needed by applications can vary significantly. The data types needed by applications can vary significantly. The data
plane devices need to provide enough flexibility and plane devices need to provide enough flexibility and
programmability to support the precise data provision for programmability to support the precise data provision for
applications. applications.
o The data plane telemetry should support incremental deployment and o The data plane telemetry should support incremental deployment and
work even though some devices are unaware of the system. This work even though some devices are unaware of the system. This
challenge is highly relevant to the standards and legacy networks. challenge is highly relevant to the standards and legacy networks.
The data plane programmability is essential to support network Although not specific to the forwarding plane, these challenges are
telemetry. Newer data plane forwarding chips are equipped with more difficult to the forwarding plane because of the limited
advanced telemetry features and provide flexibility to support resource and flexibility. The data plane programmability is
customized telemetry functions. essential to support network telemetry. Newer data plane forwarding
chips are equipped with advanced telemetry features and provide
flexibility to support customized telemetry functions.
4.1.3.1. Technique Taxonomy 4.1.3.1. Technique Taxonomy
There can be multiple possible dimensions to classify the data plane There can be multiple possible dimensions to classify the forwarding
telemetry techniques. plane telemetry techniques.
Active, Passive, and Hybrid: The active and passive methods (as well Active, Passive, and Hybrid: Active and passive methods (as well as
as the hybrid types) are well documented in [RFC7799]. The the hybrid types) are well documented in [RFC7799]. Passive
passive methods include TCPDUMP, IPFIX [RFC7011], sflow, and methods include TCPDUMP, IPFIX [RFC7011], sflow, and traffic
traffic mirror. These methods usually have low data coverage. mirroring. These methods usually have low data coverage. The
The bandwidth cost is very high in order to improve the data bandwidth cost is very high in order to improve the data coverage.
coverage. On the other hand, the active methods include Ping, On the other hand, active methods include Ping, OWAMP [RFC4656],
Traceroute, OWAMP [RFC4656], TWAMP [RFC5357], and Cisco's SLA TWAMP [RFC5357], and Cisco's SLA Protocol [RFC6812]. These
Protocol [RFC6812]. These methods are intrusive and only provide methods are intrusive and only provide indirect network
indirect network measurement results. The hybrid methods, measurement results. Hybrid methods, including in-situ OAM
including in-situ OAM [I-D.ietf-ippm-ioam-data], IPFPM [RFC8321], [I-D.ietf-ippm-ioam-data], IPFPM [RFC8321], and Multipoint
and Multipoint Alternate Marking Alternate Marking [I-D.fioccola-ippm-multipoint-alt-mark], provide
[I-D.fioccola-ippm-multipoint-alt-mark], provide a well-balanced a well-balanced and more flexible approach. However, these
and more flexible approach. However, these methods are also more methods are also more complex to implement.
complex to implement.
In-Band and Out-of-Band: The telemetry data, before being exported In-Band and Out-of-Band: The telemetry data, before being exported
to some collector, can be carried in user packets. Such methods to some collector, can be carried in user packets. Such methods
are considered in-band (e.g., in-situ OAM are considered in-band (e.g., in-situ OAM
[I-D.ietf-ippm-ioam-data]). If the telemetry data is directly [I-D.ietf-ippm-ioam-data]). If the telemetry data is directly
exported to some collector without modifying the user packets, exported to some collector without modifying the user packets,
such methods are considered out-of-band (e.g., postcard-based such methods are considered out-of-band (e.g., postcard-based
INT). It is possible to have hybrid methods. For example, only INT). It is possible to have hybrid methods. For example, only
the telemetry instruction or partial data is carried by user the telemetry instruction or partial data is carried by user
packets (e.g., IPFPM [RFC8321]). packets (e.g., IPFPM [RFC8321]).
E2E and In-Network: Some E2E methods start from and end at the E2E and In-Network: Some E2E methods start from and end at the
network end hosts (e.g., Ping). The other methods work in network end hosts (e.g., Ping). The other methods work in
networks and are transparent to end hosts. However, if needed, networks and are transparent to end hosts. However, if needed,
the in-network methods can be easily extended into end hosts. in-network methods can be easily extended into end hosts.
Information Type: Depending on the telemetry objective, the methods Information Type: Depending on the telemetry objective, the methods
can be flow-based (e.g., in-situ OAM [I-D.ietf-ippm-ioam-data]), can be flow-based (e.g., in-situ OAM [I-D.ietf-ippm-ioam-data]),
path-based (e.g., Traceroute), and node-based (e.g., IPFIX path-based (e.g., Traceroute), and node-based (e.g., IPFIX
[RFC7011]). The various data objects can be packet, flow record, [RFC7011]). The various data objects can be packet, flow record,
measurement, states, and signal. measurement, states, and signal.
4.1.4. External Data Telemetry 4.1.4. External Data Telemetry
skipping to change at page 20, line 6 skipping to change at page 21, line 6
possibilities of current and future network systems, as reflected in possibilities of current and future network systems, as reflected in
the incorporation of cognitive capabilities to new hardware and the incorporation of cognitive capabilities to new hardware and
software (virtual) elements. software (virtual) elements.
4.2. Second Level Function Components 4.2. Second Level Function Components
Reflecting the best current practice, the telemetry module at each Reflecting the best current practice, the telemetry module at each
plane is further partitioned into five distinct components: plane is further partitioned into five distinct components:
Data Query, Analysis, and Storage: This component works at the Data Query, Analysis, and Storage: This component works at the
application layer. On the one hand, it is responsible for issuing application layer. It is a part of the network management system
data requirements. The data of interest can be modeled data at the receiver side. On the one hand, it is responsible for
through configuration or custom data through programming. The issuing data requirements. The data of interest can be modeled
data requirements can be queries for one-shot data or data through configuration or custom data through programming.
The data requirements can be queries for one-shot data or
subscriptions for events or streaming data. On the other hand, it subscriptions for events or streaming data. On the other hand, it
receives, stores, and processes the returned data from network receives, stores, and processes the returned data from network
devices. Data analysis can be interactive to initiate further devices. Data analysis can be interactive to initiate further
data queries. This component can reside in either network devices data queries. This component can reside in either network devices
or remote controllers. or remote controllers. It can be centralized and distributed, and
involve one or more instances.
Data Configuration and Subscription: This component deploys data Data Configuration and Subscription: This component deploys data
queries on devices. It determines the protocol and channel for queries on devices. It determines the protocol and channel for
applications to acquire desired data. This component is also applications to acquire desired data. This component is also
responsible for configuring the desired data that might not be responsible for configuring the desired data that might not be
directly available form data sources. The subscription data can directly available form data sources. The subscription data can
be described by models, templates, or programs. be described by models, templates, or programs.
Data Encoding and Export: This component determines how telemetry Data Encoding and Export: This component determines how telemetry
data are delivered to the data analysis and storage component. data are delivered to the data analysis and storage component.
skipping to change at page 21, line 5 skipping to change at page 22, line 5
data sources. This may involve in-network computing and data sources. This may involve in-network computing and
processing on either the fast path or the slow path in network processing on either the fast path or the slow path in network
devices. devices.
Data Object and Source: This component determines the monitoring Data Object and Source: This component determines the monitoring
object and original data source. The data source usually just object and original data source. The data source usually just
provides raw data which needs further processing. A data source provides raw data which needs further processing. A data source
can be considered a probe. A probe can be statically installed or can be considered a probe. A probe can be statically installed or
dynamically installed. dynamically installed.
+----------------------------------------+ +----------------------------------------+
| | +----------------------------------------+ |
| Data Query, Analysis, & Storage | | | |
| | | Data Query, Analysis, & Storage | |
| | +
+-------+++ -----------------------------+ +-------+++ -----------------------------+
||| ^^^ ||| ^^^
||| ||| ||| |||
||V ||| ||V |||
+--+V--------------------+++------------+ +--+V--------------------+++------------+
+-----V---------------------+------------+ | +-----V---------------------+------------+ |
+---------------------+-------+----------+ | | +---------------------+-------+----------+ | |
| Data Configuration | | | | | Data Configuration | | | |
| & Subscription | Data Encoding | | | | & Subscription | Data Encoding | | |
| (model, template, | & Export | | | | (model, template, | & Export | | |
skipping to change at page 22, line 8 skipping to change at page 23, line 10
In contrast, query is used when a querier expects immediate and one- In contrast, query is used when a querier expects immediate and one-
off feedback from network devices. The queried data may be directly off feedback from network devices. The queried data may be directly
extracted from some specific data source, or synthesized and extracted from some specific data source, or synthesized and
processed from raw data. Query suits for interactive network processed from raw data. Query suits for interactive network
telemetry applications. telemetry applications.
There are four types of data from network devices: There are four types of data from network devices:
Simple Data: The data that are steadily available from some data Simple Data: The data that are steadily available from some data
store or static probes in network devices. such data can be store or static probes in network devices. such data can be
specified by YANG model. specified by YANG model.
Complex Data: The data need to be synthesized or processed in Complex Data: The data need to be synthesized or processed in
network from raw data from one or more network devices. The data network from raw data from one or more network devices. The data
processing function can be statically or dynamically loaded into processing function can be statically or dynamically loaded into
network devices. network devices.
Event-triggered Data: The data are conditionally acquired based on Event-triggered Data: The data are conditionally acquired based on
the occurrence of some events. It can be actively pushed through the occurrence of some events. It can be actively pushed through
subscription or passively polled through query. There are many subscription or passively polled through query. There are many
ways to model events, including using Finite State Machine (FSM) ways to model events, including using Finite State Machine (FSM)
or Event Condition Action (ECN) [I-D.wwx-netmod-event-yang]. or Event Condition Action (ECA) [I-D.wwx-netmod-event-yang].
Streaming Data: The data are continuously generated. It can be time Streaming Data: The data are continuously generated. It can be time
series or the dump of databases. The streaming data reflect series or the dump of databases. The streaming data reflect
realtime network states and metrics and require large bandwidth realtime network states and metrics and require large bandwidth
and processing power. The streaming data are always actively and processing power. The streaming data are always actively
pushed to the subscribers. pushed to the subscribers.
The above data types are not mutually exclusive. Rather, they often The above data types are not mutually exclusive. Rather, they often
overlap. For example, event-triggered data can be simple or complex, overlap. For example, event-triggered data can be simple or complex,
and streaming data can be simple, complex, or triggered by events. and streaming data can be simple, complex, or triggered by events.
skipping to change at page 24, line 10 skipping to change at page 25, line 34
Figure 5: Existing Work Mapping I Figure 5: Existing Work Mapping I
The second table is based on the telemetry modules and components. The second table is based on the telemetry modules and components.
+-------------+-----------------+---------------+--------------+ +-------------+-----------------+---------------+--------------+
| | Management | Control | Forwarding | | | Management | Control | Forwarding |
| | Plane | Plane | Plane | | | Plane | Plane | Plane |
+-------------+-----------------+---------------+--------------+ +-------------+-----------------+---------------+--------------+
| data config.| gRPC, NETCONF, | NETCONF/YANG | NETCONF/YANG,| | data config.| gRPC, NETCONF, | NETCONF/YANG | NETCONF/YANG,|
| & subscribe | SMIv2,YANG PUSH | | YANG FSM | | & subscribe | SMIv2,YANG PUSH | YANG PUSH | YANG PUSH |
+-------------+-----------------+---------------+--------------+ +-------------+-----------------+---------------+--------------+
| data gen. & | DNP, | DNP, | IOAM, | | data gen. & | DNP, | DNP, | IOAM, PSAMP |
| process | YANG | YANG | PBT, IPFPM, | | process | YANG | YANG | PBT, IPFPM, |
| | | | DNP | | | | | DNP |
+-------------+-----------------+---------------+--------------+ +-------------+-----------------+---------------+--------------+
| data | gRPC, NETCONF | BMP, NETCONF | IPFIX | | data | gRPC, NETCONF | BMP, NETCONF | IPFIX |
| export | YANG PUSH | | | | export | YANG PUSH | | |
+-------------+-----------------+---------------+--------------+ +-------------+-----------------+---------------+--------------+
Figure 6: Existing Work Mapping II Figure 6: Existing Work Mapping II
5. Evolution of Network Telemetry 5. Evolution of Network Telemetry
skipping to change at page 24, line 34 skipping to change at page 26, line 17
Network telemetry is a fast evolving technical area. As the network Network telemetry is a fast evolving technical area. As the network
moves towards the automated operation, network telemetry undergoes moves towards the automated operation, network telemetry undergoes
several stages of evolution. Each stage is built upon the techniques several stages of evolution. Each stage is built upon the techniques
enabled by previous stages. enabled by previous stages.
Stage 0 - Static Telemetry: The telemetry data source and type are Stage 0 - Static Telemetry: The telemetry data source and type are
determined at design time. The network operator can only determined at design time. The network operator can only
configure how to use it with limited flexibility. configure how to use it with limited flexibility.
Stage 1 - Dynamic Telemetry: The custom telemetry data can be Stage 1 - Dynamic Telemetry: The custom telemetry data can be
dynamically programmed or configured at runtime, allowing a dynamically programmed or configured at runtime without
tradeoff among resource, performance, flexibility, and coverage. interrupting the network operation, allowing a tradeoff among
DNP is an effort towards this direction. resource, performance, flexibility, and coverage. DNP is an
effort towards this direction.
Stage 2 - Interactive Telemetry: The network operator can Stage 2 - Interactive Telemetry: The network operator can
continuously customize the telemetry data in real time to reflect continuously customize and fine tune the telemetry data in real
the network operation's visibility requirements. At this stage, time to reflect the network operation's visibility requirements.
some tasks can be automated, although ultimately human operators Compared with Stage 1, the changes are frequent based on the real-
will still need to sit in the middle to make decisions. time feedback. At this stage, some tasks can be automated, but
human operators still need to sit in the middle to make decisions.
Stage 3 - Closed-loop Telemetry: Human operators are completely Stage 3 - Closed-loop Telemetry: The telemetry is free from the
excluded from the control loop. The intelligent network operation interference of human operators, except for generating the
engine automatically issues the telemetry data requests, analyzes reports. The intelligent network operation engine automatically
the data, and updates the network operations in closed control issues the telemetry data requests, analyzes the data, and updates
loops. the network operations in closed control loops.
The most of the existing technologies belong to stage 0 and stage 1. The most of the existing technologies belong to stage 0 and stage 1.
Individual stage 2 and stage 3 applications are also possible now. Individual stage 2 and stage 3 applications are also possible now.
However, the future autonomic networks may need a comprehensive However, the future autonomic networks may need a comprehensive
operation management system which relies on stage 2 and stage 3 operation management system which relies on stage 2 and stage 3
telemetry to cover all the network operation tasks. A well-defined telemetry to cover all the network operation tasks. A well-defined
network telemetry framework is the first step towards this direction. network telemetry framework is the first step towards this direction.
6. Security Considerations 6. Security Considerations
The complexity of network telemetry raises significant security The complexity of network telemetry raises significant security
implications. For example, telemetry data can be manipulated to implications. For example, telemetry data can be manipulated to
exhaust various network resources at each plane as well as the data exhaust various network resources at each plane as well as the data
skipping to change at page 25, line 20 skipping to change at page 26, line 52
6. Security Considerations 6. Security Considerations
The complexity of network telemetry raises significant security The complexity of network telemetry raises significant security
implications. For example, telemetry data can be manipulated to implications. For example, telemetry data can be manipulated to
exhaust various network resources at each plane as well as the data exhaust various network resources at each plane as well as the data
consumer; falsified or tampered data can mislead the decision making consumer; falsified or tampered data can mislead the decision making
and paralyze networks; wrong configuration and programming for and paralyze networks; wrong configuration and programming for
telemetry is equally harmful. telemetry is equally harmful.
Given that this document has proposed a framework for network Given that this document has proposed a framework for network
telemetry and the telemetry mechanisms discussed are distinct (in telemetry and the telemetry mechanisms discussed are more extensive
both message frequency and traffic amount) from the conventional (in both message frequency and traffic amount) than the conventional
network OAM concepts, we must also reflect that various new security network OAM concepts, we must also reflect that various new security
considerations may also arise. A number of techniques already exist considerations may also arise. A number of techniques already exist
for securing the forwarding plane, the control plane, and the for securing the forwarding plane, the control plane, and the
management plane in a network, but it is important to consider if any management plane in a network, but it is important to consider if any
new threat vectors are now being enabled via the use of network new threat vectors are now being enabled via the use of network
telemetry procedures and mechanisms. telemetry procedures and mechanisms.
Security considerations for networks that use telemetry methods may Security considerations for networks that use telemetry methods may
include: include:
skipping to change at page 25, line 45 skipping to change at page 27, line 28
telemetry capabilities; telemetry capabilities;
o Protocol transport used telemetry data and inherent security o Protocol transport used telemetry data and inherent security
capabilities; capabilities;
o Telemetry data stores, storage encryption and methods of access; o Telemetry data stores, storage encryption and methods of access;
o Tracking telemetry events and any abnormalities that might o Tracking telemetry events and any abnormalities that might
identify malicious attacks using telemetry interfaces. identify malicious attacks using telemetry interfaces.
o Authentication and signing of telemetry data to make data more
trustworthy.
Some of the security considerations highlighted above may be Some of the security considerations highlighted above may be
minimized or negated with policy management of network telemetry. In minimized or negated with policy management of network telemetry. In
a network telemetry deployment it would be advantageous to separate a network telemetry deployment it would be advantageous to separate
telemetry capabilities into different classes of policies, i.e., Role telemetry capabilities into different classes of policies, i.e., Role
Based Access Control and Event-Condition-Action policies. Also, Based Access Control and Event-Condition-Action policies. Also,
potential conflicts between network telemetry mechanisms must be potential conflicts between network telemetry mechanisms must be
detected accurately and resolved quickly to avoid unnecessary network detected accurately and resolved quickly to avoid unnecessary network
telemetry traffic propagation escalating into an unintended or telemetry traffic propagation escalating into an unintended or
intended denial of service attack. intended denial of service attack.
Further study of the security issues will be required, and it is Further study of the security issues will be required, and it is
expected that the secuirty mechanisms and protocols are devloped and expected that the secuirty mechanisms and protocols are developed and
deployed along with a network telemetry system. deployed along with a network telemetry system.
7. IANA Considerations 7. IANA Considerations
This document includes no request to IANA. This document includes no request to IANA.
8. Contributors 8. Contributors
The other contributors of this document are listed as follows. The other contributors of this document are listed as follows.
skipping to change at page 27, line 14 skipping to change at page 29, line 8
[I-D.ietf-grow-bmp-adj-rib-out] [I-D.ietf-grow-bmp-adj-rib-out]
Evens, T., Bayraktar, S., Lucente, P., Mi, K., and S. Evens, T., Bayraktar, S., Lucente, P., Mi, K., and S.
Zhuang, "Support for Adj-RIB-Out in BGP Monitoring Zhuang, "Support for Adj-RIB-Out in BGP Monitoring
Protocol (BMP)", draft-ietf-grow-bmp-adj-rib-out-07 (work Protocol (BMP)", draft-ietf-grow-bmp-adj-rib-out-07 (work
in progress), August 2019. in progress), August 2019.
[I-D.ietf-grow-bmp-local-rib] [I-D.ietf-grow-bmp-local-rib]
Evens, T., Bayraktar, S., Bhardwaj, M., and P. Lucente, Evens, T., Bayraktar, S., Bhardwaj, M., and P. Lucente,
"Support for Local RIB in BGP Monitoring Protocol (BMP)", "Support for Local RIB in BGP Monitoring Protocol (BMP)",
draft-ietf-grow-bmp-local-rib-07 (work in progress), May draft-ietf-grow-bmp-local-rib-08 (work in progress),
2020. November 2020.
[I-D.ietf-ippm-ioam-data] [I-D.ietf-ippm-ioam-data]
Brockners, F., Bhandari, S., and T. Mizrahi, "Data Fields Brockners, F., Bhandari, S., and T. Mizrahi, "Data Fields
for In-situ OAM", draft-ietf-ippm-ioam-data-10 (work in for In-situ OAM", draft-ietf-ippm-ioam-data-11 (work in
progress), July 2020. progress), November 2020.
[I-D.ietf-netconf-distributed-notif] [I-D.ietf-netconf-distributed-notif]
Zhou, T., Zheng, G., Voit, E., Graf, T., and P. Francois, Zhou, T., Zheng, G., Voit, E., Graf, T., and P. Francois,
"Subscription to Distributed Notifications", draft-ietf- "Subscription to Distributed Notifications", draft-ietf-
netconf-distributed-notif-00 (work in progress), October netconf-distributed-notif-01 (work in progress), November
2020. 2020.
[I-D.ietf-netconf-udp-notif] [I-D.ietf-netconf-udp-notif]
Zheng, G., Zhou, T., Graf, T., Francois, P., and P. Zheng, G., Zhou, T., Graf, T., Francois, P., and P.
Lucente, "UDP-based Transport for Configured Lucente, "UDP-based Transport for Configured
Subscriptions", draft-ietf-netconf-udp-notif-00 (work in Subscriptions", draft-ietf-netconf-udp-notif-01 (work in
progress), October 2020. progress), November 2020.
[I-D.irtf-nmrg-ibn-concepts-definitions] [I-D.irtf-nmrg-ibn-concepts-definitions]
Clemm, A., Ciavaglia, L., Granville, L., and J. Tantsura, Clemm, A., Ciavaglia, L., Granville, L., and J. Tantsura,
"Intent-Based Networking - Concepts and Definitions", "Intent-Based Networking - Concepts and Definitions",
draft-irtf-nmrg-ibn-concepts-definitions-02 (work in draft-irtf-nmrg-ibn-concepts-definitions-02 (work in
progress), September 2020. progress), September 2020.
[I-D.kumar-rtgwg-grpc-protocol] [I-D.kumar-rtgwg-grpc-protocol]
Kumar, A., Kolhe, J., Ghemawat, S., and L. Ryan, "gRPC Kumar, A., Kolhe, J., Ghemawat, S., and L. Ryan, "gRPC
Protocol", draft-kumar-rtgwg-grpc-protocol-00 (work in Protocol", draft-kumar-rtgwg-grpc-protocol-00 (work in
skipping to change at page 28, line 12 skipping to change at page 30, line 6
(gNMI)", draft-openconfig-rtgwg-gnmi-spec-01 (work in (gNMI)", draft-openconfig-rtgwg-gnmi-spec-01 (work in
progress), March 2018. progress), March 2018.
[I-D.pedro-nmrg-anticipated-adaptation] [I-D.pedro-nmrg-anticipated-adaptation]
Martinez-Julia, P., "Exploiting External Event Detectors Martinez-Julia, P., "Exploiting External Event Detectors
to Anticipate Resource Requirements for the Elastic to Anticipate Resource Requirements for the Elastic
Adaptation of SDN/NFV Systems", draft-pedro-nmrg- Adaptation of SDN/NFV Systems", draft-pedro-nmrg-
anticipated-adaptation-02 (work in progress), June 2018. anticipated-adaptation-02 (work in progress), June 2018.
[I-D.song-ippm-postcard-based-telemetry] [I-D.song-ippm-postcard-based-telemetry]
Song, H., Zhou, T., Li, Z., Shin, J., and K. Lee, Song, H., Zhou, T., Li, Z., Mirsky, G., Shin, J., and K.
"Postcard-based On-Path Flow Data Telemetry", draft-song- Lee, "Postcard-based On-Path Flow Data Telemetry using
ippm-postcard-based-telemetry-07 (work in progress), April Packet Marking", draft-song-ippm-postcard-based-
2020. telemetry-08 (work in progress), October 2020.
[I-D.song-opsawg-dnp4iq] [I-D.song-opsawg-dnp4iq]
Song, H. and J. Gong, "Requirements for Interactive Query Song, H. and J. Gong, "Requirements for Interactive Query
with Dynamic Network Probes", draft-song-opsawg-dnp4iq-01 with Dynamic Network Probes", draft-song-opsawg-dnp4iq-01
(work in progress), June 2017. (work in progress), June 2017.
[I-D.song-opsawg-ifit-framework] [I-D.song-opsawg-ifit-framework]
Song, H., Qin, F., Chen, H., Jin, J., and J. Shin, "In- Song, H., Qin, F., Chen, H., Jin, J., and J. Shin, "In-
situ Flow Information Telemetry", draft-song-opsawg-ifit- situ Flow Information Telemetry", draft-song-opsawg-ifit-
framework-13 (work in progress), October 2020. framework-13 (work in progress), October 2020.
[I-D.wwx-netmod-event-yang] [I-D.wwx-netmod-event-yang]
Bierman, A., WU, Q., Bryskin, I., Birkholz, H., Liu, X., WU, Q., Bryskin, I., Birkholz, H., Liu, X., and B. Claise,
and B. Claise, "A YANG Data model for ECA Policy "A YANG Data model for ECA Policy Management", draft-wwx-
Management", draft-wwx-netmod-event-yang-09 (work in netmod-event-yang-10 (work in progress), November 2020.
progress), July 2020.
[RFC1157] Case, J., Fedor, M., Schoffstall, M., and J. Davin, [RFC1157] Case, J., Fedor, M., Schoffstall, M., and J. Davin,
"Simple Network Management Protocol (SNMP)", RFC 1157, "Simple Network Management Protocol (SNMP)", RFC 1157,
DOI 10.17487/RFC1157, May 1990, DOI 10.17487/RFC1157, May 1990,
<https://www.rfc-editor.org/info/rfc1157>. <https://www.rfc-editor.org/info/rfc1157>.
[RFC2578] McCloghrie, K., Ed., Perkins, D., Ed., and J. [RFC2578] McCloghrie, K., Ed., Perkins, D., Ed., and J.
Schoenwaelder, Ed., "Structure of Management Information Schoenwaelder, Ed., "Structure of Management Information
Version 2 (SMIv2)", STD 58, RFC 2578, Version 2 (SMIv2)", STD 58, RFC 2578,
DOI 10.17487/RFC2578, April 1999, DOI 10.17487/RFC2578, April 1999,
skipping to change at page 30, line 52 skipping to change at page 32, line 47
In this non-normative appendix, we provide an overview of some In this non-normative appendix, we provide an overview of some
existing techniques and standard proposals for each network telemetry existing techniques and standard proposals for each network telemetry
module. module.
A.1. Management Plane Telemetry A.1. Management Plane Telemetry
A.1.1. Push Extensions for NETCONF A.1.1. Push Extensions for NETCONF
NETCONF [RFC6241] is one popular network management protocol, which NETCONF [RFC6241] is one popular network management protocol, which
is also recommended by IETF. Although it can be used for data is also recommended by IETF. Although it can be used for data
collection, NETCONF is good at configurations. YANG Push collection, NETCONF is good at configurations. YANG Push [RFC8641]
[RFC8639] extends NETCONF and enables subscriber applications to
request a continuous, customized stream of updates from a YANG
datastore. Providing such visibility into changes made upon YANG
configuration and operational objects enables new capabilities based
on the remote mirroring of configuration and operational state.
[RFC8641][RFC8639] extends NETCONF and enables subscriber Moreover, distributed data collection mechanism
applications to request a continuous, customized stream of updates
from a YANG datastore. Providing such visibility into changes made
upon YANG configuration and operational objects enables new
capabilities based on the remote mirroring of configuration and
operational state. Moreover, distributed data collection mechanism
[I-D.ietf-netconf-distributed-notif] via UDP based publication [I-D.ietf-netconf-distributed-notif] via UDP based publication
channel [I-D.ietf-netconf-udp-notif] provides enhanced efficiency for channel [I-D.ietf-netconf-udp-notif] provides enhanced efficiency for
the NETCONF based telemetry. the NETCONF based telemetry.
A.1.2. gRPC Network Management Interface A.1.2. gRPC Network Management Interface
gRPC Network Management Interface (gNMI) gRPC Network Management Interface (gNMI)
[I-D.openconfig-rtgwg-gnmi-spec] is a network management protocol [I-D.openconfig-rtgwg-gnmi-spec] is a network management protocol
based on the gRPC [I-D.kumar-rtgwg-grpc-protocol] RPC (Remote based on the gRPC [I-D.kumar-rtgwg-grpc-protocol] RPC (Remote
Procedure Call) framework. With a single gRPC service definition, Procedure Call) framework. With a single gRPC service definition,
 End of changes. 70 change blocks. 
298 lines changed or deleted 347 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/