draft-ietf-opsawg-ntf-08.txt   draft-ietf-opsawg-ntf-09.txt 
OPSAWG H. Song OPSAWG H. Song
Internet-Draft Futurewei Internet-Draft Futurewei
Intended status: Informational F. Qin Intended status: Informational F. Qin
Expires: 10 April 2022 China Mobile Expires: 16 April 2022 China Mobile
P. Martinez-Julia P. Martinez-Julia
NICT NICT
L. Ciavaglia L. Ciavaglia
Nokia Nokia
A. Wang A. Wang
China Telecom China Telecom
7 October 2021 13 October 2021
Network Telemetry Framework Network Telemetry Framework
draft-ietf-opsawg-ntf-08 draft-ietf-opsawg-ntf-09
Abstract Abstract
Network telemetry is a technology for gaining network insight and Network telemetry is a technology for gaining network insight and
facilitating efficient and automated network management. It facilitating efficient and automated network management. It
encompasses various techniques for remote data generation, encompasses various techniques for remote data generation,
collection, correlation, and consumption. This document describes an collection, correlation, and consumption. This document describes an
architectural framework for network telemetry, motivated by architectural framework for network telemetry, motivated by
challenges that are encountered as part of the operation of networks challenges that are encountered as part of the operation of networks
and by the requirements that ensue. This document clarifies the and by the requirements that ensue. This document clarifies the
terminologies and classifies the modules and components of a network terminologies and classifies the modules and components of a network
telemetry system from several different perspectives. The framework telemetry system from different perspectives. The framework and
and taxonomy help to set a common ground for the collection of taxonomy help to set a common ground for the collection of related
related work and provide guidance for related technique and standard work and provide guidance for related technique and standard
developments. developments.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on 10 April 2022. This Internet-Draft will expire on 16 April 2022.
Copyright Notice Copyright Notice
Copyright (c) 2021 IETF Trust and the persons identified as the Copyright (c) 2021 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/ Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document. license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights Please review these documents carefully, as they describe your rights
skipping to change at page 3, line 48 skipping to change at page 3, line 48
techniques and standard works. techniques and standard works.
To fulfill such an undertaking, we first discuss some key To fulfill such an undertaking, we first discuss some key
characteristics of network telemetry which set a clear distinction characteristics of network telemetry which set a clear distinction
from the conventional network OAM and show that some conventional OAM from the conventional network OAM and show that some conventional OAM
technologies can be considered a subset of the network telemetry technologies can be considered a subset of the network telemetry
technologies. We then provide an architectural framework for network technologies. We then provide an architectural framework for network
telemetry which includes four modules, each concerned with a telemetry which includes four modules, each concerned with a
different category of telemetry data and corresponding procedures. different category of telemetry data and corresponding procedures.
All the modules are internally structured in the same way, including All the modules are internally structured in the same way, including
components that allow to configure data sources with regards to what components that allow to configure data sources in regard to what
data to generate and how to make that available to client data to generate and how to make that available to client
applications, components that instrument the underlying data sources, applications, components that instrument the underlying data sources,
and components that perform the actual rendering, encoding, and and components that perform the actual rendering, encoding, and
exporting of the generated data. We show how the network telemetry exporting of the generated data. We show how the network telemetry
framework can benefit the current and future network operations. framework can benefit the current and future network operations.
Based on the distinction of modules and function components, we can Based on the distinction of modules and function components, we can
map the existing and emerging techniques and protocols into the map the existing and emerging techniques and protocols into the
framework. The framework can also simplify the tasks for designing, framework. The framework can also simplify the tasks for designing,
maintaining, and understanding a network telemetry system. At last, maintaining, and understanding a network telemetry system. At last,
we outline the evolution stages of the network telemetry system and we outline the evolution stages of the network telemetry system and
skipping to change at page 4, line 49 skipping to change at page 4, line 49
DPI: Deep Packet Inspection, referring to the techniques that DPI: Deep Packet Inspection, referring to the techniques that
examines packet beyond packet L3/L4 headers. examines packet beyond packet L3/L4 headers.
gNMI: gRPC Network Management Interface, a network management gNMI: gRPC Network Management Interface, a network management
protocol from OpenConfig Operator Working Group, mainly protocol from OpenConfig Operator Working Group, mainly
contributed by Google. See [gnmi] for details. contributed by Google. See [gnmi] for details.
GPB: Google Protocol Buffer, an extensible mechanism for serializing GPB: Google Protocol Buffer, an extensible mechanism for serializing
structured data. structured data.
gRPC: gRPC Remote Procedure Call, a open source high performance RPC gRPC: gRPC Remote Procedure Call, an open source high performance
framework that gNMI is based on. See [grpc] for details. RPC framework that gNMI is based on. See [grpc] for details.
IPFIX: IP Flow Information Export Protocol, specified in [RFC7011]. IPFIX: IP Flow Information Export Protocol, specified in [RFC7011].
IOAM: In-situ OAM, a dataplane on-path telemetry technique. IOAM: In-situ OAM, a dataplane on-path telemetry technique.
JSON: An open standard file format and data interchange format that JSON: An open standard file format and data interchange format that
uses human-readable text to store and transmit data objects. uses human-readable text to store and transmit data objects.
MIB: Management Information Base, a database used for managing the MIB: Management Information Base, a database used for managing the
entities in a network. entities in a network.
skipping to change at page 8, line 6 skipping to change at page 8, line 6
While the list is by no means exhaustive, it is enough to highlight While the list is by no means exhaustive, it is enough to highlight
the requirements for data velocity, variety, volume, and veracity in the requirements for data velocity, variety, volume, and veracity in
networks. networks.
* Security: Network intrusion detection and prevention systems need * Security: Network intrusion detection and prevention systems need
to monitor network traffic and activities and act upon anomalies. to monitor network traffic and activities and act upon anomalies.
Given increasingly sophisticated attack vector coupled with Given increasingly sophisticated attack vector coupled with
increasingly severe consequences of security breaches, new tools increasingly severe consequences of security breaches, new tools
and techniques need to be developed, relying on wider and deeper and techniques need to be developed, relying on wider and deeper
visibility into networks. The ultimate goal is to achieve the visibility into networks. The ultimate goal is to achieve the
ideal security with no or minimal human intervention. ideal security with no, or only minimal, human intervention.
* Policy and Intent Compliance: Network policies are the rules that * Policy and Intent Compliance: Network policies are the rules that
constrain the services for network access, provide service constrain the services for network access, provide service
differentiation, or enforce specific treatment on the traffic. differentiation, or enforce specific treatment on the traffic.
For example, a service function chain is a policy that requires For example, a service function chain is a policy that requires
the selected flows to pass through a set of ordered network the selected flows to pass through a set of ordered network
functions. Intent, as defined in functions. Intent, as defined in
[I-D.irtf-nmrg-ibn-concepts-definitions], is a set of operational [I-D.irtf-nmrg-ibn-concepts-definitions], is a set of operational
goal that a network should meet and outcomes that a network is goal that a network should meet and outcomes that a network is
supposed to deliver, defined in a declarative manner without supposed to deliver, defined in a declarative manner without
specifying how to achieve or implement them. An intent requires a specifying how to achieve or implement them. An intent requires a
complex translation and mapping process before being applied on complex translation and mapping process before being applied on
networks. While a policy or an intent is enforced, the compliance networks. While a policy or intent is enforced, the compliance
needs to be verified and monitored continuously relying on needs to be verified and monitored continuously by relying on
visibility that is provided through network telemetry data, any visibility that is provided through network telemetry data. Any
violation needs to be reported immediately, and updates need to be violation must be notified immediately, potentially resulting in
applied to ensure the intent remains in force. updates to how the policy or intent is applied in the network to
ensure that it remains in force, or otherwise alerting the network
administrator to the policy or intent violation.
* SLA Compliance: A Service-Level Agreement (SLA) defines the level * SLA Compliance: A Service-Level Agreement (SLA) defines the level
of service a user expects from a network operator, which include of service a user expects from a network operator, which include
the metrics for the service measurement and remedy/penalty the metrics for the service measurement and remedy/penalty
procedures when the service level misses the agreement. Users procedures when the service level misses the agreement. Users
need to check if they get the service as promised and network need to check if they get the service as promised and network
operators need to evaluate how they can deliver the services that operators need to evaluate how they can deliver the services that
can meet the SLA based on realtime network telemetry data, can meet the SLA based on realtime network telemetry data,
including data from network measurements. including data from network measurements.
* Root Cause Analysis: Any network failure can be the effect of a * Root Cause Analysis: Any network failure can be the effect of a
sequence of chained events. Troubleshooting and recovery require sequence of chained events. Troubleshooting and recovery require
quick identification of the root cause of any observable issues. quick identification of the root cause of any observable issues.
However, the root cause is not always straightforward to identify, However, the root cause is not always straightforward to identify,
especially when the failure is sporadic and the number of event especially when the failure is sporadic and the number of event
messages, both related and unrelated to the same cause, is messages, both related and unrelated to the same cause, is
overwhelming. While machine learning technologies can be used for overwhelming. While machine learning technologies can be used for
root cause analysis, it up to the network to sense and provide the root cause analysis, it up to the network to sense and provide the
relevant diagnostic data which are either actively fed into or relevant diagnostic data which are either actively fed into, or
passively retrieved by machine learning applications. passively retrieved by, machine learning applications.
* Network Optimization: This covers all short-term and long-term * Network Optimization: This covers all short-term and long-term
network optimization techniques, including load balancing, Traffic network optimization techniques, including load balancing, Traffic
Engineering (TE), and network planning. Network operators are Engineering (TE), and network planning. Network operators are
motivated to optimize their network utilization and differentiate motivated to optimize their network utilization and differentiate
services for better Return On Investment (ROI) or lower Capital services for better Return On Investment (ROI) or lower Capital
Expenditures (CAPEX). The first step is to know the real-time Expenditures (CAPEX). The first step is to know the real-time
network conditions before applying policies for traffic network conditions before applying policies for traffic
manipulation. In some cases, micro-bursts need to be detected in manipulation. In some cases, micro-bursts need to be detected in
a very short time-frame so that fine-grained traffic control can a very short time-frame so that fine-grained traffic control can
skipping to change at page 10, line 15 skipping to change at page 10, line 9
* Many application scenarios need to correlate network-wide data * Many application scenarios need to correlate network-wide data
from multiple sources (i.e., from distributed network devices, from multiple sources (i.e., from distributed network devices,
different components of a network device, or different network different components of a network device, or different network
planes). A piecemeal solution is often lacking the capability to planes). A piecemeal solution is often lacking the capability to
consolidate the data from multiple sources. The composition of a consolidate the data from multiple sources. The composition of a
complete solution, as partly proposed by Autonomic Resource complete solution, as partly proposed by Autonomic Resource
Control Architecture(ARCA) Control Architecture(ARCA)
[I-D.pedro-nmrg-anticipated-adaptation], will be empowered and [I-D.pedro-nmrg-anticipated-adaptation], will be empowered and
guided by a comprehensive framework. guided by a comprehensive framework.
* Some of the conventional OAM techniques (e.g., CLI and Syslog) * Some conventional OAM techniques (e.g., CLI and Syslog) lack a
lack a formal data model. The unstructured data hinder the tool formal data model. The unstructured data hinder the tool
automation and application extensibility. Standardized data automation and application extensibility. Standardized data
models are essential to support the programmable networks. models are essential to support the programmable networks.
* Although some conventional OAM techniques support data push (e.g., * Although some conventional OAM techniques support data push (e.g.,
SNMP Trap [RFC2981][RFC3877], Syslog, and sFlow), the pushed data SNMP Trap [RFC2981][RFC3877], Syslog, and sFlow), the pushed data
are limited to only predefined management plane warnings (e.g., are limited to only predefined management plane warnings (e.g.,
SNMP Trap) or sampled user packets (e.g., sFlow). Network SNMP Trap) or sampled user packets (e.g., sFlow). Network
operators require the data with arbitrary source, granularity, and operators require the data with arbitrary source, granularity, and
precision which are beyond the capability of the existing precision which are beyond the capability of the existing
techniques. techniques.
skipping to change at page 14, line 5 skipping to change at page 14, line 5
make efficient use of network resources and reduce the impact of make efficient use of network resources and reduce the impact of
processing related to network telemetry on network performance. processing related to network telemetry on network performance.
For example, routine network monitoring should cover the entire For example, routine network monitoring should cover the entire
network with a low data sampling rate. Only when issues arise or network with a low data sampling rate. Only when issues arise or
critical trends emerge should telemetry data source be modified critical trends emerge should telemetry data source be modified
and telemetry data rates boosted as needed. and telemetry data rates boosted as needed.
* Efficient data fusion is critical for applications to reduce the * Efficient data fusion is critical for applications to reduce the
overall quantity of data and improve the accuracy of analysis. overall quantity of data and improve the accuracy of analysis.
A telemetry framework collects together all of the telemetry-related A telemetry framework collects together all the telemetry-related
works from different sources and working groups within IETF. This works from different sources and working groups within IETF. This
makes it possible to assemble a comprehensive network telemetry makes it possible to assemble a comprehensive network telemetry
system and to avoid repetitious or redundant work. The framework system and to avoid repetitious or redundant work. The framework
should cover the concepts and components from the standardization should cover the concepts and components from the standardization
perspective. This document describes the modules which make up a perspective. This document describes the modules which make up a
network telemetry framework and decomposes the telemetry system into network telemetry framework and decomposes the telemetry system into
a set of distinct components that existing and future work can easily a set of distinct components that existing and future work can easily
map to. map to.
4. Network Telemetry Framework 4. Network Telemetry Framework
The top level network telemetry framework partitions the network The top level network telemetry framework partitions the network
telemetry into four modules based on the telemetry data object source telemetry into four modules based on the telemetry data object source
and represents their relationship. At the next level, the framework and represents their relationship. At the next level, the framework
decomposes each module into separate components. Each of the modules decomposes each module into separate components. Each of the modules
follows the same underlying structure, with one component dedicated follows the same underlying structure, with one component dedicated
to the configuration of data subscriptions and data sources, a second to the configuration of data subscriptions and data sources, a second
component dedicated to encoding and exporting data, and a third component dedicated to encoding and exporting data, and a third
component instrumenting the generation of telemetry related to the component instrumenting the generation of telemetry related to the
underlying resources. Throughout the framework, the same set of underlying resources. Throughout the framework, the same set of
abstract data acquiring mechanisms and data types (Section 4.3)are abstract data acquiring mechanisms and data types (Section 4.3) are
applied. The two-level architecture with the uniform data applied. The two-level architecture with the uniform data
abstraction helps accurately pinpoint a protocol or technique to its abstraction helps accurately pinpoint a protocol or technique to its
position in a network telemetry system or disaggregate a network position in a network telemetry system or disaggregate a network
telemetry system into manageable parts. telemetry system into manageable parts.
4.1. Top Level Modules 4.1. Top Level Modules
Telemetry can be applied on the forwarding plane, the control plane, Telemetry can be applied on the forwarding plane, the control plane,
and the management plane in a network, as well as other sources out and the management plane in a network, as well as other sources out
of the network, as shown in Figure 1. Therefore, we categorize the of the network, as shown in Figure 1. Therefore, we categorize the
skipping to change at page 16, line 21 skipping to change at page 16, line 21
Because the locations that can export data have different Because the locations that can export data have different
capabilities, different choices of data model, encoding, and capabilities, different choices of data model, encoding, and
transport method are made to balance the performance and cost. For transport method are made to balance the performance and cost. For
example, the forwarding chip has high throughput but limited capacity example, the forwarding chip has high throughput but limited capacity
for processing complex data and maintaining states, while the main for processing complex data and maintaining states, while the main
control CPU is capable of complex data and state processing, but has control CPU is capable of complex data and state processing, but has
limited bandwidth for high throughput data. As a result, the limited bandwidth for high throughput data. As a result, the
suitable telemetry protocol for each module can be different. Some suitable telemetry protocol for each module can be different. Some
representative techniques are shown in the corresponding table blocks representative techniques are shown in the corresponding table blocks
to highlight the technical diversity of these modules. Note that the to highlight the technical diversity of these modules. Note that the
selected techniques just reflect the de-facto state of the art and selected techniques just reflect the de facto state of the art and
are not exhaustive. The key point is that one cannot expect to use a are not exhaustive. The key point is that one cannot expect to use a
universal protocol to cover all the network telemetry requirements. universal protocol to cover all the network telemetry requirements.
+---------+--------------+--------------+---------------+-----------+ +---------+--------------+--------------+---------------+-----------+
| Module | Control | Management | Forwarding | External | | Module | Management | Control | Forwarding | External |
| | Plane | Plane | Plane | Data | | | Plane | Plane | Plane | Data |
+---------+--------------+--------------+---------------+-----------+ +---------+--------------+--------------+---------------+-----------+
|Object | control | config. & | flow & packet | terminal, | |Object | config. & | control | flow & packet | terminal, |
| | protocol & | operation | QoS, traffic | social & | | | operation | protocol & | QoS, traffic | social & |
| | signaling, | state | stat., buffer | environ- | | | state | signaling, | stat., buffer | environ- |
| | RIB, ACL | | & queue stat.,| mental | | | | RIB | & queue stat.,| mental |
| | | | ACL, FIB | | | | | | ACL, FIB | |
+---------+--------------+--------------+---------------+-----------+ +---------+--------------+--------------+---------------+-----------+
|Export | main control | main control | fwding chip | various | |Export | main control | main control | fwding chip | various |
|Location | CPU, | CPU | or linecard | | |Location | CPU | CPU, | or linecard | |
| | linecard CPU | | CPU; main | | | | | linecard CPU | CPU; main | |
| | or fwding | | control CPU | | | | | or forwarding| control CPU | |
| | chip | | unlikely | | | | | chip | unlikely | |
+---------+--------------+--------------+---------------+-----------+ +---------+--------------+--------------+---------------+-----------+
|Data | YANG, | YANG, MIB, | template, | YANG, | |Data | YANG, MIB, | YANG, | template, | YANG, |
|Model | custom | syslog, | YANG, | custom | |Model | syslog | custom | YANG, | custom |
| | | | custom | | | | | | custom | |
+---------+--------------+--------------+---------------+-----------+ +---------+--------------+--------------+---------------+-----------+
|Data | GPB, JSON, | GPB, JSON, | plain | GPB, JSON | |Data | GPB, JSON, | GPB, JSON, | plain | GPB, JSON |
|Encoding | XML, plain | XML | | XML, plain| |Encoding | XML | XML, plain | | XML, plain|
+---------+--------------+--------------+---------------+-----------+ +---------+--------------+--------------+---------------+-----------+
|Protocol | gRPC,NETCONF,| gRPC,NETCONF,| IPFIX, mirror,| gRPC | |Protocol | gRPC,NETCONF,| gRPC,NETCONF,| IPFIX, mirror,| gRPC |
| | IPFIX,mirror | | gRPC, NETFLOW | | | | | IPFIX, mirror| gRPC, NETFLOW | |
+---------+--------------+--------------+---------------+-----------+ +---------+--------------+--------------+---------------+-----------+
|Transport| HTTP, TCP, | HTTP, TCP | UDP | HTTP,TCP | |Transport| HTTP, TCP | HTTP, TCP, | UDP | HTTP,TCP |
| | UDP | | | UDP | | | | UDP | | UDP |
+---------+--------------+--------------+---------------+-----------+ +---------+--------------+--------------+---------------+-----------+
Figure 2: Comparison of the Data Object Modules Figure 2: Comparison of the Data Object Modules
Note that the interaction with the applications that consume network Note that the interaction with the applications that consume network
telemetry data can be indirect. Some in-device data transfer is telemetry data can be indirect. Some in-device data transfer is
possible. For example, in the management plane telemetry, the possible. For example, in the management plane telemetry, the
management plane will need to acquire data from the data plane. Some management plane will need to acquire data from the data plane. Some
of the operational states can only be derived from data plane data operational states can only be derived from data plane data sources
sources such as the interface status and statistics. As another such as the interface status and statistics. As another example,
example, obtaining control plane telemetry data may require the obtaining control plane telemetry data may require the ability to
ability to access the Forwarding Information Base (FIB) of the data access the Forwarding Information Base (FIB) of the data plane.
plane.
On the other hand, an application may involve more than one plane and On the other hand, an application may involve more than one plane and
interact with multiple planes simultaneously. For example, an SLA interact with multiple planes simultaneously. For example, an SLA
compliance application may require both the data plane telemetry and compliance application may require both the data plane telemetry and
the control plane telemetry. the control plane telemetry.
The requirements and challenges for each module are summarized as The requirements and challenges for each module are summarized as
follows (note that the requirements may pertain across all telemetry follows (note that the requirements may pertain across all telemetry
modules; however, we emphasize those that are most pronounced for a modules; however, we emphasize those that are most pronounced for a
particular plane). particular plane).
skipping to change at page 18, line 21 skipping to change at page 18, line 21
The management plane of network elements interacts with the Network The management plane of network elements interacts with the Network
Management System (NMS), and provides information such as performance Management System (NMS), and provides information such as performance
data, network logging data, network warning and defects data, and data, network logging data, network warning and defects data, and
network statistics and state data. The management plane includes network statistics and state data. The management plane includes
many protocols, including some that are considered "legacy", such as many protocols, including some that are considered "legacy", such as
SNMP and syslog. Regardless the protocol, management plane telemetry SNMP and syslog. Regardless the protocol, management plane telemetry
must address the following requirements: must address the following requirements:
* Convenient Data Subscription: An application should have the * Convenient Data Subscription: An application should have the
freedom to choose the data export means such as the data types (as freedom to choose which data is exported (see section 4.3) and the
described in Figure 4) and the export means and frequency (e.g., means and frequency of how that data is exported (e.g., on-change
on-change or periodic subscription). or periodic subscription).
* Structured Data: For automatic network operation, machines will * Structured Data: For automatic network operation, machines will
replace human for network data comprehension. Data modeling replace human for network data comprehension. Data modeling
languages, such as YANG, can efficiently describe structured data languages, such as YANG, can efficiently describe structured data
and normalize data encoding and transformation. and normalize data encoding and transformation.
* High Speed Data Transport: In order to keep up with the velocity * High Speed Data Transport: In order to keep up with the velocity
of information, a server needs to be able to send large amounts of of information, a server needs to be able to send large amounts of
data at high frequency. Compact encoding formats or data data at high frequency. Compact encoding formats or data
compression schemes are needed to compress the data and improve compression schemes are needed to reduce the quantity of data and
the data transport efficiency. The subscription mode, by improve the data transport efficiency. The subscription mode, by
replacing the query mode, reduces the interactions between clients replacing the query mode, reduces the interactions between clients
and servers and helps to improve the server's efficiency. and servers and helps to improve the server's efficiency.
4.1.2. Control Plane Telemetry 4.1.2. Control Plane Telemetry
The control plane telemetry refers to the health condition monitoring The control plane telemetry refers to the health condition monitoring
of different network control protocols at all layers of the protocol of different network control protocols at all layers of the protocol
stack. Keeping track of the operational status of these protocols is stack. Keeping track of the operational status of these protocols is
beneficial for detecting, localizing, and even predicting various beneficial for detecting, localizing, and even predicting various
network issues, as well as network optimization, in real-time and network issues, as well as network optimization, in real-time and
skipping to change at page 19, line 22 skipping to change at page 19, line 22
common issue behind these methods is that they only measure the common issue behind these methods is that they only measure the
KPIs instead of reflecting the actual running status of these KPIs instead of reflecting the actual running status of these
protocols, making them less effective or efficient for control protocols, making them less effective or efficient for control
plane troubleshooting and network optimization. plane troubleshooting and network optimization.
* An example of the control plane telemetry is the BGP monitoring * An example of the control plane telemetry is the BGP monitoring
protocol (BMP), it is currently used for monitoring the BGP routes protocol (BMP), it is currently used for monitoring the BGP routes
and enables rich applications, such as BGP peer analysis, AS and enables rich applications, such as BGP peer analysis, AS
analysis, prefix analysis, and security analysis. However, the analysis, prefix analysis, and security analysis. However, the
monitoring of other layers, protocols and the cross-layer, cross- monitoring of other layers, protocols and the cross-layer, cross-
protocol KPI correlations are still in their infancy (e.g., the protocol KPI correlations are still in their infancy (e.g., IGP
IGP monitoring is not as exensive as BMP), which require further monitoring is not as extensive as BMP), which require further
research. research.
4.1.3. Forwarding Plane Telemetry 4.1.3. Forwarding Plane Telemetry
An effective forwarding plane telemetry system relies on the data An effective forwarding plane telemetry system relies on the data
that the network device can expose. The quality, quantity, and that the network device can expose. The quality, quantity, and
timeliness of data must meet some stringent requirements. This timeliness of data must meet some stringent requirements. This
raises some challenges to the network data plane devices where the raises some challenges to the network data plane devices where the
first hand data originates. first-hand data originates.
* A data plane device's main function is user traffic processing and * A data plane device's main function is user traffic processing and
forwarding. While supporting network visibility is important, the forwarding. While supporting network visibility is important, the
telemetry is just an auxiliary function, and it should strive to telemetry is just an auxiliary function, and it should strive to
not impede normal traffic processing and forwarding (i.e., the not impede normal traffic processing and forwarding (i.e., the
forwarding behavior should not be altered and the tradeoff between forwarding behavior should not be altered and the trade-off
forwarding and telemtry should be well balanced). between forwarding performance and telemetry should be well-
balanced).
* Network operation applications require end-to-end visibility * Network operation applications require end-to-end visibility
across various sources, which can result in a huge volume of data. across various sources, which can result in a huge volume of data.
However, the sheer quantity of data must not exhaust the network However, the sheer quantity of data must not exhaust the network
bandwidth, regardless of the data delivery approach (i.e., whether bandwidth, regardless of the data delivery approach (i.e., whether
through in-band or out-of-band channels). through in-band or out-of-band channels).
* The data plane devices must provide timely data with the minimum * The data plane devices must provide timely data with the minimum
possible delay. Long processing, transport, storage, and analysis possible delay. Long processing, transport, storage, and analysis
delay can impact the effectiveness of the control loop and even delay can impact the effectiveness of the control loop and even
skipping to change at page 20, line 46 skipping to change at page 20, line 46
[I-D.ietf-ippm-ioam-data], Alternate-Marking (AM) [RFC8321], and [I-D.ietf-ippm-ioam-data], Alternate-Marking (AM) [RFC8321], and
Multipoint Alternate Marking [I-D.ietf-ippm-multipoint-alt-mark], Multipoint Alternate Marking [I-D.ietf-ippm-multipoint-alt-mark],
provide a well-balanced and more flexible approach. However, provide a well-balanced and more flexible approach. However,
these methods are also more complex to implement. these methods are also more complex to implement.
* In-Band and Out-of-Band: Telemetry data carried in user packets * In-Band and Out-of-Band: Telemetry data carried in user packets
before being exported to a data collector is considered in-band before being exported to a data collector is considered in-band
(e.g., in-situ OAM [I-D.ietf-ippm-ioam-data]). Telemetry data (e.g., in-situ OAM [I-D.ietf-ippm-ioam-data]). Telemetry data
that is directly exported to a data collector without modifying that is directly exported to a data collector without modifying
user packets is considered out-of-band (e.g., the postcard-based user packets is considered out-of-band (e.g., the postcard-based
approach described in Appendix). It is also possible to have approach described in Appendix A.3.5). It is also possible to
hybrid methods, where only the telemetry instruction or partial have hybrid methods, where only the telemetry instruction or
data is carried by user packets (e.g., AM [RFC8321]). partial data is carried by user packets (e.g., AM [RFC8321]).
* End-to-End and In-Network: End-to-End methods start from, and end * End-to-End and In-Network: End-to-End methods start from, and end
at, the network end hosts (e.g., Ping). In-Network methods work at, the network end hosts (e.g., Ping). In-Network methods work
in networks and are transparent to end hosts. However, if needed, in networks and are transparent to end hosts. However, if needed,
In-Network methods can be easily extended into end hosts. In-Network methods can be easily extended into end hosts.
* Data Subject: Depending on the telemetry objective, the methods * Data Subject: Depending on the telemetry objective, the methods
can be flow-based (e.g., in-situ OAM [I-D.ietf-ippm-ioam-data]), can be flow-based (e.g., in-situ OAM [I-D.ietf-ippm-ioam-data]),
path-based (e.g., Traceroute), and node-based (e.g., IPFIX path-based (e.g., Traceroute), and node-based (e.g., IPFIX
[RFC7011]). The various data objects can be packet, flow record, [RFC7011]). The various data objects can be packet, flow record,
skipping to change at page 21, line 32 skipping to change at page 21, line 32
[I-D.pedro-nmrg-anticipated-adaptation], provides a strategic and [I-D.pedro-nmrg-anticipated-adaptation], provides a strategic and
functional advantage to management operations. functional advantage to management operations.
As with other sources of telemetry information, the data and events As with other sources of telemetry information, the data and events
must meet strict requirements, especially in terms of timeliness, must meet strict requirements, especially in terms of timeliness,
which is essential to properly incorporate external event information which is essential to properly incorporate external event information
into network management applications. The specific challenges are into network management applications. The specific challenges are
described as follows: described as follows:
* The role of the external event detector can be played by multiple * The role of the external event detector can be played by multiple
elements, including hardware (e.g. physical sensors, such as elements, including hardware (e.g., physical sensors, such as
seismometers) and software (e.g. Big Data sources that analyze seismometers) and software (e.g., Big Data sources that analyze
streams of information, such as Twitter messages). Thus, the streams of information, such as Twitter messages). Thus, the
transmitted data must support different shapes but, at the same transmitted data must support different shapes but, at the same
time, follow a common but extensible schema. time, follow a common but extensible schema.
* Since the main function of the external event detectors is to * Since the main function of the external event detectors is to
perform the notifications, their timeliness is assumed. However, perform the notifications, their timeliness is assumed. However,
once messages have been dispatched, they must be quickly collected once messages have been dispatched, they must be quickly collected
and inserted into the control plane with variable priority, which and inserted into the control plane with variable priority, which
is higher for important sources and events and lower for secondary is higher for important sources and events and lower for secondary
ones. ones.
skipping to change at page 22, line 13 skipping to change at page 22, line 13
be easily mapped to current data models, such as in terms of YANG. be easily mapped to current data models, such as in terms of YANG.
Organizing both internal and external telemetry information together Organizing both internal and external telemetry information together
will be key for the general exploitation of the management will be key for the general exploitation of the management
possibilities of current and future network systems, as reflected in possibilities of current and future network systems, as reflected in
the incorporation of cognitive capabilities to new hardware and the incorporation of cognitive capabilities to new hardware and
software (virtual) elements. software (virtual) elements.
4.2. Second Level Function Components 4.2. Second Level Function Components
The telemetry module as each plane can be further partitioned into The telemetry module at each plane can be further partitioned into
five distinct conceptual components: five distinct conceptual components:
* Data Query, Analysis, and Storage: This component works at the * Data Query, Analysis, and Storage: This component works at the
application layer. It is normally a part of the network application layer. It is normally a part of the network
management system at the receiver side. On the one hand, it is management system at the receiver side. On the one hand, it is
responsible for issuing data requirements. The data of interest responsible for issuing data requirements. The data of interest
can be modeled data through configuration or custom data through can be modeled data through configuration or custom data through
programming. The data requirements can be queries for one-shot programming. The data requirements can be queries for one-shot
data or subscriptions for events or streaming data. On the other data or subscriptions for events or streaming data. On the other
hand, it receives, stores, and processes the returned data from hand, it receives, stores, and processes the returned data from
skipping to change at page 22, line 48 skipping to change at page 22, line 48
access control. The data encoding and the transport protocol may access control. The data encoding and the transport protocol may
vary due to the data export location. vary due to the data export location.
* Data Generation and Processing: The requested data needs to be * Data Generation and Processing: The requested data needs to be
captured, filtered, processed, and formatted in network devices captured, filtered, processed, and formatted in network devices
from raw data sources. This may involve in-network computing and from raw data sources. This may involve in-network computing and
processing on either the fast path or the slow path in network processing on either the fast path or the slow path in network
devices. devices.
* Data Object and Source: This component determines the monitoring * Data Object and Source: This component determines the monitoring
objects and original data sources provisioned in device. A data objects and original data sources provisioned in the device. A
source usually just provides raw data which needs further data source usually just provides raw data which needs further
processing. Each data source can be considered a probe. Some processing. Each data source can be considered a probe. Some
data sources can be dynamically installed, while others will be data sources can be dynamically installed, while others will be
more static. more static.
+----------------------------------------+ +----------------------------------------+
+----------------------------------------+ | +----------------------------------------+ |
| | | | | |
| Data Query, Analysis, & Storage | | | Data Query, Analysis, & Storage | |
| | + | | +
+-------+++ -----------------------------+ +-------+++ -----------------------------+
skipping to change at page 24, line 21 skipping to change at page 24, line 21
* Simple Data: The data that are steadily available from some * Simple Data: The data that are steadily available from some
datastore or static probes in network devices. datastore or static probes in network devices.
* Derived Data: The data need to be synthesized or processed in * Derived Data: The data need to be synthesized or processed in
network from raw data from one or more network devices. The data network from raw data from one or more network devices. The data
processing function can be statically or dynamically loaded into processing function can be statically or dynamically loaded into
network devices. network devices.
* Event-triggered Data: The data are conditionally acquired based on * Event-triggered Data: The data are conditionally acquired based on
the occurrence of some events. For example, a network interface the occurrence of some events. An example of event-triggered data
changing its operational state from up to down can be a trigger could be an interface changing operational state between up and
event. Such data can be actively pushed through subscription or down. Such data can be actively pushed through subscription or
passively polled through query. There are many ways to model passively polled through query. There are many ways to model
events, including using Finite State Machine (FSM) or Event events, including using Finite State Machine (FSM) or Event
Condition Action (ECA) [I-D.wwx-netmod-event-yang]. Condition Action (ECA) [I-D.wwx-netmod-event-yang].
* Streaming Data: The data are continuously generated. It can be * Streaming Data: The data are continuously generated. It can be
time series or the dump of databases. For example, an interface time series or the dump of databases. For example, an interface
packet counter is exported every second. The streaming data packet counter is exported every second. The streaming data
reflect realtime network states and metrics and require large reflect realtime network states and metrics and require large
bandwidth and processing power. The streaming data are always bandwidth and processing power. The streaming data are always
actively pushed to the subscribers. actively pushed to the subscribers.
skipping to change at page 26, line 4 skipping to change at page 26, line 4
+-------------+-----------------+---------------+--------------+ +-------------+-----------------+---------------+--------------+
| data config.| gNMI, NETCONF, | gNMI, NETCONF,| NETCONF, | | data config.| gNMI, NETCONF, | gNMI, NETCONF,| NETCONF, |
| & subscribe | SNMP, YANG-Push | YANG-Push | YANG-Push | | & subscribe | SNMP, YANG-Push | YANG-Push | YANG-Push |
+-------------+-----------------+---------------+--------------+ +-------------+-----------------+---------------+--------------+
| data gen. & | MIB, | YANG | IOAM, PSAMP | | data gen. & | MIB, | YANG | IOAM, PSAMP |
| process | YANG | | PBT, AM, | | process | YANG | | PBT, AM, |
+-------------+-----------------+---------------+--------------+ +-------------+-----------------+---------------+--------------+
| data encode.| gRPC, HTTP, TCP | BMP, TCP | IPFIX, UDP | | data encode.| gRPC, HTTP, TCP | BMP, TCP | IPFIX, UDP |
| & export | | | | | & export | | | |
+-------------+-----------------+---------------+--------------+ +-------------+-----------------+---------------+--------------+
Figure 5: Existing Work Mapping II Figure 5: Existing Work Mapping
5. Evolution of Network Telemetry Applications 5. Evolution of Network Telemetry Applications
Network telemetry is an evolving technical area. As the network Network telemetry is an evolving technical area. As the network
moves towards the automated operation, network telemetry applications moves towards the automated operation, network telemetry applications
undergo several stages of evolution which add new layer of undergo several stages of evolution which add new layer of
requirements to the underlying network telemetry techniques. Each requirements to the underlying network telemetry techniques. Each
stage is built upon the techniques adopted by the previous stages stage is built upon the techniques adopted by the previous stages
plus some new requirements. plus some new requirements.
Stage 0 - Static Telemetry: The telemetry data source and type are Stage 0 - Static Telemetry: The telemetry data source and type are
determined at design time. The network operator can only determined at design time. The network operator can only
configure how to use it with limited flexibility. configure how to use it with limited flexibility.
Stage 1 - Dynamic Telemetry: The custom telemetry data can be Stage 1 - Dynamic Telemetry: The custom telemetry data can be
dynamically programmed or configured at runtime without dynamically programmed or configured at runtime without
interrupting the network operation, allowing a tradeoff among interrupting the network operation, allowing a trade-off among
resource, performance, flexibility, and coverage. resource, performance, flexibility, and coverage.
Stage 2 - Interactive Telemetry: The network operator can Stage 2 - Interactive Telemetry: The network operator can
continuously customize and fine tune the telemetry data in real continuously customize and fine tune the telemetry data in real
time to reflect the network operation's visibility requirements. time to reflect the network operation's visibility requirements.
Compared with Stage 1, the changes are frequent based on the real- Compared with Stage 1, the changes are frequent based on the real-
time feedback. At this stage, some tasks can be automated, but time feedback. At this stage, some tasks can be automated, but
human operators still need to sit in the middle to make decisions. human operators still need to sit in the middle to make decisions.
Stage 3 - Closed-loop Telemetry: The telemetry is free from the Stage 3 - Closed-loop Telemetry: The telemetry is free from the
skipping to change at page 26, line 49 skipping to change at page 26, line 49
future autonomic networks may need a comprehensive operation future autonomic networks may need a comprehensive operation
management system which works at stage 2 and stage 3 to cover all the management system which works at stage 2 and stage 3 to cover all the
network operation tasks. A well-defined network telemetry framework network operation tasks. A well-defined network telemetry framework
is the first step towards this direction. is the first step towards this direction.
6. Security Considerations 6. Security Considerations
The complexity of network telemetry raises significant security The complexity of network telemetry raises significant security
implications. For example, telemetry data can be manipulated to implications. For example, telemetry data can be manipulated to
exhaust various network resources at each plane as well as the data exhaust various network resources at each plane as well as the data
consumer; falsified or tampered data can mislead the decision making consumer; falsified or tampered data can mislead the decision-making
and paralyze networks; wrong configuration and programming for and paralyze networks; wrong configuration and programming for
telemetry is equally harmful. The telemetry data is highly telemetry is equally harmful. The telemetry data is highly
sensitive, which exposes a lot of information about the network and sensitive, which exposes a lot of information about the network and
its configuration. Some of that information can make designing its configuration. Some of that information can make designing
attacks against the network much easier (e.g., exact details of what attacks against the network much easier (e.g., exact details of what
software and patches have been installed), and allows an attacker to software and patches have been installed), and allows an attacker to
determine whether a device may be subject to unprotected security determine whether a device may be subject to unprotected security
vulnerability. vulnerabilities.
Given that this document has proposed a framework for network Given that this document has proposed a framework for network
telemetry and the telemetry mechanisms discussed are more extensive telemetry and the telemetry mechanisms discussed are more extensive
(in both message frequency and traffic amount) than the conventional (in both message frequency and traffic amount) than the conventional
network OAM concepts, we must also reflect that various new security network OAM concepts, we must also reflect that various new security
considerations may also arise. A number of techniques already exist considerations may also arise. A number of techniques already exist
for securing the forwarding plane, the control plane, and the for securing the forwarding plane, the control plane, and the
management plane in a network, but it is important to consider if any management plane in a network, but it is important to consider if any
new threat vectors are now being enabled via the use of network new threat vectors are now being enabled via the use of network
telemetry procedures and mechanisms. telemetry procedures and mechanisms.
skipping to change at page 28, line 5 skipping to change at page 28, line 5
identify malicious attacks using telemetry interfaces. identify malicious attacks using telemetry interfaces.
* Authentication and signing of telemetry data to make data more * Authentication and signing of telemetry data to make data more
trustworthy. trustworthy.
* Segregating the telemetry data traffic from the data traffic * Segregating the telemetry data traffic from the data traffic
carried over the network (e.g., historically management access and carried over the network (e.g., historically management access and
management data may be carried via an independent management management data may be carried via an independent management
network). network).
Some of the security considerations highlighted above may be Some security considerations highlighted above may be minimized or
minimized or negated with policy management of network telemetry. In negated with policy management of network telemetry. In a network
a network telemetry deployment it would be advantageous to separate telemetry deployment it would be advantageous to separate telemetry
telemetry capabilities into different classes of policies, i.e., Role capabilities into different classes of policies, i.e., Role Based
Based Access Control and Event-Condition-Action policies. Also, Access Control and Event-Condition-Action policies. Also, potential
potential conflicts between network telemetry mechanisms must be conflicts between network telemetry mechanisms must be detected
detected accurately and resolved quickly to avoid unnecessary network accurately and resolved quickly to avoid unnecessary network
telemetry traffic propagation escalating into an unintended or telemetry traffic propagation escalating into an unintended or
intended denial of service attack. intended denial of service attack.
Further study of the security issues will be required, and it is Further study of the security issues will be required, and it is
expected that the secuirty mechanisms and protocols are developed and expected that the security mechanisms and protocols are developed and
deployed along with a network telemetry system. deployed along with a network telemetry system.
In addition to security, privacy is also an important issue. Network In addition to security, privacy is also an important issue. Network
telemetry means to improve the network operation which can ultimately telemetry means to improve the network operation which can ultimately
benefit end user's quality of experience. The network operators must benefit end user's quality of experience. The network operators must
be held accountable and strive for a balance between managing the be held accountable and strive for a balance between managing the
network and maintaining the user privacy of that network. network and maintaining the user privacy of that network.
7. IANA Considerations 7. IANA Considerations
skipping to change at page 29, line 33 skipping to change at page 29, line 33
Evens, T., Bayraktar, S., Bhardwaj, M., and P. Lucente, Evens, T., Bayraktar, S., Bhardwaj, M., and P. Lucente,
"Support for Local RIB in BGP Monitoring Protocol (BMP)", "Support for Local RIB in BGP Monitoring Protocol (BMP)",
Work in Progress, Internet-Draft, draft-ietf-grow-bmp- Work in Progress, Internet-Draft, draft-ietf-grow-bmp-
local-rib-13, 31 August 2021, local-rib-13, 31 August 2021,
<https://www.ietf.org/archive/id/draft-ietf-grow-bmp- <https://www.ietf.org/archive/id/draft-ietf-grow-bmp-
local-rib-13.txt>. local-rib-13.txt>.
[I-D.ietf-ippm-ioam-data] [I-D.ietf-ippm-ioam-data]
Brockners, F., Bhandari, S., and T. Mizrahi, "Data Fields Brockners, F., Bhandari, S., and T. Mizrahi, "Data Fields
for In-situ OAM", Work in Progress, Internet-Draft, draft- for In-situ OAM", Work in Progress, Internet-Draft, draft-
ietf-ippm-ioam-data-14, 24 June 2021, ietf-ippm-ioam-data-15, 3 October 2021,
<https://www.ietf.org/archive/id/draft-ietf-ippm-ioam- <https://www.ietf.org/archive/id/draft-ietf-ippm-ioam-
data-14.txt>. data-15.txt>.
[I-D.ietf-ippm-multipoint-alt-mark] [I-D.ietf-ippm-multipoint-alt-mark]
Fioccola, G., Cociglio, M., Sapio, A., and R. Sisto, Fioccola, G., Cociglio, M., Sapio, A., and R. Sisto,
"Multipoint Alternate-Marking Method for Passive and "Multipoint Alternate-Marking Method for Passive and
Hybrid Performance Monitoring", Work in Progress, Hybrid Performance Monitoring", Work in Progress,
Internet-Draft, draft-ietf-ippm-multipoint-alt-mark-09, 23 Internet-Draft, draft-ietf-ippm-multipoint-alt-mark-09, 23
March 2020, <https://www.ietf.org/archive/id/draft-ietf- March 2020, <https://www.ietf.org/archive/id/draft-ietf-
ippm-multipoint-alt-mark-09.txt>. ippm-multipoint-alt-mark-09.txt>.
[I-D.ietf-netconf-distributed-notif] [I-D.ietf-netconf-distributed-notif]
skipping to change at page 34, line 26 skipping to change at page 34, line 26
channel [I-D.ietf-netconf-udp-notif] provides enhanced efficiency for channel [I-D.ietf-netconf-udp-notif] provides enhanced efficiency for
the NETCONF based telemetry. the NETCONF based telemetry.
A.1.2. gRPC Network Management Interface A.1.2. gRPC Network Management Interface
gRPC Network Management Interface (gNMI) gRPC Network Management Interface (gNMI)
[I-D.openconfig-rtgwg-gnmi-spec] is a network management protocol [I-D.openconfig-rtgwg-gnmi-spec] is a network management protocol
based on the gRPC [I-D.kumar-rtgwg-grpc-protocol] RPC (Remote based on the gRPC [I-D.kumar-rtgwg-grpc-protocol] RPC (Remote
Procedure Call) framework. With a single gRPC service definition, Procedure Call) framework. With a single gRPC service definition,
both configuration and telemetry can be covered. gRPC is an HTTP/2 both configuration and telemetry can be covered. gRPC is an HTTP/2
[RFC7540] based open source micro service communication framework. [RFC7540] based open-source micro-service communication framework.
It provides a number of capabilities which are well-suited for It provides a number of capabilities which are well-suited for
network telemetry, including: network telemetry, including:
* Full-duplex streaming transport model combined with a binary * Full-duplex streaming transport model combined with a binary
encoding mechanism provides good telemetry efficiency. encoding mechanism provides good telemetry efficiency.
* gRPC provides higher-level features consistency across platforms * gRPC provides higher-level features consistency across platforms
that common HTTP/2 libraries typically do not. This that common HTTP/2 libraries typically do not. This
characteristic is especially valuable for the fact that telemetry characteristic is especially valuable for the fact that telemetry
data collectors normally reside on a large variety of platforms. data collectors normally reside on a large variety of platforms.
skipping to change at page 35, line 5 skipping to change at page 35, line 5
BGP Monitoring Protocol (BMP) [RFC7854] is used to monitor BGP BGP Monitoring Protocol (BMP) [RFC7854] is used to monitor BGP
sessions and is intended to provide a convenient interface for sessions and is intended to provide a convenient interface for
obtaining route views. obtaining route views.
The BGP routing information is collected from the monitored device(s) The BGP routing information is collected from the monitored device(s)
to the BMP monitoring station by setting up the BMP TCP session. The to the BMP monitoring station by setting up the BMP TCP session. The
BGP peers are monitored by the BMP Peer Up and Peer Down BGP peers are monitored by the BMP Peer Up and Peer Down
Notifications. The BGP routes (including Adjacency_RIB_In [RFC7854], Notifications. The BGP routes (including Adjacency_RIB_In [RFC7854],
Adjacency_RIB_out [I-D.ietf-grow-bmp-adj-rib-out], and Local_Rib Adjacency_RIB_out [I-D.ietf-grow-bmp-adj-rib-out], and Local_Rib
[I-D.ietf-grow-bmp-local-rib] are encapsulated in the BMP Route [I-D.ietf-grow-bmp-local-rib]) are encapsulated in the BMP Route
Monitoring Message and the BMP Route Mirroring Message, providing Monitoring Message and the BMP Route Mirroring Message, providing
both an initial table dump and real-time route updates. In addition, both an initial table dump and real-time route updates. In addition,
BGP statistics are reported through the BMP Stats Report Message, BGP statistics are reported through the BMP Stats Report Message,
which could be either timer triggered or event-driven. Future BMP which could be either timer triggered or event-driven. Future BMP
extensions could further enrich BGP monitoring applications. extensions could further enrich BGP monitoring applications.
A.3. Data Plane Telemetry A.3. Data Plane Telemetry
A.3.1. The Alternate Marking (AM) technology A.3.1. The Alternate Marking (AM) technology
skipping to change at page 35, line 35 skipping to change at page 35, line 35
the packet loss calculation. The same idea can be applied to delay the packet loss calculation. The same idea can be applied to delay
measurement by selecting ad hoc packets with a marking bit dedicated measurement by selecting ad hoc packets with a marking bit dedicated
for delay measurements. for delay measurements.
Alternate Marking method needs two counters each marking period for Alternate Marking method needs two counters each marking period for
each flow under monitor. For instance, by considering n measurement each flow under monitor. For instance, by considering n measurement
points and m monitored flows, the order of magnitude of the packet points and m monitored flows, the order of magnitude of the packet
counters for each time interval is n*m*2 (1 per color). counters for each time interval is n*m*2 (1 per color).
Since networks offer rich sets of network performance measurement Since networks offer rich sets of network performance measurement
data (e.g packet counters), traditional approaches run into data (e.g., packet counters), traditional approaches run into
limitations. The bottleneck is the generation and export of the data limitations. The bottleneck is the generation and export of the data
and the amount of data that can be reasonably collected from the and the amount of data that can be reasonably collected from the
network. In addition, management tasks related to determining and network. In addition, management tasks related to determining and
configuring which data to generate lead to significant deployment configuring which data to generate lead to significant deployment
challenges. challenges.
The Multipoint Alternate Marking approach, described in The Multipoint Alternate Marking approach, described in
[I-D.ietf-ippm-multipoint-alt-mark], aims to resolve this issue and [I-D.ietf-ippm-multipoint-alt-mark], aims to resolve this issue and
make the performance monitoring more flexible in case a detailed make the performance monitoring more flexible in case a detailed
analysis is not needed. analysis is not needed.
skipping to change at page 38, line 22 skipping to change at page 38, line 22
management and match it to the connectors and/or interfaces required management and match it to the connectors and/or interfaces required
to connect them. to connect them.
Categories of external event sources that may be of interest to Categories of external event sources that may be of interest to
network management include:: network management include::
* Smart objects and sensors. With the consolidation of the Internet * Smart objects and sensors. With the consolidation of the Internet
of Things~(IoT) any network system will have many smart objects of Things~(IoT) any network system will have many smart objects
attached to its physical surroundings and logical operation attached to its physical surroundings and logical operation
environments. Most of these objects will be essentially based on environments. Most of these objects will be essentially based on
sensors of many kinds (e.g. temperature, humidity, presence) and sensors of many kinds (e.g., temperature, humidity, presence) and
the information they provide can be very useful for the management the information they provide can be very useful for the management
of the network, even when they are not specifically deployed for of the network, even when they are not specifically deployed for
such purpose. Elements of this source type will usually provide a such purpose. Elements of this source type will usually provide a
specific protocol for interaction, especially one of those specific protocol for interaction, especially one of those
protocols related to IoT, such as the Constrained Application protocols related to IoT, such as the Constrained Application
Protocol (CoAP). Protocol (CoAP).
* Online news reporters. Several online news services have the * Online news reporters. Several online news services have the
ability to provide enormous quantity of information about ability to provide enormous quantity of information about
different events occurring in the world. Some of those events can different events occurring in the world. Some of those events can
skipping to change at page 38, line 51 skipping to change at page 38, line 51
be part of both the ontology and information model of the be part of both the ontology and information model of the
telemetry framework. telemetry framework.
* Global event analyzers. The advance of Big Data analyzers * Global event analyzers. The advance of Big Data analyzers
provides a huge amount of information and, more interestingly, the provides a huge amount of information and, more interestingly, the
identification of events detected by analyzing many data streams identification of events detected by analyzing many data streams
from different origins. In contrast with the other types of from different origins. In contrast with the other types of
sources, which are focused on specific events, the detectors of sources, which are focused on specific events, the detectors of
this source type will detect generic events. For example, a this source type will detect generic events. For example, a
sports event takes place and some unexpected movement makes it sports event takes place and some unexpected movement makes it
highly interesting and many people connects to sites that are fascinating and many people connect to sites that are reporting on
reporting on the event. The underlying networks supporting the the event. The underlying networks supporting the services that
services that cover the event can be affected by such situation so cover the event can be affected by such situation so their
their management solutions should be aware of it. In contrast management solutions should be aware of it. In contrast with the
with the other source types, a new information model, format, and other source types, a new information model, format, and reporting
reporting protocol is required to integrate the detectors of this protocol is required to integrate the detectors of this type with
type with the management solution. the management solution.
Additional types of detector types can be added to the system but Additional types of detector types can be added to the system, but
they will be generally the result of composing the properties offered they will be generally the result of composing the properties offered
by these main classes. by these main classes.
A.4.2. Connectors and Interfaces A.4.2. Connectors and Interfaces
For allowing external event detectors to be properly integrated with For allowing external event detectors to be properly integrated with
other management solutions, both elements must expose interfaces and other management solutions, both elements must expose interfaces and
protocols that are subject to their particular objective. Since protocols that are subject to their particular objective. Since
external event detectors will be focused on providing their external event detectors will be focused on providing their
information to their main consumers, which generally will not be information to their main consumers, which generally will not be
 End of changes. 46 change blocks. 
90 lines changed or deleted 92 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/