draft-ietf-ccamp-alarm-module-02.txt   draft-ietf-ccamp-alarm-module-03.txt 
Network Working Group S. Vallin Network Working Group S. Vallin
Internet-Draft Stefan Vallin AB Internet-Draft Stefan Vallin AB
Intended status: Standards Track M. Bjorklund Intended status: Standards Track M. Bjorklund
Expires: February 9, 2019 Cisco Expires: March 24, 2019 Cisco
August 8, 2018 September 20, 2018
YANG Alarm Module YANG Alarm Module
draft-ietf-ccamp-alarm-module-02 draft-ietf-ccamp-alarm-module-03
Abstract Abstract
This document defines a YANG module for alarm management. It This document defines a YANG module for alarm management. It
includes functions for alarm list management, alarm shelving and includes functions for alarm list management, alarm shelving and
notifications to inform management systems. There are also RPCs to notifications to inform management systems. There are also RPCs to
manage the operator state of an alarm and administrative alarm manage the operator state of an alarm and administrative alarm
procedures. The module carefully maps to relevant alarm standards. procedures. The module carefully maps to relevant alarm standards.
Status of This Memo Status of This Memo
skipping to change at page 1, line 35 skipping to change at page 1, line 35
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on February 9, 2019. This Internet-Draft will expire on March 24, 2019.
Copyright Notice Copyright Notice
Copyright (c) 2018 IETF Trust and the persons identified as the Copyright (c) 2018 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 18 skipping to change at page 2, line 18
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Terminology and Notation . . . . . . . . . . . . . . . . 3 1.1. Terminology and Notation . . . . . . . . . . . . . . . . 3
2. Objectives . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Objectives . . . . . . . . . . . . . . . . . . . . . . . . . 4
3. Alarm Module Concepts . . . . . . . . . . . . . . . . . . . . 5 3. Alarm Module Concepts . . . . . . . . . . . . . . . . . . . . 5
3.1. Alarm Definition . . . . . . . . . . . . . . . . . . . . 5 3.1. Alarm Definition . . . . . . . . . . . . . . . . . . . . 5
3.2. Alarm Type . . . . . . . . . . . . . . . . . . . . . . . 5 3.2. Alarm Type . . . . . . . . . . . . . . . . . . . . . . . 5
3.3. Identifying the Alarming Resource . . . . . . . . . . . . 7 3.3. Identifying the Alarming Resource . . . . . . . . . . . . 7
3.4. Identifying Alarm Instances . . . . . . . . . . . . . . . 8 3.4. Identifying Alarm Instances . . . . . . . . . . . . . . . 8
3.5. Alarm Life-Cycle . . . . . . . . . . . . . . . . . . . . 8 3.5. Alarm Life-Cycle . . . . . . . . . . . . . . . . . . . . 8
3.5.1. Resource Alarm Life-Cycle . . . . . . . . . . . . . . 8 3.5.1. Resource Alarm Life-Cycle . . . . . . . . . . . . . . 9
3.5.2. Operator Alarm Life-cycle . . . . . . . . . . . . . . 9 3.5.2. Operator Alarm Life-cycle . . . . . . . . . . . . . . 10
3.5.3. Administrative Alarm Life-Cycle . . . . . . . . . . . 10 3.5.3. Administrative Alarm Life-Cycle . . . . . . . . . . . 10
3.6. Root Cause, Impacted Resources and Related Alarms . . . . 10 3.6. Root Cause, Impacted Resources and Related Alarms . . . . 10
3.7. Alarm Shelving . . . . . . . . . . . . . . . . . . . . . 11 3.7. Alarm Shelving . . . . . . . . . . . . . . . . . . . . . 11
3.8. Alarm Profiles . . . . . . . . . . . . . . . . . . . . . 11 3.8. Alarm Profiles . . . . . . . . . . . . . . . . . . . . . 11
4. Alarm Data Model . . . . . . . . . . . . . . . . . . . . . . 11 4. Alarm Data Model . . . . . . . . . . . . . . . . . . . . . . 12
4.1. Alarm Control . . . . . . . . . . . . . . . . . . . . . . 12 4.1. Alarm Control . . . . . . . . . . . . . . . . . . . . . . 13
4.1.1. Alarm Shelving . . . . . . . . . . . . . . . . . . . 12 4.1.1. Alarm Shelving . . . . . . . . . . . . . . . . . . . 13
4.2. Alarm Inventory . . . . . . . . . . . . . . . . . . . . . 12 4.2. Alarm Inventory . . . . . . . . . . . . . . . . . . . . . 13
4.3. Alarm Summary . . . . . . . . . . . . . . . . . . . . . . 13 4.3. Alarm Summary . . . . . . . . . . . . . . . . . . . . . . 14
4.4. The Alarm List . . . . . . . . . . . . . . . . . . . . . 13 4.4. The Alarm List . . . . . . . . . . . . . . . . . . . . . 15
4.5. The Shelved Alarms List . . . . . . . . . . . . . . . . . 13 4.5. The Shelved Alarms List . . . . . . . . . . . . . . . . . 17
4.6. Alarm Profiles . . . . . . . . . . . . . . . . . . . . . 14 4.6. Alarm Profiles . . . . . . . . . . . . . . . . . . . . . 17
4.7. RPCs and Actions . . . . . . . . . . . . . . . . . . . . 14 4.7. RPCs and Actions . . . . . . . . . . . . . . . . . . . . 17
4.8. Notifications . . . . . . . . . . . . . . . . . . . . . . 14 4.8. Notifications . . . . . . . . . . . . . . . . . . . . . . 17
5. Alarm YANG Module . . . . . . . . . . . . . . . . . . . . . . 14 5. Alarm YANG Module . . . . . . . . . . . . . . . . . . . . . . 18
6. X.733 Extensions . . . . . . . . . . . . . . . . . . . . . . 44 6. X.733 Extensions . . . . . . . . . . . . . . . . . . . . . . 47
7. The X.733 Mapping Module . . . . . . . . . . . . . . . . . . 44 7. The X.733 Mapping Module . . . . . . . . . . . . . . . . . . 48
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 55 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 59
9. Security Considerations . . . . . . . . . . . . . . . . . . . 56 9. Security Considerations . . . . . . . . . . . . . . . . . . . 59
10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 57 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 60
11. References . . . . . . . . . . . . . . . . . . . . . . . . . 57 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 60
11.1. Normative References . . . . . . . . . . . . . . . . . . 57 11.1. Normative References . . . . . . . . . . . . . . . . . . 60
11.2. Informative References . . . . . . . . . . . . . . . . . 58 11.2. Informative References . . . . . . . . . . . . . . . . . 61
Appendix A. Vendor-specific Alarm-Types Example . . . . . . . . 59 Appendix A. Vendor-specific Alarm-Types Example . . . . . . . . 62
Appendix B. Alarm Inventory Example . . . . . . . . . . . . . . 60 Appendix B. Alarm Inventory Example . . . . . . . . . . . . . . 63
Appendix C. Alarm List Example . . . . . . . . . . . . . . . . . 61 Appendix C. Alarm List Example . . . . . . . . . . . . . . . . . 64
Appendix D. Alarm Shelving Example . . . . . . . . . . . . . . . 62 Appendix D. Alarm Shelving Example . . . . . . . . . . . . . . . 65
Appendix E. X.733 Mapping Example . . . . . . . . . . . . . . . 63 Appendix E. X.733 Mapping Example . . . . . . . . . . . . . . . 66
Appendix F. Background and Usability Requirements . . . . . . . 64 Appendix F. Background and Usability Requirements . . . . . . . 67
F.1. Alarm Concepts . . . . . . . . . . . . . . . . . . . . . 64 F.1. Alarm Concepts . . . . . . . . . . . . . . . . . . . . . 67
F.1.1. Alarm type . . . . . . . . . . . . . . . . . . . . . 64 F.1.1. Alarm type . . . . . . . . . . . . . . . . . . . . . 67
F.2. Usability Requirements . . . . . . . . . . . . . . . . . 65 F.2. Relationships to other alarm standards . . . . . . . . . 68
F.2.1. Alarm definition . . . . . . . . . . . . . . . . . . 68
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 68 F.2.2. Data model . . . . . . . . . . . . . . . . . . . . . 70
F.3. Usability Requirements . . . . . . . . . . . . . . . . . 72
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 75
1. Introduction 1. Introduction
This document defines a YANG [RFC7950] module for alarm management. This document defines a YANG [RFC7950] module for alarm management.
The purpose is to define a standardised alarm interface for network The purpose is to define a standardized alarm interface for network
devices that can be easily integrated into management applications. devices that can be easily integrated into management applications.
The model is also applicable as a northbound alarm interface in the The model is also applicable as a northbound alarm interface in the
management applications. management applications.
Alarm monitoring is a fundamental part of monitoring the network. Alarm monitoring is a fundamental part of monitoring the network.
Raw alarms from devices do not always tell the status of the network Raw alarms from devices do not always tell the status of the network
services or necessarily point to the root cause. However, being able services or necessarily point to the root cause. However, being able
to feed alarms to the alarm management application in a standardised to feed alarms to the alarm management application in a standardized
format is a starting point for performing higher level network format is a starting point for performing higher level network
assurance tasks. assurance tasks.
The design of the module is based on experience from using and The design of the module is based on experience from using and
implementing available alarm standards from ITU [X.733], 3GPP implementing available alarm standards from ITU [X.733], 3GPP
[ALARMIRP] and ANSI [ISA182]. [ALARMIRP] and ANSI [ISA182].
1.1. Terminology and Notation 1.1. Terminology and Notation
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
skipping to change at page 4, line 20 skipping to change at page 4, line 20
for example: an interface, a process. for example: an interface, a process.
o Alarm Instance: The alarm state for a specific resource and alarm o Alarm Instance: The alarm state for a specific resource and alarm
type. For example (GigabitEthernet0/15, link-alarm). An entry in type. For example (GigabitEthernet0/15, link-alarm). An entry in
the alarm list. the alarm list.
o Alarm Inventory: A list of all possible alarm types on a system. o Alarm Inventory: A list of all possible alarm types on a system.
o Alarm Shelving: Blocking alarms according to specific criteria. o Alarm Shelving: Blocking alarms according to specific criteria.
o Corrective Action: An action taken by an operator or automation
routine in order to minimize the impact of the alarm or resolving
the root cause.
o Management System: The alarm management application that consumes o Management System: The alarm management application that consumes
the alarms, i.e., acts as a client. the alarms, i.e., acts as a client.
o System: The system that implements this YANG alarm module, i.e., o System: The system that implements this YANG alarm module, i.e.,
acts as a server. This corresponds to a network device or a acts as a server. This corresponds to a network device or a
management application that provides a north-bound alarm management application that provides a north-bound alarm
interface. interface.
Tree diagrams used in this document follow the notation defined in Tree diagrams used in this document follow the notation defined in
[RFC8340]. [RFC8340].
skipping to change at page 5, line 37 skipping to change at page 5, line 42
There are two main things to remember from this definition: There are two main things to remember from this definition:
1. the definition focuses on leaving out events and logging 1. the definition focuses on leaving out events and logging
information in general. Alarms should only be used for undesired information in general. Alarms should only be used for undesired
states that require action. states that require action.
2. the definition also focus on alarms as a state on a resource, not 2. the definition also focus on alarms as a state on a resource, not
the notifications that report the state changes. the notifications that report the state changes.
See Appendix F for more motivation and consequences around this See Appendix F for more motivation and consequences around this
definition. definition as well as how it relates to other alarm standards.
3.2. Alarm Type 3.2. Alarm Type
This document defines an alarm type with an alarm type id and an This document defines an alarm type with an alarm type id and an
alarm type qualifier. alarm type qualifier.
The alarm type id is modeled as a YANG identity. With YANG The alarm type id is modeled as a YANG identity. With YANG
identities, new alarm types can be defined in a distributed fashion. identities, new alarm types can be defined in a distributed fashion.
YANG identities are hierarchical, which means that an hierarchy of YANG identities are hierarchical, which means that an hierarchy of
alarm types can be defined. alarm types can be defined.
skipping to change at page 6, line 14 skipping to change at page 6, line 17
The use of YANG identities means that all possible alarms are The use of YANG identities means that all possible alarms are
identified at design time. This explicit declaration of alarm types identified at design time. This explicit declaration of alarm types
makes it easier to allow for alarm qualification reviews and makes it easier to allow for alarm qualification reviews and
preparation of alarm actions and documentation. preparation of alarm actions and documentation.
There are occasions where the alarm types are not known at design There are occasions where the alarm types are not known at design
time. For example, a system with digital inputs that allows users to time. For example, a system with digital inputs that allows users to
connects detectors (e.g., smoke detector) to the inputs. In this connects detectors (e.g., smoke detector) to the inputs. In this
case it is a configuration action that says that certain connectors case it is a configuration action that says that certain connectors
are fire alarms for example. A potential drawback of this is that are fire alarms for example.
there is a big risk that alarm operators will receive alarm types as
a surprise, they do not know how to resolve the problem since a
defined alarm procedure does not necessarily exist. To avoid this
risk the system MUST publish all possible alarm types in the alarm
inventory, see Section 4.2.
In order to allow for dynamic addition of alarm types the alarm In order to allow for dynamic addition of alarm types the alarm
module also allows for further qualification of the identity based module allows for further qualification of the identity based alarm
alarm type using a string. type using a string. A potential drawback of this is that there is a
big risk that alarm operators will receive alarm types as a surprise,
they do not know how to resolve the problem since a defined alarm
procedure does not necessarily exist. To avoid this risk the system
MUST publish all possible alarm types in the alarm inventory, see
Section 4.2.
A vendor or standard can then define their own alarm-type hierarchy. A vendor or standard organization can define their own alarm-type
The example below shows a hierarchy based on X.733 event types: hierarchy. The example below shows a hierarchy based on X.733 event
types:
import ietf-alarms { import ietf-alarms {
prefix al; prefix al;
} }
identity vendor-alarms { identity vendor-alarms {
base al:alarm-type; base al:alarm-type;
} }
identity communications-alarm { identity communications-alarm {
base vendor-alarms; base vendor-alarms;
} }
skipping to change at page 7, line 51 skipping to change at page 8, line 5
A server SHOULD strive to minimize the number of dynamically defined A server SHOULD strive to minimize the number of dynamically defined
alarm types. alarm types.
3.3. Identifying the Alarming Resource 3.3. Identifying the Alarming Resource
It is of vital importance to be able to refer to the alarming It is of vital importance to be able to refer to the alarming
resource. This reference must be as fine-grained as possible. If resource. This reference must be as fine-grained as possible. If
the alarming resource exists in the data tree then an instance- the alarming resource exists in the data tree then an instance-
identifier MUST be used with the full path to the object. identifier MUST be used with the full path to the object.
When the module is used in a controller/orchestrator/manager the
original device resource identification can be modified to include
the device in the path. The details depend on how devices are
identified, and are out of scope for this specification.
Example:
The original device alarm might identify the resource as
"/dev:interfaces/dev:interface[dev:name='FastEthernet1/0']".
The resource identification in the manager could look something
like: "/mgr:devices/mgr:device[mgr:name='xyz123']/dev:interfaces/
dev:interface[dev:name='FastEthernet1/0']"
This module also allows for alternate naming of the alarming resource This module also allows for alternate naming of the alarming resource
if it is not available in the data tree. if it is not available in the data tree.
3.4. Identifying Alarm Instances 3.4. Identifying Alarm Instances
A primary goal of this alarm module is to remove any ambiguity in how A primary goal of this alarm module is to remove any ambiguity in how
alarm notifications are mapped to an update of an alarm instance. alarm notifications are mapped to an update of an alarm instance.
X.733 and especially 3GPP were not really clear on this point. This X.733 and especially 3GPP were not really clear on this point. This
YANG alarm module states that the tuple (resource, alarm type YANG alarm module states that the tuple (resource, alarm type
identifier, alarm type qualifier) corresponds to a single alarm identifier, alarm type qualifier) corresponds to a single alarm
skipping to change at page 12, line 5 skipping to change at page 12, line 16
The fundamental parts of the data model are the "alarm-list" with The fundamental parts of the data model are the "alarm-list" with
associated notifications and the "alarm-inventory" list of all associated notifications and the "alarm-inventory" list of all
possible alarm types. These MUST be implemented by a system. The possible alarm types. These MUST be implemented by a system. The
rest of the data model are made conditional with YANG the features rest of the data model are made conditional with YANG the features
"operator-actions", "alarm-shelving", "alarm-history", "alarm- "operator-actions", "alarm-shelving", "alarm-history", "alarm-
summary", "alarm-profile", and "severity-assignment". summary", "alarm-profile", and "severity-assignment".
The data model has the following overall structure: The data model has the following overall structure:
+--rw control
| +--rw max-alarm-status-changes? union
| +--rw (notify-status-changes)?
| | ...
| +--rw alarm-shelving {alarm-shelving}?
| ...
+--ro alarm-inventory
| +--ro alarm-type* [alarm-type-id alarm-type-qualifier]
| ...
+--ro summary {alarm-summary}?
| +--ro alarm-summary* [severity]
| | ...
| +--ro shelves-active? empty {alarm-shelving}?
+--ro alarm-list
| +--ro number-of-alarms? yang:gauge32
| +--ro last-changed? yang:date-and-time
| +--ro alarm* [resource alarm-type-id alarm-type-qualifier]
| ...
+--ro shelved-alarms {alarm-shelving}?
| +--ro number-of-shelved-alarms? yang:gauge32
| +--ro alarm-shelf-last-changed? yang:date-and-time
| +--ro shelved-alarm*
| [resource alarm-type-id alarm-type-qualifier]
| ...
+--rw alarm-profile*
[alarm-type-id alarm-type-qualifier-match resource]
{alarm-profile}?
+--rw alarm-type-id al:alarm-type-id
+--rw alarm-type-qualifier-match string
+--rw resource al:resource-match
+--rw description string
+--rw alarm-severity-assignment-profile
{severity-assignment}?
...
4.1. Alarm Control 4.1. Alarm Control
The "/alarms/control/notify-status-changes" choice controls if The "/alarms/control/notify-status-changes" choice controls if
notifications are sent for all state changes, only raise and clear, notifications are sent for all state changes, only raise and clear,
or only notifications more severe than a configured level. This or only notifications more severe than a configured level. This
feature in combination with alarm shelving corresponds to the ITU feature in combination with alarm shelving corresponds to the ITU
Alarm Report Control functionality. Alarm Report Control functionality.
Every alarm has a list of status changes, this is a circular list. Every alarm has a list of status changes, this is a circular list.
The length of this list is controlled by "/alarms/control/max-alarm- The length of this list is controlled by "/alarms/control/max-alarm-
status-changes". status-changes".
4.1.1. Alarm Shelving 4.1.1. Alarm Shelving
The shelving control tree is shown below: The shelving control tree is shown below:
+--rw control
+--rw alarm-shelving {alarm-shelving}?
+--rw shelf* [name]
+--rw name string
+--rw resource* resource-match
+--rw alarm-type-id? alarm-type-id
+--rw alarm-type-qualifier-match? string
+--rw description? string
Shelved alarms are shown in a dedicated shelved alarm list. The Shelved alarms are shown in a dedicated shelved alarm list. The
instrumentation MUST move shelved alarms from the alarm list instrumentation MUST move shelved alarms from the alarm list
(/alarms/alarm-list) to the shelved alarm list (/alarms/shelved- (/alarms/alarm-list) to the shelved alarm list (/alarms/shelved-
alarms/). Shelved alarms do not generate any notifications. When alarms/). Shelved alarms do not generate any notifications. When
the shelving criteria is removed or changed the alarm list MUST be the shelving criteria is removed or changed the alarm list MUST be
updated to the correct actual state of the alarms. updated to the correct actual state of the alarms.
Shelving and unshelving can only be performed by editing the shelf Shelving and unshelving can only be performed by editing the shelf
configuration. It cannot be performed on individual alarms. The configuration. It cannot be performed on individual alarms. The
server will add an operator state indicating that the alarm was server will add an operator state indicating that the alarm was
skipping to change at page 13, line 16 skipping to change at page 14, line 23
the alarm type qualifier MUST populate this list. the alarm type qualifier MUST populate this list.
The optional leaf-list "resource" in the alarm inventory enables the The optional leaf-list "resource" in the alarm inventory enables the
system to publish for which resources a given alarm type may appear. system to publish for which resources a given alarm type may appear.
A server MUST implement the alarm inventory in order to enable A server MUST implement the alarm inventory in order to enable
controlled alarm procedures in the client. controlled alarm procedures in the client.
The alarm inventory tree is shown below: The alarm inventory tree is shown below:
+--ro alarm-inventory
+--ro alarm-type* [alarm-type-id alarm-type-qualifier]
+--ro alarm-type-id alarm-type-id
+--ro alarm-type-qualifier alarm-type-qualifier
+--ro resource* resource-match
+--ro has-clear boolean
+--ro severity-levels* severity
+--ro description string
4.3. Alarm Summary 4.3. Alarm Summary
The alarm summary list summarises alarms per severity; how many The alarm summary list summarizes alarms per severity; how many
cleared, cleared and closed, and closed. It also gives an indication cleared, cleared and closed, and closed. It also gives an indication
if there are shelved alarms. if there are shelved alarms.
The alarm summary tree is shown below: The alarm summary tree is shown below:
+--ro summary {alarm-summary}?
+--ro alarm-summary* [severity]
| +--ro severity severity
| +--ro total? yang:gauge32
| +--ro cleared? yang:gauge32
| +--ro cleared-not-closed? yang:gauge32
| | {operator-actions}?
| +--ro cleared-closed? yang:gauge32
| | {operator-actions}?
| +--ro not-cleared-closed? yang:gauge32
| | {operator-actions}?
| +--ro not-cleared-not-closed? yang:gauge32
| {operator-actions}?
+--ro shelves-active? empty {alarm-shelving}?
4.4. The Alarm List 4.4. The Alarm List
The alarm list (/alarms/alarm-list) is a function from (resource, The alarm list (/alarms/alarm-list) is a function from (resource,
alarm type, alarm type qualifier) to the current alarm state. alarm type, alarm type qualifier) to the current composite alarm
state. The composite state includes states for the resource life-
cycle such as severity, clearance flag and operator states such as
acknowledgment.
+--ro alarm-list
+--ro number-of-alarms? yang:gauge32
+--ro last-changed? yang:date-and-time
+--ro alarm* [resource alarm-type-id alarm-type-qualifier]
+--ro resource resource
+--ro alarm-type-id alarm-type-id
+--ro alarm-type-qualifier alarm-type-qualifier
+--ro alt-resource* resource
+--ro related-alarm*
| [resource alarm-type-id alarm-type-qualifier]
| +--ro resource
| | -> /alarms/alarm-list/alarm/resource
| +--ro alarm-type-id leafref
| +--ro alarm-type-qualifier leafref
+--ro impacted-resource* resource
+--ro root-cause-resource* resource
+--ro time-created yang:date-and-time
+--ro is-cleared boolean
+--ro last-changed yang:date-and-time
+--ro perceived-severity severity
+--ro alarm-text alarm-text
+--ro status-change* [time] {alarm-history}?
| +--ro time yang:date-and-time
| +--ro perceived-severity severity-with-clear
| +--ro alarm-text alarm-text
+--ro operator-state-change* [time] {operator-actions}?
| +--ro time yang:date-and-time
| +--ro operator string
| +--ro state operator-state
| +--ro text? string
+---x set-operator-state {operator-actions}?
| +---w input
| +---w state writable-operator-state
| +---w text? string
+---n operator-action {operator-actions}?
+-- time yang:date-and-time
+-- operator string
+-- state operator-state
+-- text? string
Every alarm has three important states, the resource clearance state Every alarm has three important states, the resource clearance state
"is-cleared", the severity "perceived-severity" and the operator "is-cleared", the severity "perceived-severity" and the operator
state available in the operator state change list. state available in the operator state change list.
In order to see the alarm history the resource state changes are In order to see the alarm history the resource state changes are
available in the "status-change" list and the operator history is available in the "status-change" list and the operator history is
available in the "operator-state-change" list. available in the "operator-state-change" list.
4.5. The Shelved Alarms List 4.5. The Shelved Alarms List
skipping to change at page 14, line 14 skipping to change at page 17, line 20
4.6. Alarm Profiles 4.6. Alarm Profiles
Alarm profiles (/alarms/alarm-profile/) is a list of configurable Alarm profiles (/alarms/alarm-profile/) is a list of configurable
alarm types. The list supports configurable alarm severity levels in alarm types. The list supports configurable alarm severity levels in
the container "alarm-severity-assignment-profile". If an alarm the container "alarm-severity-assignment-profile". If an alarm
matches the configured alarm type it MUST use the configured severity matches the configured alarm type it MUST use the configured severity
level(s) instead of the system default. This configuration MUST also level(s) instead of the system default. This configuration MUST also
be represented in the alarm inventory. be represented in the alarm inventory.
+--rw alarm-profile*
[alarm-type-id alarm-type-qualifier-match resource]
{alarm-profile}?
+--rw alarm-type-id al:alarm-type-id
+--rw alarm-type-qualifier-match string
+--rw resource al:resource-match
+--rw description string
+--rw alarm-severity-assignment-profile
{severity-assignment}?
+--rw severity-levels* al:severity
4.7. RPCs and Actions 4.7. RPCs and Actions
The alarm module supports rpcs and actions to manage the alarms: The alarm module supports rpcs and actions to manage the alarms:
"purge-alarms" (rpc): delete alarms according to specific "purge-alarms" (rpc): delete alarms according to specific
criteria, for example all cleared alarms older then a specific criteria, for example all cleared alarms older then a specific
date. date.
"compress-alarms" (rpc): compress the status-change list for the "compress-alarms" (rpc): compress the status-change list for the
alarms. alarms.
skipping to change at page 14, line 45 skipping to change at page 18, line 16
operator state on an alarm, like acknowledge. operator state on an alarm, like acknowledge.
If the alarm inventory is changed, for example a new card type is If the alarm inventory is changed, for example a new card type is
inserted, a notification will tell the management application that inserted, a notification will tell the management application that
new alarm types are available. new alarm types are available.
5. Alarm YANG Module 5. Alarm YANG Module
This YANG module references [RFC6991]. This YANG module references [RFC6991].
<CODE BEGINS> file "ietf-alarms@2018-08-08.yang" <CODE BEGINS> file "ietf-alarms@2018-09-20.yang"
module ietf-alarms { module ietf-alarms {
yang-version 1.1; yang-version 1.1;
namespace "urn:ietf:params:xml:ns:yang:ietf-alarms"; namespace "urn:ietf:params:xml:ns:yang:ietf-alarms";
prefix al; prefix al;
import ietf-yang-types { import ietf-yang-types {
prefix yang; prefix yang;
reference "RFC 6991: Common YANG Data Types."; reference "RFC 6991: Common YANG Data Types.";
} }
skipping to change at page 17, line 27 skipping to change at page 20, line 43
The key words 'MUST', 'MUST NOT', 'REQUIRED', 'SHALL', 'SHALL The key words 'MUST', 'MUST NOT', 'REQUIRED', 'SHALL', 'SHALL
NOT', 'SHOULD', 'SHOULD NOT', 'RECOMMENDED', 'MAY', and NOT', 'SHOULD', 'SHOULD NOT', 'RECOMMENDED', 'MAY', and
'OPTIONAL' in the module text are to be interpreted as described 'OPTIONAL' in the module text are to be interpreted as described
in RFC 2119 (https://tools.ietf.org/html/rfc2119). in RFC 2119 (https://tools.ietf.org/html/rfc2119).
This version of this YANG module is part of RFC XXXX This version of this YANG module is part of RFC XXXX
(https://tools.ietf.org/html/rfcXXXX); see the RFC itself for (https://tools.ietf.org/html/rfcXXXX); see the RFC itself for
full legal notices."; full legal notices.";
revision 2018-08-08 { revision 2018-09-20 {
description description
"Initial revision."; "Initial revision.";
reference "RFC XXXX: YANG Alarm Module"; reference "RFC XXXX: YANG Alarm Module";
} }
/* /*
* Features * Features
*/ */
feature operator-actions { feature operator-actions {
skipping to change at page 44, line 42 skipping to change at page 48, line 12
mapping provided by the system is in conflict with other management mapping provided by the system is in conflict with other management
systems or not considered correct. systems or not considered correct.
Note that the IETF Alarm Module term 'resource' is synonymous to the Note that the IETF Alarm Module term 'resource' is synonymous to the
ITU term 'managed object'. ITU term 'managed object'.
7. The X.733 Mapping Module 7. The X.733 Mapping Module
This YANG module references [X.733] and [X.736]. This YANG module references [X.733] and [X.736].
<CODE BEGINS> file "ietf-alarms-x733@2018-08-08.yang" <CODE BEGINS> file "ietf-alarms-x733@2018-09-20.yang"
module ietf-alarms-x733 { module ietf-alarms-x733 {
yang-version 1.1; yang-version 1.1;
namespace "urn:ietf:params:xml:ns:yang:ietf-alarms-x733"; namespace "urn:ietf:params:xml:ns:yang:ietf-alarms-x733";
prefix x733; prefix x733;
import ietf-alarms { import ietf-alarms {
prefix al; prefix al;
} }
import ietf-yang-types { import ietf-yang-types {
prefix yang; prefix yang;
skipping to change at page 45, line 49 skipping to change at page 49, line 19
The module uses an integer and a corresponding string for The module uses an integer and a corresponding string for
probable cause instead of a globally defined enumeration, in probable cause instead of a globally defined enumeration, in
order to be able to manage conflicting enumeration definitions. order to be able to manage conflicting enumeration definitions.
A single globally defined enumeration is challenging to A single globally defined enumeration is challenging to
maintain."; maintain.";
reference reference
"ITU Recommendation X.733: Information Technology "ITU Recommendation X.733: Information Technology
- Open Systems Interconnection - Open Systems Interconnection
- System Management: Alarm Reporting Function"; - System Management: Alarm Reporting Function";
revision 2018-08-08 { revision 2018-09-20 {
description description
"Initial revision."; "Initial revision.";
reference "RFC XXXX: YANG Alarm Module"; reference "RFC XXXX: YANG Alarm Module";
} }
/* /*
* Features * Features
*/ */
feature configure-x733-mapping { feature configure-x733-mapping {
description description
"The system supports configurable X733 mapping from "The system supports configurable X733 mapping from
skipping to change at page 58, line 46 skipping to change at page 62, line 17
"The semantics of alarm definitions: enabling systematic "The semantics of alarm definitions: enabling systematic
reasoning about alarms. International Journal of Network reasoning about alarms. International Journal of Network
Management, Volume 22, Issue 3, John Wiley and Sons, Ltd, Management, Volume 22, Issue 3, John Wiley and Sons, Ltd,
http://dx.doi.org/10.1002/nem.800", March 2012. http://dx.doi.org/10.1002/nem.800", March 2012.
[EEMUA] EEMUA Publication No. 191 Engineering Equipment and [EEMUA] EEMUA Publication No. 191 Engineering Equipment and
Materials Users Association, London, 2 edition., "Alarm Materials Users Association, London, 2 edition., "Alarm
Systems: A Guide to Design, Management and Procurement.", Systems: A Guide to Design, Management and Procurement.",
2007. 2007.
[G.7710] ITU-T, "SERIES G: TRANSMISSION SYSTEMS AND MEDIA, DIGITAL
SYSTEMS AND NETWORKS Data over Transport - Generic aspects
- Transport network control aspects. Common equipment
management function requirements", 2012.
[ISA182] International Society of Automation,ISA, "ANSI/ISA- [ISA182] International Society of Automation,ISA, "ANSI/ISA-
18.2-2009 Management of Alarm Systems for the Process 18.2-2009 Management of Alarm Systems for the Process
Industries", 2009. Industries", 2009.
[RFC3877] Chisholm, S. and D. Romascanu, "Alarm Management [RFC3877] Chisholm, S. and D. Romascanu, "Alarm Management
Information Base (MIB)", RFC 3877, DOI 10.17487/RFC3877, Information Base (MIB)", RFC 3877, DOI 10.17487/RFC3877,
September 2004, <http://www.rfc-editor.org/info/rfc3877>. September 2004, <http://www.rfc-editor.org/info/rfc3877>.
[RFC8340] Bjorklund, M. and L. Berger, Ed., "YANG Tree Diagrams", [RFC8340] Bjorklund, M. and L. Berger, Ed., "YANG Tree Diagrams",
BCP 215, RFC 8340, DOI 10.17487/RFC8340, March 2018, BCP 215, RFC 8340, DOI 10.17487/RFC8340, March 2018,
skipping to change at page 64, line 13 skipping to change at page 67, line 13
</alarms> </alarms>
Appendix F. Background and Usability Requirements Appendix F. Background and Usability Requirements
This section gives background information regarding design choices in This section gives background information regarding design choices in
the alarm module. It also defines usability requirements for alarms. the alarm module. It also defines usability requirements for alarms.
Alarm usability is important for an alarm interface. A data-model Alarm usability is important for an alarm interface. A data-model
will help in defining the format but if the actual alarms are of low will help in defining the format but if the actual alarms are of low
value we have not gained the goal of alarm management. value we have not gained the goal of alarm management.
The telecommunication domain has standardised an alarm interface in The telecommunication domain has standardized an alarm interface in
ITU-T X.733 [X.733]. This continued in mobile networks within the ITU-T X.733 [X.733]. This continued in mobile networks within the
3GPP organisation [ALARMIRP]. Although SNMP is the dominant 3GPP organization [ALARMIRP]. Although SNMP is the dominant
mechanism for monitoring devices, IETF did not early on standardise mechanism for monitoring devices, IETF did not early on standardize
an alarm MIB. Instead, management systems interpreted the enterprise an alarm MIB. Instead, management systems interpreted the enterprise
specific traps per MIB and device to build an alarm list. When specific traps per MIB and device to build an alarm list. When
finally The Alarm MIB [RFC3877] was published, it had to address the finally The Alarm MIB [RFC3877] was published, it had to address the
existence of enterprise traps and map these into alarms. This existence of enterprise traps and map these into alarms. This
requirement led to a MIB that is not always easy to use. requirement led to a MIB that is not always easy to use.
F.1. Alarm Concepts F.1. Alarm Concepts
There are two misconceptions regarding alarms and alarm interfaces There are two misconceptions regarding alarms and alarm interfaces
that are important to sort out. The first problem is that alarms are that are important to sort out. The first problem is that alarms are
mixed with events in general. Alarms MUST correspond to an mixed with events in general. Alarms MUST correspond to an
undesirable state that needs corrective action. Many implementations undesirable state that needs corrective action. Many implementations
of alarm interfaces do not adhere to this principle and just send of alarm interfaces do not adhere to this principle and just send
events in general. In order to qualify as an alarm, there must exist events in general. In order to qualify as an alarm, there must exist
a corrective action. If that is not true, it is an event that can go a corrective action. If that is not true, it is an event that can go
into logs. into logs.
The other misconception is that the term "alarm" refers to the
notification itself. Rather, an alarm is a state of a resource in
the system. The alarm notifications report state changes of the
alarm, such as alarm raise and alarm clear.
"One of the most important principles of alarm management is that an "One of the most important principles of alarm management is that an
alarm requires an action. This means that if the operator does not alarm requires an action. This means that if the operator does not
need to respond to an alarm (because unacceptable consequences do not need to respond to an alarm (because unacceptable consequences do not
occur), then it is not an alarm. Following this cardinal rule will occur), then it is not an alarm. Following this cardinal rule will
help eliminate many potential alarm management issues." [ISA182] help eliminate many potential alarm management issues." [ISA182]
The other misconception is that the term "alarm" refers to the
notification itself. Rather, an alarm is a state of a resource in
the system. The alarm notifications report state changes of the
alarm, such as alarm raise and alarm clear.
F.1.1. Alarm type F.1.1. Alarm type
Since every alarm has a corresponding corrective action, a vendor can Since every alarm has a corresponding corrective action, a vendor can
to prepare a list of available alarms and their corrective actions. to prepare a list of available alarms and their corrective actions.
We use the term "alarm type" to refer to every possible alarm that We use the term "alarm type" to refer to every possible alarm that
could be active in the system. could be active in the system.
Alarm types are also fundamental in order to provide a state-based Alarm types are also fundamental in order to provide a state-based
alarm list. The alarm list correlates alarm state changes for the alarm list. The alarm list correlates alarm state changes for the
same alarm type and the same resource into one alarm. same alarm type and the same resource into one alarm.
skipping to change at page 65, line 19 skipping to change at page 68, line 19
Different alarm interfaces use different mechanisms to define alarm Different alarm interfaces use different mechanisms to define alarm
types, ranging from simple error numbers to more advanced mechanisms types, ranging from simple error numbers to more advanced mechanisms
like the X.733 triplet of event type, probable cause and specific like the X.733 triplet of event type, probable cause and specific
problem. problem.
A common misunderstanding is that individual alarm notifications are A common misunderstanding is that individual alarm notifications are
alarm types. This is not correct; e.g., "link-up" and "link-down" alarm types. This is not correct; e.g., "link-up" and "link-down"
are two notifications reporting different states for the same alarm are two notifications reporting different states for the same alarm
type, "link-alarm". type, "link-alarm".
F.2. Usability Requirements F.2. Relationships to other alarm standards
Common alarm problems and the cause of the problems are summarised in This section briefly describes how this alarm module relates to other
Table 1. This summary is adopted to networking based on the ISA relevant alarm standards. It covers the definition of the concept of
an alarm and the data models of the referenced alarm standards.
F.2.1. Alarm definition
The table below summarizes relevant definitions of the term "alarm".
+------------+---------------------------+--------------------------+
| Standard | Definition | Comment |
+------------+---------------------------+--------------------------+
| X.733 | error: A deviation of a | The X.733 alarm |
| [X.733] | system from normal | definition is focused on |
| | operation. fault: The | the notification as such |
| | physical or algorithmic | and not the state. It |
| | cause of a malfunction. | also uses the basic |
| | Faults manifest | criteria of deviation |
| | themselves as errors. | from normal condition. |
| | alarm: A notification, of | There is no requirement |
| | the form defined by this | for an operation action |
| | function, of a specific | to be required. |
| | event. An alarm may or | |
| | may not represent an | |
| | error. | |
| | | |
| G.7710 | Alarms are indications | The G.7710 definition is |
| [G.7710] | that are automatically | close to the original |
| | generated by an NE as a | X.733 definition. |
| | result of the declaration | |
| | of a failure. | |
| | | |
| Alarm MIB | Alarm: Persistent | RFC 3877 defines alarm |
| [RFC3877] | indication of a fault. | referring back to "a |
| | Fault: Lasting error or | deviation from normal |
| | warning condition. | operation". This is |
| | Error: A deviation of a | problematic, since this |
| | system from normal | might not require an |
| | operation. | operator action. The |
| | | alarm MIB is state |
| | | oriented rather than |
| | | notification oriented, |
| | | an alarm is a "lasting |
| | | condition", not a |
| | | discrete notification |
| | | reporting about a |
| | | condition state change. |
| | | |
| ISA | Alarm: An audible and/or | The ISA standard adds an |
| [ISA182] | visible means of | important requirement to |
| | indicating to the | the "deviation from |
| | operator an equipment | normal condition state"; |
| | malfunction, process | requiring a response. |
| | deviation or abnormal | |
| | condition requiring a | |
| | response. | |
| | | |
| EEMUA | An alarm is an event to | This is the foundation |
| [EEMUA] | which an operator must | for the definition of |
| | knowingly react,respond, | alarm in this document. |
| | and acknowledge - not | It focuses on the core |
| | simply acknowledge and | criteria that an action |
| | ignore. | is really needed. |
| | | |
| 3GPP Alarm | 3GPP v15: An alarm | The latest 3GPP Alarm |
| IRP | signifies an undesired | IRP version uses |
| [ALARMIRP] | condition of a resource | literally the same alarm |
| | (e.g. network element, | definition as this alarm |
| | link) for which an | module. It is worth |
| | operator action is | noting that earlier |
| | required. It emphasizes a | versions used a |
| | key requirement that | definition not requiring |
| | operators [...] should | an operator action and |
| | not be informed about an | the more broad |
| | undesired condition | definition of deviation |
| | unless it requires | from normal condition. |
| | operator action. 3GPP | The earlier version also |
| | v12: alarm: abnormal | defined an alarm as a |
| | network entity condition, | special case of "event". |
| | which categorizes an | |
| | event as a fault. fault: | |
| | a deviation of a system | |
| | from normal operation, | |
| | which may result in the | |
| | loss of operational | |
| | capabilities [...] | |
+------------+---------------------------+--------------------------+
Table 1: Definition of alarm in standards
The evolution of the definition of alarm moves from focused on events
reporting a deviation from normal operation towards a definition to a
undesired *state* which *requires an operator action*.
F.2.2. Data model
This section describes how this YANG alarm module relates to other
standard data models. Note well that we cover other data-models for
alarm interfaces. Not other standards such as SDO specific alarms
for example.
F.2.2.1. X.733
X.733 has acted as a base for several alarm data models over the
year. The YANG alarm module differs in the following ways:
X.733 models the alarm list as a list of notifications. The YANG
alarm module defines the alarm list as the current alarm states
for the resources, which is generated from the state change
reporting notifications.
In X.733 an alarm can have the severity level clear. In the YANG
alarm module "clear" is not a severity level, it is a separate
state of the alarm. An alarm can have the following states for
example (major, cleared), (minor, not cleared)
X.733 uses a flat globally defined enumerated "probable cause" to
identify alarm types. This alarm module uses a hierarchical YANG
identity, alarm-type. This enables delegation of alarm types
within organizations. It also lets management reason about
"abstract" alarm-types corresponding to base identities, see
Section 3.2.
The YANG alarm module has not included the majority of the X.733
alarm attributes. Rather these are defined in an augmenting
module if "strict" X.733 compliance is needed.
F.2.2.2. RFC3877, the Alarm MIB
The MIB in RFC3877 takes a different approach, rather than defining a
concrete data-model for alarms, it defines a model to map existing
SNMP managed-objects and notifications into alarm states and alarm
notifications. This was necessary since MIBs where already defined
with both managed objects and notifications indicating alarms, for
example linkUp and linkDown notifications in combination with
ifAdminState and ifOperState. So RFC3877 can not really be compared
to the alarm YANG module in that sense.
The Alarm MIB maps existing MIB definitions into alarms,
alarmModelTable. The upside of that is that a SNMP Manager can at
runtime read the possible alarm types. This corresponds to the
alarmInventory in the alarm YANG module.
F.2.2.3. 3GPP Alarm IRP
The 3GPP Alarm IRP is an evolution of X.733. Main differences
between the alarm YANG module and 3GPP are:
3GPP keeps the majority of the X.733 attributes, the alarm YANG
module does not.
3GPP introduced overlapping and possibly conflicting keys for
alarms, alarmId and (managed object, event type, probable cause,
specific problem). (See Annex C in [X.733] Example 3). In the
YANG alarm module the key for identifying an alarm instance is
clearly defined by (resource, alarm-type, alarm-type-qualifier).
See also Section 3.4 for more information.
The alarm YANG module clearly separates the resource/
instrumentation life cycle from the operator life cycle. 3GPP
allows operators to set the alarm severity to clear, this is not
allowed by this module, rather an operator closes an alarm which
does not affect the severity.
F.2.2.4. G.7710
G.7710 is different than the previous referenced alarm standards. It
does define a data-model for alarm reporting. It defines common
equipment management function requirements including alarm
instrumentation. The scope is transport networks.
The requirements in G.7710 corresponds to features in the alarm YANG
module in the following way:
Alarm Severity Assignment Profile (ASAP): the alarm profile
"/alarms/alarm-profile/".
Alarm Reporting Control (ARC): alarm shelving "/alarms/control/
alarm-shelving/" and the ability to control alarm notifications
"/alarms/control/notify-status-changes".
F.3. Usability Requirements
Common alarm problems and the cause of the problems are summarized in
Table 2. This summary is adopted to networking based on the ISA
[ISA182] and EEMUA [EEMUA] standards. [ISA182] and EEMUA [EEMUA] standards.
+------------------+--------------------------------+---------------+ +------------------+--------------------------------+---------------+
| Problem | Cause | How this | | Problem | Cause | How this |
| | | module | | | | module |
| | | address the | | | | address the |
| | | cause | | | | cause |
+------------------+--------------------------------+---------------+ +------------------+--------------------------------+---------------+
| Alarms are | "Nuisance" alarms (chattering | Strict | | Alarms are | "Nuisance" alarms (chattering | Strict |
| generated but | alarms and fleeting alarms), | definition of | | generated but | alarms and fleeting alarms), | definition of |
| they are ignored | faulty hardware, redundant | alarms | | they are ignored | faulty hardware, redundant | alarms |
| by the operator. | alarms, cascading alarms, | requiring | | by the operator. | alarms, cascading alarms, | requiring |
| | incorrect alarm settings, | corrective | | | incorrect alarm settings, | corrective |
| | alarms have not been | response. | | | alarms have not been | response. |
| | rationalised, the alarms | Alarm | | | rationalized, the alarms | Alarm |
| | represent log information | requirements | | | represent log information | requirements |
| | rather than true alarms. | in Table 2. | | | rather than true alarms. | in Table 3. |
| | | | | | | |
| When alarms | Insufficient alarm response | The alarm | | When alarms | Insufficient alarm response | The alarm |
| occur, operators | procedures and not well | inventory | | occur, operators | procedures and not well | inventory |
| do not know how | defined alarm types. | lists all | | do not know how | defined alarm types. | lists all |
| to respond. | | alarm types | | to respond. | | alarm types |
| | | and | | | | and |
| | | corrective | | | | corrective |
| | | actions. | | | | actions. |
| | | Alarm | | | | Alarm |
| | | requirements | | | | requirements |
| | | in Table 2. | | | | in Table 3. |
| | | | | | | |
| The alarm | Nuisance alarms, stale alarms, | The alarm | | The alarm | Nuisance alarms, stale alarms, | The alarm |
| display is full | alarms from equipment not in | definition | | display is full | alarms from equipment not in | definition |
| of alarms, even | service. | and alarm | | of alarms, even | service. | and alarm |
| when there is | | shelving. | | when there is | | shelving. |
| nothing wrong. | | | | nothing wrong. | | |
| | | | | | | |
| During a | Incorrect prioritization of | State-based | | During a | Incorrect prioritization of | State-based |
| failure, | alarms. Not using advanced | alarm model, | | failure, | alarms. Not using advanced | alarm model, |
| operators are | alarm techniques (e.g. state- | alarm rate | | operators are | alarm techniques (e.g. state- | alarm rate |
| flooded with so | based alarming). | requirements | | flooded with so | based alarming). | requirements |
| many alarms that | | in Table 3 | | many alarms that | | in Table 4 |
| they do not know | | and Table 4 | | they do not know | | and Table 5 |
| which ones are | | | | which ones are | | |
| the most | | | | the most | | |
| important. | | | | important. | | |
+------------------+--------------------------------+---------------+ +------------------+--------------------------------+---------------+
Table 1: Alarm Problems and Causes Table 2: Alarm Problems and Causes
Based upon the above problems EEMUA gives the following definition of Based upon the above problems EEMUA gives the following definition of
a good alarm: a good alarm:
+----------------+--------------------------------------------------+ +----------------+--------------------------------------------------+
| Characteristic | Explanation | | Characteristic | Explanation |
+----------------+--------------------------------------------------+ +----------------+--------------------------------------------------+
| Relevant | Not spurious or of low operational value. | | Relevant | Not spurious or of low operational value. |
| | | | | |
| Unique | Not duplicating another alarm. | | Unique | Not duplicating another alarm. |
| | | | | |
| Timely | Not long before any response is needed or too | | Timely | Not long before any response is needed or too |
| | late to do anything. | | | late to do anything. |
| | | | | |
| Prioritised | Indicating the importance that the operator | | Prioritized | Indicating the importance that the operator |
| | deals with the problem. | | | deals with the problem. |
| | | | | |
| Understandable | Having a message which is clear and easy to | | Understandable | Having a message which is clear and easy to |
| | understand. | | | understand. |
| | | | | |
| Diagnostic | Identifying the problem that has occurred. | | Diagnostic | Identifying the problem that has occurred. |
| | | | | |
| Advisory | Indicative of the action to be taken. | | Advisory | Indicative of the action to be taken. |
| | | | | |
| Focusing | Drawing attention to the most important issues. | | Focusing | Drawing attention to the most important issues. |
+----------------+--------------------------------------------------+ +----------------+--------------------------------------------------+
Table 2: Definition of a Good Alarm Table 3: Definition of a Good Alarm
Vendors SHOULD rationalise all alarms according to above. Another Vendors SHOULD rationalize all alarms according to above. Another
crucial requirement is acceptable alarm notification rates. Vendors crucial requirement is acceptable alarm notification rates. Vendors
SHOULD make sure that they do not exceed the recommendations from SHOULD make sure that they do not exceed the recommendations from
EEMUA below: EEMUA below:
+-----------------------------------+-------------------------------+ +-----------------------------------+-------------------------------+
| Long Term Alarm Rate in Steady | Acceptability | | Long Term Alarm Rate in Steady | Acceptability |
| Operation | | | Operation | |
+-----------------------------------+-------------------------------+ +-----------------------------------+-------------------------------+
| More than one per minute | Very likely to be | | More than one per minute | Very likely to be |
| | unacceptable. | | | unacceptable. |
| | | | | |
| One per 2 minutes | Likely to be over-demanding. | | One per 2 minutes | Likely to be over-demanding. |
| | | | | |
| One per 5 minutes | Manageable. | | One per 5 minutes | Manageable. |
| | | | | |
| Less than one per 10 minutes | Very likely to be acceptable. | | Less than one per 10 minutes | Very likely to be acceptable. |
+-----------------------------------+-------------------------------+ +-----------------------------------+-------------------------------+
Table 3: Acceptable Alarm Rates, Steady State Table 4: Acceptable Alarm Rates, Steady State
+----------------------------+--------------------------------------+ +----------------------------+--------------------------------------+
| Number of alarms displayed | Acceptability | | Number of alarms displayed | Acceptability |
| in 10 minutes following a | | | in 10 minutes following a | |
| major network problem | | | major network problem | |
+----------------------------+--------------------------------------+ +----------------------------+--------------------------------------+
| More than 100 | Definitely excessive and very likely | | More than 100 | Definitely excessive and very likely |
| | to lead to the operator to abandon | | | to lead to the operator to abandon |
| | the use of the alarm system. | | | the use of the alarm system. |
| | | | | |
| 20-100 | Hard to cope with. | | 20-100 | Hard to cope with. |
| | | | | |
| Under 10 | Should be manageable - but may be | | Under 10 | Should be manageable - but may be |
| | difficult if several of the alarms | | | difficult if several of the alarms |
| | require a complex operator response. | | | require a complex operator response. |
+----------------------------+--------------------------------------+ +----------------------------+--------------------------------------+
Table 4: Acceptable Alarm Rates, Burst Table 5: Acceptable Alarm Rates, Burst
The numbers in Table 3 and Table 4 are the sum of all alarms for a The numbers in Table 4 and Table 5 are the sum of all alarms for a
network being managed from one alarm console. So every individual network being managed from one alarm console. So every individual
system or NMS contributes to these numbers. system or NMS contributes to these numbers.
Vendors SHOULD make sure that the following rules are used in Vendors SHOULD make sure that the following rules are used in
designing the alarm interface: designing the alarm interface:
1. Rationalize the alarms in the system to ensure that every alarm 1. Rationalize the alarms in the system to ensure that every alarm
is necessary, has a purpose, and follows the cardinal rule - that is necessary, has a purpose, and follows the cardinal rule - that
it requires an operator response. Adheres to the rules of it requires an operator response. Adheres to the rules of
Table 2 Table 3
2. Audit the quality of the alarms. Talk with the operators about 2. Audit the quality of the alarms. Talk with the operators about
how well the alarm information support them. Do they know what how well the alarm information support them. Do they know what
to do in the event of an alarm? Are they able to quickly to do in the event of an alarm? Are they able to quickly
diagnose the problem and determine the corrective action? Does diagnose the problem and determine the corrective action? Does
the alarm text adhere to the requirements in Table 2? the alarm text adhere to the requirements in Table 3?
3. Analyze and benchmark the performance of the system and compare 3. Analyze and benchmark the performance of the system and compare
it to the recommended metrics in Table 3 and Table 4. Start by it to the recommended metrics in Table 4 and Table 5. Start by
identifying nuisance alarms, standing alarms at normal state and identifying nuisance alarms, standing alarms at normal state and
startup. startup.
Authors' Addresses Authors' Addresses
Stefan Vallin Stefan Vallin
Stefan Vallin AB Stefan Vallin AB
Email: stefan@wallan.se Email: stefan@wallan.se
Martin Bjorklund Martin Bjorklund
 End of changes. 46 change blocks. 
82 lines changed or deleted 411 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/