draft-ietf-forces-netlink-01.txt   draft-ietf-forces-netlink-02.txt 
ForCES Working Group Jamal Hadi Salim ForCES Working Group Jamal Hadi Salim
Internet Draft Znyx Networks Internet Draft Znyx Networks
Hormuzd Khosravi Hormuzd Khosravi
Intel Intel
Andi Kleen Andi Kleen
Suse Suse
Alexey Kuznetsov Alexey Kuznetsov
INR/Swsoft INR/Swsoft
November 2001 March 2002
Netlink as an IP services protocol Netlink as an IP services protocol
draft-ietf-forces-netlink-02.txt
draft-ietf-forces-netlink-01.txt
Status of this Memo Status of this Memo
This document is an Internet-Draft and is in full conformance with This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026. Internet-Drafts are working all provisions of Section 10 of RFC2026. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas, documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups. Note that other groups may also distribute and its working groups. Note that other groups may also distribute
working documents as Internet-Drafts. working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
skipping to change at page 2, line 5 skipping to change at page 2, line 5
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in
this document are to be interpreted as described in [RFC-2119]. this document are to be interpreted as described in [RFC-2119].
1. Abstract 1. Abstract
This document describes Linux Netlink, which is used in Linux both This document describes Linux Netlink, which is used in Linux both
as an inter-kernel messaging system as well as between kernel and as an inter-kernel messaging system as well as between kernel and
jhs_hk_ak_ank draft-forces-netlink-01.txt jhs_hk_ak_ank draft-forces-netlink-02.txt
user-space. The purpose of this document is intended as informa- user-space. The purpose of this document is intended as informa-
tional in the context of prior art for the ForCES IETF working tional in the context of prior art for the ForCES IETF working
group. The focus of this document is to describe netlink from a group. The focus of this document is to describe netlink from a
context of a protocol between a Forwording Engine Component (FEC) context of a protocol between a Forwarding Engine Component (FEC)
and a Control Plane Component(CPC) that define an IP service. and a Control Plane Component(CPC) that define an IP service.
The document ignores the ability of netlink as a inter-kernel mes- The document ignores the ability of netlink as a inter-kernel mes-
saging system, as a an inter-process communication scheme (IPC) or saging system, as a an inter-process communication scheme (IPC) or
its use in configuring other non-network as well as network but its use in configuring other non-network as well as network but
non-IP services (such as decnet etc). non-IP services (such as decnet etc).
2. Introduction 2. Introduction
The concept of IP Service control-forwarding separation was first The concept of IP Service control-forwarding separation was first
skipping to change at page 3, line 5 skipping to change at page 3, line 5
trol for a different IP service being executed by a FE component. trol for a different IP service being executed by a FE component.
This means that there might be several CPCs on a physical CP if it This means that there might be several CPCs on a physical CP if it
is controlling several IP services. In essence, the cohesion is controlling several IP services. In essence, the cohesion
between a CP component and a FE component is the service abstrac- between a CP component and a FE component is the service abstrac-
tion. tion.
In the diagram below we show a simple FE<->CP setup to provide an In the diagram below we show a simple FE<->CP setup to provide an
example of the classical IPv4 service with an extension to do some example of the classical IPv4 service with an extension to do some
basic QoS egress scheduling and how it fits in this described basic QoS egress scheduling and how it fits in this described
jhs_hk_ak_ank draft-forces-netlink-01.txt jhs_hk_ak_ank draft-forces-netlink-02.txt
model. model.
Control Plane (CP) Control Plane (CP)
.------------------------------------ .------------------------------------
| /^^^^^ /^^^^^ | | /^^^^^\ /^^^^^\ |
| | | | COPS |- | | | | | COPS |-. |
| | ospfd | | PEP | | | | ospfd | | PEP | | |
| / _____/ | | | | / \_____/ | |
/------_____/ | | | /--------\_____/ | | |
| | | | | | | | | | | |
| |_____________________|___|_________| | |______________________|___|_________|
| | | | | | | |
****************************************** ******************************************
Forwarding ************* Netlink layer ************ Forwarding ************* Netlink layer ************
Engine (FE) ***************************************** Engine (FE) *****************************************
.-------------|-----------|------------|---|----------- .-------------|-----------|------------|---|-----------
| IPv4 forwading | / | | IPv4 forwading | / |
| FE Service / / | | FE Service / / |
| Component / / | | Component / / |
| ---------------/---------------/--------- | | ---------------/---------------/--------- |
| | | / | | | | | / | |
packet | | --------|-- ----|----- | packet packet | | --------|-- ----|----- | packet
in | | | IPV4 | | Egress | | out in | | | IPV4 | | Egress | | out -->---> |------>|---->|Forwading |----->| QoS |--->| ---->|---->
-->--->|------>|---->|Forwading |----->| QoS |--->| ---->|---->
| | | | | Scheduler| | | | | | | | Scheduler| | |
| | ----------- ---------- | | | | ----------- ---------- | |
| | | | | | | |
| --------------------------------------- | | --------------------------------------- |
| | | |
------------------------------------------------------- -------------------------------------------------------
2.1.1. Control Plane Components (CPCs) 2.1.1. Control Plane Components (CPCs)
Control plane components would encompass signalling protocols with Control plane components would encompass signalling protocols with
diversity ranging from dynamic routing protocols such as OSPF diversity ranging from dynamic routing protocols such as OSPF
[RFC2328] to tag distribution protocols such as CR-LDP [RFC3036]. [RFC2328] to tag distribution protocols such as CR-LDP [RFC3036].
Classical Management protocols and activities also fall under this Classical Management protocols and activities also fall under this
category. These include SNMP [RFC1157], COPS [RFC2748] or propri- category. These include SNMP [RFC1157], COPS [RFC2748] or propri-
etary CLI/GUI configuration mechanisms. etary CLI/GUI configuration mechanisms.
jhs_hk_ak_ank draft-forces-netlink-01.txt jhs_hk_ak_ank draft-forces-netlink-02.txt
The purpose of the control plane is to provide an execution envi- The purpose of the control plane is to provide an execution envi-
ronment for the above mentioned activities with the ultimate goal ronment for the above mentioned activities with the ultimate goal
being to configure and manage the second NE component: the FE. The being to configure and manage the second NE component: the FE. The
result of the configuration would define the way packets travesing result of the configuration would define the way packets travesing
the FE are treated. the FE are treated.
In the above diagram, ospfd and COPS are distinct CPCs. In the above diagram, ospfd and COPS are distinct CPCs.
2.1.2. Forwarding Engine Components 2.1.2. Forwarding Engine Components
skipping to change at page 5, line 5 skipping to change at page 5, line 5
arrives at the NE to the moment it departs. In essence an IP ser- arrives at the NE to the moment it departs. In essence an IP ser-
vice in this context is a Per-Hop Behavior. A service control/sig- vice in this context is a Per-Hop Behavior. A service control/sig-
naling protocol/management-application (CP components running on naling protocol/management-application (CP components running on
NEs defining the end to end path) unifies the end to end view of NEs defining the end to end path) unifies the end to end view of
the IP service. As noted above, these CP components then define the the IP service. As noted above, these CP components then define the
behavior of the FE (and therefore the NE) to a described packet. behavior of the FE (and therefore the NE) to a described packet.
A simple example of an IP service is the classical IPv4 Forwarding. A simple example of an IP service is the classical IPv4 Forwarding.
In this case, control components such as routing protocols(OSPF, In this case, control components such as routing protocols(OSPF,
jhs_hk_ak_ank draft-forces-netlink-01.txt jhs_hk_ak_ank draft-forces-netlink-02.txt
RIP etc) and proprietary CLI/GUI configurations modify the FE's RIP etc) and proprietary CLI/GUI configurations modify the FE's
forwarding tables in order to offer the simple service of forward- forwarding tables in order to offer the simple service of forward-
ing packets to the next hop. Traditionally, NEs offering this sim- ing packets to the next hop. Traditionally, NEs offering this sim-
ple service are known as routers. ple service are known as routers.
Over the years it has become important to add aditional services to Over the years it has become important to add aditional services to
the routers to meet emerging requirements. More complex services the routers to meet emerging requirements. More complex services
extending classical forwarding were added and standardized. These extending classical forwarding were added and standardized. These
newer services might go beyond the layer 3 contents of the packet newer services might go beyond the layer 3 contents of the packet
skipping to change at page 5, line 32 skipping to change at page 5, line 32
One extreme definition of a IP service is something a service One extreme definition of a IP service is something a service
provider would be able to charge for. provider would be able to charge for.
3. Netlink Architecture 3. Netlink Architecture
IP services components control is defined by using templates. IP services components control is defined by using templates.
The FEC and CPC participate to deliver the IP service by communi- The FEC and CPC participate to deliver the IP service by communi-
cating using these templates. The FEC might continously get cating using these templates. The FEC might continously get
updates from the control plane component on how to operate the ser- updates from the control plane component on how to operate the ser-
vice (example for V4 forwarding route additions or deletions). vice (example for V4 forwarding, route additions or deletions).
The interaction between the FEC and the CPC, in the netlink con- The interaction between the FEC and the CPC, in the netlink con-
text, would define a protocol. Netlink provides the mechanism for text, would define a protocol. Netlink provides the mechanism for
the CPC(residing in user space) and FEC(residing in kernel space) the CPC(residing in user space) and FEC(residing in kernel space)
to define their own protocol definition. Kernel space and user to have their own protocol definition. Kernel space and user space
space just mean different protection domains direct where direct just mean different protection domains. Therefore a wire protocol
memory access is not allowed inbetween. Therefore a wire protocol
is needed to communicate. The wire protocol would be normally be is needed to communicate. The wire protocol would be normally be
provided by some privileged service that is able to copy between provided by some privileged service that is able to copy between
multiple protection domains. We will call this service netlink multiple protection domains. We will refer to this service as the
service. Netlink service could also be mapped to a different netlink service. Netlink service could also be necapsulated to a
transport layer if the CPC should be running on a different node different transport layer if the CPC executes on a different node
than the CPC. The FEC and CPC, using netlink mechanisms, may than the FEC. The FEC and CPC, using netlink mechanisms, may
choose to define a reliable protocol between each other, for exam- choose to define a reliable protocol between each other. By
ple. By default netlink provides an unreliable communication. default, however, netlink provides an unreliable communication.
Note that the FEC and CPC can both live in the same memory protec- Note that the FEC and CPC can both live in the same memory protec-
tion domain and use the connect() system call to create a path to tion domain and use the connect() system call to create a path to
the peer and talk to each other. We will not discuss this further the peer and talk to each other. We will not discuss this further
other than to say it is available as a mechanism. Through out this
jhs_hk_ak_ank draft-forces-netlink-01.txt jhs_hk_ak_ank draft-forces-netlink-02.txt
other than to say it is available as a mechanism. Through out this document we will refer interchangebly to the FEC to mean kernel-
document we will refer interchangbly to the FEC to mean kernel- space and the CPC to mean user-space. This is not meant, however,
space and the CPC to mean user-space. to restrict the two components to these protection domains or to
the same compute node.
Note: Netlink allows participation in IP services by both service Note: Netlink allows participation in IP services by both service
components. components.
3.1. Netlink Logical model 3.1. Netlink Logical model
In the diagram below we show a simple FEC<->CPC logical relation- In the diagram below we show a simple FEC<->CPC logical relation-
ship. We use the example of IPV4 forwarding FEC (NETLINK_ROUTE, ship. We use the example of IPV4 forwarding FEC (NETLINK_ROUTE,
which is discussed further below) as an example. which is discussed further below) as an example.
skipping to change at page 7, line 5 skipping to change at page 7, line 5
| --------------------------------------- | | --------------------------------------- |
| | | |
----------------------------------------------------- -----------------------------------------------------
Netlink logically models FECs and CPCs in the form of nodes inter- Netlink logically models FECs and CPCs in the form of nodes inter-
connected to each other via a broadcast wire. connected to each other via a broadcast wire.
The wire is specific to a service. The example above shows the The wire is specific to a service. The example above shows the
broadcast wire belonging to the extended IPV4 forwarding service. broadcast wire belonging to the extended IPV4 forwarding service.
jhs_hk_ak_ank draft-forces-netlink-01.txt jhs_hk_ak_ank draft-forces-netlink-02.txt
Nodes connect to the wire and register to receive specific mes- Nodes connect to the wire and register to receive specific mes-
sages. CPCs may connect to multiple wires if it helps them to con- sages. CPCs may connect to multiple wires if it helps them to con-
trol the service better. All nodes(CPCs and FECs) dump packets on trol the service better. All nodes(CPCs and FECs) dump packets on
the broadcast wire. Packets could be discarded by the wire if mal- the broadcast wire. Packets could be discarded by the wire if mal-
formed or not specifically formated for the wire. Dropped packets formed or not specifically formated for the wire. Dropped packets
are not seen by any of the nodes. The netlink service MAY signal are not seen by any of the nodes. The netlink service MAY signal
an error to the original if it detects an malformatted netlink an error to the original if it detects an malformatted netlink
packet. packet.
Packets sent on the wire could be broadcast, multicast or unicast. Packets sent on the wire could be broadcast, multicast or unicast.
FECs or CPCs pick specific messages of interest for processing or FECs or CPCs register for and pick specific messages of interest
just monitoring purposes. for processing or just monitoring purposes.
3.2. The message format 3.2. The message format
There are three levels to a netlink message: The general netlink There are three levels to a netlink message: The general netlink
message header, the IP service specific template, the IP service message header, the IP service specific template, the IP service
specific data. specific data.
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
skipping to change at page 7, line 42 skipping to change at page 7, line 42
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | | |
| IP Service Template | | IP Service Template |
| | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | | |
| IP Service specific data in TLVs | | IP Service specific data in TLVs |
| | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The netlink message is used to communicate between the FEC and CPC
for parametrization of the FECs, asynchoronous event notification
of FEC events to the CPCs and statistics querying/gathering (typi-
cally by the CPC). The Netlink message header is generic for all
services whereas the IP Service Template header is specific to a
service. Each IP Service then carries parameterization
data(CPC->FEC direction) or response (FEC->CPC direction). These
are in TLV format and unique just to the service.
jhs_hk_ak_ank draft-forces-netlink-02.txt
3.3. Protocol Model 3.3. Protocol Model
This section expands on how netlink provides the mechanism for ser- This section expands on how netlink provides the mechanism for ser-
vice oriented FEC and CPC interaction. vice oriented FEC and CPC interaction.
jhs_hk_ak_ank draft-forces-netlink-01.txt
3.3.1. Service Addressing 3.3.1. Service Addressing
Access is provided by first connecting to the service on the FE. Access is provided by first connecting to the service on the FE.
This is done by making a socket() system call to the PF_NETLINK This is done by making a socket() system call to the PF_NETLINK
domain. Each FEC is identified by a protocol number. One may open domain. Each FEC is identified by a protocol number. One may open
either SOCK_RAW or SOCK_DGRAM type sockets although netlink doesnt either SOCK_RAW or SOCK_DGRAM type sockets although netlink doesnt
distinguish the two. The socket connection provides the basis for distinguish the two. The socket connection provides the basis for
the FE<->CP addressing. the FE<->CP addressing.
Connecting to a service is followed (at any point during the life Connecting to a service is followed (at any point during the life
of the connection) by issuing either a service specific command of the connection) by issuing either a service specific command
mostly for configuration purposes (from the CPC to the FEC) or sub- mostly for configuration purposes (from the CPC to the FEC) or sub-
scribing/unsubscribing to service(s') events. scribing/unsubscribing to service(s') events, or statistics collec-
tion.
3.3.1.1. Sample Service Hierachy 3.3.1.1. Sample Service Hierachy
In the diagram below we show a simple IP service, foo, and the In the diagram below we show a simple IP service, foo, and the
interaction it has between CP and FE components for the ser- interaction it has between CP and FE components for the ser-
vice(labels 1-3). vice(labels 1-3).
We introduce the diagram below to demonstrate CP<->FE addressing. We introduce the diagram below to demonstrate CP<->FE addressing.
In this section we illustrate only the addressing semantics. In In this section we illustrate only the addressing semantics. In
section 4, the diagram is referenced again to define the protocol section 4, the diagram is referenced again to define the protocol
interaction between srevice foo's CPC and FEC (labels 4-10). interaction between service foo's CPC and FEC (labels 4-10).
jhs_hk_ak_ank draft-forces-netlink-01.txt jhs_hk_ak_ank draft-forces-netlink-02.txt
CP CP
[--------------------------------------------------------. [--------------------------------------------------------.
| .-----. | | .-----. |
| | . -------. | | | . -------. |
| | CLI | / | | | CLI | / |
| | | | CP protocol | | | | | CP protocol |
| /->> -. | component | <-. | | /->> -. | component | <-. |
| __ _/ | | For | | | | __ _/ | | For | | |
| | | IP service | ^ | | | | IP service | ^ |
skipping to change at page 10, line 5 skipping to change at page 10, line 5
above in the diagram. above in the diagram.
1) Connect to IP service foo through a socket connect. A typical con- 1) Connect to IP service foo through a socket connect. A typical con-
nection would be via a call to: socket(AF_NETLINK, SOCK_RAW, nection would be via a call to: socket(AF_NETLINK, SOCK_RAW,
NETLINK_FOO) NETLINK_FOO)
2) Bind to listen to specific async events for service foo 2) Bind to listen to specific async events for service foo
3) Bind to listen to specific async FE events 3) Bind to listen to specific async FE events
jhs_hk_ak_ank draft-forces-netlink-01.txt jhs_hk_ak_ank draft-forces-netlink-02.txt
3.3.2. Netlink message header 3.3.2. Netlink message header
Netlink messages consist of a byte stream with one or multiple Netlink messages consist of a byte stream with one or multiple
Netlink headers and associated payload. If the payload is too big Netlink headers and associated payload. If the payload is too big
to fit into a single message it can be split over multiple netlink to fit into a single message it can be split over multiple netlink
messages. This is called a multipart message. For multipart mes- messages. This is called a multipart message. For multipart mes-
sages the first and all following headers have the NLM_F_MULTI sages the first and all following headers have the NLM_F_MULTI
netlink header netlink header flag set, except for the last header which has the
flag set, except for the last header which has the netlink header netlink header type NLMSG_DONE.
type NLMSG_DONE.
The netlink message header is shown below. The netlink message header is shown below.
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
0 1 2 3 0 1 2 3
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length | | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Flags | | Type | Flags |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number | | Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Process PID | | Process PID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The fields in the header are: The fields in the header are:
jhs_hk_ak_ank draft-forces-netlink-01.txt jhs_hk_ak_ank draft-forces-netlink-02.txt
Length: 32 bits Length: 32 bits
The length of the message in bytes including the header. The length of the message in bytes including the header.
Type: 16 bits Type: 16 bits
This field describes the message content. This field describes the message content.
It can be one of the standard message types: It can be one of the standard message types:
NLMSG_NOOP message is ignored NLMSG_NOOP message is ignored
NLMSG_ERROR the message signals an error and the payload NLMSG_ERROR the message signals an error and the payload
contains a nlmsgerr structure. This can be looked contains a nlmsgerr structure. This can be looked
skipping to change at page 12, line 4 skipping to change at page 11, line 54
because it has the potential to interrupt because it has the potential to interrupt
service in the FE for a longer time. service in the FE for a longer time.
Convenience macros for flag bits: Convenience macros for flag bits:
NLM_F_DUMP This is NLM_F_ROOT or'ed with NLM_F_MATCH NLM_F_DUMP This is NLM_F_ROOT or'ed with NLM_F_MATCH
Additional flag bits for NEW requests Additional flag bits for NEW requests
NLM_F_REPLACE Replace existing matching config object with NLM_F_REPLACE Replace existing matching config object with
this request. this request.
NLM_F_EXCL Don't replace the config object if it already NLM_F_EXCL Don't replace the config object if it already
exists.
jhs_hk_ak_ank draft-forces-netlink-01.txt jhs_hk_ak_ank draft-forces-netlink-02.txt
exists.
NLM_F_CREATE Create config object if it doesn't already NLM_F_CREATE Create config object if it doesn't already
exist. exist.
NLM_F_APPEND Add to the end of the object list. NLM_F_APPEND Add to the end of the object list.
For those familiar with BSDish use of such operations in route For those familiar with BSDish use of such operations in route
sockets, the equivalent translations are: sockets, the equivalent translations are:
- BSD ADD operation equates to NLM_F_CREATE or-ed - BSD ADD operation equates to NLM_F_CREATE or-ed
with NLM_F_EXCL with NLM_F_EXCL
- BSD CHANGE operation equates to NLM_F_REPLACE - BSD CHANGE operation equates to NLM_F_REPLACE
skipping to change at page 12, line 35 skipping to change at page 12, line 34
Process PID: 32 bits Process PID: 32 bits
The PID of the process sending the message. The PID is used by the The PID of the process sending the message. The PID is used by the
kernel to multiplex to the correct sockets. A PID of zero is used kernel to multiplex to the correct sockets. A PID of zero is used
when sending messages to user space from the kernel. netlink service when sending messages to user space from the kernel. netlink service
fills in an appropiate value when zero. fills in an appropiate value when zero.
3.3.2.1. Mechanisms for creating protocols 3.3.2.1. Mechanisms for creating protocols
One could create a reliable protocol between an FEC and a CPC by One could create a reliable protocol between an FEC and a CPC by
using the combination of sequence numbers, ACKs and retransmit using the combination of sequence numbers, ACKs and retransmit
timers. Both sequence numbers and sequence numbers are provided by timers. Both sequence numbers and ACKs are provided by netlink.
netlink. Timers are provided by Linux. Timers are provided by Linux.
One could create a heartbeat protocol between the FEC and CPC by One could create a heartbeat protocol between the FEC and CPC by
using the ECHO flags and the NLMSG_NOOP message. using the ECHO flags and the NLMSG_NOOP message.
3.3.2.2. The ACK netlink message 3.3.2.2. The ACK netlink message
This message is actually used to denote both an ACK and a NACK. This message is actually used to denote both an ACK and a NACK.
Typically the direction is from kernel to user space (in response Typically the direction is from kernel to user space (in response
to an ACK request message that is sent). However, user space should to an ACK request message). However, user space should be able to
be able to send ACKs back to kernel space when requested. This is send ACKs back to kernel space when requested. This is IP service
IP service specific. specific.
jhs_hk_ak_ank draft-forces-netlink-01.txt jhs_hk_ak_ank draft-forces-netlink-02.txt
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
0 1 2 3 0 1 2 3
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Netlink message header | | Netlink message header |
| type = NLMSG_ERROR | | type = NLMSG_ERROR |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| error code | | error code |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
skipping to change at page 13, line 33 skipping to change at page 13, line 33
header that can be used to compare against (sent sequence numbers header that can be used to compare against (sent sequence numbers
etc). etc).
A non-zero error message is equivalent to a Negative ACK (NACK). A non-zero error message is equivalent to a Negative ACK (NACK).
In such a situation, the netlink data that was sent down to the In such a situation, the netlink data that was sent down to the
kernel is returned appended to the original netlink message header. kernel is returned appended to the original netlink message header.
An error code printable via the perror() is also set (not in the An error code printable via the perror() is also set (not in the
message header, rather in the executing environment state vari- message header, rather in the executing environment state vari-
able). able).
3.3.3. FE services' templates 3.3.3. FE System services' templates
These are services that are offered by the system for general use These are services that are offered by the system for general use
by other services. They include ability to configure and listen to by other services. They include ability to configure, gather
changes in resource management. IP address management, link events statistics and listen to changes in shared resources. IP address
etc fit here. We separate them into this section here for logical management, link events etc fit here. We separate them into this
purposes despite the fact that they are accessed via the section here for logical purposes despite the fact that they are
NETLINK_ROUTE FEC. The reason that they exist within NETLINK_ROUTE accessed via the NETLINK_ROUTE FEC. The reason that they exist
is due to historical cruft based on the fact that BSD 4.4 rather within NETLINK_ROUTE is due to historical cruft based on the fact
narrowly focussed Route Sockets implemented them as part of the that BSD 4.4 rather narrowly focussed Route Sockets implemented
IPV4 forwarding sockets. them as part of the IPV4 forwarding sockets.
3.3.3.1. 3.3.3.1.
Network Interface Service Module Network Interface Service Module
jhs_hk_ak_ank draft-forces-netlink-01.txt jhs_hk_ak_ank draft-forces-netlink-02.txt
This service provides the ability to create, remove or get informa- This service provides the ability to create, remove or get informa-
tion about a specific network interface. The network interface tion about a specific network interface. The network interface
could be either pohysical or virtual and is network protocol inde- could be either physical or virtual and is network protocol inde-
pendent (example an x.25 interface can be defined via this mes- pendent (example an x.25 interface can be defined via this mes-
sage). The Interface service message template is shown below. sage). The Interface service message template is shown below.
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
0 1 2 3 0 1 2 3
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Family | Padding | Device Type | | Family | Padding | Device Type |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Interface Index | | Interface Index |
skipping to change at page 15, line 5 skipping to change at page 15, line 5
IFF_PROMISC Interface is in promiscuous mode. IFF_PROMISC Interface is in promiscuous mode.
IFF_NOTRAILERS Avoid use of trailers. IFF_NOTRAILERS Avoid use of trailers.
IFF_ALLMULTI Receive all multicast packets. IFF_ALLMULTI Receive all multicast packets.
IFF_MASTER Master of a load balancing bundle. IFF_MASTER Master of a load balancing bundle.
IFF_SLAVE Slave of a load balancing bundle. IFF_SLAVE Slave of a load balancing bundle.
IFF_MULTICAST Supports multicast IFF_MULTICAST Supports multicast
IFF_PORTSEL Is able to select media type via ifmap. IFF_PORTSEL Is able to select media type via ifmap.
IFF_AUTOMEDIA Auto media selection active. IFF_AUTOMEDIA Auto media selection active.
IFF_DYNAMIC Interface Address is not permanent. IFF_DYNAMIC Interface Address is not permanent.
jhs_hk_ak_ank draft-forces-netlink-01.txt jhs_hk_ak_ank draft-forces-netlink-02.txt
Change Mask: Reserved for future use. Must be set to 0xFFFFFFFF. Change Mask: Reserved for future use. Must be set to 0xFFFFFFFF.
Applicable attributes: Applicable attributes:
attribute description attribute description
....................................................... .......................................................
IFLA_UNSPEC - unspecified. IFLA_UNSPEC - unspecified.
IFLA_ADDRESS hardware address interface L2 address IFLA_ADDRESS hardware address interface L2 address
IFLA_BROADCAST hardware address L2 broadcast IFLA_BROADCAST hardware address L2 broadcast
address. address.
IFLA_IFNAME ascii string Device name. IFLA_IFNAME ascii string device name.
IFLA_MTU MTU of the device. IFLA_MTU MTU of the device.
IFLA_LINK Link type. IFLA_LINK Link type.
IFLA_QDISC ascii string defining Queueing disci- IFLA_QDISC ascii string defining Queueing
pline. discipline.
IFLA_STATS Interface Statistics. IFLA_STATS Interface Statistics.
Netlink message types specific to this service: RTM_NEWLINK, Netlink message types specific to this service: RTM_NEWLINK,
RTM_DELLINK, RTM_GETLINK RTM_DELLINK, RTM_GETLINK
3.3.3.2. IP Address Service module 3.3.3.2. IP Address Service module
This service provides the ability to add, remove or receive information This service provides the ability to add, remove or receive information
about an IP address associated with an interface. The Address provi- about an IP address associated with an interface. The Address provi-
sioning service message template is shown below. sioning service message template is shown below.
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
0 1 2 3 0 1 2 3
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Family | Length | Flags | Scope | | Family | Length | Flags | Scope |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Interface Index | | Interface Index |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Family: AF_INET for IPV4 or AF_INET6 for IPV6. Length: the Family: AF_INET for IPV4 or AF_INET6 for IPV6.
length of the address mask Flags: IFA_F_SECONDARY for secondary Length: the length of the address mask
address (old alias interface), Flags: IFA_F_SECONDARY for secondary address (alias interface),
IFA_F_PERMANENT for a permanent address set by the user as IFA_F_PERMANENT for a permanent address set by the user as
opposed to dynamic addresses. opposed to dynamic addresses.
other flags include: other flags include:
IFA_F_DEPRECATED which defines deprecated (IPV6) address IFA_F_DEPRECATED which defines deprecated (IPV6) address
IFA_F_TENTATIVE which defines tentative (IPV6) address IFA_F_TENTATIVE which defines tentative (IPV6) address
Scope: the address scope Scope: the address scope
jhs_hk_ak_ank draft-forces-netlink-01.txt jhs_hk_ak_ank draft-forces-netlink-02.txt
Applicable attributes: Applicable attributes:
attribute description attribute description
....................................................... .......................................................
IFA_UNSPEC - unspecified. IFA_UNSPEC - unspecified.
IFA_ADDRESS raw protocol address of interface IFA_ADDRESS raw protocol address of interface
IFA_LOCAL raw protocol local address IFA_LOCAL raw protocol local address
IFA_LABEL ascii string name of the interface IFA_LABEL ascii string name of the interface
reffered to. reffered to.
IFA_BROADCAST raw protocol broadcast address. IFA_BROADCAST raw protocol broadcast address.
skipping to change at page 17, line 5 skipping to change at page 17, line 5
7) receive response to 6) via channel on 2) 7) receive response to 6) via channel on 2)
9) register the protocol specific packets you would like the FE to 9) register the protocol specific packets you would like the FE to
forward to you forward to you
10) send specific service foo commands and receive responses for them 10) send specific service foo commands and receive responses for them
if needed if needed
4.1. Interacting with other IP services 4.1. Interacting with other IP services
jhs_hk_ak_ank draft-forces-netlink-01.txt jhs_hk_ak_ank draft-forces-netlink-02.txt
The last diagram shows another control component configuring the The last diagram shows another control component configuring the
same service. In this case, it is a proprietary Command Line Inter- same service. In this case, it is a proprietary Command Line Inter-
face. The CLI (may or ) may not be using the netlink protocol to face. The CLI (may or ) may not be using the netlink protocol to
communicate to the foo component. If the CLI should issue commands communicate to the foo component. If the CLI should issue commands
that will affect the policy of the FEC for service "foo" then, then that will affect the policy of the FEC for service "foo" then, then
the "foo" CPC is notified. It could then make algorithmic decisions the "foo" CPC is notified. It could then make algorithmic decisions
based on this input (example if a policy that foo installed was based on this input (example if a policy that foo installed was
deleted, there might be need to propagate this to all the peers of deleted, there might be need to propagate this to all the peers of
service "foo"). service "foo").
5. Currently Defined netlink IP services 5. Currently Defined netlink IP services
Although there are many other IP services defined which are using Although there are many other IP services defined which are using
netlink, we will only mention those integrated into the kernel netlink, we will only mention those integrated into the kernel
today (kernel version 2.4.6). These are: today (kernel version 2.4.6). These are:
NETLINK_ROUTE,NETLINK_FIREWALL,NETLINK_ARPD,NETLINK_ROUTE6,NETLINK_IP6_FW NETLINK_ROUTE,NETLINK_FIREWALL,NETLINK_ARPD,NETLINK_ROUTE6,
NETLINK_TAPBASE,NETLINK_SKIP,NETLINK_USERSOCK. NETLINK_IP6_FW
5.1. IP Service NETLINK_ROUTE 5.1. IP Service NETLINK_ROUTE
This service allows CPCs to modify the IPv4 routing table in the This service allows CPCs to modify the IPv4 routing table in the
Forwarding Engine. It can also be used by CPCs to receive routing Forwarding Engine. It can also be used by CPCs to receive routing
updates. updates as well as collecting statistics.
5.1.1. Network Route Service Module 5.1.1. Network Route Service Module
This service provides the ability to create, remove or receive informa- This service provides the ability to create, remove or receive informa-
tion about a network route. The service message template is shown tion about a network route. The service message template is shown
below. below.
jhs_hk_ak_ank draft-forces-netlink-01.txt jhs_hk_ak_ank draft-forces-netlink-02.txt
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
0 1 2 3 0 1 2 3
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Family | Src length | Dest length | TOS | | Family | Src length | Dest length | TOS |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Table ID | Protocol | Scope | Type | | Table ID | Protocol | Scope | Type |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Flags | | Flags |
skipping to change at page 19, line 4 skipping to change at page 18, line 38
RT_TABLE_UNSPEC an unspecified routing table RT_TABLE_UNSPEC an unspecified routing table
RT_TABLE_DEFAULT the default table RT_TABLE_DEFAULT the default table
RT_TABLE_MAIN the main table RT_TABLE_MAIN the main table
RT_TABLE_LOCAL the local table RT_TABLE_LOCAL the local table
The user may assign arbitary values between The user may assign arbitary values between
RT_TABLE_UNSPEC and RT_TABLE_DEFAULT. RT_TABLE_UNSPEC and RT_TABLE_DEFAULT.
Protocol: identifies what/who added the route. Described further Protocol: identifies what/who added the route. Described further
below. below.
jhs_hk_ak_ank draft-forces-netlink-01.txt
protocol Route origin. protocol Route origin.
.............................................. ..............................................
RTPROT_UNSPEC unknown RTPROT_UNSPEC unknown
RTPROT_REDIRECT by an ICMP redirect RTPROT_REDIRECT by an ICMP redirect
(currently unused) (currently unused)
RTPROT_KERNEL by the kernel RTPROT_KERNEL by the kernel
RTPROT_BOOT during boot RTPROT_BOOT during boot
RTPROT_STATIC by the administrator RTPROT_STATIC by the administrator
Values larger than RTPROT_STATIC are not inter- Values larger than RTPROT_STATIC are not interpreted by the ker-
preted by the kernel, they are just for user infor- nel, they are just for user information. They may be used to tag
mation. They may be used to tag the source of a the source of a routing information or to distingush between multiple
routing information or to distingush between multi- routing daemons. See <linux/rtnetlink.h> for the routing daemon
ple routing daemons. See <linux/rtnetlink.h> for identifiers which are already assigned.
the routing daemon identifiers which are already
assigned. jhs_hk_ak_ank draft-forces-netlink-02.txt
Scope: Route scope (distance to destination). Scope: Route scope (distance to destination).
RT_SCOPE_UNIVERSE global route RT_SCOPE_UNIVERSE global route
RT_SCOPE_SITE interior route in the RT_SCOPE_SITE interior route in the
local autonomous system local autonomous system
RT_SCOPE_LINK route on this link RT_SCOPE_LINK route on this link
RT_SCOPE_HOST route on the local host RT_SCOPE_HOST route on the local host
RT_SCOPE_NOWHERE destination doesn't exist RT_SCOPE_NOWHERE destination doesn't exist
The values between RT_SCOPE_UNIVERSE and The values between RT_SCOPE_UNIVERSE and RT_SCOPE_SITE are avail-
RT_SCOPE_SITE are available to the user. able to the user.
Type: The type of route. Type: The type of route.
jhs_hk_ak_ank draft-forces-netlink-01.txt
Route type description Route type description
------------------------------------------------- -------------------------------------------------
RTN_UNSPEC unknown route RTN_UNSPEC unknown route
RTN_UNICAST a gateway or direct route RTN_UNICAST a gateway or direct route
RTN_LOCAL a local interface route RTN_LOCAL a local interface route
RTN_BROADCAST a local broadcast route RTN_BROADCAST a local broadcast route
(sent as a broadcast) (sent as a broadcast)
RTN_ANYCAST a local broadcast route RTN_ANYCAST a local broadcast route
(sent as a unicast) (sent as a unicast)
RTN_MULTICAST a multicast route RTN_MULTICAST a multicast route
skipping to change at page 21, line 5 skipping to change at page 20, line 5
Flags: further qualify the route. Flags: further qualify the route.
RTM_F_NOTIFY if the route changes, notify the RTM_F_NOTIFY if the route changes, notify the
user via rtnetlink user via rtnetlink
RTM_F_CLONED route is cloned from another route RTM_F_CLONED route is cloned from another route
RTM_F_EQUALIZE a multicast equalizer (not yet RTM_F_EQUALIZE a multicast equalizer (not yet
implemented) implemented)
Attributes applicable to this service: Attributes applicable to this service:
jhs_hk_ak_ank draft-forces-netlink-01.txt jhs_hk_ak_ank draft-forces-netlink-02.txt
Attribute description Attribute description
----------------------------------------------- -----------------------------------------------
RTA_UNSPEC ignored. RTA_UNSPEC ignored.
RTA_DST protocol address for route RTA_DST protocol address for route
destination address. destination address.
RTA_SRC protocol address for route source RTA_SRC protocol address for route source
address. address.
RTA_IIF Input interface index. RTA_IIF Input interface index.
RTA_OIF Output interface index. RTA_OIF Output interface index.
skipping to change at page 21, line 46 skipping to change at page 21, line 5
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
0 1 2 3 0 1 2 3
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Family | Padding | Padding | | Family | Padding | Padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Interface Index | | Interface Index |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| State | Flags | Type | | State | Flags | Type |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Family: Address Family Interface Index: The unique interface index jhs_hk_ak_ank draft-forces-netlink-02.txt
State: is a bitmask of the following states:
jhs_hk_ak_ank draft-forces-netlink-01.txt Family: Address Family
Interface Index: The unique interface index
State: is a bitmask of the following states:
NUD_INCOMPLETE a currently resolving cache entry NUD_INCOMPLETE a currently resolving cache entry
NUD_REACHABLE a confirmed working cache entry NUD_REACHABLE a confirmed working cache entry
NUD_STALE an expired cache entry NUD_STALE an expired cache entry
NUD_DELAY an entry waiting for a timer NUD_DELAY an entry waiting for a timer
NUD_PROBE a cache entry that is currently NUD_PROBE a cache entry that is currently
reprobed reprobed
NUD_FAILED an invalid cache entry NUD_FAILED an invalid cache entry
NUD_NOARP a device with no destination cache NUD_NOARP a device with no destination cache
NUD_PERMANENT a static entry NUD_PERMANENT a static entry
Flags: one of: Flags: one of:
NTF_PROXY a proxy arp entry NTF_PROXY a proxy arp entry
NTF_ROUTER an IPv6 router NTF_ROUTER an IPv6 router
Attributes applicable to this service: Attributes applicable to this service:
Attribute$ description Attributes description
------------------------------------ ------------------------------------
NDA_UNSPEC unknown type NDA_UNSPEC unknown type
NDA_DST a neighbour cache network NDA_DST a neighbour cache network
layer destination address layer destination address
NDA_LLADDR a neighbour cache link layer NDA_LLADDR a neighbour cache link layer
address address
NDA_CACHEINFO cache statistics. NDA_CACHEINFO cache statistics.
Describe the NDA_CACHEINFO nda_cacheinfo header later --JHS Describe the NDA_CACHEINFO nda_cacheinfo header later --JHS
additional netlink message types applicable to this service: additional netlink message types applicable to this service:
RTM_NEWNEIGH, RTM_DELNEIGH, RTM_GETNEIGH RTM_NEWNEIGH, RTM_DELNEIGH, RTM_GETNEIGH
5.1.3. Traffic Control Service 5.1.3. Traffic Control Service
This service provides the ability to add, remove or get a queueing dis- This service provides the ability to provision, query or listen to
cipline. The service message template is shown below. events under the auspicies of traffic control. These include Queueing
disciplines (schedulers and queue treatment algorithms eg Priority based
scheduler or RED algorithm) and classifiers. Linux Traffic Control Ser-
vice is very flexible and allows for hierachical cascading of the dif-
ferent blocks for traffic sharing. The service message template which
makes this possible is shown below. Each of the specific component of
the model has unique attributes which describe it best. The common
jhs_hk_ak_ank draft-forces-netlink-01.txt jhs_hk_ak_ank draft-forces-netlink-02.txt
attributes as well which are described below.
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
0 1 2 3 0 1 2 3
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Family | Padding | Padding | | Family | Padding | Padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Interface Index | | Interface Index |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Qdisc handle | | Qdisc handle |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Parent Qdisc | | Parent Qdisc |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| TCM Info | | TCM Info |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
jhs_hk_ak_ank draft-forces-netlink-02.txt
Family: Address Family
Interface Index: The unique interface index
Qdisc handle: unique identifier for instance of queueing discipline.
Typically this is split into major:minor of 16 bits each. The major
number would also be the major number of the parent of this instance.
Parent Qdisc: This is used in hierarchical layering of queueing
disciplines.
If this value and the Qdisc handle are the same and equal to TC_H_ROOT
then the defined qdisc is the top most layer known as the root qdisc.
TCM Info: This is set by the FE to 1 typically except when the qdisc
instance is in use, in which case it is set to imply a reference count.
Attributes applicable to this service:
Attribute description
------------------------------------
TCA_KIND canonical name of FE component
TCA_STATS generic usage statistics of FEC
TCA_RATE rate estimator being attached to
FEC. Takes snapshots of stats to
compute rate
TCA_XSTATS specific statistics of FEC
TCA_OPTIONS nested FEC-specific attributes
[should we define all FEC-specific attributes? Seems like a lot of work
-- jhs]
[We still need to talk about classes and filters; later -- jhs]
5.2. IP Service NETLINK_FIREWALL 5.2. IP Service NETLINK_FIREWALL
This service allows CPCs to receive packets sent by the IPv4 fire- This service allows CPCs to receive packets sent by the IPv4 fire-
wall service in the FE. wall service in the FE.
Two types of messages exist that can be sent from CPC to FEC. These Two types of messages exist that can be sent from CPC to FEC. These
are: Mode messages and Verdict messages. The formats are described are: Mode messages and Verdict messages. The formats are described
below. below.
jhs_hk_ak_ank draft-forces-netlink-02.txt
The Verdict message format is as follows The Verdict message format is as follows
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
0 1 2 3 0 1 2 3
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Value | | Value |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Packet ID | | Packet ID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Length | | Data Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Payload ... | | Payload ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
A ipq_packet_msg packet type is sent from the FEC to the CPC. The A ipq_packet_msg packet type is sent from the FEC to the CPC. The
format is described below ==> We need to complete this later format is described below ==> We need to complete this later
jhs_hk_ak_ank draft-forces-netlink-01.txt
5.3. IP Service NETLINK_ARPD 5.3. IP Service NETLINK_ARPD
This service is used by CPCs for managing the ARP table in FE. This service is used by CPCs for managing the ARP table in FE.
5.4. IP Service NETLINK_ROUTE6 5.4. IP Service NETLINK_ROUTE6
This service allows CPCs to modify the IPv6 routing table in the This service allows CPCs to modify the IPv6 routing table in the
FE. It can also be used by CPCs to receive routing updates. FE. It can also be used by CPCs to receive routing updates.
jhs_hk_ak_ank draft-forces-netlink-01.txt jhs_hk_ak_ank draft-forces-netlink-02.txt
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
0 1 2 3 0 1 2 3
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| IPv6 dst addr | | IPv6 dst addr |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| IPv6 dst addr | | IPv6 dst addr |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| IPv6 dst addr | | IPv6 dst addr |
skipping to change at page 26, line 5 skipping to change at page 26, line 5
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Flags | | Flags |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Interface Index | | Interface Index |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
5.5. IP Service NETLINK_IP6_FW 5.5. IP Service NETLINK_IP6_FW
This service allows CPCs to receive packets that failed the IPv6 This service allows CPCs to receive packets that failed the IPv6
jhs_hk_ak_ank draft-forces-netlink-01.txt jhs_hk_ak_ank draft-forces-netlink-02.txt
firewall checks by that module in the FE. firewall checks by that module in the FE.
5.6. IP Service NETLINK_TAPBASE
This service allows CPCs to simulate an ethernet driver belonging
to the FE.
//are the instances of the ethertap device. Ethertap //is a
pseudo network tunnel device that allows an //ethernet driver to
be simulated from user space.
5.7. IP Service NETLINK_SKIP
This service is reserved for ENskip (?).
5.8. IP Service NETLINK_USERSOCK
This service is reserved for future Control Plane to FE protocols.
6. Security Considerations 6. Security Considerations
Netlink lives in a trusted environment of a single host separated Netlink lives in a trusted environment of a single host separated
by kernel and user space. Linux capabilities ensures that only by kernel and user space. Linux capabilities ensures that only
someone with CAP_NET_ADMIN capability (typically root user) is someone with CAP_NET_ADMIN capability (typically root user) is
allowed to open sockets. allowed to open sockets.
7. References 7. References
[RFC1633] R. Braden, D. Clark, and S. Shenker, "Integrated [RFC1633] R. Braden, D. Clark, and S. Shenker, "Integrated
Services in the Internet Architecture: an Overview", RFC 1633, Services in the Internet Architecture: an Overview", RFC 1633,
jhs_hk_ak_ank draft-forces-netlink-01.txt
ISI, MIT, and PARC, June 1994. ISI, MIT, and PARC, June 1994.
[RFC1812] F. Baker, "Requirements for IP Version 4 [RFC1812] F. Baker, "Requirements for IP Version 4
Routers", RFC 1812, June 1995. Routers", RFC 1812, June 1995.
[RFC2475] M. Carlson, W. Weiss, S. Blake, Z. Wang, D. [RFC2475] M. Carlson, W. Weiss, S. Blake, Z. Wang, D.
Black, and E. Davies, "An Architecture for Differentiated Black, and E. Davies, "An Architecture for Differentiated
Services", RFC 2475, December 1998. Services", RFC 2475, December 1998.
[RFC2748] J. Boyle, R. Cohen, D. Durham, S. Herzog, R. [RFC2748] J. Boyle, R. Cohen, D. Durham, S. Herzog, R.
skipping to change at page 27, line 29 skipping to change at page 27, line 5
[RFC2328] J. Moy, "OSPF Version 2", RFC 2328, April 1998. [RFC2328] J. Moy, "OSPF Version 2", RFC 2328, April 1998.
[RFC1157] J.D. Case, M. Fedor, M.L. Schoffstall, C. Davin, [RFC1157] J.D. Case, M. Fedor, M.L. Schoffstall, C. Davin,
"Simple Network Management Protocol (SNMP)", RFC 1157, May "Simple Network Management Protocol (SNMP)", RFC 1157, May
1990. 1990.
[RFC3036] L. Andersson, P. Doolan, N. Feldman, A. Fredette, [RFC3036] L. Andersson, P. Doolan, N. Feldman, A. Fredette,
B. Thomas "LDP Specification", RFC 3036, January 2001. B. Thomas "LDP Specification", RFC 3036, January 2001.
jhs_hk_ak_ank draft-forces-netlink-02.txt
[stevens] G.R Wright, W. Richard Stevens. "TCP/IP Illus- [stevens] G.R Wright, W. Richard Stevens. "TCP/IP Illus-
trated Volume 2, Chapter 20", June 1995 trated Volume 2, Chapter 20", June 1995
8. Acknowledgements 8. Acknowledgements
1) Andi Kleen for man pages on netlink and rtnetlink. 1) Andi Kleen for man pages on netlink and rtnetlink.
2) Alexey Kuznetsov is credited for extending netlink to the IP ser- 2) Alexey Kuznetsov is credited for extending netlink to the IP ser-
vice delivery model. The original netlink character device was vice delivery model. The original netlink character device was
written by Alan Cox. written by Alan Cox.
jhs_hk_ak_ank draft-forces-netlink-01.txt
9. Author's Address: 9. Author's Address:
Jamal Hadi Salim Jamal Hadi Salim
Znyx Networks Znyx Networks
Ottawa, Ontario Ottawa, Ontario
Canada Canada
hadi@znyx.com hadi@znyx.com
Hormuzd M Khosravi Hormuzd M Khosravi
Intel Intel
skipping to change at line 1093 skipping to change at page 27, line 39
Hillsboro OR 97124-5961 Hillsboro OR 97124-5961
USA USA
1 503 264 0334 1 503 264 0334
hormuzd.m.khosravi@intel.com hormuzd.m.khosravi@intel.com
Andi Kleen Andi Kleen
SuSE SuSE
Stahlgruberring 28 Stahlgruberring 28
81829 Muenchen 81829 Muenchen
Germany Germany
Alexey Kuznetsov
INR/Swsoft
Moscow
Russia
 End of changes. 

This html diff was produced by rfcdiff 1.23, available from http://www.levkowetz.com/ietf/tools/rfcdiff/