next up previous
Next: NIST Switch design and Up: NIST Switch Architecture Overview Previous: NIST Switch Architecture Overview

Architecture diagram

The diagram below indicates the principal components of NIST Switch and their interdependencies.

\epsfig{file=switcharch.pstex} %

Here is a summary of the architectural components:

Basic label handling module:

Creates, interprets and processes the labels (or label stacks) for all incoming and outgoing packets. On outgoing packets, it determines the appropriate label (if any) to apply based on the packet's destination (IP address and possibly UDP/TCP port), route (for explicitly routed packets), and class of service. For incoming packets, it uses the packet's label to determine how the packet should be handled (forwarded or received locally). If the incoming packet is to be forwarded, it determines what changes (if any) should be made to the packet's label.

Label table:

Label information is kept in a kernel table, indexed by both label and by destination/route/service information.

Ethernet (and ATM) label handling:

Handles incorporating and interpreting label information in the (extended) Ethernet (or, later, ATM) packet header. Both ``MAC'' (draft-srinivasan-mpls-lans-label-00.txt) and ``shim'' (draft-rosen-tag-stack-03.txt) versions of Ethernet labels are supported. Either type is recognized on input from any interface; the type used on output is configurable on a per-interface basis. For Ethernet, we will implement generic label support routines and make complete sample implementation for a limited set of specific card types. (Current plans are for DEC Tulip-based cards: SMC 9332, possibly Znyx ZX346; and also 3Com ``Vortex/Boomerang'' series cards (3c590, 90x).)

In Ethernet, we will accommodate label stacks at first via big (> 1500) packets. (Both the card types listed above can handle these.) Later, we may (should) implement fragmentation/packet dropping, again configurable per interface or controlled by MTU.

For ATM, the interesting combination seems to be implementing VC pools and VCIDs as described in draft-demizu-mpls-vcpool-00.txt and draft-demizu-mpls-vcid-01.txt, using VC merge to increase the level of aggregation possible. Given the complexity of ATM implementations and the relative lack of maturity of the Linux code, we will defer any substantial work with ATM until a later date.

IPv4 and IPv6 support

Our initial target will be IPv4, with IPv6 to be supported at a later date. The actual dependencies on IPv4 of the label handling code proper are relatively small, since label handling at the lowest levels really is protocol-independent, but interaction with all the ancillary support from RSVP, routing, etc., makes it inadvisable to jump into IPv6 versions until they are reasonably stable.

Label-conscious queueing

To provide support for quality of service guarantees, we need queueing algorithms which classify and handle packets appropriately. Code for these exists for pure IP implementations (ALTQ for FreeBSD, new 2.1.9x code for Linux). We adapt these as necessary to include support for a label-based environment. We will also explore the use of a variety of traffic shaping mechanisms, including explicit congestion notification.

RSVP label support

RSVP interacts with labels in two basic ways. As a QoS signalling protocol, it distributes requests for the allocation of resources for particular flows, which we will typically want to associate with labels. Also, as suggested in (draft-davie-mpls-rsvp-01.txt), RSVP can also serve as a distribution protocol for labels. We implement both types of RSVP label support.

Differentiated services label support

Diffserv can be viewed as an attempt to provide fairly direct input into traffic management facilities. We need to devise reasonable methods for ensuring the proper correspondence between diffserv requests and applied labels, and for the proper interaction between diffserv flows and other labeled flows. As diffserv currently has no defined distribution mechanisms, we must also implement label distribution for it.

Labels and routing

Label-based QoS routing is one of our fundamental research areas. Our basic approach is in certain ways similar to the segmented routing used in routing FPGA designs. It may be summarized as follows: In general, a label designates a path or tree segment ("stick") in a network with (possibly) certain QoS characteristics. These "sticks" may be preallocated or (particularly for RSVP-created paths) made as needed. Routing at the entry gateway then consists of selecting the appropriate "bundle" of one or more of these sticks, and applying the corresponding label stack to the packets. Then, through the rest of the net, the labels dictate both the route taken and the traffic handling characteristics applied. Desirable routes are those with a small bundle size, and hence small label stack (preferably one only).

User/daemon label support

We provide a set of user-level APIs which allow reading and updating the kernel label database. These are used by the LDP daemon to get, distribute, and update label information.

Label distribution

We support two methods of label distribution: through the RSVP label object, and through a simple user-level LDP daemon. The LDP daemon communicates with its peers through (a subset of) the protocol described in draft-gray-mpls-generic-ldp-spec-00.txt and draft-feldman-ldp-spec-00.txt.

Label routing daemon

Based off the multi-threaded routing toolkit (MRT) from Merit/U of Michigan. (There is also an effort starting to create a public domain replacement for gated which may bear watching.) With interior routes largely handled through labels, perhaps the most interesting aspect of the routing daemon is how to handle the border, i.e., what and how to advertise via existing conventional routing protocols to non-label-switching routers.

SMP support

Recent router designs borrow heavily from the late, lamented, ``massively parallel minisupercomputer'' industry. To maximize its usefulness as example code, NIST Switch is aimed toward a modest version of this environment - what I'm calling ``SMP-almost-ready.'' The idea is to design things to maximize the amount of simultaneous access allowed, and put in hooks for locking where we know they'll be needed. When a multiple (dual only for now) processor system becomes available, we will do at least some confirmatory testing to indicate the fundamental soundness of our SMP design.

next up previous
Next: NIST Switch design and Up: NIST Switch Architecture Overview Previous: NIST Switch Architecture Overview
Mark Carson