NIST

Frequently Asked Questions about NIST Net

General information

Q. What is NIST Net, and what is it for?

NIST Net is a network emulation package that runs on Linux. NIST Net allows a single Linux box set up as a router to emulate a wide variety of network conditions, such as packet loss, duplication, delay and jitter, bandwidth limitations, and network congestion. This allows testing of network-adaptive protocols and applications (or those which aren't, but maybe should be) in a lab setting.

Q. What is "emulation," and how does it differ from "simulation"?

As the terms are used here, "emulation" is basically testing code inserted into a "live" implementation, allowing the live implementation to emulate (imitate) the performance characteristics of other networks. "Simulation" is a totally synthetic test environment, without a live component.

Q. What do I need to run NIST Net?

  1. A Linux installation (2.0.xx, 2.2.xx or 2.4.xx). I personally only tend to install it on Slackware distributions, but it does work for other distributions as well. The current version of NIST Net will install on at least 2.0.27 - 2.0.39, 2.2.5 - 2.2.18 and 2.4.0 - 2.4.2, and probably most other versions as well.
  2. One or more network interfaces. Typically, NIST Net is installed on a box with two Ethernet cards which is routing between two subnets. This allows it to munge all traffic flowing between the two networks. It can also be set up on an end node, to munge traffic into that node. There is no dependence on the interface type; loopback, token ring or PPP all work as well as Ethernet.
  3. An X11R6 installation, for the user interface. The interface is built on the 3D Athena widget set, but any of the drop-in replacements should work as well. I personally use the NeXT-like libneXtaw widget set, as seen in the included screen shots.

Q. What are the machine requirements for NIST Net?

Essentially, NIST Net needs enough kernel memory to store any delayed packets, and enough processor speed such that the additional overhead it introduces doesn't skew its delay properties too noticeably. (Currently, NIST Net does not account for its own overhead in computing delay factors, under the assumption this is negligible.)

As a couple of data points, NIST Net has been run successfully on a 25/50 MHz 486 with 16M of memory doing emulation on 10Mb Ethernet, and on a 200MHz Pentium with 32M of memory doing emulation on 100Mb Ethernet. Measured per-packet overhead for the first configuration was around 28 microseconds, and for the second, around 5-7 microseconds. Both values are well under the usual minimum inter-packet times on these networks, so should not have any (inherent) adverse effect on packet handling. (The emulator reports average observed overhead through the HITIOCTL_GLOBALSTATS ioctl.)

The overhead for the new alpha is currently a bit higher, as are its memory requirements. I will be working on these aspects over the next few weeks.

Q. Will NIST Net be ported to other architectures, or operating systems other than Linux? How about other widget sets?

The code has only been tested on i386-type systems, but the current version does have some (totally untested) code for Alpha and Sparc processors. (Note: the code hasn't been updated for a little while and needs some touching up.)

There don't seem to be any compelling reasons to port it to other operating systems or widget sets at this time. When used as a router, NIST Net emulation can affect any IP traffic from any source to any destination, regardless of their operating systems.

If you want something similar that runs on SunOS, NIST Net was partially inspired by (and retains some of the interfaces of) the hitbox emulator used by USC in testing TCP/Vegas.

I understand there is a similar package for FreeBSD systems called dummynet, available at http://www.iet.unipi.it/~luigi/ip_dummynet/.

If for some reason you want something similar that runs on a MicroSoft Windows operating system, I understand there are commercial packages along these lines. I do not have any further details about them, though.

Q. When is the next release coming out?

2.0.10 was "officially" released March 14, 2001. The next release is due out probably in May.


Installation information

Q. Why are kernel patches required to install NIST Net? Couldn't it all have been done as a kernel module?

You are correct, though it was somewhat non-trivial to accomplish. Starting with version 2.0a.7, NIST Net is entirely modular. That is, no kernel patches are required!

Q. I installed an old version of NIST Net that required kernel patches. The new one doesn't. What should I do?

You have three options:
  1. Do nothing. If your old version was 2.0.5 or later, the kernel-based fast timer only started when called, so it will cause no interference. With versions 2.0.4 or earlier, it will still run, increasing the interrupt load, but otherwise having no great effect.
  2. Recompile the kernel without the patches. You can simply configure the NIST Net changes off, or install a clean kernel source tree. (We tend to change kernels with some frequency here anyway, so for us this is the usual method.)
  3. Use the Unpatch.Kernel script that comes with the distribution. It will attempt to determine which version of the patches you have and undo them. Again, you'll have to recompile the kernel when done.

Q. What is the Load.Nistnet script for?

It loads the compiled and installed nistnet module, first attempting to "save" the system RTC (real-time clock) interrupt information if possible. Hence, the usual sequence for getting NIST Net going is: If you don't want to try to save the system RTC interrupt information for some reason (like, doing so causes the system to crash...), just do insmod nistnet rather than Load.Nistnet

Q. I was able to load the nistnet module, but when I try to load mungemod or spymod, I get

./mungemod.o: unresolved symbol addmunge
./mungemod.o: unresolved symbol rmmunge

Version 2.0.10 of NIST Net should hopefully fix this problem completely. So first, try upgrading to it. If this still fails, try the following:

Two possibilities: The simple one is that you don't have the nistnet module still loaded. It is a prerequisite for loading mungemod or spymod. (Sometimes some system cleanup daemon might remove it, assuming it not to be in use.)

The complex one is that you've run into a module versioning bug. The drastic solution is to recompile the kernel with module versioning turned off ("Set version information on all symbols for modules" under "Loadable module support").

The less drastic solution is to remake the interface version files, to ensure they're all up-to-date. (They should all be updated automatically, but then there are lots of other things that should work properly all the time, aren't there?) Go to /usr/src/linux/include/linux/modules and remove everything there. Then remake the kernel and modules (make, make modules). This shouldn't take too long (assuming you've compiled the kernel before), since you haven't actually changed anything.

Q. I tried compiling NIST Net, but I got complaints about no /usr/src/linux/.config file, or maybe about no <asm/spinlock.h>.

Oh, well, I should have realized this. The problem is, even if you don't need to create a new kernel to install NIST Net any more, you still need to have the kernel configuration defined so NIST Net can be compiled properly. So follow this procedure:
  1. On most Red Hat systems, there's a set of config files corresponding to the various kernel options in /usr/src/linux/configs. For example, on a nearby Red Hat 7 box, we have
    kernel-2.2.16-i386-BOOT.config  kernel-2.2.16-i586.config
    kernel-2.2.16-i386-smp.config   kernel-2.2.16-i686-enterprise.config
    kernel-2.2.16-i386.config       kernel-2.2.16-i686-smp.config
    kernel-2.2.16-i586-smp.config   kernel-2.2.16-i686.config
    
  2. To determine which one is actually being used, type uname -r - on the box here, this gives 2.2.16-22enterprise, so the appropriate config file is kernel-2.2.16-i686-enterprise.config.
  3. Now do the following:
    cd /usr/src/linux/configs
    cp kernel-2.2.16-i686-enterprise.config ../.config
    - replace this with whichever one is appropriate
    cd /usr/src/linux
    make menuconfig
    - look over the selections if you wish, though you probably don't need to change anything. When you exit, say "yes" to saving the kernel configuration.
    make dep
    - creates all the dependencies and so on corresponding to your configuration.
  4. If you don't have a /usr/src/linux/configs directory, you'll have to create a configuration file as best you can. Just skip to the make menuconfig part above. For other packages, you may want to put some effort into setting up a good configuration, but for NIST Net, the default values are almost certainly good enough, so you can just exit out and save.
  5. You should now be able to compile and install NIST Net.

Q. Well, when I tried compiling NIST Net, I got complaints about no xmkmf or maybe no Imake.tmpl or <X11/Shell.h>.

You need to install the "X development kit" or "toolset" or whatever it's called for your version of Linux.


"Complaints" about NIST Net, or why does it work this way?

Q. Why does NIST Net emulation only affect incoming traffic and not outgoing traffic?

Basically because it was easier (i.e., required less disruption of the kernel code) to do it that way. When NIST Net is used on a router, catching packets at receive time suffices to affect all traffic. This isn't true on an end node, of course, but hopefully the provided capabilities will be sufficient.

There are vague plans to redo the packet interception code to allow handling outgoing traffic too. Don't expect this anytime soon, though.

Q. Why does NIST Net implement DRD (Derivative Random Drop) instead of RED (Random Early Detection)? Isn't RED "better" even if more complicated?

For the purposes of the emulator, any congestion-dependant packet dropping mechanism is really sufficient. The main problem with DRD in a router implementation is that an instantaneous traffic burst could lead to a large number of near-simultaneous drops, and hence correlation of the subsequent restarts, from multiple TCP connections. But since this emulator can treat any source/destination pair separately, there's no need to end up with correlated drops.

Of course, if the goal is actually to test RED or some variant thereof, it can be implemented as an "add-on" packet munger. And if you happen to want correlated drops, NIST Net does offer this as well.

Q. Why didn't they implement a faster timer in ordinary Linux? Are there bugs/problems with your approach?

Yes, there is a potential problem, which explains why the faster timer wasn't implemented in ordinary Linux. The problem is that some poorly behaved Linux device driver code can turn off interrupts for an indefinite period. If this period is more than one timer tick, it's possible for timer interrupts to be "lost" so the timer tick count will be off. For the ordinary Linux timer tick interval of 1/100 of a second, this normally isn't a much of a problem. With the code here running a clock at 8192 Hz (timer tick interval of 122 usec), this becomes more likely. How can this be dealt with? Here are several approaches:
  1. Don't worry; be happy. The main effect of missed ticks from NIST Net's point of view is that some packet delays are a bit longer than they would be ideally. By and large, I haven't found this to be a major problem. (NIST Net isn't really a "hard real-time" system.)
  2. On the machine with NIST Net, don't have a lot of device activity while NIST Net is running.
  3. The biggest offender usually is the IDE controller. Try using hdparm to get it to unmask interrupts sooner. I usually use the following (put into rc.local, once for each IDE drive):

    hdparm -m16 -u1 /dev/hda

    See the hdparm manual entry for various caveats on its use, first. If you can use these settings, do so; they will actually improve the performance and responsiveness of your system. If you can't use them, though, the "symptoms" may include massive disk corruption, so a little caution is indicated. (This shouldn't be a problem with any recent systems, though.)

  4. With Pentium-class systems, timer problems are much less of an issue, since the Pentium has a fairly accurate cycle counter which can be used to keep the clock in sync. (By the way, the counter is only really useful when APM (power management) is not enabled. I STRONGLY recommend not enabling APM on any machine running NIST Net, since going into sleep mode pretty much wrecks a machine's usefulness as a router...)


Technical points

Q. [This one was an actual asked question!] I set a straight bandwidth limitation of 8000 bytes/second. For ping packets 1472 or below in size, I see no delays. When the size reaches 1473, though, the delay suddenly jumps to around 190 ms. What's going on?

This is a consequence of the way IP behaves, and the way NIST Net implements bandwidth limitation. When you set a maximum allowed bandwidth of 8000 bytes/second, then as long as the bandwidth utilization is below that level, NIST Net won't do any delaying. That's what you see with packet sizes 1472 and below - since ping sends one packet per second, its bandwidth utilization is well below 8000 bytes/second.

So why are there delays for packet sizes 1473 (and above)? Well, including the IP header, the actual packet size is 1501 bytes. On most LANs (most networks, in fact), the maximum allowed IP packet size (MTU) is 1500. So when you try to send a larger packet, IP will fragment it into two packets. If you trace the traffic (with hitbox -S src dest), you'll see that for each ping, two packets are sent in a row, of sizes 1500 and 46 bytes. (46 bytes is the minimum ping packet size.) When the first packet arrives, NIST Net notes that 1500 bytes have been sent through that connection; to keep the instantaneous bandwidth utilization below 8000 bytes/second, it will then delay the second packet for 1500/8000 of a second (187.5 ms). This delays the reassembly of the packet fragments at the receiving end by the same amount.

NIST Net delays the packet because it looks at instantaneous bandwidth utilization, i.e., it's (roughly) emulating a network where at no time can you send more than 1 byte per 1/8000 of a second. So the second packet is delayed, even though the long term average utilization is only 1546 bytes/second. (By the way, you should see the delays stay about the same value for ping packets of size up to 2952; at 2953, the packet gets fragmented into three pieces, and the delay times will double.)

Now one quirk of the implementation is that it only takes bandwidth utilization by the previous packets into account, not the current one. So when you're only sending packets every second, like here, the first one essentially gets a free ride. I had thought about taking the current packet into account, but with sustained traffic this will tend to overcount bandwidth utilization and delay packets too much.

Some people haven't been happy with this quirk, so it is now a configuration option. If you look at the Config file in the top-level NIST Net directory, it indicates three possible ways of doing bandwidth delay. Check the comments there for more details.

Q. When I put new entries into the NIST Net user interface, it seems to hang for a long period, then finally comes back. What's going on?

Almost certainly this is due to a problem with domain name resolution on your system. In an attempt to normalize the appearance of names, when you enter a new name into the user interface, it does two DNS lookups:
  1. A forward lookup of name to IP address, to find the address it will furnish to the kernel emulator.
  2. A reverse lookup of that IP address to name, to find the "standard" form of the name it will then display.
Usually, it's the latter that gives problems. Try nslookup host and nslookup IP address for the entry you're adding. If one of them fails, you can either fix your DNS server, or as a quick hack add the problem host/IP address to /etc/hosts on the NIST Net machine.

This one seems to be a fairly common problem, especially affecting people who are not in a position to fix their DNS servers. So, in the latest versions, I have added a timeout around the DNS lookups. If they don't succeed within a fairly short period of time, they are aborted. The code sets this time period to 5 seconds; this should usually be fine, but if your DNS server is extraordinarily slow, this may be too short. If so, fix the alarm() calls in nistnet/lib/alarmingdns.c to use a longer period.

Q. [Vaguely based on yet another actual asked question] What's going on with the units for the various delay/drop/etc. parameters? And how do random delay and DRD actually work?

Bear with me; this is a slightly long one.

1. First of all, I finally unified all the disparate measurement units into one list:
Quantity Units
Delay times milliseconds (floating point)
Bandwidth bytes/second (integer)
Drop/dup probabilities percentage of packets (i.e. 100xfraction) dropped or duplicated (floating point)
One thing to note here is that bandwidths are in bytes/second, not bits/second! So, if you want to do 56000 baud, it's (approximately) 7000 bytes/second.

2. The random delay stuff was done in a way to make it quick to implement, though a little clumsy to explain. I use a random number to do a lookup in a distribution table, generating a "number of standard deviations" value (multiplied by a scaling factor of 8192 to make it integral). The delay value is then:

specified (mean) delay + (# of std dev)*(size of std dev)/scale
This gives random values which have the specified mean and standard deviation, and which match the specified distribution. It's perhaps a slightly cheesy method, but seems good enough for this purpose.

Of course, there's more to a distribution than just its shape. Successive delays, drops and so on tend to be strongly correlated with each other. For this reason, the new version of NIST Net allows specifying a (linear) correlation factor for these events. This is a number between -1 and 1, where -1 indicates complete anticorrelation; 0 indicates no correlation; and +1 indicates complete correlation. (Realistic values tend to be around .1 to .8.) The actual delay applied will then be

(1-correlation)* (calculated delay) + (correlation)*(previous delay)
Now here I will have to admit I am oversimplifying. If you want to understand better what NIST Net is really doing here, check the README files in the math directory that comes with the NIST Net distribution.

3. One other slightly confusing note is that while I specify all times in microseconds, internally, they're rounded off to the nearest "minijiffy" (minor timer tick), which by default is set to 1/7600 sec, around 131 microseconds. (The weird value is because it needs to be an integral divisor of the frequency of the 8253 timer chip. Otherwise, the machine's clock will start drifting off due to roundoff errors. Here at NIST we have to have precise clocks!)

4. Along the same lines, internally all parameters are integers, so the percentages get converted to fractions of 2^16. What I do is generate a random number between 0 and 2^16. If it's less than x, the packet is dropped or duplicated.

5. The DRD parameters are the minimum and maximum queue lengths for the DRD algorithm. More precisely, if the number of packets queued is less than the minimum specified, DRD won't drop any packets. When the minimum is reached, DRD starts randomly dropping 10% of the incoming packets. This percentage ramps up with an increase in queue length, reaching 95% when the maximum is reached. (You can actually have more packets queued than the "maximum," but with 95% of all new packets being dropped, you tend not to get very much above the maximum.)

6. The ECN (explicit congestion notification) parameter must be a value between the minimum and maximum queue lengths (or 0, which means congestion notification will not be used for this connection). When this is set, if a packet arrives which is marked with the ECN_CAPABLE bit (currently bit 1) and the queue length is between the minimum and ECN parameter, then NIST Net will mark the packet with the ECN_NOTED bit (currently bit 0) rather than drop it. Not all packets will be so marked, but only those that would otherwise have been (randomly) dropped. If the queue length rises above the ECN parameter, then NIST Net will drop a packet whether or not it is marked as ECN_CAPABLE.

My reading of the ECN proposals is that the DRD ECN parameter should be set equal to the DRD maximum. An ECN-capable router should only drop ECN-enabled packets when it is in a condition of "distress," which is what exceeding the DRD maximum should mean. Of course, if you want to experiment with ECN, you can set the values however you wish!

Q. I want to create my own delay distribution table. How should it be set up to match the distribution I want?

The distribution used is governed by a table of short integers you can load. The code generates a uniformly distributed "random" number between 0 and the size of the table (4096 is what I used, which should be more than good enough for any use I can imagine). This is used as an index in looking up the table value, which is then divided by the table "factor" to get a "number of standard deviations" value. In my tables, the factor was always 8192, and hence the "number of standard deviations" always between -4 and 4 (MAXSHORT/8192). Again, as mentioned above, the generated delay time is then
(average) delay + (number of standard deviations)*delay sigma.
This table used for "synthesizing" the distribution amounts to a scaled, translated, inverse to the cumulative distribution function.

Here's how to think about it: Let F() be the cumulative distribution function for a probability distribution X. We'll assume we've scaled things so that X has mean 0 and standard deviation 1, though that's not so important here. Then:

F(x) = P(X <= x) = $\int_{-inf}^x f$
where f is the probability density function.

F is monotonically increasing, so has an inverse function G, with range 0 to 1. Here, G(t) = the x such that P(X <= x) = t. (In general, G may have singularities if X has point masses, i.e., points x such that P(X = x) > 0.)

Now we create a tabular representation of G as follows: Choose some table size N, and for the ith entry, put in G(i/N). Let's call this table T.

The claim now is, I can create a (discrete) random variable Y whose distribution has the same approximate "shape" as X, simply by letting Y = T(U), where U is a discrete uniform random variable with range 1 to N. To see this, it's enough to show that Y's cumulative distribution function, (let's call it H), is a discrete approximation to F. But

H(x) = P(Y <= x)
= (# of entries in T <= x) / N -- as Y chosen uniformly from T
= i/N, where i is the largest integer such that G(i/N) <= x
= i/N, where i is the largest integer such that i/N <= F(x)
-- since G and F are inverse functions (and F is increasing)
= floor(N*F(x))/N
as desired.

How can we create this table in practice? In some cases, F may have a simple expression which allows evaluating its inverse directly. The pareto distribution is one example of this. In other cases, and especially for matching an experimentally observed distribution, it's easiest simply to create a table for F and "invert" it. Here, we give a concrete example, namely how the new "experimental" distribution was created. Note: starting with version 1.4, tools to do all the operations described here are provided.

  1. Collect enough data points to characterize the distribution. Here, I collected 25,000 "ping" roundtrip times to a "distant" point (www.ntt.co.jp). That's far more data than is really necessary, but it was fairly painless to collect it, so...
  2. Normalize the data so that it has mean 0 and standard deviation 1.
  3. Determine the cumulative distribution. The code I wrote creates a table covering the range -4 to +4, with granularity .00002. Obviously, this is absurdly over-precise, but since it's a one-time only computation, I figured it hardly mattered.
  4. Invert the table: for each table entry F(x) = y, make the y*TABLESIZE (here, 4096) entry be x*TABLEFACTOR (here, 8192). This creates a table for the ("normalized") inverse of size TABLESIZE, covering its domain 0 to 1 with granularity 1/TABLESIZE. Note that even with the excessive granularity used in creating the table for F, it's possible not all the entries in the table for G will be filled in. So, make a pass through the inverse's table, filling in any missing entries by linear interpolation.

To load a new distribution table, you can of course recompile the NIST Net driver. However, the driver actually supports loading new tables at runtime. Simply write the table, which must be an array of 4096 short words (8192 bytes), to the device driver:

int readfd, writefd;
char buffer[8192];

/* Assume binary.table is your previously prepared table of 4096 short ints */
readfd = open("binary.table", O_RDONLY);
read(readfd, buffer, 8192);
writefd = open("/dev/hitbox", O_WRONLY);
write(writefd, buffer, 8192);
When updating the table by this means, you must use a table size of 4096 and a table factor of 8192. (More precisely, these must be the same as whatever table is currently loaded.)

I had previously advised just using cat to write the table to the device. This falls into the category of "things which seem like they so obviously will work that they don't require testing, and of course in practice will fail." The problem is that cat will try to break the write into pieces of 4096 bytes, which will fail. So you have to use a little program, as indicated above.

After writing the above, it occurred to me (and I actually tested it this time) that dd will do the job in writing the table. Just use this line:

dd if=binary.table of=/dev/hitbox bs=8192 count=1

Q. [Vaguely based on yet another actual asked question] When I use fixed delays on a very fast network (like gigabit), I get some packet reordering. What's going on?

Here, there's some good news and some bad news. Most likely, what you've witnessed is the lack of stability in the radix sort used for Linux timers (which I incorporated into the fast timer). This is fixed in version 2.0.8, so first try upgrading to it (or a later version).

If you're still seeing the problem, this means that it is taking longer than one timer tick to process all the packets scheduled for that tick. I can't see how this could happen, unless you're going from a much faster network to a much slower one (like, forwarding gigabit Ethernet to 10 Mbit Ethernet), and have no flow control. Of course, in that case you'd be having a few other network problems anyway...

For unrelated reasons, I added another lock to packet processing in version 2.0.10, so this problem should be gone for good now.


Trivial points

Q. Tell me all that good legal stuff about copyrights and warranties.

As a U.S. government publication, so to speak, NIST Net is not copyrighted. You can do whatever you want with it, including employing its code in whole or in part in any other package or product. You need not credit me or NIST (though not doing so would be a bit rude).

As is usual for code provided on this basis, there is absolutely no warranty of any sort. We are interested in receiving any reports of problems or requests for improvements and will try to help, but can't make any specific promises!

I had thought the above was fairly explicit, but apparently not explicit enough. What the lack of copyright means is that you have a non-exclusive right to use this code in any fashion you wish, including incorporating it into a product without any further compensation or permission required. "Non-exclusive" just means that other people can do it as well, so just because you used NIST Net in your product doesn't mean somebody else can't.

Please note these remarks only apply to the code I originated; some code (like the fast timer) is based directly on existing Linux kernel code and hence carries exactly the same copyright restrictions as Linux does.

Q. What's the name of this thing, anyway? Is it Nistnet, NISTNet, or what?

Well, you can call it what you want, but the "official" name is "NIST Net" with the capitalization and spacing shown. The "Net" part could be construed as an acronym for "Network emulation tool," though in point of fact, we came up with the name and only later tried to fit an acronym to it. That's why I didn't go all caps on it. (I tend have a cordial dislike of acronyms in any case.)

New! Q. Where can I get a big/small/whatever sized NIST Net logo?

Right here!
Comments? Questions? Let us know at nistnet-dev@antd.nist.gov.
[ NIST Net Home Page] [Installing NIST Net] [Using NIST Net] [NIST Net FAQ]