Jump to content, skipping navigation

Network Test: Latency and Jitter White Paper

Network Test Latency and Jitter Whitepaper CoverA stockbroker fires up a high-frequency trading application. A football fan streams game highlights onto a smartphone. A neurosurgeon loads a brain scan onto a tablet. These users have something in common: They all expect content delivered when they want it. But throw in even a little unexpected delay on the network, and the result could be lost revenue, dissatisfied customers, or a matter of life or death. That’s what makes latency a critical metric when it comes to network performance assessment.

Time-based performance measurement is a complex topic. Some common terms have multiple definitions, with subtle differences between official and colloquial usage. Other concepts require an understanding of network device and/or test instrument architecture. And many latency measurements, especially in the area of QoS testing, require not only great accuracy and precision but also specific capabilities on the part of the test instrument.

This whitepaper explores latency measurement and covers the following three key areas in detail:

  • Terminology for latency, jitter, and related topics
  • Hands-on examples of latency measurement at work
  • Guidelines and latency-related questions to consider when assessing network performance test equipment

Award Finalists

Test & Measurement Award Images

    * Required Field

    Cancel

    About Time: The Role of Latency and Jitter in Network Performance Assessment By David Newman, Network Test February 2011 ABOUT TIME: LATENCY AND RELATED TIME-BASED METRICS Pag e2 Table of Contents Executive Summary..................................................................................................... 3 Part I: Defining Key Terms ........................................................................................... 4 Latency ..............................................................................................................................4 Jitter ..................................................................................................................................6 The Myth of Zero Jitter .......................................................................................................7 Negative Latency ................................................................................................................7 Timestamp Resolution ........................................................................................................9 Ethernet Clocking and PPM.................................................................................................9 Speed Mismatches ........................................................................................................... 10 Latency Over Time ............................................................................................................ 11 Latency Distribution and Histograms ................................................................................ 12 Part II: Applied Latency Testing ................................................................................. 13 Example 1: Absolute Accuracy and Precision ..................................................................... 13 Example 2: Mixed Traffic Classes ...................................................................................... 15 Example 3: Drilling Down on Delay ................................................................................... 18 Part III: Latency Questions for Your Test Equipment Vendor ...................................... 23 About Network Test .................................................................................................. 24 Disclaimer ................................................................................................................. 24 ABOUT TIME: LATENCY AND OTHER TIME-BASED METRICS Pag e3 Executive Summary A stockbroker fires up a high-frequency trading application. A football fan streams game highlights onto a smartphone. A neurosurgeon loads a brain scan onto a tablet. These users have something in common: They all expect content delivered when they want it. But throw in even a little unexpected delay on the network, and the result could be lost revenue, dissatisfied customers, or a matter of life or death. That’s what makes latency a critical metric when it comes to network performance assessment. Throughput may grab more attention, but it describes only the maximum limits of a system. Latency (along with its close relative, jitter) applies to all traffic, all the time. Even on a lightly loaded network, added latency and jitter can have a significant impact on application performance, especially for converged networks carrying voice and video traffic. Time-based performance measurement is a complex topic. Some common terms have multiple definitions, with subtle differences between official and colloquial usage. Other concepts require an understanding of network device and/or test instrument architecture. And many latency measurements, especially in the area of QoS testing, require not only great accuracy and precision but also specific capabilities on the part of the test instrument. Spirent Communications commissioned Network Test to prepare a white paper exploring latency measurement. This paper is organized into three parts. Part I discusses terminology for latency, jitter, and related topics. Part II presents hands-on examples of latency measurement at work, using Spirent TestCenter as the test instrument. Part III presents a list of latency-related questions to consider when assessing network performance test equipment. ABOUT TIME: LATENCY AND RELATED TIME-BASED METRICS Pag e4 Part I: Defining Key Terms Terminology is important in network device benchmarking, especially since some definitions differ from those in common usage. For example, RFC 1242, a foundation document for network device performance benchmarking, defines “throughput” as a zero-loss condition. In contrast, many engineers use the term simply to mean the rate at which traffic passes through a system, without regard to frame loss. Given that two definitions can produce very different results, it’s important to understand exactly what’s meant when referring to a given metric. This section discusses definitions for latency, jitter, and related concepts that apply to time-based measurements. Latency Even within the benchmarking world, one metric may have multiple meanings. Latency is one such term. RFC 1242 offers different definitions depending on whether you’re testing a “store and forward” or “bit forwarding” device. Store-and-forward devices are more common. These devices cache an entire incoming frame before deciding how to process it. For store-and-forward devices, RFC 1242 defines latency as follows: The time interval starting when the last bit of the input frame reaches the input port and ending when the first bit of the output frame is seen on the output port. In other words, store-and-forward latency is a “last in, first out” (LIFO) measurement, in that the clock starts when the device under test (DUT) receives the last bit of a frame and stops when the device begins forwarding the frame. You may sometimes hear test engineers refer to these markers as “time 0 and time 1.” Regardless of their names, these “in” and “out” markers are extremely important when it comes to precision and accuracy1. In contrast, “bit forwarding” devices begin processing a frame as soon as the DUT receives the frame header, without waiting for the rest of the packet. These devices are more widely known as “cut-through” switches because they begin to “cut through” the switch fabric to forward a frame as soon as possible. RFC 1242 offers a different latency definition for these devices: The time interval starting when the end of the first bit of the input frame reaches the input port and ending when the start of the first bit of the output frame is seen on the output port. 1 Accuracy and precision are not at all the same thing. Accuracy refers to the correctness of a measurement, while precision refers to a measurement’s variation around some reference point. A clock that varies 1 nanosecond per year but is off by 24 hours is extremely precise but not very accurate. ABOUT TIME: LATENCY AND OTHER TIME-BASED METRICS Pag e5 Cut-through latency is a “first in, first out” (FIFO) metric. Here, the measurements are at the beginning of the frame, both on the ingress and egress sides of the DUT. Regardless of which technique is used, RFC 2544 (the methodology companion document to RFC 1242) recommends that latency be measured at, and only at, the throughput rate. Another testing specification, RFC 4689, defines a “last in, last out” (LILO) measurement technique, but calls the resulting metric “forwarding delay” rather than “latency.” Forwarding delay can be very useful in that it can be measured at any intended load, not just the throughput rate. Section 3.2.4 of RFC 4689 has an interesting discussion that compares LILO measurement with other techniques. Not defined in the RFCs, but the final permutation is “first in, last out” (FILO). Although this set of markers is not widely used in device benchmarking, this method is useful in measuring frame insertion time – the time needed to place an entire frame on the medium – when verifying that a test instrument is correctly calibrated. In theory, Ethernet frame insertion time measurements should match a simple formula: I = f / m where I is the frame insertion time; f is frame length; and m is media rate, all expressed in bits. So, for example, to calculate frame insertion time for a 64-byte frame on 10G Ethernet, we first convert bytes to bits and then plug in the values to the formula: I = 512 / 10,000,000,000 = 0.0000000512 = 51.2 ns Thus, a FILO latency measurement for 64-byte frames on 10G Ethernet can never be smaller than 51.2 ns, since it takes at least that long for the frame to cross the medium (plus cable propagation time). In practice, observed DUT latencies are far higher; however, FILO latency is still useful when assessing timestamp resolution, discussed below2. There’s no one “right answer” as to whether you should use LIFO, FIFO or LILO latency; each has its place. It’s important to understand what each metric represents, and where each metric is or isn’t appropriate. Obviously, a test instrument that lets you choose the measurement technique is also a prerequisite. 2 The time between any two frames is slightly higher, since that measurement must include 12 additional bytes for the interframe gap and 8 additional bytes for the Ethernet preamble. ABOUT TIME: LATENCY AND RELATED TIME-BASED METRICS Pag e6 Graphical comparisons are helpful in understanding the different methods of latency measurement (see Figures 1 to 4). FIGURE 1: STORE-AND-FORWARD (RFC 1242), LIFO FIGURE 2: BIT FORWARDING, AKA CUT-THROUGH (RFC 1242): FIFO FIGURE 3: FORWARDING DELAY (RFC 4689): LILO FIGURE 4: FRAME INSERTION TIME: FILO Jitter RFC 4689 also defines jitter, which is commonly understood as “variation in latency” or “distribution of latency across multiple packets or streams.” The more precise definition given in RFC 4689 is: The absolute value of the difference between the Forwarding Delay of two consecutive received packets belonging to the same stream. ABOUT TIME: LATENCY AND OTHER TIME-BASED METRICS Pag e7 It’s necessary to express jitter as an absolute value because the difference in forwarding delay between any two packets may be a negative number. The mathematical expression of jitter is: | D(i) – D(i-1) | where D is forwarding delay and i is the order in which frames were received. Two related terms used by Spirent TestCenter and other test instruments are delay variation and inter-arrival jitter. The Spirent TestCenter documentation describes delay variation as a measurement very similar to RFC 4689, involving the time difference between two arriving frames in the same stream. Although this is not an absolute value, as in the RFC definition, its absolute value is used in calculating accumulated jitter over the entire test duration. Inter- arrival jitter describes variation from the expected arrival time of each frame. It is computed based on frame rate. The Myth of Zero Jitter Jitter has one other meaning that describes variance between measurements. As discussed in RFC 3393, jitter can refer to the difference between the arrival time for a given signal and a reference clock signal. Some variation between these two events, however small, is inevitable. This type of jitter is unavoidable in all test instruments, even those measuring asynchronous technologies such as Ethernet. Jitter can be characterized and minimized through the use of more precise clocking sources, but never eliminated. There will always be a point where a more precise clock source delivers only a small reduction in jitter but an exponential increase in cost. There is always a tradeoff between precision and cost, with even the most expensive clocking equipment still involving some nonzero amount of variance in timing measurements. This has some very real consequences for latency measurement – including the possibility of negative numbers, as we’ll discuss in the next section. Negative Latency Negative latency doesn’t mean a frame arrived before it was transmitted, and it doesn’t (necessarily) mean the test instrument is broken. There are two situations, both valid, where a latency measurement can be negative. The first situation can occur when a test instrument measures very short delays. If latency is close to zero, then jitter in the test instrument’s clock source will cause measurements to oscillate around zero as a midpoint – with half the numbers slightly greater than and half slightly less than zero. The resulting negative measurements can be confusing, especially for nontechnical audiences. Even experienced test engineers erroneously may conclude there is a problem with the test instrument. ABOUT TIME: LATENCY AND RELATED TIME-BASED METRICS Pag e8 The test instrument may simply present all latency exactly as measured, both positive and negative. Another option is to compensate for the negative numbers by adding a small positive offset in the test instrument’s firmware. This “positive bias” is the default in multiple test instruments, including Spirent TestCenter. Positive bias is used to avoid confusion over negative results and to maintain backward compatibility for comparisons with existing test results. The second situation where negative latency can occur is when using the LIFO method on a FIFO device, such as a length of cable. A cable cannot buffer a frame, as a store-and-forward DUT would. As a result, the “last in” part of LIFO may occur after the “first out” part occurs – and the result is a negative number. Moreover, negative latency will increase linearly with frame length (see Figure 5). RFC 1242 section 3.8 discusses the possibility of negative latency in comparing LIFO and FIFO methods. FIGURE 5: LATENCY MEASUREMENT METHODS COMPARED Starting in version 3.60, Spirent TestCenter offers testers a choice of three methods of dealing with negative latency. The first is to use the default positive bias setting, where the instrument will add enough delay to compensate for negative latency of small frames. The second is to apply a “latency compensation” of zero – essentially, replacing the positive bias with a zero bias. This increases the possibility of negative latency, but also presents the most accurate picture of latency as measured on the medium. ABOUT TIME: LATENCY AND OTHER TIME-BASED METRICS Pag e9 The final method allows users to configure positive or negative latency compensations of up to 1 usec on transmitting and receiving ports. This can be useful when factoring for long cables or media converters, both of which add significant delay that should not be attributed to the DUT. Timestamp Resolution Timestamp resolution is a key concept for any time-based measurement. Simply put, it refers to the precision of the test instrument. In the case of 10G Ethernet modules for Spirent TestCenter, the timestamp resolution is ±10 ns. So, for example, the actual delay on the medium for a latency measurement of 660 ns may be anywhere in the range of 650 to 670 ns. Timestamp resolution is an important consideration when assessing test equipment. It helps testers determine whether latency results are meaningful; for instance, a difference of 15 ns between two measurements is not meaningful if the test instrument has a resolution of 20 ns. Timestamp resolution also determines whether a test instrument’s precision is adequate for a given transport. Timestamp resolution must be smaller than the minimum frame insertion time for a given medium. At 10G Ethernet rates, the shortest frame insertion time is 51.2 ns, so a timestamp resolution of 10 ns is more than enough. It is also sufficient for 40G Ethernet, where minimum insertion time falls to 20.48 ns. The forthcoming 100G Ethernet specification will require a finer timestamp resolution of 5.12 ns or less. Spirent TestCenter’s current 10-ns resolution is determined in firmware. Spirent TestCenter hardware is designed so timestamp resolutions of future firmware releases can be as small as 2.5 ns, which will be sufficient for 100G Ethernet and any previous version, all on the same platform. Ethernet Clocking and PPM The IEEE 802.3 specification defines Ethernet as an asynchronous technology, with each interface having its own clock source. This differs from synchronous WAN technologies, where all interfaces use a common clock source. To accommodate timing variations among various transceivers, sections 22 and 36 of the IEEE 802.3 specification state that a transmitter must run at a given frequency, ±100 parts per million (which at 10G Ethernet rates equates to ±1,488 fps with minimal-sized 64-byte frames). Over time, differences between transmit and receive rates can lead to buffering and increased delay. In production networks, it’s a virtual certainty that different Ethernet interfaces will run at slightly different rates. In lab settings, it may be desirable to have all test interfaces run at exactly the same rate; for example, a fabric designer may wish to set all transmitting interfaces ABOUT TIME: LATENCY AND RELATED TIME-BASED METRICS Pag e10 to the maximum allowable rate – exactly +100 ppm – to verify a fabric will forward all traffic without frame loss. A test instrument should offer both capabilities – slightly different clock rates on each interface, or uniform clock rates on all interfaces. Spirent TestCenter allows PPM adjustments on individual interfaces (see Figure 6). If “Internal” clocking or a 0-ppm setting is used, the test port will use Ethernet’s nominal line rate. FIGURE 6: PPM CONTROL IN SPIRENT TESTCENTER There is one unscrupulous use of PPM settings: Some testers have been known to reduce the PPM setting so that a slightly slow DUT will forward all traffic at the newly redefined line rate without loss, and also show reduced latency. This is a form of cheating, and should be avoided. Essentially, it redefines line rate downward simply to accommodate a slow DUT. Network equipment manufacturers (NEMs) cannot know in advance what exact rates their customers’ Ethernet devices will use, and thus should always assume a PPM default of 0 and/or randomly selected PPM values between -100 and +100. The latter represents an interesting test of Ethernet conformance. Speed Mismatches A device or system with interfaces operating at different speeds can introduce congestion, and this in turn has implications for latency measurement. If, for example, one or more 10G Ethernet ABOUT TIME: LATENCY AND OTHER TIME-BASED METRICS Pag e11 interfaces offer traffic to a gigabit Ethernet interface at any rate above 10 percent of line rate, egress buffers will overflow and frame loss will result. Latency measurements in such conditions may be a reflection of buffer capacity more than anything else3. When designing a latency or jitter test for a mixed-speed environment, it’s important to consider whether congestion will occur, and if so what effect the DUT’s buffers have on test results. In QoS testing, it’s common for testers to deliberately induce congestion. We cover this test case in section II of this document. Latency Over Time Modern test instruments can measure the latency of every single frame received, even over long durations. Typically, results are then summarized by presenting minimum, average, and maximum latency and jitter numbers. This is an effective data reduction technique, but it may not necessarily tell the entire story when it comes to characterizing device or network delay. In many situations, it is critical to understand when elevated latency or jitter occurred. In QoS testing, congestion may cause buffers to overflow in the first few milliseconds of a test. In other situations, a slight speed mismatch between two devices or networks may cause buffers to fill over time, with corresponding elevated latency. And bursts of traffic can introduce periodic, seemingly random increases in latency and jitter. For all these reasons, it can be very useful to monitor latency over time. Spirent TestCenter offers multiple ways of tracking changes in latency and jitter. One is short-term average latency, in which average latency is calculated every second and displayed in real time over the test duration. We will use short-term average latency to drill down on test results in Section II of this document. For even finer-grained latency and jitter tracking, it can be useful to examine traffic at the frame level. Spirent TestCenter embeds a special signature field in every frame transmitted; among other fields, the signature field includes a timestamp and sequence number. Spirent offers a modified version of Wireshark, the open-source protocol analyzer, to decode signature fields and display timing information for every frame. By examining captured frames, it’s possible to discern even the smallest changes in latency and jitter over the test duration. 3 This section and the QoS testing example in Part II use the colloquial rather than the benchmarking definition of “latency.” Since RFC 2544 recommends that latency be measured at the throughput rate, and since throughput is a zero-loss condition, any latency measurement in the presence of frame loss technically should be called “forwarding delay” or just “delay.” ABOUT TIME: LATENCY AND RELATED TIME-BASED METRICS Pag e12 Latency Distribution and Histograms Modern test instruments measure the latency and jitter of every single frame. That’s an impressive capability, but the usual data reduction methods – such minimum, average, and maximum latency – can be overly simplistic. Knowing the maximum latency was X is useful, but that single number says nothing about whether X represents one outlier frame, or one billion frames. Histograms, a concept from statistics, can help here. A histogram is simply a graph of the frequency distribution of a data set. It takes all the points from a given data set and divides them into multiple classes, then presents the classes in chart form (see Figure 7). FIGURE 7: LATENCY HISTOGRAM By presenting latency distribution in graphical form, histograms make it easy to determine how latency varies around some average or median value. We’ll discuss a hands-on example using histogram in Part II of this document. ABOUT TIME: LATENCY AND OTHER TIME-BASED METRICS Pag e13 Part II: Applied Latency Testing It’s time to put the various concepts discussed in Part I to work. In this section, we’ll conduct three sets of tests that will determine the precision and accuracy of the test instrument; measure latency for multiple traffic classes; and record short-term average latency and latency distribution using histograms. We’ll briefly introduce each set of tests and discuss why they’re important in the context of latency measurement. The test instrument and software for these examples will be Spirent TestCenter version 3.60 using HyperMetrics CV 10G Ethernet modules. Example 1: Absolute Accuracy and Precision A common benchmarking practice, especially for any new tool, is to “test the tester” and ensure its measurements are accurate and precise. As noted, accuracy and precision are different concepts, with the former referring to the correctness of a measurement and the latter referring to the degree of variance in that measurement. It’s important to determine both accuracy and precision for every test instrument. Accuracy is a bedrock requirement: If the test instrument records erroneous time measurements (or, worse, variable and erroneous measurements), then anything it says about a device or system under test cannot be trusted. Precision also should be verified: Any variance in measurements should fall within the timestamp resolution declared for a given test instrument. In this test, we will determine accuracy and precision by measuring a device with known, constant latency: A 1-meter fiber-optic cable. The expected result is that FIFO latency should register about 5 ns4. The test instrument also should be able to determine if FILO latency expands linearly with frame size. We will use FIFO measurement to determine latency for the cable itself and FILO measurement to determine propagation times for different frame sizes. We use FIFO because we’re measuring the time for a signal to propagate from one end of a cable to the other; that is a first-in, first-out proposition. We use FILO for different frame lengths because here we’re interested in the time it takes to insert and then remove an entire frame from the medium.5 For both tests, we use Spirent TestCenter’s RFC 2544 wizard to measure latency for a unidirectional stream, and specify multiple frame lengths. Let’s begin with the FIFO latency measurement. 4 Propagation time through copper or fiber cabling is around 5 nanoseconds/meter, or around 0.59 times the speed of light. Thus, it will take approximately 5 ns for a signal to cross a 1-meter cable, 50 ns to cross a 10-meter cable, and 500 ns to cross a 100-meter cable. 5 Note that these tests use a piece of cabling; if we were measuring a DUT, we would use LIFO or FIFO measurement depending on DUT architecture. ABOUT TIME: LATENCY AND RELATED TIME-BASED METRICS Pag e14 Significantly, we must ensure that Spirent TestCenter does not itself add any latency compensation in these tests. The times involved for a signal to cross relatively short runs of fiber will be less than the compensation factor Spirent TestCenter adds by default, so we should instead use a “zero bias” setting. To do this, we enable latency compensation (using the “PHY options” tab in the Settings node), adding 0 ns of delay on a test port (in the advanced tab in the port node; see Figure 8), and then use the port copy wizard to apply this setting to other test ports. FIGURE 8: REMOVING LATENCY COMPENSATION The latency measurements from this test match the expected results almost exactly (see Figure 9). In all cases, regardless of frame length, average latency for a 1-meter cable is about 3 to 4 ns, close to the expected 5 ns measurement and well within the 10-ns timestamp resolution of the test instrument. The results also show perfectly scaling of FILO latency as frame lengths increase. The FILO numbers are proportional to frame length, and match the expected theoretical frame insertion times (within the timestamp resolution of the test instrument). These results validate both the accuracy and precision of the test instrument. ABOUT TIME: LATENCY AND OTHER TIME-BASED METRICS Pag e15 FIGURE 9: FIFO AND FILO LATENCY, 1-METER CABLE Example 2: Mixed Traffic Classes Latency is a key metric in QoS testing, especially where protection of mission-critical traffic is concerned. A common task is assessing whether a DUT keeps latency low and consistent for delay-sensitive applications such as voice, video, and some trading applications used in the financial services industry. By definition, QoS involves giving preferential treatment to one traffic group at the expense of one or more others. Thus, it’s important to be able to track latency for each traffic group to determine whether the DUT correctly prioritizes mission-critical traffic. In this example, we’ll set up a four-port test in which each of three 10G Ethernet ports offer traffic to a fourth port (this is known as a partial mesh topology as defined in RFC 2285). All transmitter ports will offer two classes of traffic: A constant bit rate (CBR) stream of small VoIP frames, and a variable bit rate (VBR) stream of best-effort background traffic made up of large frames. We use three ingress ports transmitting to one egress port to create congestion on the egress port (see Figure 10). We have deliberately created an overload pattern because we want to see how the DUT will protect CBR traffic when congestion exists. Absent any QOS configuration on ABOUT TIME: LATENCY AND RELATED TIME-BASED METRICS Pag e16 the DUT, some frames will be dropped, since this is an overload condition. Further, short CBR frames may be queued behind longer best-effort frames, so latency will be higher for those frames that do get forwarded. FIGURE 10: OVERLOAD PATTERN With Spirent TestCenter, we can define different rates and burst sizes for different sets of streams, even on the same transmitting port, by using priority based scheduling. Priority based scheduling can be configured through the Traffic Wizard or on individual ports (see Figures 11 and 12). FIGURE 11: PRIORITY BASED SCHEDULING WITH TRAFFIC WIZARD FIGURE 12: PRIORITY BASED SCHEDULING IN PORT NODE ABOUT TIME: LATENCY AND OTHER TIME-BASED METRICS Pag e17 In this example, we will set a priority for each stream block (with 0 being the highest value); a burst size (with 1 indicating a constant stream, and any other number indicating the number of frames in the burst); and a start delay (the interval after test start that this particular stream block should begin transmitting). For CBR traffic, we set a priority of 0 and a burst size of 1 with no delay. For VBR traffic, we set a priority of 1; a burst size of 1,000 frames; and delay of 300 bytes (this is the length of the 200-byte CBR traffic, plus 100 bytes for padding). As a baseline, we begin this QoS test with a “before” picture to illustrate the effects of congestion without traffic prioritization enabled on the DUT. Here, we offer CBR and VBR traffic classes to a DUT with no prioritization enabled, and measure latency for each traffic group. The results show delay is both high and variable for all traffic (see Figure 13). Note the average delay of more than 500 usec, which is very high for a 10G Ethernet DUT. Note also the wide variation between minimum and maximum delay, and further that delay is uniformly high for both CBR and VBR traffic. In this congested scenario, performance for CBR traffic is no better than for best-effort VBR traffic. Essentially, the delay measurements are simply a reflection of the buffer depth of the DUT. FIGURE 13: DELAY FOR MULTIPLE TRAFFIC GROUPS WITH QOS DISABLED Then we configure the DUT to apply traffic-shaping to VBR traffic, such that the DUT will only forward VBR traffic at 1 Gbit/s. With this restriction in place, delay for both CBR and VBR traffic is vastly improved (see Figure 14). Note here that average delay is still lower for CBR traffic, so there is no “head-of line blocking” issue, where shorter CBR frames get queued behind longer VBR frames. FIGURE 14: DELAY FOR MULTIPLE TRAFFIC GROUPS WITH QOS ENABLED ABOUT TIME: LATENCY AND RELATED TIME-BASED METRICS Pag e18 From a measurement perspective, the key features are the ability to define multiple traffic groups per test port, each transmitting at a different rate, and then track delay for each group and individual streams. This makes it possible to see at a glance whether prioritization and traffic-shaping techniques are working as expected. Example 3: Drilling Down on Delay Sometimes post-test latency results may raise more questions than they answer. For example, if maximum latency is unexpectedly high, it may be helpful to know when during a given test iteration the high delay occurred: Was it just at the beginning at the test, at the end, or consistently over the entire test duration? And when maximum latency is much higher than average latency, is this the result of a few outlier frames, or does the DUT introduce high delay for significant amounts of traffic? These and other questions can be answered with two analysis tools: Short-term average latency reporting and latency histograms. These tools can show, in real time if desired, how latency is distributed across time and across all frames received during a test. FIGURE 15: SHORT-TERM AVERAGE LATENCY SELECTION Spirent TestCenter tracks and displays short-term average latency in real time during a test. Updated once per second, the short-term average latency display shows delay for all streams belonging to a traffic group, or for all streams in the test. To enable the short-term average latency display, select “Change Result View” in one of the results windows, choose Streams, and pick then either Traffic Group Results or Detailed Stream Results (see Figure 15). The display will then include a column in results view titled “short-term average latency.” As in the previous QoS example, we’ve defined two traffic groups, VBR and CBR, and selected the short-term average latency display for traffic groups. We also have not configured any QOS policy on the DUT, so it’s very possible that VBR traffic will affect rates and latency of CBR traffic. ABOUT TIME: LATENCY AND OTHER TIME-BASED METRICS Pag e19 When streams in the bursty VBR group are active, as in this example, short-term average latency for CBR traffic is substantially higher than the average over the entire test duration (see Figure 16). We can see in real time that latency for CBR traffic jumps from 4 usec to nearly 31 usec when a burst of VBR traffic is active – nearly an eightfold increase in delay. FIGURE 16: SHORT-TERM AVERAGE LATENCY, MULTIPLE TRAFFIC GROUPS Once VBR traffic stops bursting, short-term average latency for CBR traffic is much lower, and also much closer to the average over the entire test duration (see Figure 17). This type of real-time reporting makes it easy for test engineers to determine at a glance how different traffic classes (or other events, such as DUT configuration changes) can affect latency. FIGURE 17: SHORT-TERM AVERAGE LATENCY, SINGLE TRAFFIC GROUP Latency histograms offer an even deeper view by showing the distribution of delay across all frames received. In Spirent TestCenter, histograms are configured on a per-port basis. Once histogram bucket values have been defined on one port, they can be copied to all other ports using the port copy wizard. Let’s return to the QoS test in Example 2, where delay was very high with no prioritization enabled on the DUT (see Figure 13 above). In this case there was a wide spread between minimum and maximum delay values, which in turn raises questions about latency distribution. Were the high maximum values the results of just a few frames, or was delay uniformly high for all traffic? We can find out by defining histograms (latency buckets) that take the minimum and maximum values into account. In setting up latency buckets, we always want the first and last buckets to be empty; that means there are no frames with delay outside the minimum and maximum range we’ve defined. How we distribute values between minimum and maximum is up to the tester. For a large spread between minimum and maximum values, an exponential or log scale is useful; ABOUT TIME: LATENCY AND RELATED TIME-BASED METRICS Pag e20 for smaller spreads, a linear spread may be more appropriate. At Network Test, we typically use a log scale at first and then reapply a linear scale if needed. In Spirent TestCenter, histograms are defined in the Traffic Analyzer node of each port (see Figure 18). Once histogram bucket values are defined for a given port, they can be applied to other ports using the port copy wizard. For Spirent’s HyperMetrics 10G Ethernet CV modules, the bucket values are entered in 10-ns units, reflecting the test instrument’s 10-ns timestamp resolution. Here, we use a log scale to represent values between the minimum and maximum delays we’ve observed. FIGURE 18: HISTOGRAM DEFINITION Next, we can change the real-time results view to stream block results or detailed stream results, and click on the histograms tab (see Figure 19). Note the column headings reflect the histogram bucket values we configured in the previous step. Frame counters for each stream will increment as the test is run. FIGURE 19: REAL-TIME HISTOGRAM RESULTS In this case, note that there are a few (thousands) of frames with delay of 370 usec or less, and the vast majority of frames (millions) have delay of between 370 and 574 usec. (The units used in this display are in 10-ns increments.) From this we can conclude that the high maximum ABOUT TIME: LATENCY AND OTHER TIME-BASED METRICS Pag e21 delays observed are not an anomaly. The histograms show very clearly that a preponderance of frames were delayed by at least 370 usec. The same numbers are also available in the results database, viewable in Results Reporter (see Figure 20). This allows testers to analyze histograms post-test, and also permits export of the data to CSV or Excel files. FIGURE 20: HISTOGRAMS IN RESULTS REPORTER The ability to export histogram data (or virtually any of hundreds of other metrics in Results Reporter) makes it very easy to represent results in graphical form. A Microsoft Excel chart with histograms for CBR and VBR traffic makes it very clear that latency distribution is not uniform between minimum and maximum values; in fact, almost all measurements fall within a single bucket (see Figure 21). FIGURE 21: LATENCY HISTOGRAMS EXPORTED FOR CHARTING ABOUT TIME: LATENCY AND RELATED TIME-BASED METRICS Pag e22 In uncongested devices, a bell-curve latency distribution is more common, with the greatest number of latency measurements in a center bucket and somewhat fewer measurements in buckets to the left and right of center. Spirent TestCenter has predefined profiles for latency histograms, including a center-weighted pattern for bell-curve distributions. This can be configured by choosing “center” distribution under predefined modes (see Figure 18, again). ABOUT TIME: LATENCY AND OTHER TIME-BASED METRICS Pag e23 Part III: Latency Questions for Your Test Equipment Vendor Measurement of time-related metrics such as latency is a complex matter requiring great accuracy, precision, and standards-compliant test equipment. In selecting test equipment, here are a few questions to ask prospective suppliers: Can I configure a test to measure latency and delay goal posts using any combination of first- and last- techniques (FIFO, FILO, LIFO and LILO)? Are these techniques available to me for every type of test I might run (throughput, latency, latency over time, and so on)? Can I configuration different traffic groups to generate streams at different rates from the same test interface, and then measure latency on a per-group and per-stream basis? How many individual streams and traffic groups can I set up per test interface? Will I lose the ability to record other measurements such as frames in sequence or inter- arrival time if I enable advanced types of latency measurement? What is the timestamp resolution for each type of interface used in the test instrument? Is the timestamp resolution sufficiently fine-grained for all technologies I may assess, up to and including 100G Ethernet? Does the test instrument provide a choice between asynchronous clocking, as found in production Ethernet networks, and synchronous clocking required in some controlled lab settings?