NTP Simulator

The programs and data in this distribution are designed to evaluate the
performance of the engineered algorithms used by the Network Time
Protocol (NTP) Version 4 and, in particular, to aid in establishing
optimum architecture constants for these algorithms. The file ntpsim.c
is a portable C program designed to faithfully simulate the clock
filtering, selection and discipline algorithms. It compiles and runs in
Unix and Windows environments with generic C or C++ compilers. The
program reads history files in two formats created using the "filegen"
facility of the xntpd daemon for Unix and Windows/NT. It then simulates
the behavior of the NTP algorithms and produces traces and summary
statistics as directed. The compressed tar archive ntpsim.tar.Z
containing this distribution, as well as the compressed tar archive
allan.tar.Z describing how to determine computer clock stability, is
available at the location(s) given in the NTP web page
www.eecis.udel.edu/~ntp.

The program operates in one of four modes, as selected by a command-line
option (command-line options are not available when compiled for
Windows). The default mode 0 uses the "rawstats" file produced by xntpd
from the four timestamps determined at the transmit and receive times of
the outbound and return messages at each NTP measurement round. The
timestamps are determined before processing by the engineered algorithms
defined in the specification. Mode 1 uses the "loopstats" file produced
by the daemon from the final corrections used to adjust the local clock.
The format of the rawstats and loopstats files are described in the
comments in the program text.

Modes 2 and 3 use synthetic data to generate random phase and frequency
variations characteristic of typical configurations. Mode 2 is used to
produce the actual simulation, while mode 3 is used to generate files
which are later processed by a Matlab program in the allan.tar.Z
distribution to verify the generators faithfully replicate the actual
statistics. The generators can also be used in modes 0 and 1 to
introduce synthetic phase and frequency variations in addition to the
data computed from the data files. The phase variations are modelled as
a Gaussian distribution with specified standard deviation, which is
intended to simulate typical network paths and operating system
latencies. The frequency variations are modelled as a random walk using
a Gaussian distribution with specified standard deviation, which is
intended to simulate typical computer clock oscillators, which are
generally not stabilized in any real way.

Example rawstats and loopstats files are included for testing and
evaluation. The rawstats data were collected over a ten-day period
involving NTP primary server pogo.udel.edu and about two dozen NTP
primary servers located in the US, Europe, Asia and South America. At
the present time, they probably represent the most extreme cases of
dispersive network delays and congestion on existing Internet paths.
Since these data involve only primary servers, which are controlled by
external means, frequency errors should be very small. These data are
most useful in evaluating phase-lock loop (PLL) clock discipline
schemes.

The loopstats data were collected over about 28 days using a SPARC IPC
with free-running local clock compared to a precision pulse-per-second
(PPS) signal using the ppsclock line discipline. These data demonstrate
clock oscillator variations due to temperature changes, etc., but have
very low phase variations. These data are most useful in evaluating
frequency-lock loop (FLL) clock discipline schemes.

In both the rawstats and loopstats cases, the initial phase and
frequency error can be specified with the -T and -P command-line
options, repectively. In the case of rawstats data, the selection of
which peers to use in the simulation is determined by two command-line
options. The -l option suppresses peers on the same IP network as the
host generating the rawstats data. The -r option adds an IP address to a
restriction list. As each rawstats sample is processed, the source
address is compared to each entry in the list. If the source address is
not in the list, the sample is discarded. If no -r options are present,
all peers are used, except those excluded by the -l option. In any case,
updates from a local discipline source, such as a pulse-per-second (PPS)
signal, are suppressed.

The program produces output in three formats. The default format
includes variables useful for processing by various statistics and
plotting packages, such as S and Matlab. The alternate debug format
consists of a trace which gives details of the various simulated events,
as well as the values of state variables at each local clock update. A
sample of a typical trace is as follows:

7 22059.554 192.5.41.40 3 ff   88 1024  0.562  0.076  0.667  1.454  10
7 22995.567 192.5.41.40 7 source outlyer 0.014439 0.002824 0.000624
7 22995.567 18.145.0.30 3 new clock source 0.000624 0.000568 0.000624
7 22995.567 18.145.0.30 3 ff  936 1024  0.531  0.076  1.193  1.467  20
7 23868.680 frequency 0.005
7 24019.620 18.145.0.30 3 spike -0.000584 0.023787 0.001193
7 24403.747 frequency -0.032
7 25043.573 18.145.0.30 3 ff 2048 1024  0.652  0.050  5.838  1.466  30

The first number on each line is the day number following day 0 which
begins the trace, while the second is the seconds and fraction past
midnight of that day. For all but frequency changes, the third field is
the IP address of the currently selected peer, while the fourth is the
number of peers surviving the selection and clustering algorithms. As
per specification, these are combined in order to generate the actual
clock update. For frequency changes, the value following the "frequency"
string is in parts-per-million (PPM). The above trace shows a source
change due to the current source being discarded by the clustering
algorithm, followed by a switch to a new source. Later a spike was
detected in an update, which was then discarded. As the example rawstats
file contains a relatively large number of peers, most with large
dispersive delays, the example data shows a rather large number of these
events.

If the remainder of the line consists of a alpha string followed by
other data, the line is one of many informative messages about events
internal to the simulator. The best way to decode these is to grep the
program source and read the comments in the text. The remaining fields
represent the actual clock update. The first field following the number
of survivors is the reachability register for the given peer in hex
format followed by the actual interval since the previous update
followed by the poll interval determined by the local clock discipline.

The next four numbers following the poll interval show the local clock
error, current frequency estimate and total (filter plus select)
dispersion, all in milliseconds. The next shows the standard deviation
of the first-order frequency differences, originally intended as an aid
in the local clock algorithm. The last number is the poll-update
counter, which is used by the algorithm which determines the poll
interval. See the program text for an explanation of how that algorithm
works.

When the data file is completely processed, the program produces summary
statistics similar to:

ID IP Src          IP Dst      Samples       Mean     StdDev        Max
   Local Clock                    1277      0.258      1.444      4.850
 0 192.43.244.18   128.4.1.20     1497     -5.338      8.105     31.780
 1 129.132.2.21    128.4.1.20     1473      2.750     80.690   1110.294
 2 192.36.143.150  128.4.1.20     1560      0.056      4.625     44.778
 3 131.107.1.10    128.4.1.20     1503     47.949    129.488    321.024
 4 18.145.0.30     128.4.1.20     1557      0.050      3.735     23.666
 5 128.252.19.1    128.4.1.20     1536      1.061      4.450     32.398
 6 204.123.2.5     128.4.1.20     1535     -7.566      9.098     50.951
 7 192.5.5.245     128.4.1.20     1531      0.362      8.194     64.464
 8 128.115.14.97   128.4.1.20     1469    -38.083     24.929     81.723
 9 192.67.12.101   128.4.1.20     1464     24.400     64.070    414.240
10 128.250.36.2    128.4.1.20     1505    -17.308    765.031  14874.870
11 133.100.9.2     128.4.1.20     1362     -2.671      5.744     36.539
12 192.5.41.40     128.4.1.20     1520      0.019      4.643     28.403
13 204.34.198.41   128.4.1.20     1400     -9.291      7.309     41.659
14 131.188.2.75    128.4.1.20     1488     17.104     16.494     72.309
15 129.20.128.2    128.4.1.20     1521     13.000     23.308     65.790
16 193.204.114.1   128.4.1.20     1515     19.761     29.983     86.775
17 132.163.135.130 128.4.1.20     1525    -10.856      8.153     34.198
18 146.83.8.200    128.4.1.20     1366    -18.543     34.829    120.607

The mean, standard deviation and maximum are all in milliseconds. The
first line after the header line represents the actual local clock, with
the mean relative to the actual time. In other words, if this were a
real scenario and the local clock was controlled by the given peers
using the same algorithms, the local clock would have a mean error of
0.258 us relative to the actual time. The remaining lines represent the
individual peer data as collected and displayed with the ntpq program in
real life. Obviously, some of these critters are doubtful as providers
of precision time, but these are real data for the real Internet where
congestion is a fact of life. The results invite the conclusion that the
algorithms are doing an excellent job under very demanding conditions.

When automatic poll-interval adjustment is in effect, the summary
information includes a table showing for each value of poll interval the
number of polls at that interval and the total time (seconds) spent at
that value. A summary line shows the number of clock updates in phase-
lock mode, the number of clock updates in frequency-lock mode, the
number of spikes discarded and the number of step phase-change
adjustments.

The third output format is produced only in mode 3. It designed for
processing by Matlab programs which construct an Allan deviation plot
used to verify the random generators operate as intended. These
programs, along with data files and documentation, are in the
allan.tar.Z distribution mentioned at the beginning of this note.

Command-Line Format

There is one optional argument, which is the filename to use for input.

The options, which work only in Unix, are as follows:

-d             Select debug output format.

-D             Set maximum number of days in simulation run, with
               default 30.

-f<filename>   Read input from specified file.

-F<frequency>  Set the initial frequency offset, with default 0.

-i<mode>       Select input mode
               0 rawstats
               1 loopstats
               2 synthetic phase and frequency variations
               3 data for Allan deviation

-l             Suppress peers on the same LAN as the host generating the
               rawstats data.
-m<interval>   Set minimum poll interval specified in log2 units from 4
               (16 s) to 14 (16384 s), with default 6 (64 s).

-M<interval>   Set maximum poll interval specified in log2 units from 4
               (16 s) to 14 (16384 s), with default 10 (1024 s).

-p<parameter>  Set the phase noise parameter, represented as the
               standard deviation of a Gaussian distribution, with
               default 0.

-P<interval>   Set maximum phase-lock mode interval specified in log2
               units from 4 (16 s) to 14 (16384 s), with default 11
               (2048 s).

-r<address>    Add an IP address to the restriction list.

-T<time>       Set the initial time offset, wiht default 0.

-w<parameter>  Set the frequency noise parameter, represented as the
               standard deviation of a random-walk Gaussian
               distribution, with default 0.

Files included

README         this file
ntpsim.c       program source
loopstats      example loopstats file
rawstats       example rawstats file

Dave Mills
University of Delaware
mills@udel.edu
www.eecis.udel.edu/~mills
25 January 1997
