The School for Sysadmins Who Can’t Timesync Good and Wanna Learn To Do Other Stuff Good Too, part 2 – how NTP works

(Part 1 covered the background and rationale.  Part 3 is about installation and configuration.)

What is NTP?

NTP (Network Time Protocol) is an Internet standard for time synchronisation covered by multiple RFCs.  “NTP is [arguably] the longest running, continuously operating, ubiquitously available protocol in the Internet” [Mills].  It has been operating since 1985, which is several years before Tim Berners-Lee invented the WWW.  The current version is NTPv4, described in RFC5905, which also covers SNTP (Simple NTP), a more limited version designed mostly for clients.

Whilst there are multiple different implementations of NTP, I’ll be focusing on the reference implementation, from the Network Time Foundation, because that’s what I’m most familiar with, and because it has the most online reference material available.

How Linux keeps time

Linux and other Unix-like kernels maintain a system clock which is set at system boot time from a hardware real time clock (RTC), and is maintained by regular interrupts from a timing circuit, usually a crystal oscillator.

The kernel clock is maintained in UTC; the base unit of time is the number of seconds since midnight 1 January 1970 UTC.  Applications can read the system clock via time(2), gettimeofday(2), and clock_gettime(2), the last two of which offer micro- and nano-second resolution.

System calls are available to set the time if it needs to change (called “stepping” the clock), but the more commonly-used technique is to ask the kernel to adjust the system clock gradually via the adjtime(3) library function or adjtimex(2) system call (called “slewing” the clock).  Slewing ensures that the clock counter continues to increase rather than jumping suddenly (even if the clock needs to be adjusted backwards), by making slight changes in the length of seconds on the system clock.  If the clock needs to go forwards, the seconds are shortened (sped up) slightly until true time is reached; if the clock needs to go backwards, the seconds are lengthened (slowed down) slightly until true time catches up.  (There are other interesting timing functions supported by the Linux kernel; see the documentation for more.)

Because oscillators are imperfect, system time is always out from UTC by some amount.  Better quality hardware is accurate to within very small variance from the true time (unnoticeable by humans), while cheap hardware can be out by quite significant amounts.  Clock accuracy is also affected by other factors such as temperature, humidity, and even system load.  NTP is designed to receive timing information from external sources and use clock slewing (or stepping, where necessary) to keep the system clock as close as possible to true UTC time.

How NTP works

The notion of one true time is central to how NTP operates, and it has numerous checks and balances in it which are designed to keep your system zeroing in on the one true time. (For a more detailed and authoritative explanation of this, see Mills’ “Notes on setting up a NTP subnet“.)

Polling

The primary means which NTP uses for determining the correct time is just to ask for it!  An NTP server simply polls other NTP servers (on UDP port 123) or other time sources (more on this below) for their current time, measures how long it takes the request to get there and back, and analyses the results to determine which sources represent the true time.  The polling process is very efficient and can support huge numbers of clients with a minimum of bandwidth.

An NTP poll happens at intervals ranging from 8 seconds to 36 hours (going up in powers of two), with 64 seconds to 1024 seconds being the default range.  The NTP daemon will automatically adjust its polling interval for each source based on the previous responses it has received.  On most systems with a reliable clock and reliable time sources, poll times will settle on the maximum within a few hours of the NTP daemon being started.  Here’s an example from one of my systems:

$ ntpq -pn
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
+172.22.254.1    172.22.254.53    2 u  255 1024  177    0.527    0.082   2.488
*172.22.254.53   .NMEA.           1 u   37   64  376    0.598    0.150   2.196
-192.189.54.17   130.95.179.80    2 u 1067 1024  377   44.964   -1.948   0.764
+192.189.54.33   130.95.179.80    2 u  101 1024  377   32.703   -1.666   8.223
+129.127.40.3    130.95.179.80    2 u  953 1024  377   55.609   -0.120   6.276
-2001:4478:fe00: 216.218.254.202  2 u   76 1024  377   35.971    4.814   1.848
-2001:67c:1560:8 17.253.34.125    2 u 1017 1024  377  376.041   -3.303   4.412
+162.213.34.249  17.253.34.253    2 u 1004 1024  377  325.680    1.469  38.157

The 6th column is the poll time, which is 1024 seconds for all but one of its peers.  (More on how to interpret the output of ntpq will come in a later post.)

Strata

So if your system gets time from another system on the network, from where does that system get its time?  NTP time is ultimately sourced from accurate external sources like atomic clocks, some of which use the ultimate source of the standard second, the Caesium atom, as their reference.  Such time sources are expensive, so other sources are used as well, such as radio clocks, stable oscillators, or (perhaps most commonly) the GPS satellite system (which itself uses atomic clocks).  These sources are collectively referred to as reference clocks.

In the NTP network, a reference clock is stratum 0 – that is, an authoritative source of time.  An NTP server which uses a stratum 0 clock as its time source is stratum 1.  Stratum 2 servers get their time from stratum 1 servers; stratum 3 servers get their time from stratum 2 servers, and so on.  In practice it’s rare to see servers higher than stratum 4 or 5 on the Internet [Mills] [Minar].

Stratum 1 servers are connected to their stratum 0 sources via local hardware such as a serial port or expansion card slot.  The reason we have additional strata after stratum 1 is to ensure that there are enough servers to cope with the load from all the clients.  As much as it is possible, network delay (latency) between strata should be kept to a minimum.

Algorithms

NTP uses a number of different algorithms to ensure that the time it receives is accurate. [Mills]  Knowing how these algorithms work at a basic level can help us avoid configuration mistakes later, so we’ll look at them here briefly:

  1. filtering – The poll results from each time source are filtered in order to produce the most accurate results. [Mills]
  2. selection (a.k.a. intersection) – The results from all sources are compared to determine which ones can potentially represent the true time, and those which cannot (called falsetickers or falsechimers) are discarded from further calculations. [Mills]
  3. clustering – The surviving time sources from the selection algorithm are combined using statistical techniques. [Mills]

Read on in part 3 – installation and configuration, where we’ll explore how to install and configure NTP on an Ubuntu Linux 16.04 system.

The School for Sysadmins Who Can’t Timesync Good and Wanna Learn To Do Other Stuff Good Too, part 1 – the problem with NTP

(With apologies to Derek Zoolander and Justin Steven.  And to whoever had to touch the HP-UX NTP setup at Queensland Police after I left. And to anyone who prefers the American spelling “synchronization”.)

(This is the first of a series on NTP.  Part 2 is an overview of how NTP works.)

The problem with NTP

In my experience, Network Time Protocol (NTP) is one of the least well-understood of the fundamental Internet application-layer protocols, and very few IT professionals operate it effectively.  Part of the reason for this is that the documentation for NTP is highly technical and assumes a certain level of background knowledge.

I first encountered NTP more than 20 years ago, and my first efforts with it were an unmitigated disaster due to my ignorance of how the protocol was designed to function.  Since then virtually every IT environment I’ve encountered has had a less-than-optimal NTP setup.

I am still far from an expert on NTP, but I’ve learned quite a lot about operating it since my early days.  I hope this series of posts will help you develop a working knowledge of NTP faster and get the basics of NTP configuration right in your environment.

Why learn NTP?

Why bother learning this rather obscure corner of Internet lore?  I mean, the Internet mostly works, despite this alleged widespread lack of expertise in time sync, right?

Here are some of the reasons you might want to learn more about NTP:

  1. You run Ceph, Mongodb, Kerberos, or a similar distributed system, and you want it to actually work.
  2. You want your logs to match up across multiple systems, potentially on multiple continents.
  3. You like learning about new things and tinkering with embedded systems.
  4. You think bandwidth-efficient, high-precision time synchronisation is just a fun, nerdy problem.
  5. You think this is cool:

    A scenario where the latter behavior [the PPS driver disciplining the local clock in the absence of external sources] can be most useful is a planetary orbiter fleet, for instance in the vicinity of Mars, where contact between orbiters and Earth only one or two times per Sol (Mars day). These orbiters have a precise timing reference based on an Ultra Stable Oscillator (USO) with accuracy in the order of a Cesium oscillator. A PPS signal is derived from the USO and can be disciplined from Earth on rare occasion or from another orbiter via NTP. In the above scenario the PPS signal disciplines the spacecraft clock between NTP updates.

    (Personally, they had me at “planetary orbiter fleet”. 🙂 )

Caveats

In this series, I’ll describe a few best practices for setting up NTP in a standard 64-bit Ubuntu Linux 16.04 LTS environment.  Bear in mind this quite limited scope; this advice will not apply in all circumstances and intentionally ignores the less common use cases.  Further caveats:

    1. I have no looks.
    2. I am not an expert.   My descriptions of the algorithms are based on the documentation and operational experience.  I’m not a member of the NTP project; I’ve never submitted a patch; I’ve never compiled ntpd from source (I hate reading & writing C/C++).
    3. I’ve only worked with the reference implementation of NTP, and only on Linux, with only one reference clock driver (NMEA), and a limited range of configuration options.
    4. I will be glossing over a lot of detail.  Sometimes it’s because I don’t think it’s necessary in order to work with NTP successfully; sometimes it’s because I haven’t looked into that particular corner and so I don’t understand it; sometimes it’s because I have looked into that particular corner and I still don’t understand it. 🙂  But mostly it’s because I’m attempting to keep this series accessible for those who are newcomers.  If you’re an experienced NTP operator, you probably won’t find much of interest (if anything) until later in the series.
    5. We won’t cover much history or theory of time sync in this series.  If you’d like to know a little more about that, check out Julien Goodwin‘s previous LCA & SLUG talks:

Read on in part 2 – how NTP works.

Email message size limits

Background

Prompted by a request from staff at a client’s head office, a couple of days ago i posed this question to a couple of the mailing lists i’m on: what is your size limit on individual email messages?

I was blown away by the speed, quantity, and quality of the responses i received from the AusNOG and SAGE-AU communities.  Within an hour i had some hard data and a useful recommendation to take to my client.

Results

I’ve published the statistics and the raw figures to separate sheetes in the same Google Docs workbook; a few explanatory comments about the results are necessary:

  • A number of responses indicated two values, often broken down by receive/send or internal/external criteria (with the latter being the smaller).  This is indicated as “Tier 1” vs. “Tier 2” in the raw results.  I’ve used the “Tier 1” figure to calculate the results.
  • Answers which were ambiguous or indicated no limit were not included in the calculations, nor was one answer of 5 GB, since it skewed the results unrealistically.

Statistics[1]

  • Number of responses: 64
  • Number of numerically quantifiable responses: 57
  • Mean: 30.105 MB
  • Median: 25 MB
  • Mode: 20 MB
  • Standard Deviation: 20.929 MB

Bottom line

I’d say that anyone using something in the range of 10 – 50 MB could consider themselves reasonably “normal”; both those figures are within one standard deviation of the mean.

Commentary

Here are some of the more interesting comments i received, along with the size they indicated.  In most cases, these are direct quotes, but i’ve edited them for spelling, clarity, and punctuation where necessary.  I’ve highlighted two responses that i found striking, given their closeness to the actual results.  (I also suggest reading the AusNOG discussion – both threads – some excellent points were made.)

Size(s) Comments
8 If people need to send more than that, email is the wrong answer.
15 We’ve found in the past increasing above 15 MB resulted in a large number of bounce backs for organizations rejecting messages that were too large being sent to them. The biggest issue we have is explaining this to our customers and them believing it. Mainly because they don’t understand that a simple 8 MB JPEG can blow out to 20-25 MB because of mime encoding etc. We try our best to advise them of this, we do get quite a lot of arguing and feedback requesting we increase it anyway. However, slowly they’re realizing: when their large messages start bouncing back they ask us to set the limit back to what it was before.
25 I imagine a general consensus will be 25 MB upper limit due to Google Apps.
25 Most of my clients have gone Google Apps.
30 Our general view is that if a limitation is lower than what a customer gets on gmail (which is currently 25 MB) and related free services, then you will need to support at least that limit. A limit of 30 MB doesn’t have to be in place long before user actively notice that the limits are typically elsewhere, and start talking about how good their system is. Non-technical high-ups will struggle with paying for a business service that offers less than their personal accounts.
30 Microsoft did a risk assessment for us and noted that having large message sizes and large mailbox sizes (10G to 60G) is a high risk.
40 … and we still get complaints.
50 People still run into [our limit.  We] had ‘someone’s IT guy’ tell us the ‘industry standard’ was 10 MB. I expect you’re getting a wide range of answers, and that there really isn’t an ‘industry standard’.
50 Unfortunately, I still get called every time an email bounces due to remote size limits.
100 We didn’t see any notable impact because of this change [to 100 MB], no delays, additional load or problems caused by the larger emails. Note: These clients had 20, 50, 100meg or faster Internet pipes.
5? I’m actually looking at reducing email size limits to force users into using technologies designed for file sharing and governance – Sharepoint, Skydrive Pro, etc. Reducing limits to 5 MB has all sorts of flow on effects: not even talking about freeing up link bandwidth, Exchange store sizes, etc. I’ve found that email enables poor habits. Emailing a 10 MB doc to the user 2 rooms down via a hosted exchange? Floods the link twice, plus stores the attachment in your local OST, the recipients local OST, and two copies in the exchange store. Now, modify it, and send it back. Ouch.
20? If I had to pick a single size that’s used, it would probably be 20 MB – but there’s no end of variation. 10 MB is common, although mainly for historic reasons, and the number of people with such a low limit is dropping. 25 MB and even 50 MB aren’t uncommon. 100 MB is rare, but out there – mainly in situations where mail is being sent to a specific recipient and they have also upped their [overall] limit. I’ve even seen one company who wanted their limit set to 1 GB…
unlimited/10 I can not express enough the frustration in a customer saying they want to send a bigger email and wanting us to up our limit, explaining the internet is just too hard a task sometimes. In one specific case it was an 11 MB email, the customer response was “It’s only an extra 1 MB can you just let it through this once”, so I pointed him to an SMTP with no limit on it; next day he is forwarding a bounce back from the receiving end who blocked him based on size.

Decision

For those who are interested in the decision: my client and myself were both previously part of the “10 MB is the industry standard” camp, but found the argument about gmail compatibility compelling, and have decided to increase to 25 MB, much to the delight of the staff member pushing for the change.

Notes

  1. Disclaimer: I am not a statistician; this is not a scientifically- or statistically-valid survey; all online polls are inherently bogus due to the respondents self-selecting; i have no idea whether this sample is statistically significant or valid; i did not attempt to authenticate or validate the responses in any way; YMMV; no warranties expressed or implied, etc.


Source: libertysys.com.au

When (Windows) software updates go awry

One of my clients had some very interesting Internet traffic statistics last week.  We came in Thursday morning and found that overnight we had downloaded over 700 GB of data from our ISP (UQ SchoolsNet).

Traffic graph from last week

When we looked through our proxy server logs we found that 538 GB of the total came from a single PC attempting to download a single URL for Adobe Acrobat Reader 9.2 updates.  Fortunately, we’re on an Internet plan which is capped rather than charged for excess traffic, and more fortunately still, our ISP hosts an Akamai mirror, which is where the URL for the updates resolved to.  So, no harm done.

What this reinforced to me was that allowing direct access to the Internet by PCs is rather irresponsible, both from a bandwidth utilization perspective and a cost perspective.  (And that doesn’t even take into account what legal ramifications there might have been if it had been a BitTorrent client rather than a misconfigured/buggy software update client.)

Attachment Size
internet-traffic.png 20.37 KB
internet-traffic-2.png 60.4 KB

Source: libertysys.com.au

Spam insights from Project Honeypot

Project Honeypot just published a report of their experience in processing 1 billion spam messages.  Highlights for the impatient:

  • For the past 5 years, spam “bots have grown at a compound annual growth rate of more than 378%. In other words, the number of bots has nearly quadrupled ever year.”
  • The top 5 countries which host bots are: China (11.4%), Brazil (9.2%), United States (7.5%), Turkey (6.3%), and Germany (6.0%).
  • Top 5 countries with the best ratio of security professionals to spam sources: Finland, Canada, Belgium, Australia (yay!), and the Netherlands.
  • The corresponding bottom 5: China, Azerbaijan, South Korea, Colombia, and Macedonia.
  • Top Spam harvesting countries: United States, Spain, the Netherlands, United Arab Emirates, and Hong Kong.
  • Fraud is rising as a cause for spamming:

    On the other hand “Fraud” spammers — those committing phishing or so-called “419” advanced fee scams — tend to send to and discard harvested addresses almost immediately. The increased average speed of spammers appears to be mostly attributable to the rise in spam as a vehicle for fraud rather than an increasing efficiency among traditional product spammers.

    As an anecdote to reinforce this, on one site i administer, i set up a dedicated subdomain which was purely designed to catch spam.  I placed some addresses in that domain on a web page, and within 1 day they had been harvested and 1 spam had been sent to each email address.  No email to that subdomain has been seen since.

Check out Project Honeynet’s full analysis.

Source: libertysys.com.au