|
|
Subscribe / Log in / New account

Distributions

The NTP pool system

September 21, 2016

This article was contributed by Tom Yates

NTP, the Network Time Protocol, quietly and without much fuss performs the critical internet function of knowing the correct time. Using it, a computer with imperfect communications links may join a distributed community of servers, each of which is either directly attached to a reliable clock, or is trying to best synchronize its clock to one or more better-synchronized members of the community. The NTP pool system has arisen as a method of providing such a community to the internet; it works well, but is not without its challenges.

NTP is quite complex in design, see these slides [PDF] for some details. Comparison of local and remote time stamps on transmission and reception over a pair of packet exchanges is used, iteratively, to estimate and correct for network latencies, though the inconstant nature of these latencies produces a floor below which, in practice, cannot be corrected for. As the protocol's author notes [PDF], a huff-and-puff algorithm corrects for large outliers and asymmetric delays, and a popcorn spike suppressor clips noise spikes.

NTP is fairly simple, however, in operation. Timestamps are exchanged via UDP, with the service being assigned port number 123. The protocol is hierarchical, with servers being classified into strata according to how many server hops lie between them and a reliable clock. The reliable clocks themselves are at the theoretical stratum of 0 (no server participating in the protocol should ever advertise itself as being at stratum 0). An NTP server whose current time comes from a directly attached reliable clock is at stratum 1, the highest stratum that should actually be seen on the internet. Servers that take their current time from a stratum 1 server are at stratum 2, and so on down to stratum 16, which equates to "not synchronized".

The protocol has been in continuous use since 1985, making it one of the oldest protocols still around. Back then, the business of finding higher-stratum servers to bind to was done by consulting the list of public NTP servers, which was maintained by the protocol's author, David Mills, on his University of Delaware homepage. You picked a couple of likely-looking servers whose access policies allowed you, dropped their admins an email if they'd asked for it, and put them in your ntp.conf file. (I am old enough to be quietly nostalgic for the days when a three-line text file was enough to configure any reliable service.)

This informal approach worked well while there were enough high-stratum servers willing to support the rest of the internet population. That stopped being the case in 2003, when Mills posted to the comp.protocols.time.ntp newsgroup that he'd removed from the public lists the servers of a national time standards laboratory, at their request, following repeated violations of their declared access policy.

Pooling our efforts

Following some discussion, Adrian von Bidder proposed, then implemented, an approach by which clients would configure the same small number of generic time sources by hostname: the original suggestions were left, right, and center.time.fortytwo.ch, the left, right, and center having no meaning other than to enable multiple servers to be configured without repeating a fully-qualified domain name. Through the use of round-robin DNS, a relatively large number of servers would combine to provide the underlying service. Within 24 hours, the refinement was suggested of incorporating geographical (continent or country) information in the hostnames to assist clients who wished to use servers that were likely to be fewer network hops away. Later that day, the suggestion was made to change the domain name being used to pool.ntp.org.

At this point, the skeleton of today's NTP pool service, which many Linux distributions come ready to use out of the box, was clearly visible. Modern pool hostnames, which look like 0.pool.ntp.org, 1.europe.pool.ntp.org, and 2.ar.pool.ntp.org, can clearly be seen foreshadowed in the previous discussion (again, the 0, 1, and 2 are meaningless placeholders to allow multiple servers to be drawn from the same zone without repeating a hostname). Later additions included a backend monitoring system, which continuously checks all pool servers to verify that they advertise good time and removes faulty servers from the pool for as long as they do not. The project also developed a DNS server that allowed for the weighting of servers in creating responses to queries of the pool zone, so that servers that had better internet connections could be returned more often than those with poorer connections.

In 2005, von Bidder relinquished the reins of the project. They were capably taken up by Ask Bjørn Hansen, who remains the project's lead developer (currently assisted by, amongst others, Guillaume Filion, Arnold Schekkerman, and John Winters). One major change since then was the addition in 2006 of the vendor pool; projects and commercial vendors that wish to ship systems preconfigured to use the NTP pool as a time source are strictly forbidden from using simple pool hostnames. Instead, they must ask the project for a zone dedicated to them (e.g. debian.pool.ntp.org, centos.pool.ntp.org, or linksys.pool.ntp.org), which they are allowed to configure into software and hardware they ship. The main technical reason for this is to enable a quick solution to a vendor that ships devices that, accidentally or otherwise, start to abuse the pool servers; the project notes that an accompanying procedural benefit is to have a process in place to talk to people who are going to use the pool at larger scale, to make sure they do so responsibly. Free-software organizations needing pool zones are requested to make a reference in their configuration file or documentation encouraging people to join the pool; the following example comes from the CentOS 7 config file, as installed:

    # Use public servers from the pool.ntp.org project.
    # Please consider joining the pool (http://www.pool.ntp.org/join.html).
    server 0.centos.pool.ntp.org iburst
    server 1.centos.pool.ntp.org iburst
    server 2.centos.pool.ntp.org iburst
    server 3.centos.pool.ntp.org iburst

Commercial or closed-source organizations are asked to make a financial contribution to the project, and possibly to add a few servers to the pool as well.

How are we doing for time?

One could argue that the public NTP service was saved from being a victim of its own success by two factors: firstly, the increasing affordability of GPS-driven time sources, which allow a stratum 1 server to be built without the need for an immediately-adjacent cesium beam clock or an expensive radio time signal receiver, and secondly, the creation of the NTP pool system. At the time of writing, the pool contains about 3,700 server addresses, of which about 30% are IPv6 addresses. Slightly more than two-thirds of the pool servers are in Europe, and most of the rest are in North America; this information can be found in real time, and in much more detail, on the pool's website. The most common stratum for a pool server is 2, which contains about 70% of the pool's servers; less than 10% are at stratum 1.

Because the system is distributed, there is no easy way to know how many clients are served by the pool system. One estimate in 2011 was five to fifteen million, and that number can only have increased since. It's pretty clear that running a pool server is a remarkably efficient way of helping people: each server helps, on average, well over a thousand end-users know the right time. Or, to look at it another way, because most pool servers are at stratum 2, their redistribution of a stratum-1 server's time reduces the load on the server of anyone kind enough to run a public stratum-1 server by a factor of a thousand or more.

But the pool system has its problems. The DNS infrastructure is in need of assistance. The system that monitors the quality of potential pool servers is located on the west coast of the US, and though its failure wouldn't result in the immediate unavailability of the pool's NTP servers, it is a single point of failure and could do with being parallelized to other sites around the world. Translation assistance is needed. Although it has been amazingly reliable for the last decade, Hansen notes that the system as a whole has several more single points of failure that he'd like to eliminate.

And more than anything else, the system needs more NTP servers, particularly in places other than Europe and North America. At the time of writing, 158 country zones have no servers at all. If a client queries an empty zone, the query generally falls back to the continental zone. This creates a disincentive to be one of the first few server operators in a zone, because all the queries descend on your server, but the alternative is that some countries remain without local time service. A proposal has been made to populate otherwise-empty or lightly-populated zones with randomly-selected volunteers from the global zone, to try to minimize the impact of being the first server in a country, but this can only go so far: China, a country not noted for unfettered links to the global internet and thus in particular need of local servers, has only eight servers in its zone. Until the world is well-supplied with servers, more are needed everywhere.

We noted above that the most common stratum of a pool server is 2. The average LWN reader will very quickly infer that a candidate pool server does not need a directly-attached clock. According to the project, all you need is a static IP address and a reliable internet connection. Assistance is offered to potential pool server operators in getting things configured correctly. Once this is done, your author, who has run a pool server for over five years, can attest that it is a remarkably painless and trouble-free process. It imposes a small load on the underlying hardware (I estimate it adds about 0.2 to the server's load average, and around 20GB of network traffic a month), and the pool allows you to select a lower connectivity level if you prefer a lighter load than this. If you want a way to help a lot of people do something important with minimal effort, running an NTP pool server is worth a look.

Comments (67 posted)

Brief items

Distribution quote of the week

Debian's bug system is a tool we use to improve the distribution, not a user support channel. We should not retain bugs that do not help us achieve that. It would be great if it could also be a user support channel, but this is just unachievable for a volunteer-maintained distribution like Debian, and we should avoid creating the impression that we promise to do this.
-- Russ Allbery

Comments (1 posted)

Updated Debian 8: 8.6 released

Debian 8.6 has been released. "This update mainly adds corrections for security problems to the stable release, along with a few adjustments for serious problems. Security advisories were already published separately and are referenced where available."

Full Story (comments: none)

Newsletters and articles of interest

Distribution newsletters

Comments (none posted)

Catanzaro: GNOME 3.22 core apps

Michael Catanzaro lays down the rules for which GNOME applications distributions should package if they want to claim to provide a "pure GNOME experience." "Selecting the right set of default applications is critical to achieving a quality user experience. Installing redundant or overly technical applications by default can leave users confused and frustrated with the distribution. Historically, distributions have selected wildly different sets of default applications. There’s nothing inherently wrong with this, but it’s clear that some distributions have done a much better job of this than others."

Comments (38 posted)

Page editor: Rebecca Sobol
Next page: Development>>


Copyright © 2016, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds