| From: |
| David Woodhouse <dwmw2-AT-infradead.org> |
| To: |
| Richard Cochran <richardcochran-AT-gmail.com>, Wen Gu <guwen-AT-linux.alibaba.com>, David Woodhouse <dwmw2-AT-infradead.org>, Andrew Lunn <andrew+netdev-AT-lunn.ch>, "David S. Miller" <davem-AT-davemloft.net>, Eric Dumazet <edumazet-AT-google.com>, Jakub Kicinski <kuba-AT-kernel.org>, Paolo Abeni <pabeni-AT-redhat.com>, John Stultz <jstultz-AT-google.com>, Thomas Gleixner <tglx-AT-kernel.org>, Stephen Boyd <sboyd-AT-kernel.org>, Anna-Maria Behnsen <anna-maria-AT-linutronix.de>, Frederic Weisbecker <frederic-AT-kernel.org>, Shuah Khan <shuah-AT-kernel.org>, Peter Zijlstra <peterz-AT-infradead.org>, Thomas Weißschuh <thomas.weissschuh-AT-linutronix.de>, Arnd Bergmann <arnd-AT-arndb.de>, Miroslav Lichvar <mlichvar-AT-redhat.com>, Julien Ridoux <ridouxj-AT-amazon.com>, Ryan Luu <rluu-AT-amazon.com>, linux-kernel-AT-vger.kernel.org |
| Subject: |
| [PATCH v4 0/7] timekeeping/ntp: Fix drift tracking precision |
| Date: |
| Mon, 25 May 2026 14:54:32 +0100 |
| Message-ID: |
| <20260525135904.126282-1-dwmw2@infradead.org> |
| Archive-link: |
| Article |
This is the bugfix subset of the RFC series last posted at
https://lore.kernel.org/all/20260520135207.37826-1-dwmw2@...
These patches stand alone and fix several long-standing precision issues
in the kernel's NTP drift tracking. The feed-forward clock discipline
which was the reason I was *looking*, is dropped for now.
Patch 1: MAINTAINERS update
Add Miroslav Lichvar as timekeeping reviewer.
Patches 2-3: Timekeeping bugfixes
2. Remove stale xtime_remainder from ntp_error accumulation.
3. Account for monotonicity adjustment in ntp_error.
These fix systematic drift in ntp_error tracking that caused the
timekeeping dithering to fight against NTP corrections rather than
assist them.
Patch 4: Independent bugfix
4. Guard against divide-by-zero during clocksource recalibration.
Prevents an oops on KVM guests with clocksource=tsc when the TSC
frequency is recalibrated during boot.
Patches 5-7: NTP rework — eliminate tick_length_base
5. Drive time_offset skew via per-tick ntp_error transfer instead of
tick_length inflation, with mult adjustment for dithering bandwidth.
6. Convert adjtime() to use time_offset directly instead of inflating
tick_length, removing the rounding loss that prevented convergence.
7. Remove tick_length_base entirely — tick_length is now always the
NTP-disciplined value with no per-tick inflation.
This eliminates the dual-accounting between tick_length and
tick_length_base that was a source of rounding errors and made the
code harder to reason about.
The net effect is cleaner accounting in the timekeeping code: the
dual tick_length/tick_length_base bookkeeping is eliminated, and the
ntp_error tracking no longer accumulates systematic bias from stale
remainders or unaccounted monotonicity adjustments.
I've tested this with added debugging tracking the absolute values of
the three time references:
• (A): xtime, reported as CLOCK_REALTIME.
• (B): xtime+ntp_error, the time NTP *wants* to report right now.
• (C): xtime+ntp_error+time_offset, the time NTP wants to skew to.
With the fixes, the ntp_error and time_offset deltas remain entirely
consistent and the kernel perfectly skews to and tracks the reference
setting that it's asked to (although the actual *setting* is now
absent from this series; it's just using the existing tick_length
and time_offset parameters).
Changes since RFC v3:
- Dropped patches 8-10 (feed-forward infrastructure, ptp_vmclock
integration, vmclock_host) — those will be reposted separately
once Thomas's timekeeping branch lands.
- No code changes to patches 1-7 vs the RFC.
Changes since RFC v2:
• Renamed "clawback" to "monotonicity adjustment" throughout (patch 2).
• Drop the exponential tail clamping (v2 patch 3).
• Convert adjtime() to use time_offset to deliver skew too, and remove
the separate 'tick_length_base' as adjusting tick_length directly is
no longer used to skew the clock. The skew_delta basically does the
same thing, but is easier to get the accounting right.
• The timekeeping_set_reference() API (patch 8) now takes tk_core.lock
and computes the phase offset internally, eliminating the race window
that existed in v2 between setting the reference and the tick code
consuming it.
• vmclock_host (patch 10) is no longer marked WIP — it has a selftest
and proper locking (but is still RFC).
• Added MAINTAINERS entry for Miroslav Lichvar as timekeeping reviewer
(patch 1).
David Woodhouse (7):
MAINTAINERS: Add Miroslav as timekeeping reviewer
timekeeping: Remove xtime_remainder from ntp_error accumulation
timekeeping: Account for monotonicity adjustment in ntp_error
timekeeping: Guard against divide-by-zero in timekeeping_adjust
timekeeping: Drive time_offset skew via per-tick ntp_error transfer
ntp: Convert adjtime() to use time_offset instead of tick_length inflation
ntp: Remove tick_length_base, use tick_length directly
MAINTAINERS | 1 +
include/linux/timekeeper_internal.h | 4 +-
kernel/time/ntp.c | 99 ++++++++++++++++++++++++++-----------
kernel/time/ntp_internal.h | 3 ++
kernel/time/timekeeping.c | 41 ++++++++++++---