Leading items

Welcome to the LWN.net Weekly Edition for August 5, 2021

This edition contains the following feature content:

A GPSD time warp: a lurking bug in GPSD threatens to derail NTP servers in October.
Hole-punching races against page-cache filling: the long path toward a fix for a subtle kernel filesystem bug.
Strict memcpy() bounds checking for the kernel: a proposal to enable bounds checking for functions like memcpy(); it's not as easy as it seems.
Kernel topics on the radar: brief updates on memory folios, a proposed new isolation mode, and a lightweight threading mechanism from Google.
New features in Neovim 0.5: a discussion of the changes in the latest version of this Vim editor fork.

This week's edition also includes these inner pages:

Brief items: Brief news items from throughout the community.
Announcements: Newsletters, conferences, security updates, patches, and more.

Please enjoy this week's edition, and, as always, thank you for supporting LWN.net.

Comments (none posted)

A GPSD time warp

By Jake Edge
August 4, 2021

The GPSD project provides a daemon for communicating with various GPS devices in order to retrieve the location information that those sensors provide. But the GPS satellites also provide highly accurate time information that GPSD can extract for use by Network Time Protocol (NTP) servers. A bug in the GPSD code will cause time to go backward in October, though, which may well cause some havoc if affected NTP servers do not get an update before then.

At some level, the root cause of the problem is the GPS week-number rollover that occurs because only ten bits were used to represent week numbers in the original GPS protocol. Ten bits overflows after 1023, so only 19.6 (and change) years can be represented. Since the GPS epoch starts at the beginning of 1980, there have already been two rollover events (in 1999 and 2019); there is not supposed to be another until 2038, but a bug in some sanity checking code in GPSD will cause it to subtract 1024 from the week number on October 24, 2021. The effect will be a return to March 2002, which is not what anyone wants—or expects.

The problem was reported by Stephen Williams on July 21. It affects GPSD versions 3.20‑3.22, which is all of the releases since the last day of 2019. The upcoming 3.23 release—due as soon as August 4—will fix the problem, but it needs to be installed on all of the relevant servers. There are concerns that if the word does not get out to NTP server administrators, there could be a rather unpleasant October surprise.

The code in question was quoted in the bug report. In the gpsd_gpstime_resolv() function, the wrong value for a constant is used:

    /* sanity check week number, GPS epoch, against leap seconds
     * Does not work well with regressions because the leap_sconds
     * could be from the receiver, or from BUILD_LEAPSECONDS. */
    if (0 < session->context->leap_seconds &&
        19 > session->context->leap_seconds &&
        2180 < week) {
        /* assume leap second = 19 by 31 Dec 2022
         * so week > 2180 is way in the future, do not allow it */
        week -= 1024;
        GPSD_LOG(LOG_WARN, &session->context->errout,
                 "GPS week confusion. Adjusted week %u for leap %d\n",
                 week, session->context->leap_seconds);
    }

The code may be a little hard to read with the comparisons in the reverse order in which they typically are written; perhaps it is Yoda notation, though it seems strange to apply it to non-equality comparisons. In any case, the week number, which is being calculated elsewhere with rollovers accounted for, is compared against 2180, which is not "way in the future" as stated in the comment, but corresponds to October 24 instead. The test was evidently meant to prevent some spurious regression-test failures, which is what the first comment is talking about.

GPSD maintainer Gary E. Miller acknowledged the problem, noting that he meant to use the week number for December 31, 2022 but made an error in calculating it, thus 2180. The code effectively also "predicts" another leap second being added by the end of 2022, but, as Williams pointed out, that may not be a valid assumption. Beyond that, it is possible that a negative leap second may be coming relatively soon, but the code is not written with that in mind.

Miller said that up until 2020, "leap seconds had been very predicable", but that recent findings about an increase in the earth's rotational speed have changed that—raising the possibility of a negative leap second. The code in question was aimed at the regression tests, however, not the path for handling live GPS messages, which was another part of the problem.

On July 24, Miller committed a fix that removed the errant test from the live path. But the fix will only appear in the 3.23 release; it will not be backported to previous releases—at least by the GPSD project. While distributions may do so, he is not convinced that it will make things better:

gpsd does not have enough volunteers to maintain "branches". Some distros try to cherry pick, but usually make things worse.
This bug was announced on gpsd-dev and gpsd-users email lists. So the packagers for several distros already saw it. What they do is what they do.
3.23 will be released before a week has gone by.
[...] The fact that distros do not pick up gpsd updates, or upstream their patches, is a very sore spot with me.

Williams found the bug in a fork of GPSD 3.19 that he is maintaining. Some changes that were made for 3.20 were backported to that fork; testing that he did on that code showed the problem. But Miller believes that distributions and others should be running more recent versions, and that they should upgrade to 3.23 when it is available, because each new release fixes security-related bugs. That is, of course, somewhat similar to the position of other projects, the Linux kernel in particular, as Miller noted: "I [am] gonna fall back on Greg K_H's dictum: All users must update."

The question of problems with negative leap seconds was also discussed. With Miller's fix applied, there is no known problem of that sort, and even with the earlier (broken) code, a negative leap second would not have changed anything, Williams said. He just happened to notice that the code in question was not expecting the possibility of a negative leap second. No one has yet found any problem should a negative leap second occur, but it is something that could use more testing.

It seems rather short-sighted of the GPS protocol designers to "bake in" a 20-year rollover; as Miller put it: "GPS, by design, is a 1024 week time warp waiting to happen." The more recent CNAV protocol (which is not present in all GPS satellites yet) upgrades the week number to 13 bits, which results in a plausibly safer 157-year rollover, though the first overflow of that is only 116 years from now in 2137. It seems probable that there will be other navigation (and time) technologies by then—or that another couple of bits can be squeezed in somewhere.

The upshot is that anyone relying on GPSD for the correct time after mid-October will want to be running a version without this bug. The Time Warp is fun in movies, but it is rather less so for the systems that dole out time on the internet. NTP servers and the like that use GPSD must upgrade—or at least avoid versions 3.20‑3.22.

[Thanks to David A. Wheeler for giving us a heads-up about this issue.]

Comments (29 posted)

Hole-punching races against page-cache filling

By Jonathan Corbet
July 29, 2021

Filesystem developers tend to disagree with each other about many things, but they are nearly unanimous in their dislike for the truncate() system call, which chops data off the end of a file. Implementing truncate() tends to be full of traps for the unwary — the kind of traps that can lead to lost data. But it turns out that a similar operation, called "hole punching", may be worse. This operation has been subject to difficult-to-hit but real race conditions in many filesystems for years; this patch set from Jan Kara may finally be at a point where it can fill the hole in hole punching.

Hole punching, as its name suggests, is the act of creating a hole in the middle of a file; it is performed using the FALLOC_FL_PUNCH_HOLE option to the fallocate() system call. The caller provides an offset and a length; the kernel then erases the given number of bytes in the file, starting at the provided offset. The associated blocks on the underlying storage device are freed for other uses. The length of the file does not change, though; this operation creates a hole that, if read, will return zeroes. It is, essentially, an efficient way of writing zeroes to the specified range within the file.

Note that neither the offset nor the length must be page-aligned. The kernel will write zeroes to the partial pages at the beginning and end of the hole, should they exist; this edge work is essentially just a couple of write() calls. The efficiency gains of hole punching, though, come from its ability to simply drop entire pages from the file without writing anything; that, naturally, is also where the challenges lie.

To implement (the full-page part of) hole punching, a filesystem must do (at least) two things: remove the associated pages from the page cache, and free the blocks on the storage device. A failure to do either could leave the old data visible in the file, which is something that user space has just made an explicit request to prevent. But, even if both tasks are properly carried out, there is another way in which things can go wrong. The problem is that, in current kernels, there is nothing that ties those two operations together into an atomic change, meaning that something else can happen between one and the other.

Specifically, a filesystem can clear the relevant pages out of the page cache in the usual way, but then race against another task that is trying to access the same file. Should that other task access one or more of the hole-punched pages in the file, they can be reinstated in the page cache before the filesystem has done the work of cleaning up the blocks on disk, leaving stale information in the page cache that may get written back out at some future time. That could lead to any of a number of unpleasant things, including the old data persisting, exposure of unrelated data, or corruption of the filesystem. That can punch a hole in the user's trust in the system overall.

This race is clearly difficult to hit, or there would have been a stream of corruption reports since hole punching was added to 2.6.38 just over ten years ago. But it is a real race that will surely bite somebody sooner or later; it needs to be fixed. Doing that properly has required ten versions of Kara's patch set (at last count) since early 2021.

The solution is conceptually simple: filesystems must take a lock that prevents hole punching and the instantiation of page-cache entries from happening at the same time. But the words "simple" and "locking" are rarely found together in the filesystem realm. In this case, the locks normally used to serialize operations on page-cache pages cannot be used, since the point is that the pages should be absent. Other existing locks run into locking-order issues. So Kara had to add a new lock (a reader-writer semaphore, specifically) to the address_space structure that describes a mapping between the page cache and a file. This lock, called invalidate_lock, prevents operations that instantiate page-cache pages (readers, in the sense of this lock) from racing with those that invalidate pages and underlying storage (writers).

The way filesystems use this lock varies a bit, depending on their internal architecture, but the end result is the same: the race is closed in almost all of the filesystems that support hole punching in the first place. There are a couple of exceptions, specifically the GFS and OCFS2 cluster filesystems, where everything is more complex and the maintainers need to be involved; fixes for those filesystems are still under development.

This work was deemed ready to go and was pushed to Linus Torvalds for the 5.14 merge window, but Torvalds was not impressed: "There is no way I'll merge something this broken" was his response. He was unhappy with the use of the new lock, which was being acquired even in situations where the page(s) in question already exist in the page cache and do not need to be instantiated. Finding pages in the page cache is one of the most performance-critical functions in the kernel, so adding unnecessary overhead there is highly unwelcome. Fixing that required another iteration of the patch set — and another development cycle waiting for the merge window to open again.

By all appearances, this work is now ready to go for 5.15; once that happens, this particular obscure race will have been closed. Even though the problem is evidently hard to hit, it would not be surprising to see this work backported to older kernels once a sufficient level of confidence in its stability has been reached. That will help to ensure that hole-punched files remain whole.

Comments (20 posted)

Strict memcpy() bounds checking for the kernel

By Jonathan Corbet
July 30, 2021

The C programming language is famously prone to memory-safety problems that lead to buffer overflows and a seemingly endless stream of security vulnerabilities. But, even in C, it is possible to improve the situation in many cases. One of those is the memcpy() family of functions, which are used to efficiently copy or overwrite blocks of memory; with a bit of help from the compiler, those functions can be prevented from writing past the end of the destination object they are passed. Enforcing that condition in the kernel is harder than one might expect, though, as this massive patch set from Kees Cook shows.

Buffer overflows never seem to go away, and they are a constant source of bugs and security problems in the kernel. That said, hardening techniques have become good enough that many types of stack-based overflows can be detected and defended against (by killing the system if nothing else). It is hard to overwrite the stack without running over boundaries (which may contain a canary value) in ways that make the problem evident. Heap-based data lacks such boundaries, though, making overflows in the heap space harder to detect; as a result, attackers tend to find such vulnerabilities attractive.

Fortifying the source

The kernel's FORTIFY_SOURCE configuration option turns on a range of checks for functions that are commonly implicated in memory problems in the heap area (and beyond). The strcpy() family of functions, for example, is fairly thoroughly checked when this option is turned on. There are also checks for memcpy() and friends; consider the fortified version of memset() from include/linux/fortify-string.h which, in current kernels, looks like this:

    __FORTIFY_INLINE void *memset(void *p, int c, __kernel_size_t size)
    {
	size_t p_size = __builtin_object_size(p, 0);

	if (__builtin_constant_p(size) && p_size < size)
	    __write_overflow();
	if (p_size < size)
	    fortify_panic(__func__);
	return __underlying_memset(p, c, size);
    }

This version asks the compiler for the size of the destination object (p). If the passed-in size is known at compile time (the __builtin_constant_p() test is true), then the test can be made right away, causing compilation to fail if an overflow is detected; otherwise the second if test performs the check at run time. Note that the run-time test will be optimized out by the compiler in cases where the size is known to be within bounds.

So it would seem that the kernel already has bounds checking for these functions, but there's a catch. The second argument to __builtin_object_size() describes which object is of interest. This comes into play when, for example, the object of interest is embedded within a structure. If that second argument is zero (as in the example above), the return size is the number of bytes to the end of the containing structure; setting that argument to one, instead, gives only the size of the immediate object itself. See the GCC documentation for more information on __builtin_object_size().

The end result is that the version of memset() shown above will catch attempts to write beyond the end of a structure, but will not catch overflows that overwrite structure fields after the intended destination. That leaves a lot of interesting fields for an attacker to step on if they can find a way to influence the size passed into those functions. One might think that the obvious thing to do is to change the second argument to __builtin_object_size() to one, thus checking against the correct size, but this is the kernel and life is not so simple.

Setting or copying data across multiple structure fields is, as it turns out, a fairly common action in the kernel, and those actions would trigger more strict tests in the memory functions. The result of enabling the strict tests would be an unbuildable, unusable kernel; that would certainly be secure, but users would still be unimpressed. Users can be a little funny that way.

`memset_after()`

One common use case for copying across fields is the "write zeroes from here to the end of the structure" operation. Consider, for example, this code in the AR9170 wireless network driver:

    memset(&txinfo->status.ack_signal, 0,
	   sizeof(struct ieee80211_tx_info) -
	   offsetof(struct ieee80211_tx_info, status.ack_signal));

This code shows a case of clearing to the end of the structure; it also shows just how awkward such code can be. That sort of length arithmetic is easy to get wrong, and it's subject to disruption if the layout of the structure changes for any reason. Indeed, the line of code before the above reads:

    	BUILD_BUG_ON(offsetof(struct ieee80211_tx_info, status.ack_signal) != 20);

This test will cause a build failure if the offset of the first field to overwrite is not as expected, but will not catch any changes made after that field. Structure members added after ack_signal will be overwritten by this memset() call — a fact that may not be obvious at the time.

To clarify this sort of code and to avoid false positives from stricter checks on memset(), the patch set introduces a new macro for this operation:

    memset_after(object, value, member);

It will cause every byte of object located after member to be set to value. This macro can replace the above code with:

    memset_after(&txinfo->status, 0, rates);

(The ack_signal field, being the first to be zeroed, is immediately after rates in this structure). Numerous such cases have been fixed in Cook's patch set.

Grouped structure fields

There is a more complicated case, though, in which a range of fields within a structure is overwritten in a single call. A number of approaches have been used within the kernel to try to do such copies safely; one of those is the same sort of offsetof() arithmetic seen in the case above. But there are others. Deep within the sk_buff structure used to represent network packets is this field:

    __u32 headers_start[0];

A full 120 lines later is another zero-length array called headers_end. Those arrays clearly cannot hold any data of interest; instead, they are used with the same sort of offset arithmetic to copy a whole set of packet headers in a single operation. Here, too, there is a set of build-time checks to ensure that, at least, all of the relevant header fields are located between the two markers.

Some developers simply add up the lengths of the fields to be written and use the result as the length for the memory operation. Yet another approach is to define a nested structure to hold the set of fields to be copied. This variant is safer, but it complicates the use of those fields (which must be accessed by way of the intermediate structure) and tends to lead to pollution of the namespace with macros added to minimize those complications.

In summary, kernel developers have come up with a number of ways of handling cross-field memory operations, but none of them are particularly satisfying. Cook's patch set brings a new solution (co-authored with Keith Packard) in the form of the struct_group() macro. Taking the example from that patch, consider a structure like this:

    struct foo {
	int one;
	int two;
	int three;
	int four;
    };

Imagine further that the developer wants to copy over fields two and three with a single memcpy() call. This could be formalized by declaring the structure this way:

    struct foo {
	int one;
	struct_group(thing,
		     int two,
		     int three,
	);
	int four;
    };

This macro has the effect of creating a nested structure called thing, which can be used with functions like memcpy() with the strict bounds checks enabled. The individual fields can still be referred to as two and three, though, without the need to name the nested structure, and without any macro ugliness. This is accomplished this way:

    #define struct_group_attr(NAME, ATTRS, MEMBERS) \
	union { \
	    struct { MEMBERS } ATTRS; \
	    struct { MEMBERS } ATTRS NAME; \
	}

    #define struct_group(NAME, MEMBERS)	\
	struct_group_attr(NAME, /* no attrs */, MEMBERS)

This macro defines an intermediate structure to hold the grouped fields — twice; one is anonymous while the other has the given NAME. The duplicated structures are then overlaid on top of each other within an anonymous union. This bit of trickery makes it possible to use the field names directly while also providing the name for the structure as a whole, which can be used with the memory functions.

Toward a harder kernel

Much of the patch set is devoted to defining these groups within structures throughout the kernel, then using the groups for memory operations. With that done, it becomes possible to enable the stricter bounds checks for those operations — sort of. The remaining problem is that this kind of cross-field operation is actually kind of hard to find in the code; there is not a pattern that can be easily grepped for. Chances are thus good that there are other occurrences in the kernel that have not been found yet; as Cook noted halfway through the patch series, there are over 25,000 memcpy() calls in the kernel. Crashing the system in response to an unfixed (but correct) cross-field operation would be seen as rude at best, so warnings will have to be issued instead for the indefinite future.

There should come a time, though, when reports of warnings fall off and the community will feel confident enough to halt the system when an out-of-bounds copy is detected. The value of doing so could be significant. Quoting the just-linked patch:

With this it's also possible to compare the places where the known 11 memcpy() flaw overflows happened against the resulting list of potential new bounds checks, as a measure of potential efficacy of the tightened mitigation. Much to my surprise, horror, and delight, all 11 flaws would have been detected by the newly added run-time bounds checks, making this a distinctly clear mitigation improvement.

This mitigation seems worth having, but first the patches must find their way into the mainline kernel. Security-related work often has a rough path into the kernel, though the situation has gotten better over the years. In this case, at least, one frequent complaint (impact on performance) should not be an issue; the cost of an extra length check in the cases where the answer isn't known at compile time is tiny. But the patch set is still large and wide-ranging; chances are that there will be some discussions to get through before it can be merged. The completion of that process should herald the end of another type of unpleasant security bugs.

Comments (16 posted)

Kernel topics on the radar

By Jonathan Corbet
August 2, 2021

The kernel-development community is a busy place, with thousands of emails flying by every day and many different projects under development at any given time. Much of that work ends up inspiring articles at LWN, but there is no way to ever cover all of it, or even all of the most interesting parts. What follows is a first attempt at what may become a semi-regular LWN feature: a quick look at some of the work that your editor is tracking that may or may not show up as the topic of a full article in the future. The first set of topics includes memory folios, task isolation, and a lightweight threading framework from Google.

Memory folios

The memory folios work was covered here in March; this patch set by Matthew Wilcox adds the concept of a "folio" as a page that is guaranteed not to be a tail page within a compound page. By guaranteeing that a folio is either a singleton page or the head of a compound page, this work enables the creation of an API that adds some useful structure to memory management, saves some memory, and slightly improves performance for some workloads.

While the memory-management community is still not fully sold on this concept (it looks like a lot of change for a small benefit to some developers), it looks increasingly likely that it will be merged in the near future. Or, at least, the merging process will start; one does not swallow a 138-part (at last count) memory-management patch series in a single step. In mid-July, Wilcox presented his plan, which involves getting the first 89 patches merged for 5.15; the rest of the series would be merged during the following two development cycles. Nobody seems to be contesting that schedule at this point.

Later in July, though, Wilcox stumbled across the inevitable Phoronix benchmarking article which purported to show an 80% performance improvement for PostgreSQL with the folio patches applied to the kernel. He said that the result was "plausibly real" and suggested that, perhaps, the merging of folios should be accelerated. Other developers responded more skeptically, though. PostgreSQL developer Andres Freund looked at how the results were generated and concluded that the test "doesn't end up measuring something particularly interesting". His own test showed a 7% improvement, though, which is (as he noted) still a nice improvement.

The end result is that the case for folios seems to be getting stronger, and the merging process still appears to be set to begin in 5.15.

Retrying task isolation

Last year, the development community discussed a task-isolation mode that would allow latency-sensitive applications to run on a CPU with no interruptions from the kernel. That work never ended up being merged, but the interest in this mode clearly still exists, as can be seen in this patch set from Marcelo Tosatti. It takes a simpler approach to the problem — initially, at least.

This patch is focused, in particular, on kernel interruptions that can happen even when a CPU is running in the "nohz" mode without a clock tick. Specifically, he is looking at the "vmstat" code that performs housekeeping for the memory-management subsystem. Some of this work is done in a separate thread (via a workqueue) that is normally disabled while a CPU is running in the nohz mode. There are situations, though, that can cause this thread to be rescheduled on a nohz CPU, ending the application's exclusive use of that processor.

Tosatti's patch set adds a set of new prctl() commands to address this problem. The PR_ISOL_SET command sets the "isolation parameters", which can be either PR_ISOL_MODE_NONE or PR_ISOL_MODE_NORMAL; the latter asks the kernel to eliminate interruptions. Those parameters do not take effect, though, until the task actually enters the isolation mode, which can be done with the PR_ISOL_ENTER command. The kernel's response to entering the isolation mode will be to perform any deferred vmstat work immediately so that the kernel will not decide to do it at an inconvenient time later. The deferred-work cleanup will happen at the end of any system call made while isolation mode is active; since those system calls are the likely source of any deferred work in the first place, that should keep the decks clear while the application code is running.

The evident intent is to make this facility more general, guaranteeing that any deferred work would be executed right away. That led others (including Nicolás Sáenz) to question the use of a single mode to control what will eventually be a number of different kernel operations. Splitting out the various behaviors would, he said, be a way to move any policy decisions to user space. After some back-and-forth, Tosatti agreed to a modified interface that would give user space explicit control over each potential isolation feature. A patch set implementing this API was posted on July 30; it adds a new operation (PR_ISOL_FEAT) to query the set of actions that can be quiesced while the isolation mode is active.

Bonus fact: newer members of our community may not be aware that, 20 years ago, Tosatti was known as Marcelo the Wonder Penguin.

User-managed concurrency groups

In May of this year, Peter Oskolkov posted a patch set for a mechanism called "user-managed concurrency groups", or UMCG. This work is evidently a version of a scheduling framework known as "Google Fibers", which is naturally one of the most ungoogleable terms imaginable. This patch set has suffered from a desultory attempt to explain what it is actually supposed to implement, but the basic picture is becoming more clear over time.

UMCG is meant to be a lightweight, user-space-controlled, M:N threading mechanism; this document, posted after some prodding, describes its core concepts. A user-space process can set up one or more concurrency groups to manage its work. Within each group, there will be one or more "server" threads; the plan seems to be that applications would set up one server thread for each available CPU. There will also be any number of "worker" threads that carry out the jobs that the application needs done. At any given time, each server thread can be running one worker. User space will control which worker threads are running at any time by attaching them to servers; notifications for events like workers blocking on I/O allow the servers to be kept busy.

In the August 1 version of the patch set, there are two system calls defined to manage this mechanism. A call to umcg_ctl() will register a thread as an UMCG task, in either the server or the worker mode; it can also perform unregistration. umcg_wait() is the main scheduling mechanism; a worker thread can use it to pause execution, for example. But a server thread can also use umcg_wait() to wake a specific worker thread or to force a context switch from one worker thread to another; the call will normally block for as long as the worker continues to run. Once umcg_wait() returns, the server thread can select a new worker to execute next.

Or so it seems; there is little documentation for how these system calls are really meant to be used and no sample code at all. The most recent version of the series did, finally, include a description of the system calls, something that had been entirely absent in previous versions. Perhaps as a result, this work has seen relatively little review activity so far. Oskolkov seems to be focused on how the in-kernel functionality is implemented, but reviewers are going to want to take a long and hard look at the user-space API, which would have to be supported indefinitely if this subsystem were to be merged. UMCG looks like interesting and potentially useful work, but this kind of core-kernel change is hard to merge in the best of conditions; the absence of information on what is being proposed has made that process harder so far.

Comments (18 posted)

New features in Neovim 0.5

August 3, 2021

This article was contributed by Ayooluwa Isaiah

Neovim 0.5, the fifth major version of the Neovim editor, which descends from the venerable vi editor by way of Vim, was released on July 2. This release is the culmination of almost two years of work, and it comes with some major features that aim to modernize the editing experience significantly. Highlights include native support for the Language Server Protocol (LSP), which enables advanced editing features for a wide variety of languages, improvements to its Lua APIs for configuration and plugins, and better syntax highlighting using Tree-sitter. Overall, the 0.5 release is a solid upgrade for the editor; the improvements should please the existing fan base and potentially draw in new users and contributors to the project.

The Neovim project was started by Thiago Padilha in 2014 shortly after his patch to introduce multi-threading capabilities to Vim was rejected without much in the way of feedback. This event was the major trigger that led Padilha to create this fork, with the explicit aim of improving the usability, maintainability, and extensibility of Vim while facilitating a more open and welcoming environment.

A built-in LSP client

The Language Server Protocol is an open-source specification that standardizes programming language features across different source code editors and integrated development environments (IDEs). It facilitates communication between code-editing tools (clients), and locally running language servers to provide language-specific smarts such as auto-completion, find-and-replace, go-to-definition, diagnostics, and refactoring assistance.

Prior to the development of LSP, the work of providing support for a programming language had to be implemented for each IDE or text editor, either directly in the code, or through its extension system, which led to varying levels of support across language and editor combinations. The LSP standard enables the decoupling of language services from the editor into a self-contained piece so that language communities can concentrate on building a single server that has a deep understanding of a language. Other tools can then provide advanced capabilities for any programming language simply by integrating with the existing language servers.

While it was already possible to use LSP in Neovim with the help of third-party plugins, the 0.5 release adds native LSP support to Neovim for the first time. The introduction of LSP in Neovim allows the editor to act as a client, informing a language server about user actions (such as executing a "go-to-definition" command); the server answers the request with the appropriate information, which could be the location of the definition for the symbol under the cursor. That will allow the editor to navigate to the specified location in the file or project.

The interface provided by the Neovim LSP client is a general one, so it does not support all of the features that are available in third-party LSP plugins (e.g. auto-completion). It was built to be extensible, though, so it includes a Lua framework that allows plugins to add features not currently supported in the Neovim core. Setting up individual language servers for the editor can be done using the nvim-lspconfig plugin, which helps with the launching and initialization of language servers that are currently installed on the system. Note that language servers are not provided by Neovim or nvim-lspconfig, they must be installed separately. There is a long list of LSP servers supported by the nvim-lspconfig plugin.

Lua integration

Initial support for the Lua programming language in Neovim landed in the 0.2.1 release in 2017. It has seen continued development and deeper integration in the editor since then, most notably with the addition of a Neovim standard library for Lua in the 0.4 release in 2019. The Neovim developers expect Lua to become a first-class scripting language in the editor, thus providing an alternative to VimL, which is the scripting language inherited from Vim. Neovim 0.5 takes big strides toward the realization of this goal by improving the Lua API and adding init.lua as an alternative to init.vim for configuring the editor.

A good explanation of the rationale behind the decision to embed Lua in Neovim can be found in a video of a talk by Justin M. Keyes, a lead maintainer for the project. In summary, Lua is a more approachable language than VimL due to its simplicity and ease of embedding. It is also an order of magnitude faster than VimL. Neovim supports Lua 5.1, which was released in 2006, rather than more recent versions of Lua, such as 5.3 or 5.4 (released 2015 and 2020 respectively), mostly due to LuaJIT, which only supports Lua 5.1. The motivation for maintaining compatibility with LuaJIT stems from its significant performance advantages over the standard Lua compiler.

Adding Lua to Neovim has made it easier to extend the capabilities of the editor and contribute to its core code, especially for users who have been put off by VimL, which is not a language that is used outside of Vim. Since Lua is also heavily used for scripting video games and for extending other programs written in a variety of languages (C, C++, Java, etc.), there is an abundance of resources available for learning the language, along with examples that show how to use it to interact with APIs from other languages. This wealth of information on Lua makes it possible for new plugin authors and aspiring Neovim contributors to get up to speed with the language quickly.

The Lua support in Neovim has led to it becoming the preferred language for how newer Neovim features, such as the LSP client, are being exposed. Using these APIs can only be done with Lua, since VimL cannot be used to interact with them. However, VimL support in Neovim is not going anywhere, and the Neovim developers do not anticipate any reason to deprecate it, so migrating an existing init.vim configuration to init.lua, or porting a VimL plugin to Lua for the sake of it is completely optional at this time. The only caveat is that using these Neovim APIs (such as LSP or Tree-sitter) in an init.vim configuration or VimL plugins can only be done by embedding some Lua snippets within the existing VimL code.

Although deeper Lua integration is seen as one of the main achievements of the 0.5 release, not all of the reactions toward the push to supplant VimL in the editor core have been positive. There is some concern that the emphasis on Lua APIs, and Lua-only plugins, will lead to a split in the plugin community where an increasing number of plugins will be Neovim-only (as opposed to supporting both Vim and Neovim). Also, an improved and not entirely backward-compatible version of VimL (currently referred to as Vim9) is under active development by Vim creator Bram Moolenaar and other Vim contributors. It is not entirely clear whether the Neovim maintainers plan to support Vim9, since they are more invested in Lua. At the time of this writing, there are already several Lua plugins that work only in Neovim, and a handful of Vim9 plugins that work only in Vim. It is therefore easy to speculate that the ecosystems for both projects may diverge significantly in the near future as there are currently no plans to bring a similar level of Lua integration into Vim.

Tree-sitter

Tree-sitter is a new parsing system that aims to replace the limited, regular-expression-based, code-analysis capabilities that are prevalent in current developer tools. It is a high-performance parser generator that can build parsers to create an incremental syntax tree for a source file, and can efficiently update the syntax tree in realtime as the file is being edited. In Neovim 0.5, support for Tree-sitter has been added to the editor core, although it is currently classed as experimental due to some known bugs along with performance issues for large files. The expectation is that it will become stable in the next major release (0.6), which should be expected in a year or two judging from past releases.

Using Tree-sitter in Neovim makes it possible for the editor to understand the code in a source file as a tree of programming language constructs (such as variables, functions, types, keywords, etc.), and use that information to handle those constructs consistently. When a Tree-sitter parser is installed and enabled for a specific language, the editor's syntax highlighting will be based on the syntax trees it provides; this results in improvements to the use of color to outline the structure of the code more clearly. In particular, object fields, function names, keywords, types, and variables will be highlighted more consistently throughout the file.

Tree-sitter is also able to do incremental parsing, which keeps the syntax tree up to date as the code is being edited. This puts an end to the practice of re-parsing an entire file from scratch in order to update its syntax highlighting after a change is made, which is currently the case with regular-expression-based highlighting systems. That leads to significant speed improvements.

Tree-sitter has been lauded for its improved syntax-highlighting capabilities, but it also enables the definition of language-aware text objects better suited to editing code than what is provided by default in the editor. The nvim-treesitter-textobjects module allows the creation of text objects for constructs like classes, functions, parameters, conditionals, and more, which can be manipulated just as easily as words or sentences. Several examples of the Tree-sitter-based highlighting can be seen in the gallery for the nvim-treesitter repository.

Wrapping up

The features above make up the bulk of this release, but Neovim 0.5 also includes improvements and bug fixes to the user interface, as well as smaller features such as support for remote plugins written in Perl 5.22+ on Unix platforms. It is also worth mentioning that around 1000 Vim patches were merged in this release, updating various aspects of the editor. The full list of changes, fixes and refinements can be seen in the release notes linked above.

The Neovim project uses GitHub issues to track all feature and bug requests, so a list of closed issues for the 0.5 milestone is available for a further exploration of the changes that made it into this release. The planning for subsequent releases is detailed on the project's roadmap page, while priorities are tracked through GitHub milestones. Contributions from the community are welcome, of course, and the project maintainers may be reached via Gitter, Matrix, or the #neovim room on irc.libera.chat.

Comments (13 posted)

Page editor: Jonathan Corbet
Next page: Brief items>>