A vulnerability in Git
A potentially nasty vulnerability in the Git distributed revision-control system was disclosed on March 9. There are enough qualifiers in the description of the vulnerability that it may appear to be fairly narrowly focused—and it is. That may make it less worrisome, but it is not entirely clear. As with most vulnerabilities, it all depends on how the software is being used and the environment in which it is running.
The vulnerability (CVE-2021-21300) could lead to code execution on the local system when cloning from a repository crafted to exploit it. It requires that some kind of Git filter be installed. Filters are used to manipulate files in between the filesystem and the Git repository; "smudge" filters are used when pulling blobs (binary objects) out of the repository to store in the working directory, while "clean" filters can change files as they are being committed into the repository. Which of those types is needed will depend on the type of transformation being performed. Git Large File Storage (LFS) is a commonly used extension (with both smudge and clean filters), which is installed by default with Git on Windows.
Filters are able to delay the normal processing of Git operations so that long-running filtering can be completed in the background. For example, Git LFS may need to copy a large file across the network in order to satisfy a checkout operation. But the delay feature changes the normal order in which files and directories are processed by Git. That, in turn, means that information cached by the tool may no longer be valid when it is relied upon, which is exactly where the vulnerability lies.
In order to reduce the number of lstat() calls that are made, Git maintains a cache called the "lstat cache". If a path collision (i.e. two files with the same path and name) occurs as the files are being checked out, for example if two files with names that differ only in their case are being checked out into a case-insensitive filesystem, that cache could be left in an invalid state. That does not typically lead to a problem because the checkouts proceed in a known order so the cache is not actually needed at the point where it is invalid.
However, if certain parts of the checkout are delayed by the filters, all bets are off. When the cache is consulted, the type of the files in the cached path may have changed; if that change was crafted by an attacker, unpleasantness is sure to occur. The patch fixing the vulnerability described the problem this way:
But, there are some users of the checkout machinery that do not always follow the index order. In particular: checkout-index writes the paths in the same order that they appear on the CLI (or stdin); and the delayed checkout feature -- used when a long-running filter process replies with "status=delayed" -- postpones the checkout of some entries, thus modifying the checkout order.When we have to check out an out-of-order entry and the lstat() cache is invalid (due to a previous path collision), checkout_entry() may end up using the invalid data and [trusting] that the leading components are real directories when, in reality, they are not. In the best case scenario, where the directory was replaced by a regular file, the user will get an error: "fatal: unable to create file 'foo/bar': Not a directory". But if the directory was replaced by a symlink [symbolic link], checkout could actually end up following the symlink and writing the file at a wrong place, even outside the repository. Since delayed checkout is affected by this bug, it could be used by an attacker to write arbitrary files during the clone of a maliciously crafted repository.
Several paths to a fix were considered, including disabling the cache for unordered checkouts or sorting the file names so that they are always processed in the same order. Both of those had performance impacts and there was a concern that other code paths could someday lead to unordered processing, thus reviving the bug. Instead, the cache is simply invalidated whenever a remove-directory operation is performed.
As noted, symbolic links play a role in the ability to exploit the vulnerability. While highly useful, symbolic links have also historically been used to wreak havoc in various ways. They often feature in race condition exploits (e.g. for temporary files) and the like. Not all systems support symbolic links, though Unix-derived systems (Linux, macOS) generally do; these days, Windows administrators can also create symbolic links.
So it is a combination of several different features and situations that lead to an exploitable system—including the existence of an attacker-crafted repository that users need to be convinced into cloning. Even for Windows systems, where both Git LFS and case-insensitive filesystems are the norm, exploits are seemingly not at all common—perhaps even non-existent. This has the look of a problem discovered via code inspection or testing that was subsequently reported and fixed quickly—without even time for a catchy name, logo, and web site. If any systems have been exploited, it seems most probable that the attacks were highly targeted and may not have been discovered (yet).
While Linux-native filesystems are not usually case-insensitive, they can be. Beyond that, though, Linux can make use of native filesystem formats for Windows and macOS that have such functionality. In addition, the test cases provided with the fix show another way to cause the problem: via Unicode normalization. The test case uses two different Unicode representations for "ä" (U+0061 U+0308, "\141\314\210", and U+00e4, "\303\244") to ensure that no files are written to the wrong place. So it may be less likely that Linux systems are affected by the bug, but they are not immune.
| Index entries for this article | |
|---|---|
| Security | Git |
| Security | Race conditions |
