Symbolic links in "sticky" directories
Security problems that exploit badly written programs by placing symbolic links in /tmp are legion. This kind of flaw has existed in applications going back to the dawn of UNIX time, and new ones get introduced regularly. So a recent effort to change the kernel to avoid these kinds of problems would seem, at first glance anyway, to be welcome. But some kernel hackers are not convinced that the core kernel should be fixing badly written applications.
These /tmp symlink races are in a class of security vulnerabilities known as time-of-check-to-time-of-use (TOCTTOU) bugs. For /tmp files, typically a buggy application will check to see if a particular filename exists and/or if the file has a particular set of characteristics; if the file passes that test, the program uses it. An attacker exploits this by racing to put a symbolic link or different file in /tmp between the time of the check and the open or create. That allows the attacker to bypass whatever the checks are supposed to enforce.
For programs with normal privilege levels, these attacks can cause a variety of problems, but don't lead to system compromise. But for setuid programs, an attacker can use the elevated privileges to overwrite arbitrary files in ways that can lead to all manner of ugliness, including complete compromise via privilege escalation. There are various guides that describe how to avoid writing code with this kind of vulnerability, but the flaw still gets reported frequently.
Ubuntu security team member Kees Cook proposed changing the kernel to avoid the problem, not by removing the race, but by stopping programs from following the symlinks that get created. "Proper" fixes in applications will completely avoid the race by creating random filenames that get opened with O_CREAT|O_EXCL. But, since these problems keep cropping up after multiple decades of warnings, perhaps another approach is in order. Cook adapted code from the Openwall and grsecurity kernels that did just that.
Since the problems occur in shared directories (like /tmp and /var/tmp) which are world-writable, but with the "sticky bit" turned on so that users can only delete their own files, the patch restricts the kinds of symlinks that can be followed in sticky directories. In order for a symlink in a sticky directory to be followed, it must either be owned by the follower, or the directory and symlink must have the same owner. Since shared temporary directories are typically owned by root, and random attackers cannot create symlinks owned by root, this would eliminate the problems caused by /tmp file symlink races.
The first version of the patch elicited a few suggestions, and an ACK by Serge Hallyn, but no complaints. Cook obviously did a fair amount of research into the problem and anticipated some objections from earlier linux-kernel discussions, which he linked to in the post. He also linked to a list of 243 CVE entries that mention /tmp—not all are symlink races, but many of them are. When Cook revised and reposted the patch, though, a few complaints cropped up.
For one thing, Cook had anticipated that VFS developers would object to putting his test into that code, so he put it into the capabilities checks (cap_inode_follow_link()) instead. That didn't sit well with Eric Biederman, who said:
Alan Cox agreed that it should go into
SELinux or some specialized Linux security module (LSM). He also suggested
that giving each user their own /tmp mountpoint would solve the
problem as well, without requiring any kernel changes: "Give your users their own /tmp. No kernel mods, no misbehaviours, no
weirdomatic path walking hackery. No kernel patch needed that I can
see.
"
But Cook and others are not convinced that there are any legitimate applications that require the ability to follow these kinds of symlinks. Given that following them has been a source of serious security holes, why not just fix it once and for all in the kernel? One could argue that changing the behavior would violate the POSIX standard—one of the objections Cook anticipated—but that argument may be a bit weak. Ted Ts'o believes that POSIX doesn't really apply because the sticky bit isn't in the standard:
Per-user /tmp directories might solve the problem, but come with an administrative burden of their own. Eric Paris notes that it might be a better solution, but it doesn't come for free:
Ts'o agrees: "I do have a slight preference against per-user /tmp mostly because it
gets confusing for administrators, and because it could be used by
rootkits to hide themselves in ways that would be hard for most system
administrators to find.
" Based on that and other comments, Cook
revised the patches again, moving the test
into VFS, rather than trying to come in through the security subsystem.
In addition, he changed the code so that the new behavior defaulted "off" to address one of the bigger objections. Version 3 of the patch was posted on June 1, and has so far only seen comments from Al Viro, who doesn't seem convinced of the need for the change, but was nevertheless discussing implementation details.
It may be that Viro and other filesystem developers—Christoph Hellwig did not seem particularly in favor of the change for example—will oppose this change. It is, at some level, a band-aid to protect poorly written applications, but it also provides a measure of protection that some would like to have. As Cook pointed out, the Ubuntu kernel already has this protection, but he would like to see that protection extended to all kernel users. Whether that happens remains to be seen.
| Index entries for this article | |
|---|---|
| Kernel | Security |
| Security | Linux kernel |
| Security | Race conditions |
| Security | Vulnerabilities/Temporary files |
Posted Jun 3, 2010 10:35 UTC (Thu)
by nix (subscriber, #2304)
[Link] (9 responses)
Also, as you pointed out, fixing every application (including all the binary-only ones) or educating every single developer is impractical: it hasn't happened in decades and we couldn't get everyone to upgrade even if we did fix it. At least if the kernel was blocking this we'd only need to get people to upgrade to that kernel once, and then this class of problems would be history.
(I just checked at my workplace, where fifty-odd developers write Unix financial server apps. Not one of them knew what TOCTTOU races were. A single one had heard of symlink attacks in /tmp, but wasn't clear how you avoided them. More than half of them had written code that writes to /tmp for one reason or another. If we are trying to fix this by educating developers, we are failing.)
Posted Jun 3, 2010 10:52 UTC (Thu)
by djm (subscriber, #11651)
[Link]
Did you miss the entire wakelocks/suspend blockers discussion? (I do agree with you though)
Posted Jun 3, 2010 13:47 UTC (Thu)
by dwheeler (guest, #1216)
[Link] (7 responses)
We've been waiting for several decades for these magical developers who never, ever, ever make a mistake.
We will wait forever.
After all, information on how to write secure programs on Linux is widely available; see my
Secure Programming for Linux and Unix HOWTO, for example.
It's time to change our systems so that the vulnerabilities cannot happen in the first place.
Making the system invulnerable to symlink attacks via /tmp,
harder to exploit via buffer overflows,
Fixing Unix/Linux filenames,
are all part of that.
Posted Jun 3, 2010 19:17 UTC (Thu)
by wahern (subscriber, #37304)
[Link] (6 responses)
In fact, TOCTTOU is just a subset of atomicity bugs. Thus, anyone who doesn't know how to avoid symlink attacks is almost certainly committing a litany of other errors. In any multitasking environment one shouldn't try to memorize when and where to worry about atomicity gotchas, but when and where one doesn't need to worry. In other words, developers don't need to remember about "symlink attacks", because unless they can affirmatively exclude the possibility of concurrent operations, it must be assumed, and they should endeavor to prove that their code is appropriately safe.
IMHO, a better approach to this particular problem would be to add a check to valgrind for unsafe sticky-bit-directory operations (e.g. not using O_EXCL with O_CREAT, using a fixed name, etc.). Sadly, too few developers use valgrind, which is an even more pressing issue.
The enduring solution to the problem is something called "open source"--allowing careful users to analyze correctness--and something called "free software"--allowing conscientious users to fix any problem and share his solution with others without encumbrance. Add to this mix tools like valgrind which can--or can be modified to--detect these problems, and you have a reliable and flexible solution for not just this but many others problems.
I'm as pragmatic as the next guy, but papering over these kinds of mistakes is just a bad idea.
Posted Jun 4, 2010 5:11 UTC (Fri)
by dwheeler (guest, #1216)
[Link]
"Preventing all possible TOCTTOU bugs" isn't necessary or even relevant for this patch to be useful. You don't have to prevent imperfect developers from making EVERY bug for bug prevention to be useful. Simply find the more common kinds of errors that CAN be prevented, and try to prevent them. After a while there will be diminishing returns, but we aren't there today.
I haven't seen anyone argue that they NEED this particular symlink functionality, other than to implement attacks.
> IMHO, a better approach to this particular problem would be to add a check to valgrind for unsafe sticky-bit-directory operations (e.g. not using O_EXCL with O_CREAT, using a fixed name, etc.). Sadly, too few developers use valgrind, which is an even more pressing issue.
Of course, that "better approach" fails for the reason you gave: It is possible (and likely) that a developer will not use valgrind.
It also fails because valgrind only detects problems if your tests happen to use that functionality. If it's not in the test suite that you use with valgrind, you still miss the problem even IF you use valgrind.
Valgrind definitely has its uses, but it's nowhere near as effective in this case.
> I'm as pragmatic as the next guy, but papering over these kinds of mistakes is just a bad idea.
This is not "papering over". This is "limiting damage" or "preventing disaster". Preventing the opening of symlinks in this case doesn't prevent all problems, of course, and in fact, the software still won't work quite as originally intended. But it greatly reduces damage, giving developers (1) warning that they have a problem, and (2) time to fix it.
Obligatory car analogy: This is like a seat belt in a car. Clearly, it's better to not drive your car run into a wall or another car. But several decades have shown conclusively that we can't prevent all accidents. We should *certainly* try to reduce accidents further when we can. But we need to be reducing the damage caused when accidents happen.
I suggest taking a look at "Normal Accident theory". Increasingly, people are realizing that accidents are NORMAL in complicated systems. We should prevent them where we can, but where we can't, we need to reduce their consequences.
Posted Jun 4, 2010 5:27 UTC (Fri)
by dwheeler (guest, #1216)
[Link]
Posted Jun 4, 2010 13:24 UTC (Fri)
by jschrod (subscriber, #1646)
[Link] (3 responses)
Sorry, but I don't consider it "as pragmatic as the next guy" to argue against stopping the most common class of problems by pointing out that there are multiple similar problem classes that actually happens much less often. (Empirical data: symlink attack is the most common atomicity problem in CVE database.)
Posted Jun 8, 2010 20:42 UTC (Tue)
by nix (subscriber, #2304)
[Link] (2 responses)
I find it hard to believe that this is uncommon.
Posted Jun 8, 2010 23:46 UTC (Tue)
by jschrod (subscriber, #1646)
[Link] (1 responses)
The only place where I see that they meet concurrency and all its associated problems are Java web applications with session state. And then they mark all methods of the respective session bean classes as "synchronized", without ever analyzing if it's needed or if it's sufficient. Well, most of them wouldn't know how to analyze it in the first place; cargo cult programming at its best.
And that's not limited to specific customers; I make the same observation in finance (not the equity departments, though), automotive, and telco companies.
Posted Jun 9, 2010 12:07 UTC (Wed)
by nix (subscriber, #2304)
[Link]
Posted Jun 3, 2010 10:43 UTC (Thu)
by mikachu (guest, #5333)
[Link] (3 responses)
This is a false statement, unless I'm missing something subtle.
The symlink hardlinked can be anywhere on the same partition on /tmp, so it's somewhat mitigated if /tmp is its own filesystem. (note the third step is performed by a non-root user).
Posted Jun 3, 2010 11:26 UTC (Thu)
by spender (guest, #23067)
[Link] (2 responses)
-Brad
Posted Jun 3, 2010 12:41 UTC (Thu)
by mikachu (guest, #5333)
[Link] (1 responses)
Posted Jun 3, 2010 23:08 UTC (Thu)
by kees (subscriber, #27264)
[Link]
Posted Jun 3, 2010 16:01 UTC (Thu)
by clugstj (subscriber, #4020)
[Link] (1 responses)
The disagreement seems to me to be the usual one: one group wants the problem fixed NOW (the proposed solution is good enough), the other wants it fixed CORRECTLY (the proposed solution is more offensive than the problem, or could result in other as-yet-unforeseen problems, or could prevent the implementation of a better solution).
Posted Jun 3, 2010 16:54 UTC (Thu)
by spender (guest, #23067)
[Link]
-Brad
Symbolic links in "sticky" directories
But some kernel hackers are not convinced that the core kernel should be fixing badly written applications.
I haven't seen anyone mentioning this, which is good, because anyone who did would be a fool. The 'badly written' code in question works perfectly well and securely everywhere except in writable-by-attacker directories (which pretty much means sticky-bitted ones in practice), so what we really have here is a Unix API which works fine everywhere except in /tmp, and appears to work even there. Are we surprised that developers don't notice this? It's an API that requires extra effort to use correctly, and whose failure is invisible until a malicious attacker exploits it: of course they get it wrong!
Symbolic links in "sticky" directories
> > be fixing badly written applications.
>
> I haven't seen anyone mentioning this, which is good, because anyone who did would be a fool
Waiting for perfect appliation code == stupid plan
Waiting for perfect appliation code == stupid plan
Waiting for perfect application code == stupid plan
I suggest taking a look at
"Security Enhancements in Red Hat Enterprise Linux" by Ulrich Drepper.
He describes a set of changes to ELF layouts and various restrictions
that end up greatly reducing the vulnerabilities of systems even when programs have bugs (as they always do).
"Disruptions are still possible, but the severity of the attacks is significantly reduce[d]".
Waiting for perfect appliation code == stupid plan
Waiting for perfect appliation code == stupid plan
Waiting for perfect appliation code == stupid plan
Waiting for perfect appliation code == stupid plan
Waiting for perfect appliation code == stupid plan
Symbolic links in "sticky" directories
# mkdir tmp; chmod 777 tmp; chmod +t tmp; cd tmp
# ln -s /etc/shadow rootapprovedlink
$ ln rootapprovedlink omghax
# echo hello > omghax
Symbolic links in "sticky" directories
Symbolic links in "sticky" directories
Symbolic links in "sticky" directories
Symbolic links in "sticky" directories
Symbolic links in "sticky" directories
