Hardening the "file" utility for Debian

By Jake Edge
August 14, 2019

The file command would seem to be an ideal candidate for sandboxing; it routinely handles untrusted input. But an effort to add seccomp() filtering to file for Debian has run aground. The upstream file project has added support for sandboxing via seccomp() but it does not play well with other parts of the Debian world, package building in particular. This situation provides further evidence that seccomp() filtering is brittle and difficult to use.

The discussion began with a post to the debian-devel mailing list where Christoph Biedl announced that he had enabled the file sandbox feature for the unstable repository. He was asking that other Debian developers keep an eye out for problems. He noted that the feature has some drawbacks:

This however comes with a price: Some features are no longer available. For example, inspecting the content of compressed files (disabled by default, command-line parameters -z and -Z) is now supported for a few compressions only: gzip (and friends, see libz), bzip2, lzma, xz. Decompressing other formats requires invocation of external programs which will lead to a program abort (SIGSYS).

In addition, he had already encountered problems with file running in environments with non-standard libraries that were loaded using the LD_PRELOAD environment variable. Those libraries can (and do) make system calls that the regular file binary does not make; the system calls were disallowed by the seccomp() filter.

Building a Debian package often uses FakeRoot (or fakeroot) to run commands in a way that appears that they have root privileges for filesystem operations—without actually granting any extra privileges. That is done so that tarballs and the like can be created containing files with owners other than the user ID running the Debian packaging tools, for example. Fakeroot maintains a mapping of the "changes" made to owners, groups, and permissions for files so that it can report those to other tools that access them. It does so by interposing a library ahead of the GNU C library (glibc) to intercept file operations.

In order to do its job, fakeroot spawns a daemon (faked) that is used to maintain the state of the changes that programs make inside of the fakeroot. The libfakeroot library that is loaded with LD_PRELOAD will then communicate to the daemon via either System V (sysv) interprocess communication (IPC) calls or by using TCP/IP. Biedl referred to a bug report in his message, where Helmut Grohne had reported a problem with running file inside a fakeroot. The msgget() system call was the cause in that case; Biedl changed the Debian file whitelist to specifically allow that call before his announcement:

Also, when running in LD_PRELOAD environments, that extra library may use blacklisted syscalls. One example is fakeroot which caused breakage in debhelper (#931985, already fixed). In both cases you should see a log message in the kernel log then.

There is a workaround for such situations which is disabling seccomp, command line parameter --no-sandbox.

As it turns out, though, his fix was specific to the sysv IPC mechanism; in order to make it work with TCP/IP, more whitelisting of system calls will be needed, as Grohne pointed out. Furthermore, blocking mechanisms like IPC and networking is just what the filter should be doing; those are the kinds of calls you don't want to make if file is compromised, he said. Instead of playing whack-a-mole with system calls, he suggested checking for the presence of LD_PRELOAD libraries and turning off the sandbox for those cases.

That idea did not sit entirely well with Biedl, who was concerned with "silently disabling this security feature in a production system". He thought that perhaps disabling the filter for build environments might be a way forward. Meanwhile, on debian-devel, several people thanked Biedl for enabling the filter, seeing it as a good step toward helping to secure the system. Russ Allbery said:

I would love to see more places where seccomp is at least present, if optional and off by default, since it provides an option to use the program more securely and accept that it breaks a lot of features. A great example would be ghostscript -- I would love to be able to prevent it from executing programs, writing to arbitrary files, and many other things that, strictly speaking, are part of the PostScript spec and therefore upstream wants to support in the more general case. Everyone who cares about this already has to pass in -dSAFER, so we're already dealing with the complexity cost of this being optional.

But Biedl eventually had to deliver some bad news in the thread. He disabled the system-call filtering in file because of the problems it caused:

Several issues popped up in the last days as a result of that change, and in spite of some band-aiding to current implementation of seccomp in the file program creates way more trouble than I am willing to ignore. So, sadly, I've reverted seccomp support for the time being to avoid further disruption of the bullseye development.

However, he did point out that Grohne had suggested some ideas for ways to make the sandboxing of file more workable. In the bug report, Grohne said:

The issue at hand is that file sets up its sandbox early and thus has to allow a ton of system calls (including open and write) even in the full sandbox. You can easily append to ~/.bashrc and escape the next time a user logs in. I'm arguing that this sandbox is somewhat useless, because it is way too weak. If file were opening the input file and the magic database prior to applying the sandbox, it could support a much stricter sandbox. In principle, it could do with just write (for communicating the result) and _exit. You can implement that using prctl(PR_SET_SECCOMP, SECCOMP_MODE_STRICT, 0, 0, 0), which is available for more than 10 years now. The code for loading (not parsing) the input file and the magic database is relatively harmless and confining it is what breaks fakeroot. The parsing code doesn't need syscalls, so it should be unaffected by most LD_PRELOAD hacks.

Of course, getting there is essentially rewriting the seccomp feature in file. You cannot easily bolt it onto file in the way it currently is.

That is something that will need to be worked out with the upstream project and Biedl said that he plans to do so. There were several suggestions on how to approach the problem in the mailing list thread as well. Colin Watson commiserated with Biedl, reporting on the problems he encountered when adding seccomp() filtering:

I ran into a ton of annoying problems like that when I added seccomp filtering to man-db (the idea there being to limit what damage might be done by potential bugs in groff and friends). The worst difficulties are from third-party programs that some people have installed: there are a couple of apparently fairly popular Linux antivirus tools that work by installing an LD_PRELOAD wrapper that talks to a private daemon using a Unix-domain socket and/or a System V message queue; there's a VPN that does something similar even though it really has no business existing at this level or interfering with processes that have nothing to do with networking; and there's the "snoopy" package in Debian that logs messages to /dev/log on execve.

At the moment my compromise solution is to reluctantly open up the minimum possible set of syscalls I could find that stopped people sending me bug reports that were in fact caused by something injected from outside my software, and to limit most of that to only those cases where I've detected the relevant LD_PRELOAD wrappers as being present.

The fragility of the seccomp() solution extends to glibc and kernel versions, as Vincent Bernat pointed out. Those kinds of problems could be detected through automated testing, Philipp Kern suggested. Biedl said that it is something he is working on.

In file, we have a strong candidate for hardening, as it parses and handles file data that often has unknown origins—textbook untrusted input, in other words. But actually using seccomp() filtering to reduce its attack surface has not been successful for Debian. In truth, hardening programs that are often used in conjunction with LD_PRELOAD is always going to be difficult to impossible. But even just changing the version of glibc (which can potentially change the system calls it makes) or which kernel the tool is running on can invalidate the carefully crafted whitelist.

The OpenBSD pledge() system call provides a different path. Developers can specify which system calls are allowed, but only in broad categories like stdio (file operations, mostly), inet (IPv4 and IPv6 calls), or proc (process calls, such as fork(), but not including execve(), which is governed by the exec category). By not tying the filtering directly to individual system calls, some of the problems that Linux seccomp() users have encountered can be avoided. It also doesn't hurt that the OpenBSD user space is released in lockstep with the kernel.

For its file utility, OpenBSD systematically reduces the privileges that the tool has with multiple pledge() calls. It starts by disallowing all but a handful of categories after processing the command-line arguments. It then forks a process that executes the child() function, which reduces privileges further, eventually to only have stdio and recvfd. The child reads messages from the parent, each of which includes a file descriptor for a file to be tested. In that way, the code that is most at risk for compromise is only able to perform fairly minimal operations.

For Linux, it may well be that seccomp() filtering just isn't suitable for retrofitting onto existing projects. Completely separating the "worrisome" code (file-format parsing for file, for example) from the unavoidable code (e.g. opening files) may provide a path, but also probably means the existing code will have to be rewritten or at least majorly thrashed. The calls that LD_PRELOAD libraries are targeting for interception will likely be in that unavoidable part. Perhaps that could even lead hardened subprocesses to simply use the older, simpler seccomp() mode, as suggested by Grohne. That seems preferable to playing a never-ending game of whack-a-mole.

Index entries for this article
Security	Hardening

Hardening the "file" utility for Debian

Posted Aug 14, 2019 19:19 UTC (Wed) by clugstj (subscriber, #4020) [Link] (20 responses)

At this point, would't it be easier to rewrite "file" from scratch with security in mind instead of trying to use "seccomp()"?

Hardening the "file" utility for Debian

Posted Aug 14, 2019 19:23 UTC (Wed) by josh (subscriber, #17465) [Link] (2 responses)

The biggest problem here isn't the file utility, it's the expectation people have that they can LD_PRELOAD a random library that intercepts function calls and makes different syscalls while maintaining a security sandbox.

Hardening the "file" utility for Debian

Posted Aug 14, 2019 21:40 UTC (Wed) by clugstj (subscriber, #4020) [Link]

No, the problem is that one group of people want functionality (using LD_PRELOAD to do kewl things) and another group want to use "seccomp()" for security. These two "wants" don't play nice together.

Hardening the "file" utility for Debian

Posted Aug 15, 2019 11:31 UTC (Thu) by pbonzini (subscriber, #60935) [Link]

But it seems to me that the same issues would happen with pledge and OpenBSD has fixed them. The article itself hints that you could even use seccomp v1 if the architecture of file is changed to split restricted code into a separate process.

Hardening the "file" utility for Debian

Posted Aug 14, 2019 19:30 UTC (Wed) by juliank (guest, #45896) [Link] (8 responses)

You can't rewrite file with security in mind w/o using seccomp. The point is that there is likely some bug somewhere in the parsing code that can be abused to execute programs or talk to network. seccomp is the only way to prevent that (well, you could switch network and mount namespaces into like empty namespaces, but um, sounds like similarly broken and more work, and not working as user depending on your distro).

Hardening the "file" utility for Debian

Posted Aug 14, 2019 19:53 UTC (Wed) by mathstuf (subscriber, #69389) [Link]

Well, choosing a language other than C is sure to help here. Assuming you have more advanced facilities like parser combinators or PEG grammars actually reading the untrusted code, the lack of open-coded fiddly bits (pointer increment, off-by-one loops, </<= mixups, etc.) is certainly a far better step in the right direction.

Hardening the "file" utility for Debian

Posted Aug 14, 2019 20:09 UTC (Wed) by epa (subscriber, #39769) [Link] (6 responses)

You could write the parsing code in a safe language. Then, if there isn't a call to exec() literally appearing in the source code, there's no way the code can be tricked into calling exec() by overwriting the stack due to a missing bounds check, integer overflow or whatever. There are safe dialects of C which are probably compatible enough for the parsing code to work.

Hardening the "file" utility for Debian

Posted Aug 14, 2019 20:26 UTC (Wed) by juliank (guest, #45896) [Link] (5 responses)

I don't think there are safe languages. Runtime bugs add a lot of unsafety; having another layer of protection is important.

Hardening the "file" utility for Debian

Posted Aug 14, 2019 23:14 UTC (Wed) by roc (subscriber, #30627) [Link] (2 responses)

In practice, if you write the parsing code in Rust or Go and avoid doing something exceptionally stupid like using Rust's "unsafe" keyword, your code will not be exploitable. For evidence, take a look at https://github.com/rust-fuzz/trophy-case and observe how few security bugs there are, and how they involved explicit use of "unsafe".

You can argue it still wouldn't be "safe" for some meaning of the word, but seccomp filters aren't "safe" in those terms either.

Having said that, extra layers of protection are still good and grappling with the issues in this post is still important. In particular, if `file` was written in Rust but users' systems inject C code into it via LD_PRELOAD, then savvy attackers would target that C code. Witness the security vulnerabilities introduced by AV filters over the years.

Hardening the "file" utility for Debian

Posted Aug 18, 2019 3:02 UTC (Sun) by k8to (guest, #15413) [Link] (1 responses)

Your code will not be exploitable by memory overruns, use after free, etc type problems. There are other potential attacks on software.

Hardening the "file" utility for Debian

Posted Aug 18, 2019 5:59 UTC (Sun) by dvdeug (guest, #10998) [Link]

Okay? You can give up on worrying about potential attacks on software, but it seems bizarre to worry about potential attacks on software and ignore the ability to ignore memory overruns, use after free, etc. type problems.

Hardening the "file" utility for Debian

Posted Aug 15, 2019 13:01 UTC (Thu) by epa (subscriber, #39769) [Link]

There may not be any safe languages but there are certainly dangerous ones.

While there are bugs in the Java or .NET runtimes, or other language runtimes, getting an exploit through one of those is usually much harder than the swarm of exploits a C program will contain unless written with exceptional discipline by a highly skilled programmer.

But actually I wasn't really suggesting one of these heavyweight managed languages that pulls along a runtime environment. Rust doesn't have a runtime, for example. The Cyclone programming language is a safer dialect of C which also doesn't have any special run time requirements.

Hardening the "file" utility for Debian

Posted Aug 17, 2019 11:21 UTC (Sat) by dvdeug (guest, #10998) [Link]

There are a wide variety of languages wherein buffer overflows and similar tricks can not run arbitrary code. A Scheme program, for example, can crash due to being out of memory, but will never allow arbitrary code execution. Do Scheme interpreters and compilers, and the libraries they use, have bugs? Sure, but it's like driving a rusted-out car across country instead of flying because "there's no such thing as a safe vehicle".

Hardening the "file" utility for Debian

Posted Aug 14, 2019 20:05 UTC (Wed) by Deleted user 129183 (guest, #129183) [Link] (7 responses)

> At this point, would't it be easier to rewrite "file" from scratch with security in mind instead of trying to use "seccomp()"?

Well, I guess that’s exactly the reason that openBSD has their own, seemingly NIH, implementation of it…

Hardening the "file" utility for Debian

Posted Aug 14, 2019 22:55 UTC (Wed) by wahern (subscriber, #37304) [Link] (6 responses)

OpenBSD had a syscall filtering mechanism, systrace, years before seccomp. Unlike the ptrace-based version of systrace on Linux, BSD systrace was incorporated into the kernel. The problem was that nobody used it--it was too low-level. So after several years systrace was ripped out. pledge and unveil is what came about after people chewed on the problem for a few more years.

The original contributors of seccomp were quite familiar with systrace. seccomp only permits filtering scalars because it was through systrace that it was proven how easy it was to bypass string path filtering unless the kernel copied the string. (Later releases of systrace came with a huge warning that string path filtering wasn't secure.) As I recall, seccomp was originally intended for sandboxing Chrome NaCl, which by design only required read and write from the sandboxed process. seccomp was the minimal amount necessary to put into the kernel to make NaCl work. I don't think Google ever intended seccomp to grow into a general purpose syscall filtering or sandboxing mechanism as it was already obvious from the history of systrace that the low-level semantics don't lend themselves to more sophisticated use cases.

So, no, pledge isn't NIH. seccomp is basically a *worse* systrace, and everybody knew systrace was a dead end.

Another alternative is Capsicum. OpenBSD rejected this for the same reasons the file(1) maintainer hasn't yet refactored their codebase: Capsicum, like the original seccomp, is premised on using a multi-process privilege separation model, which requires alot of work, *especially* for preexisting codebases. Capsicum is great model, but it doesn't prove a viable solution for preexisting code.

Hardening the "file" utility for Debian

Posted Aug 14, 2019 23:46 UTC (Wed) by roc (subscriber, #30627) [Link] (3 responses)

You're mixing up the original seccomp with seccomp-bpf. The original seccomp was designed by Andrea Arcangeli, not Google, and only allowed read/write on open file descriptors. Later Google added seccomp-bpf to reduce exposed kernel attack surface from sandboxed Chrome processes, not just NaCl but also Web content processes.

It's very important to keep in mind that implementing a sandbox with just seccomp is usually a terrible idea. In Linux, kernel namespaces are the best way to construct a sandbox. Then you apply a seccomp filter as an extra layer of defense, to reduce the kernel API attack surface exposed to sandboxed code. This is what Chrome and Firefox do. OpenBSD of course doesn't have kernel namespaces. Comparing pledge() to seccomp-bpf for constructing sandboxes is really a mistake, you should compare pledge() to kernel namespaces (with or without an additional seccomp-bpf layer).

> I don't think Google ever intended seccomp to grow into a general purpose syscall filtering or sandboxing mechanism

Perhaps Andrea Arcangeli didn't, but Google certainly did, otherwise their decision to use BPF to express arbitrary predicates is unfathomable.

Personally I'm pretty glad they did. We use seccomp-bpf for selective syscall interception in rr in a way that a dedicated sandbox API like pledge() would never have supported. That feature is critical for low overhead in rr recording.

Dead-end or not, seccomp-bpf is working in practice for Firefox, Chrome, rr, and others.

Hardening the "file" utility for Debian

Posted Aug 15, 2019 0:34 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

> OpenBSD of course doesn't have kernel namespaces.
They do have unveil()-based file "namespaces".

Hardening the "file" utility for Debian

Posted Aug 15, 2019 1:52 UTC (Thu) by roc (subscriber, #30627) [Link] (1 responses)

unveil() lets you whitelist filesystem paths. I think it's confusing to call that "namespaces". chroot() is more namespace-like.

Hardening the "file" utility for Debian

Posted Aug 15, 2019 2:29 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

That's why I put "namespaces" in scare quotes, because in practice it functions similarly to the unshare()-then-bind-mount trick that systemd and other software use on Linux.

Hardening the "file" utility for Debian

Posted Aug 15, 2019 14:41 UTC (Thu) by Deleted user 129183 (guest, #129183) [Link] (1 responses)

> So, no, pledge isn't NIH.

I was talking about openBSD reimplementation of `file`, not about `pledge`.

Hardening the "file" utility for Debian

Posted Aug 16, 2019 0:52 UTC (Fri) by flussence (guest, #85566) [Link]

I think it's incorrect to say OpenBSD's the one suffering from a Not Invented Here problem here.

Hardening the "file" utility for Debian

Posted Aug 14, 2019 19:38 UTC (Wed) by juliank (guest, #45896) [Link] (8 responses)

nss is also a huge issue for seccomp()-ing, as your libc can now suddenly call different syscalls. We saw that when we added seccomp to apt's downloading processes, and eventually disabled it.

Hardening the "file" utility for Debian

Posted Aug 14, 2019 19:44 UTC (Wed) by josh (subscriber, #17465) [Link] (7 responses)

What would it take to, on Debian, move all NSS name resolution out-of-process into a daemon that loads NSS modules, and have the only in-process name resolution be to call that out-of-process daemon?

(There are complications here, such as chroots and similar, but this would be a reasonable default configuration.)

Hardening the "file" utility for Debian

Posted Aug 14, 2019 20:28 UTC (Wed) by juliank (guest, #45896) [Link]

Hmm, I don't know, I think that's an upstream libc/systemd question. I could imagine a systemd-nssd that provides nss stuff over dbus.

Hardening the "file" utility for Debian

Posted Aug 14, 2019 20:52 UTC (Wed) by dezgeg (subscriber, #92243) [Link] (3 responses)

The solution already exists in glibc: nscd

Hardening the "file" utility for Debian

Posted Aug 14, 2019 21:28 UTC (Wed) by mezcalero (subscriber, #45103) [Link] (2 responses)

nscd is not what you appear to think it is: the nscd client in glibc has a very short time-out, in which case it falls back to traditional, non-nscd client side NSS. It is thus not suitable as a sandboxing solution, and only and exclusively as a cache for speeding things up following the theory that such a daemon makes no sense to block on for a longer time when its purpose is to make sure lookups only take a shorter time. The short time-out is also an effective method to make deadlocks due to local IPC less penalizing.

Hardening the "file" utility for Debian

Posted Aug 14, 2019 22:39 UTC (Wed) by dezgeg (subscriber, #92243) [Link] (1 responses)

Seriously? That is... terrible. Is this actually documented somewhere? Who knows how many setups are (not necessarily intentionally) relying on lookups always going through nscd, not just for sandboxes that might lack the nss libraries but for example not having the 32-bit equivalents of the nss libraries installed. For one, the NixOS distribution relies on nscd for all other nss queries except for the ones included in glibc since there is no global /lib.

Having the NSS libraries loaded into the caller's address space just needs to die. Just this week I had to debug an issue with a (proprietary, but not relevant to discussion) software distributed as binaries with all the libraries bundled. And this broke since some NSS module from the system (with new glibc) needed to be loaded but was using some symbols that didn't exist in the bundled libc. Of course, installing nscd was the solution.

Hardening the "file" utility for Debian

Posted Aug 14, 2019 22:43 UTC (Wed) by juliank (guest, #45896) [Link]

Wow that's terrible

Hardening the "file" utility for Debian

Posted Aug 27, 2019 7:57 UTC (Tue) by cortana (subscriber, #24596) [Link] (1 responses)

See also sssd. Although I don't think it can make use of arbitrary NSS modules itself, rather it just provides a daemon that knows how to talk to IPA, AD, generic LDAP and so on.

Hardening the "file" utility for Debian

Posted Aug 27, 2019 13:25 UTC (Tue) by Jonno (subscriber, #49613) [Link]

> Although I don't think it can make use of arbitrary NSS modules itself,

Actually it can. By using the "proxy" id_provider sssd will use a specified nss library as a backend.

Unfortunately the sssd nss service does not support all NSS databases, so using sssd is not a complete solution (sssd_nss can provide passwd, shadow, group, netgroup and services; but not hosts, networks, protocols, ethers, or rpc).

Hardening the "file" utility for Debian

Posted Aug 14, 2019 20:41 UTC (Wed) by zblaxell (subscriber, #26385) [Link] (4 responses)

Why not just unset LD_PRELOAD before calling file? Unless I missed
something, file doesn't need to participate in fakeroot stat mangling
(or NSS for that matter--file doesn't translate uids to names or
interact with hostnames or URLs). Detect LD_PRELOAD in file's
main function, unset it, and re-exec file so seccomp works properly.

Allowing LD_PRELOAD to propagate from a low-privileged context to a
high-privileged one is obviously a bad idea--which is why it's disabled
for setuid programs. Why does allowing LD_PRELOAD to propagate from a
high-privileged context to a low-privileged one seem like a good idea?
Sounds like it's just asking for exactly the kind of problems listed
above. If you're going to sandbox a process, that should include
deleting most of its environment variables, so that you can predict what
environment it's going to run in. That conflicts with the fundamental
ideas behind fakeroot and NSS, but...well, maybe they weren't particularly
good ideas anyway.

fakeroot is a fun hack and all, but maybe there are better ways to solve
the underlying problem? e.g. run packaging tasks on a FUSE filesystem
that acts like the user is doing everything as root, and patch the two
or three packages that still do euid == 0 checks during build. That
approach is going to work for things that aren't running on top of the
C library, too.

NSS is a real annoyance: you try to map a UID to a name, and
suddenly your thread blocks for network IO, or crashes, or gets RCE
vulnerabilities, because someone configured NSS to do something dumb, and
unprivileged users don't get an easy way to turn it off. There should be
an easy way to turn NSS off per process--users can already LD_PRELOAD in
a sane implementation of getpwnam(), getpwuid(), and so on, so it does
not introduce any new vulnerabilities to have an environment variable
(ignored for setuid programs) that lets users override nsswitch.conf
more conveniently.

Hardening the "file" utility for Debian

Posted Aug 15, 2019 9:58 UTC (Thu) by cjwatson (subscriber, #7322) [Link]

Unsetting LD_PRELOAD doesn't even solve all the preloading problems: for better or worse, people are apparently using antivirus tools and VPNs that inject themselves using /etc/ld.so.preload. Convincing ld.so to enter "secure mode" would help, but as far as I know all the methods for doing that involve being privileged in some way.

Hardening the "file" utility for Debian

Posted Aug 15, 2019 17:01 UTC (Thu) by rwmj (subscriber, #5474) [Link] (2 responses)

I'm also inclined to think fakeroot is the root cause of the problem here, rather than file or seccomp. Other distros manage to package broadly the same set of packages as Debian and they don't use fakeroot. Instead the package builder uses a combination of DESTDIR and added metadata marking the desired ownership and permission on installed files.

Hardening the "file" utility for Debian

Posted Aug 15, 2019 18:14 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

Indeed. How do Go programs (which don't call libc) expect to be affected by `fakeroot`?

Hardening the "file" utility for Debian

Posted Aug 16, 2019 7:06 UTC (Fri) by smcv (subscriber, #53363) [Link]

Yes, fakeroot is the problem here, and Debian is moving away from it. Packages with the "Rules-Requires-Root: no" field are built without fakeroot or similar tricks, which works for packages where every file in the .deb is owned by root:root (including those that use dpkg-statoverride to chown a file during installation, like dbus). This is opt-in because it's potentially a backwards-incompatible change.

My understanding is that for the minority of packages that contain files with different ownership (for example audit and shadow), there are plans for some sort of declarative file-ownership metadata (analogous to RPM does it), but that isn't available yet.

Hardening the "file" utility for Debian

Posted Aug 14, 2019 21:44 UTC (Wed) by mezcalero (subscriber, #45103) [Link] (5 responses)

Well, I think the lesson to learn here is that LD_PRELOAD is a crutch, a hacker tool and should not be used for clean codepaths. It's not just incompatible with seccomp style stuff, but totally incompatible with anything involving suid binaries or fcaps or such. I am not sure why anyone would bother with making seccomp work with LD_PRELOAD if not even suid works with it...

While I generally do agree that maintaining seccomp policies is nastier than people might think I also think it's managable if you are careful. Specifically, seccomp policies that trigger SIGSYS are a really bad idea, as are syscall blacklists. If you stick to whitelists and stick to returning EPERM for unlisted syscalls you should mostly be fine, as most code that runs in environments it doesn't know well (i.e. NSS module code, library code, or even LD_PRELOAD hacks) tends to be written carefully enough to handle EPERM in a graceful way. Moreover, new syscalls added to the kernel this way also return EPERM and most code using such new syscalls tends to have fallbacks in place anyway to support slightly older kernels, and these codepaths are triggered then. In addition in a world of SELinux and AppArmor apps are vaguely prepared to getting EPERM/EACCES from various places already, thus getting them from some syscalls is fine too.

In systemd we started out with blacklisting and our logic defaults to triggering SIGSYS. Today we know we probably should not even have bothered with blacklisting at all, nor with SIGSYS because it's unmanagable, but we didn't know that when we first added support. It appears the folks who started this work in the 'file' tool made the same mistakes...

(Oh, and grouping syscalls is kinda important too: ideally libseccomp would even do that on its own. Policies shouldn't need to spell all 4 syscalls for sending a datagram individually nor the 7 syscalls for changing ownership of a file. In systemd we defined our own grouping to make this managable, but this sounds like a concept to have in libseccomp itself. With such grouping you get the coarseness that pledge() provides.)

Lennart

Hardening the "file" utility for Debian

Posted Aug 14, 2019 22:46 UTC (Wed) by juliank (guest, #45896) [Link] (1 responses)

My idea was to return ENOSYS for blocked syscalls. This should have one advantage that it does not break when libc migrates to a newer syscall, as it silently falls back to the old one.

Grouping is a bit tricky maybe. It's likely that there are different bugs in different variants of the same syscall, so it might make sense to only allow the latest one.

Hardening the "file" utility for Debian

Posted Aug 15, 2019 17:01 UTC (Thu) by luto (guest, #39314) [Link]

One could use SIGSYS but catch the SIGSYS, log it, and emulate -ENOSYS.

I once wrote a library for this, but I wrote it as a patch to libseccomp that was a bit out of place. I should just release it standalone.

Hardening the "file" utility for Debian

Posted Aug 14, 2019 23:01 UTC (Wed) by roc (subscriber, #30627) [Link] (2 responses)

Hear hear! LD_PRELOAD is simply not a reliable tool for intercepting syscalls in a production system. Using LD_PRELOAD for syscall interception completely fails if the application does raw syscalls in its own code, and composition is terrible so trying to compose multiple LD_PRELOAD interceptors together invariably fails.

It would be nice to have a composable, fast, reliable user-space syscall interception mechanism. LD_PRELOAD isn't it.

Hardening the "file" utility for Debian

Posted Aug 19, 2019 14:59 UTC (Mon) by bpearlmutter (subscriber, #14693) [Link] (1 responses)

The fakeroot-ng program uses ptrace instead of library hacking, and might meet all your desiderata.

Hardening the "file" utility for Debian

Posted Aug 24, 2019 12:41 UTC (Sat) by xilun (guest, #50638) [Link]

Can ptrace have multiple users targeting a process now? If not I don't see how it is composable. And it is even not debuggable...

Hardening the "file" utility for Debian

Posted Aug 14, 2019 22:47 UTC (Wed) by jamesmorris (subscriber, #82698) [Link]

seccomp is useful for reducing the attack surface of the kernel (i.e. restrict access to only required syscalls), but it's not intended as a least privilege mechanism or sandboxing on its own. seccomp operates at the wrong abstraction level, as evidenced by comments about having to specify 4 syscalls for one type of operation. The LSM API with higher level policies is a better fit for least priv, as you can utilize all security relevant information at an operation-focused granularity for policy enforcement.

Hardening the "file" utility for Debian

Posted Aug 15, 2019 10:42 UTC (Thu) by rbranco (subscriber, #129813) [Link] (2 responses)

echo FROM scratch | podman build -t scratch -f - .

alias ldd='podman run --rm -v /:/:ro --net=none --env-file <(env) scratch ldd'
alias file='podman run --rm -v /:/:ro --net=none --env-file <(env) scratch file'

Hardening the "file" utility for Debian

Posted Aug 22, 2019 6:24 UTC (Thu) by Siosm (subscriber, #86882) [Link] (1 responses)

Nice one! Here is similar one with Bubblewrap:


alias file='bwrap --unshare-all --ro-bind / / --dev /dev --tmpfs /tmp --tmpfs /proc --tmpfs /sys -- file'


alias ldd='bwrap --unshare-all --ro-bind / / --dev /dev --tmpfs /tmp --tmpfs /proc --tmpfs /sys -- ldd'

Hardening the "file" utility for Debian

Posted Aug 22, 2019 8:55 UTC (Thu) by rbranco (subscriber, #129813) [Link]

My trick doesn't work anymore I don't know why. I just can't mount the root filesystem anymore with podman. Also, the -v /:/:ro is problematic because recursive bind mounts don't mount read-only other filesystems on / Code would have to detect the filesystems (ideally from /etc/fstab, skip pseudo-filesystems and network filesystems), and mount them all readonly as a volume using the option bind-nonrecursive.

Hardening the "file" utility for Debian

Posted Aug 15, 2019 14:59 UTC (Thu) by joey (guest, #328) [Link]

There are quite a few programs besides `file` that use libmagic. It seems likely that some of them are easier to exploit than `file` because they accept untrusted input from the network and pass it directly to libmagic.

Hardening libmagic will benefit all of them, while hardening `file` does not.