LWN: Comments on "Hardening the "file" utility for Debian"

Hardening the "file" utility for Debian

Jonno — Tue, 27 Aug 2019 13:25:03 +0000

> Although I don't think it can make use of arbitrary NSS modules itself,

Actually it can. By using the "proxy" id_provider sssd will use a specified nss library as a backend.

Unfortunately the sssd nss service does not support all NSS databases, so using sssd is not a complete solution (sssd_nss can provide passwd, shadow, group, netgroup and services; but not hosts, networks, protocols, ethers, or rpc).

Hardening the "file" utility for Debian

cortana — Tue, 27 Aug 2019 07:57:05 +0000

See also sssd. Although I don't think it can make use of arbitrary NSS modules itself, rather it just provides a daemon that knows how to talk to IPA, AD, generic LDAP and so on.

Hardening the "file" utility for Debian

xilun — Sat, 24 Aug 2019 12:41:08 +0000

Can ptrace have multiple users targeting a process now? If not I don't see how it is composable. And it is even not debuggable...

Hardening the "file" utility for Debian

rbranco — Thu, 22 Aug 2019 08:55:25 +0000

My trick doesn't work anymore I don't know why. I just can't mount the root filesystem anymore with podman. Also, the -v /:/:ro is problematic because recursive bind mounts don't mount read-only other filesystems on / Code would have to detect the filesystems (ideally from /etc/fstab, skip pseudo-filesystems and network filesystems), and mount them all readonly as a volume using the option bind-nonrecursive.

Hardening the "file" utility for Debian

Siosm — Thu, 22 Aug 2019 06:24:07 +0000

Nice one! Here is similar one with Bubblewrap:


alias file='bwrap --unshare-all --ro-bind / / --dev /dev --tmpfs /tmp --tmpfs /proc --tmpfs /sys -- file'


alias ldd='bwrap --unshare-all --ro-bind / / --dev /dev --tmpfs /tmp --tmpfs /proc --tmpfs /sys -- ldd'

Hardening the "file" utility for Debian

bpearlmutter — Mon, 19 Aug 2019 14:59:52 +0000

The fakeroot-ng program uses ptrace instead of library hacking, and might meet all your desiderata.

Hardening the "file" utility for Debian

dvdeug — Sun, 18 Aug 2019 05:59:33 +0000

Okay? You can give up on worrying about potential attacks on software, but it seems bizarre to worry about potential attacks on software and ignore the ability to ignore memory overruns, use after free, etc. type problems.

Hardening the "file" utility for Debian

k8to — Sun, 18 Aug 2019 03:02:15 +0000

Your code will not be exploitable by memory overruns, use after free, etc type problems. There are other potential attacks on software.

Hardening the "file" utility for Debian

dvdeug — Sat, 17 Aug 2019 11:21:19 +0000

There are a wide variety of languages wherein buffer overflows and similar tricks can not run arbitrary code. A Scheme program, for example, can crash due to being out of memory, but will never allow arbitrary code execution. Do Scheme interpreters and compilers, and the libraries they use, have bugs? Sure, but it's like driving a rusted-out car across country instead of flying because "there's no such thing as a safe vehicle".

Hardening the "file" utility for Debian

smcv — Fri, 16 Aug 2019 07:06:28 +0000

Yes, fakeroot is the problem here, and Debian is moving away from it. Packages with the "Rules-Requires-Root: no" field are built without fakeroot or similar tricks, which works for packages where every file in the .deb is owned by root:root (including those that use dpkg-statoverride to chown a file during installation, like dbus). This is opt-in because it's potentially a backwards-incompatible change.

My understanding is that for the minority of packages that contain files with different ownership (for example audit and shadow), there are plans for some sort of declarative file-ownership metadata (analogous to RPM does it), but that isn't available yet.

Hardening the "file" utility for Debian

flussence — Fri, 16 Aug 2019 00:52:02 +0000

I think it's incorrect to say OpenBSD's the one suffering from a Not Invented Here problem here.

Hardening the "file" utility for Debian

mathstuf — Thu, 15 Aug 2019 18:14:03 +0000

Indeed. How do Go programs (which don't call libc) expect to be affected by `fakeroot`?

Hardening the "file" utility for Debian

rwmj — Thu, 15 Aug 2019 17:01:48 +0000

I'm also inclined to think fakeroot is the root cause of the problem here, rather than file or seccomp. Other distros manage to package broadly the same set of packages as Debian and they don't use fakeroot. Instead the package builder uses a combination of DESTDIR and added metadata marking the desired ownership and permission on installed files.

Hardening the "file" utility for Debian

luto — Thu, 15 Aug 2019 17:01:16 +0000

One could use SIGSYS but catch the SIGSYS, log it, and emulate -ENOSYS.

I once wrote a library for this, but I wrote it as a patch to libseccomp that was a bit out of place. I should just release it standalone.

Hardening the "file" utility for Debian

joey — Thu, 15 Aug 2019 14:59:43 +0000

There are quite a few programs besides `file` that use libmagic. It seems likely that some of them are easier to exploit than `file` because they accept untrusted input from the network and pass it directly to libmagic.

Hardening libmagic will benefit all of them, while hardening `file` does not.

Hardening the "file" utility for Debian

Deleted user 129183 — Thu, 15 Aug 2019 14:41:26 +0000

> So, no, pledge isn't NIH.

I was talking about openBSD reimplementation of `file`, not about `pledge`.

Hardening the "file" utility for Debian

epa — Thu, 15 Aug 2019 13:01:17 +0000

There may not be any safe languages but there are certainly dangerous ones.

While there are bugs in the Java or .NET runtimes, or other language runtimes, getting an exploit through one of those is usually much harder than the swarm of exploits a C program will contain unless written with exceptional discipline by a highly skilled programmer.

But actually I wasn't really suggesting one of these heavyweight managed languages that pulls along a runtime environment. Rust doesn't have a runtime, for example. The Cyclone programming language is a safer dialect of C which also doesn't have any special run time requirements.

Hardening the "file" utility for Debian

pbonzini — Thu, 15 Aug 2019 11:31:21 +0000

But it seems to me that the same issues would happen with pledge and OpenBSD has fixed them. The article itself hints that you could even use seccomp v1 if the architecture of file is changed to split restricted code into a separate process.

Hardening the "file" utility for Debian

rbranco — Thu, 15 Aug 2019 10:42:15 +0000

echo FROM scratch | podman build -t scratch -f - .

alias ldd='podman run --rm -v /:/:ro --net=none --env-file <(env) scratch ldd'
alias file='podman run --rm -v /:/:ro --net=none --env-file <(env) scratch file'

Hardening the "file" utility for Debian

cjwatson — Thu, 15 Aug 2019 09:58:02 +0000

Unsetting LD_PRELOAD doesn't even solve all the preloading problems: for better or worse, people are apparently using antivirus tools and VPNs that inject themselves using /etc/ld.so.preload. Convincing ld.so to enter "secure mode" would help, but as far as I know all the methods for doing that involve being privileged in some way.

Hardening the "file" utility for Debian

Cyberax — Thu, 15 Aug 2019 02:29:56 +0000

That's why I put "namespaces" in scare quotes, because in practice it functions similarly to the unshare()-then-bind-mount trick that systemd and other software use on Linux.

Hardening the "file" utility for Debian

roc — Thu, 15 Aug 2019 01:52:48 +0000

unveil() lets you whitelist filesystem paths. I think it's confusing to call that "namespaces". chroot() is more namespace-like.

Hardening the "file" utility for Debian

Cyberax — Thu, 15 Aug 2019 00:34:50 +0000

> OpenBSD of course doesn't have kernel namespaces.
They do have unveil()-based file "namespaces".

Hardening the "file" utility for Debian

roc — Wed, 14 Aug 2019 23:46:19 +0000

You're mixing up the original seccomp with seccomp-bpf. The original seccomp was designed by Andrea Arcangeli, not Google, and only allowed read/write on open file descriptors. Later Google added seccomp-bpf to reduce exposed kernel attack surface from sandboxed Chrome processes, not just NaCl but also Web content processes.

It's very important to keep in mind that implementing a sandbox with just seccomp is usually a terrible idea. In Linux, kernel namespaces are the best way to construct a sandbox. Then you apply a seccomp filter as an extra layer of defense, to reduce the kernel API attack surface exposed to sandboxed code. This is what Chrome and Firefox do. OpenBSD of course doesn't have kernel namespaces. Comparing pledge() to seccomp-bpf for constructing sandboxes is really a mistake, you should compare pledge() to kernel namespaces (with or without an additional seccomp-bpf layer).

> I don't think Google ever intended seccomp to grow into a general purpose syscall filtering or sandboxing mechanism

Perhaps Andrea Arcangeli didn't, but Google certainly did, otherwise their decision to use BPF to express arbitrary predicates is unfathomable.

Personally I'm pretty glad they did. We use seccomp-bpf for selective syscall interception in rr in a way that a dedicated sandbox API like pledge() would never have supported. That feature is critical for low overhead in rr recording.

Dead-end or not, seccomp-bpf is working in practice for Firefox, Chrome, rr, and others.

Hardening the "file" utility for Debian

roc — Wed, 14 Aug 2019 23:14:19 +0000

In practice, if you write the parsing code in Rust or Go and avoid doing something exceptionally stupid like using Rust's "unsafe" keyword, your code will not be exploitable. For evidence, take a look at https://github.com/rust-fuzz/trophy-case and observe how few security bugs there are, and how they involved explicit use of "unsafe".

You can argue it still wouldn't be "safe" for some meaning of the word, but seccomp filters aren't "safe" in those terms either.

Having said that, extra layers of protection are still good and grappling with the issues in this post is still important. In particular, if `file` was written in Rust but users' systems inject C code into it via LD_PRELOAD, then savvy attackers would target that C code. Witness the security vulnerabilities introduced by AV filters over the years.

Hardening the "file" utility for Debian

roc — Wed, 14 Aug 2019 23:01:40 +0000

Hear hear! LD_PRELOAD is simply not a reliable tool for intercepting syscalls in a production system. Using LD_PRELOAD for syscall interception completely fails if the application does raw syscalls in its own code, and composition is terrible so trying to compose multiple LD_PRELOAD interceptors together invariably fails.

It would be nice to have a composable, fast, reliable user-space syscall interception mechanism. LD_PRELOAD isn't it.

Hardening the "file" utility for Debian

wahern — Wed, 14 Aug 2019 22:55:50 +0000

OpenBSD had a syscall filtering mechanism, systrace, years before seccomp. Unlike the ptrace-based version of systrace on Linux, BSD systrace was incorporated into the kernel. The problem was that nobody used it--it was too low-level. So after several years systrace was ripped out. pledge and unveil is what came about after people chewed on the problem for a few more years.

The original contributors of seccomp were quite familiar with systrace. seccomp only permits filtering scalars because it was through systrace that it was proven how easy it was to bypass string path filtering unless the kernel copied the string. (Later releases of systrace came with a huge warning that string path filtering wasn't secure.) As I recall, seccomp was originally intended for sandboxing Chrome NaCl, which by design only required read and write from the sandboxed process. seccomp was the minimal amount necessary to put into the kernel to make NaCl work. I don't think Google ever intended seccomp to grow into a general purpose syscall filtering or sandboxing mechanism as it was already obvious from the history of systrace that the low-level semantics don't lend themselves to more sophisticated use cases.

So, no, pledge isn't NIH. seccomp is basically a *worse* systrace, and everybody knew systrace was a dead end.

Another alternative is Capsicum. OpenBSD rejected this for the same reasons the file(1) maintainer hasn't yet refactored their codebase: Capsicum, like the original seccomp, is premised on using a multi-process privilege separation model, which requires alot of work, *especially* for preexisting codebases. Capsicum is great model, but it doesn't prove a viable solution for preexisting code.

Hardening the "file" utility for Debian

jamesmorris — Wed, 14 Aug 2019 22:47:53 +0000

seccomp is useful for reducing the attack surface of the kernel (i.e. restrict access to only required syscalls), but it's not intended as a least privilege mechanism or sandboxing on its own. seccomp operates at the wrong abstraction level, as evidenced by comments about having to specify 4 syscalls for one type of operation. The LSM API with higher level policies is a better fit for least priv, as you can utilize all security relevant information at an operation-focused granularity for policy enforcement.

Hardening the "file" utility for Debian

juliank — Wed, 14 Aug 2019 22:46:40 +0000

My idea was to return ENOSYS for blocked syscalls. This should have one advantage that it does not break when libc migrates to a newer syscall, as it silently falls back to the old one.

Grouping is a bit tricky maybe. It's likely that there are different bugs in different variants of the same syscall, so it might make sense to only allow the latest one.

Hardening the "file" utility for Debian

juliank — Wed, 14 Aug 2019 22:43:19 +0000

Wow that's terrible

Hardening the "file" utility for Debian

dezgeg — Wed, 14 Aug 2019 22:39:17 +0000

Seriously? That is... terrible. Is this actually documented somewhere? Who knows how many setups are (not necessarily intentionally) relying on lookups always going through nscd, not just for sandboxes that might lack the nss libraries but for example not having the 32-bit equivalents of the nss libraries installed. For one, the NixOS distribution relies on nscd for all other nss queries except for the ones included in glibc since there is no global /lib.

Having the NSS libraries loaded into the caller's address space just needs to die. Just this week I had to debug an issue with a (proprietary, but not relevant to discussion) software distributed as binaries with all the libraries bundled. And this broke since some NSS module from the system (with new glibc) needed to be loaded but was using some symbols that didn't exist in the bundled libc. Of course, installing nscd was the solution.

Hardening the "file" utility for Debian

mezcalero — Wed, 14 Aug 2019 21:44:46 +0000

Well, I think the lesson to learn here is that LD_PRELOAD is a crutch, a hacker tool and should not be used for clean codepaths. It's not just incompatible with seccomp style stuff, but totally incompatible with anything involving suid binaries or fcaps or such. I am not sure why anyone would bother with making seccomp work with LD_PRELOAD if not even suid works with it...

While I generally do agree that maintaining seccomp policies is nastier than people might think I also think it's managable if you are careful. Specifically, seccomp policies that trigger SIGSYS are a really bad idea, as are syscall blacklists. If you stick to whitelists and stick to returning EPERM for unlisted syscalls you should mostly be fine, as most code that runs in environments it doesn't know well (i.e. NSS module code, library code, or even LD_PRELOAD hacks) tends to be written carefully enough to handle EPERM in a graceful way. Moreover, new syscalls added to the kernel this way also return EPERM and most code using such new syscalls tends to have fallbacks in place anyway to support slightly older kernels, and these codepaths are triggered then. In addition in a world of SELinux and AppArmor apps are vaguely prepared to getting EPERM/EACCES from various places already, thus getting them from some syscalls is fine too.

In systemd we started out with blacklisting and our logic defaults to triggering SIGSYS. Today we know we probably should not even have bothered with blacklisting at all, nor with SIGSYS because it's unmanagable, but we didn't know that when we first added support. It appears the folks who started this work in the 'file' tool made the same mistakes...

(Oh, and grouping syscalls is kinda important too: ideally libseccomp would even do that on its own. Policies shouldn't need to spell all 4 syscalls for sending a datagram individually nor the 7 syscalls for changing ownership of a file. In systemd we defined our own grouping to make this managable, but this sounds like a concept to have in libseccomp itself. With such grouping you get the coarseness that pledge() provides.)

Lennart

Hardening the "file" utility for Debian

clugstj — Wed, 14 Aug 2019 21:40:40 +0000

No, the problem is that one group of people want functionality (using LD_PRELOAD to do kewl things) and another group want to use "seccomp()" for security. These two "wants" don't play nice together.

Hardening the "file" utility for Debian

mezcalero — Wed, 14 Aug 2019 21:28:56 +0000

nscd is not what you appear to think it is: the nscd client in glibc has a very short time-out, in which case it falls back to traditional, non-nscd client side NSS. It is thus not suitable as a sandboxing solution, and only and exclusively as a cache for speeding things up following the theory that such a daemon makes no sense to block on for a longer time when its purpose is to make sure lookups only take a shorter time. The short time-out is also an effective method to make deadlocks due to local IPC less penalizing.

Hardening the "file" utility for Debian

dezgeg — Wed, 14 Aug 2019 20:52:29 +0000

The solution already exists in glibc: nscd

Hardening the "file" utility for Debian

zblaxell — Wed, 14 Aug 2019 20:41:50 +0000

Why not just unset LD_PRELOAD before calling file? Unless I missed
something, file doesn't need to participate in fakeroot stat mangling
(or NSS for that matter--file doesn't translate uids to names or
interact with hostnames or URLs). Detect LD_PRELOAD in file's
main function, unset it, and re-exec file so seccomp works properly.

Allowing LD_PRELOAD to propagate from a low-privileged context to a
high-privileged one is obviously a bad idea--which is why it's disabled
for setuid programs. Why does allowing LD_PRELOAD to propagate from a
high-privileged context to a low-privileged one seem like a good idea?
Sounds like it's just asking for exactly the kind of problems listed
above. If you're going to sandbox a process, that should include
deleting most of its environment variables, so that you can predict what
environment it's going to run in. That conflicts with the fundamental
ideas behind fakeroot and NSS, but...well, maybe they weren't particularly
good ideas anyway.

fakeroot is a fun hack and all, but maybe there are better ways to solve
the underlying problem? e.g. run packaging tasks on a FUSE filesystem
that acts like the user is doing everything as root, and patch the two
or three packages that still do euid == 0 checks during build. That
approach is going to work for things that aren't running on top of the
C library, too.

NSS is a real annoyance: you try to map a UID to a name, and
suddenly your thread blocks for network IO, or crashes, or gets RCE
vulnerabilities, because someone configured NSS to do something dumb, and
unprivileged users don't get an easy way to turn it off. There should be
an easy way to turn NSS off per process--users can already LD_PRELOAD in
a sane implementation of getpwnam(), getpwuid(), and so on, so it does
not introduce any new vulnerabilities to have an environment variable
(ignored for setuid programs) that lets users override nsswitch.conf
more conveniently.

Hardening the "file" utility for Debian

juliank — Wed, 14 Aug 2019 20:28:14 +0000

Hmm, I don't know, I think that's an upstream libc/systemd question. I could imagine a systemd-nssd that provides nss stuff over dbus.

Hardening the "file" utility for Debian

juliank — Wed, 14 Aug 2019 20:26:56 +0000

I don't think there are safe languages. Runtime bugs add a lot of unsafety; having another layer of protection is important.

Hardening the "file" utility for Debian

epa — Wed, 14 Aug 2019 20:09:32 +0000

You could write the parsing code in a safe language. Then, if there isn't a call to exec() literally appearing in the source code, there's no way the code can be tricked into calling exec() by overwriting the stack due to a missing bounds check, integer overflow or whatever. There are safe dialects of C which are probably compatible enough for the parsing code to work.

Hardening the "file" utility for Debian

Deleted user 129183 — Wed, 14 Aug 2019 20:05:26 +0000

> At this point, would't it be easier to rewrite "file" from scratch with security in mind instead of trying to use "seccomp()"?

Well, I guess that’s exactly the reason that openBSD has their own, seemingly NIH, implementation of it…