Garrett: Linux Container Security [LWN.net]

Garrett: Linux Container Security

Posted Oct 23, 2014 21:47 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link] (34 responses)

Nah, containers are not secure against intruders trying to break out. Kernel attack surface is just too large.

And seccomp doesn't help a lot, because you actually _need_ most of syscalls.

Garrett: Linux Container Security

Posted Oct 23, 2014 22:07 UTC (Thu) by luto (guest, #39314) [Link] (33 responses)

I disagree on the latter point.

The Sandstorm sandbox (namespaces + seccomp) blocks lots of syscalls [1], and I think that most recent kernel public kernel bugs would have been blocked by it. And a lot of common, useful applications and daemons work unmodified inside it.

(Disclosure: I wrote the seccomp part. It's far from perfect, and the plan is to eventually convert it from a blacklist to a whitelist. Nonetheless, I think it's pretty good.)

[1] https://github.com/sandstorm-io/sandstorm/blob/master/src...

Garrett: Linux Container Security

Posted Oct 23, 2014 22:11 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link] (26 responses)

So you're blocking quotactl, ptrace and other interesting syscalls. They might very well be required for containerized software (especially ptrace).

And my software uses AF_KEY sockets explicitly to set up peer-to-peer IPSec links, for example.

Garrett: Linux Container Security

Posted Oct 23, 2014 23:28 UTC (Thu) by luto (guest, #39314) [Link] (25 responses)

ptrace is mostly incompatible with seccomp. This is even documented. Someone might fix it some day if there's demand.

And IPSec from a container? Can you not set that up when you create the constainer?

The Sandstorm containers aren't meant to run admin tools, but allowing them will unavoidably greatly increase the attack surface. Interesting = interesting to the attackers.

Garrett: Linux Container Security

Posted Oct 23, 2014 23:34 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link] (24 responses)

We're using IPSec to create p2p links, specific for each application.

My point is that if you want to run arbitrary software in containers then it's not even remotely secure. Hypervisors are much better this way - their attack surface is greatly diminished.

Garrett: Linux Container Security

Posted Oct 24, 2014 0:10 UTC (Fri) by PaXTeam (guest, #24616) [Link] (14 responses)

containers vs hypervisors is apples vs oranges. try networked boxes vs hypervisors.

Garrett: Linux Container Security

Posted Oct 25, 2014 11:41 UTC (Sat) by deepfire (guest, #26138) [Link] (13 responses)

I'm sorry could you expand on this?

If we are to take security as the only and absolute scale of comparison, i.e. that which Cyberax seems to do -- containers do indeed seem much less secure than VMMs.

Why don't you think we can and should compare them this way?

After all, for some people security is a major consideration.

Garrett: Linux Container Security

Posted Oct 25, 2014 12:37 UTC (Sat) by PaXTeam (guest, #24616) [Link] (12 responses)

you can compare anything to anything but that doesn't mean that every instance is meaningful. a single kernel solution to compartmentalize userland appplications is very different from simulating a network of entire physical boxes (each running its own single kernel). the same way we don't compare bycicles to airplanes on grounds that both have wheels and can get you from point A to B for some sets of A and B. so for running arbitrary software in secure ways the proper comparison of hypervisor based solutions is to a network of physical boxes because that's what they were meant to replace to begin with. whether extra security comes with such a solution is very much in debate (personally i don't believe that millions of gates and many thousands if not millions of lines of code will ever be as secure as copper wire). now you can try to drag containers into the picture but that doesn't change the fundamental fact that the single kernel is the security boundary, not the containers. i know proponents like to think and advertise otherwise but that's wishful thinking, i think anyone following the security issues of kernels has no choice but to agree to that. of course if you don't care about the 'arbitrary' part above, then you can play games by giving up on general properties of single kernel solutions (sandboxes, access control policies, etc) and try to push the security boundary down to containers but then we can't (in a meaningful way) compare such cases to general solutions based on hypervisors or networks.

PS: this just in: http://www.openwall.com/lists/oss-security/2014/10/24/9

Garrett: Linux Container Security

Posted Oct 25, 2014 14:52 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link] (11 responses)

Yup. And I totally don't like it when containers are billed as a way to securely run arbitrary applications.

A way to isolate well-behaving applications? Yes. A way to provide consistent environment for packaged applications? Yes. But not more.

Hypervisors, on the other hand, can be reasonably secure to run arbitrary stuff.

Garrett: Linux Container Security

Posted Oct 25, 2014 19:22 UTC (Sat) by dlang (guest, #313) [Link] (10 responses)

Well, containers are more secure than just running the applications in a single system image without containers, and they have far less overhead than running the same application splits in VMs

Now, you can argue that they aren't "secure enough", but the same argument can be made about running in VMs compared to running on dedicated hardware.

If the security between applications is really critical, then you should be running them on different physical hardware on different physical switches, all of which are completely under your control

but very few people are running applications that are that critical. Even those that I would consider that critical, others consider the costs of such isolation to be higher than the benefits.

So the question boils down to how do you define "secure enough" for any particular purpose?

Garrett: Linux Container Security

Posted Oct 28, 2014 14:45 UTC (Tue) by pbonzini (subscriber, #60935) [Link] (4 responses)

The difference is that hypervisors do reduce the ring0 attack surface to that of the hypervisor. Even with KVM, which uses Linux as the kernel, I do not remember any case in which guests could exploit vulnerabilities _outside_ KVM without first breaking out of KVM and/or QEMU.

Garrett: Linux Container Security

Posted Oct 28, 2014 15:50 UTC (Tue) by drag (guest, #31333) [Link] (1 responses)

Well that is sort a of like saying that you've never known any java vulnerability affect a OS without it first breaking out of the java virtual machine.

Really, a java virtual machine and a x86_64 virtual machine has a huge number of things in common on multiple levels.

As capabilities and performance of virtual machines increase so does their attack surface. I feel that eventually containers will improve isolation to the point while people improve flexibility and performance of hypervisors to the point were the security differences are pretty minimal.

Garrett: Linux Container Security

Posted Oct 28, 2014 16:06 UTC (Tue) by pbonzini (subscriber, #60935) [Link]

No, not exactly. You could relatively easily exploit a network stack vulnerability from within Java (via sockets), for example. It's much harder to exploit a network stack vulnerability in the host from within a guest.

The portion of the host kernel that is accessible to a Java program is much larger than the portion that is accessible to a KVM guest, even if you include the userspace parts (QEMU) in the equation. This is because the Java program is exposed a high-level POSIX-y interface, while a guest roughly speaking only sees Ethernet and SCSI.

Of course you're right that new capabilities increase the attack surface. But for hypervisors most of those capabilities are in userspace, and are pretty well confined. For containers the kernel is your only protection.

Garrett: Linux Container Security

Posted Oct 28, 2014 21:21 UTC (Tue) by dlang (guest, #313) [Link] (1 responses)

I don't disagree with the reduced attack surface, which is why I readily say that hypervisors are probably always going to be more secure than containers.

the question is if this extra security is worth the performance drawbacks of hypervisors? The processing overhead of the second kernel and the decreased efficiency of resource allocation and management because there is no place that the entire system can be analyzed.

Security is not the prime goal of any computer system. Functionality is the prime goal, and security requirements are part of the overall system requirements. For some people containers are going to be a better trade-off for them than hypervisors.

Garrett: Linux Container Security

Posted Oct 28, 2014 23:48 UTC (Tue) by pbonzini (subscriber, #60935) [Link]

I totally agree that you should define what the threat is, before touting "security" as an advantage of virtualization (or equivalently as an advantage of IaaS vs. PaaS). In general, there are good reasons why you can handwave hypervisor security away much more easily than container security, but hypervisor vulnerabilities do happen all the time.

That said, performance is not the prime goal of any computer system either. Performance requirements have to be evaluated together with system requirements, and the performance drawbacks aren't huge for virtualization of most workloads. Virtualization will probably be less efficient but the performance might really not differ from containers that much---i.e. fewer tasks will fit in a host, but you should be able to run most of them as fast as in a container.

So, security could swing the balance towards virtualization, and performance could do the same in favor of containers, but ease of use and management could turn out to be the most important factor.

Garrett: Linux Container Security

Posted Nov 1, 2014 18:35 UTC (Sat) by ThinkRob (guest, #64513) [Link] (4 responses)

> Well, containers are more secure than just running the applications in a single system image without containers, and they have far less overhead than running the same application splits in VMs

If there's one thing I've learned about computer security, there's no such thing as "more secure" when you're talking about trying to restrict an attacker with the ability to control some or all aspects of execution.

The best example of how the idea of "more secure" can come back to bite you is the Xbox security system. It had lots of obstacles that it threw in the path of attackers with the idea that the more of them were there the "more secure" it would be. Only it wasn't. It slowed down the attackers some, maybe, but it didn't actually stop them. And in the end it wasn't any more secure than something with no security at all.

Garrett: Linux Container Security

Posted Nov 1, 2014 19:03 UTC (Sat) by deepfire (guest, #26138) [Link] (3 responses)

Absolutely. I so with more people understood this..

Security is only available to the extent of the completeness of our understanding of the attack surface.

Ergo, it is ridiculous to claim that containers (as generally used) significantly improve it.

Garrett: Linux Container Security

Posted Nov 2, 2014 2:41 UTC (Sun) by raven667 (subscriber, #5198) [Link] (2 responses)

I think the confusion is that there are different schools of though about this. Security is not usefully described as a binary value because the answer is that for anything complex enough to be useful it can't possibly be perfectly secure. Since absolute security is non-existent in practice we have to instead talk about degrees of risk and approaches to make compromise more difficult without falling into a trap of believing that any particular technique "fixes" security. Processes, permissions, containers, VMs, dedicated machines, airgaps all fall on a continuum of effort and risk and you have to pick what is appropriate for your workload. So containers with processes and permission have more isolation than just processes and permissions on a system, but not as much as a hypervisor and VMs.

Garrett: Linux Container Security

Posted Nov 2, 2014 11:26 UTC (Sun) by deepfire (guest, #26138) [Link] (1 responses)

> So containers with processes and permission have more isolation
> than just processes and permissions on a system

That's the point where we disagree. In principle you are correct, but I believe that quantitatively it doesn't amount to much.

Let me explain.

First, I agree that containers+seccomp are a useful tool to reduce the kernel attack surface. But that's not how they are used. And I don't really believe that's how they are going to be used, to any significant degree. It can be spawned to a separate discussion, if we so desire.

Now we are reduced to plain containers with the worse-than-regular kernel attack surface -- since the in-kernel containerization implementation _increases_ the complexity even further.

What do they protect us now from? Userspace to root escalation? How useful is that? Useful, for sure.

Garrett: Linux Container Security

Posted Nov 2, 2014 12:36 UTC (Sun) by dlang (guest, #313) [Link]

but if you view containers not as something to affect the kernel attack surface, but instead to isolate applications and ease management, then there is a clear difference between having them and just running the apps.

you may want to take the approach that anyone who gets a local account of any type now has full root privileges and the ability to do anything to any application, container or VM, but that's an extreme position.

Yes containers can be used in a way that provides less benefit than if they were used in other ways, but that doesn't make them worthless.

Garrett: Linux Container Security

Posted Oct 24, 2014 6:47 UTC (Fri) by kentonv (subscriber, #92073) [Link] (8 responses)

The trick with Sandstorm is that it doesn't want to run arbitrary software. It just wants to run web servers. It turns out most web servers don't need most system calls, /proc, or any devices except null, zero, and urandom.

Point is, it is possible for containers to be (reasonably) secure. It just depends on your use case. If you want a machine you can shell into and be root, then, yeah, containers aren't secure.

Garrett: Linux Container Security

Posted Oct 24, 2014 8:48 UTC (Fri) by Arach (guest, #58847) [Link] (7 responses)

Where "reasonably" means "we participate in the cargo cult of "securing" Linux by doing something instead of nothing, so we could sell you some false sense of security, conveniently and nicely". One of the recent local privesc vulns was in the futex subsystem, how about that?

It's only reasonable when developers give security some good priority, which is not the case with Linux, where they consistently ignore the facts about what works and what doesn't.

Garrett: Linux Container Security

Posted Oct 24, 2014 9:09 UTC (Fri) by kentonv (subscriber, #92073) [Link] (6 responses)

Of course there will occasionally be vulnerabilities like the futex one which seccomp can't do anything about. But hypervisors sometimes have vulnerabilities too. Security is not a true or false thing; it's a game of risk management. The vast majority of recent kernel vulnerabilities have not been exploitable in Sandstorm.

Garrett: Linux Container Security

Posted Oct 24, 2014 21:45 UTC (Fri) by Arach (guest, #58847) [Link] (4 responses)

> But hypervisors sometimes have vulnerabilities too.

Strawman.

> Security is not a true or false thing

Strawman.

> it's a game of risk management

Who does this management? Where are any results of any security risk assessment?

> The vast majority of recent kernel vulnerabilities have not been exploitable in Sandstorm.

Which vulnerabilities? https://sandstorm.io - where are the details?

Garrett: Linux Container Security

Posted Oct 24, 2014 22:33 UTC (Fri) by luto (guest, #39314) [Link] (3 responses)

The list of blocked vulnerabilities includes:

Basically anything in a driver (almost no device nodes are allowed in).
Anything in ptrace (e.g. CVE-2014-4699).
All of the vulnerabilities that require creating a user namespace (e.g. CVE-2014-7970 and CVE-2014-7975)
Anything that requires setuid binaries (for example, the entire family of write(2)-checking-current_cred bugs).

This is, admittedly, biased. It's hard to get a real list because no one seems to maintain a database of known Linux kernel vulnerabilities.

Garrett: Linux Container Security

Posted Oct 24, 2014 22:44 UTC (Fri) by kentonv (subscriber, #92073) [Link]

> All of the vulnerabilities that require creating a user namespace (e.g. CVE-2014-7970 and CVE-2014-7975)

And CVE-2014-5206 and CVE-2014-5207. :)

Garrett: Linux Container Security

Posted Oct 25, 2014 0:36 UTC (Sat) by PaXTeam (guest, #24616) [Link] (1 responses)

> [...] because no one seems to maintain a database of known Linux kernel vulnerabilities.

for some definition of 'known' perhaps this can be of help: https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=linux

Garrett: Linux Container Security

Posted Oct 25, 2014 16:36 UTC (Sat) by luto (guest, #39314) [Link]

Fair enough.

In a Sandstorm sandbox backed by ext4, it looks these vulnerabilities would be accessible:

CVE-2014-0131 (maybe; this might require vhost-net)
CVE-2014-1438 (only in unusual circumstances)
CVE-2014-2851 (configuration-dependent; usually inaccessible; unlikely to allow privilege escalation)
CVE-2014-3122 (not entirely sure; DoS only I think)
CVE-2014-3917 (DoS / minor information disclosure only; requires an unlikely configuration on the host to be accessible at all)

At this point, I got bored. Note that, of the particularly noteworthy nasty ones this year, CVE-2014-0196 (the tty bug) would not have been exploitable inside a Sandstorm sandbox.

I ignored all the KVM emulator issues. Those might be exploitable regardless of anything Sandstorm (or any other guest OS or sandbox) could plausibly do to mitigate them, but they only work inside KVM.

Garrett: Linux Container Security

Posted Oct 26, 2014 20:01 UTC (Sun) by thestinger (guest, #91827) [Link]

> Of course there will occasionally be vulnerabilities like the futex one which seccomp can't do anything about.

The futex vulnerability can be walled off by a good system call filter because it requires flags that aren't usually necessary. A whitelist operating solely at a system call level is too coarse. For example, it will end up permitting *all* `ioctl` operations just because it was used FIOCLEX was set on a file descriptor (nginx does this).

The futex vulnerability was a Chromium sandbox bypass because they build the rules by hand and miss most opportunities to use parameter filters. It would have been prevented if the rules were created by profiling the necessary calls and flag parameters.

It's not a panacea, but seccomp can cut down the kernel attack surface quite a bit when it's tightly integrated into an application. It's much less useful externally because it has to permit everything needed by the dynamic linker and application start-up. It also can't take advantage of security boundaries within the application.

Garrett: Linux Container Security

Posted Oct 24, 2014 9:08 UTC (Fri) by Lionel_Debroux (subscriber, #30014) [Link] (5 responses)

seccomp requires voluntary effort by programmers. The computer industry being what it is (like mostly everything else, it's usually driven by lower cost and shorter development times), security remains an afterthought, and a crushing majority of real-world programs don't use seccomp...

Another way to block most recent public kernel exploits is PaX's good old set of features, including but not limited to UDEREF. UDEREF-enabled kernels do not blissfully dereference arbitrary operations structures from user-space, and execute arbitrary user-space functions as a result.
But sadly, mainline and most distros shun the features (which reduce performance, but the fact that speed and security are usually contradictory isn't new) and distribution of PaX/grsecurity...

Garrett: Linux Container Security

Posted Oct 24, 2014 14:58 UTC (Fri) by luto (guest, #39314) [Link] (2 responses)

IIUC UDEREF is extremely expensive on 64-bit machines. On Broadwell, though, kernel.org kernels do more or less the same thing at little cost.

I expect that this will make exploits harder to write, but not impossible, and we'll see a variety of new techniques to exploit kernel bugs. At least NULL pointer dereferences will finally stop being exploitable.

Hmm. Has anyone ever tried using PCID to do something like UDEREF at lower cost?

Garrett: Linux Container Security

Posted Oct 24, 2014 15:24 UTC (Fri) by Lionel_Debroux (subscriber, #30014) [Link]

For now, Broadwell only represents a tiny fraction of computers, and x86(_64) chips do no longer represent a majority in sales figures AFAIK, so I don't think Broadwell's belated capabilities are very relevant for security.
With Intel spewing brand-new (well... recycled) 32-bit x86 processors, fast segmentation-based protection will remain useful for the foreseeable future.

Even on a processor with SMAP support, techniques to exploit kernel bugs won't necessarily be new... among other useful changes, PaX/grsecurity contains protections against a variety of techniques which work on mainline :)

AFAICS, modern versions of PaX do use PCID and INVPCID, even outside #if defined(CONFIG_X86_64) && defined(CONFIG_PAX_MEMORY_UDEREF) / #endif blocks.

Garrett: Linux Container Security

Posted Oct 24, 2014 15:29 UTC (Fri) by PaXTeam (guest, #24616) [Link]

> IIUC UDEREF is extremely expensive on 64-bit machines.

define extremely ;). a recent real life use case (smtp+clamav at 100Mbps) saw an impact of <16% on opteron but their kernel also included two other performance killer features, SANITIZE and STACKLEAK.

> On Broadwell, though, kernel.org kernels do more or less the same thing at little cost.

SMAP is strictly less than UDEREF but it can be massaged into something more useful with some extra work.

> I expect that this will make exploits harder to write, but not impossible, and we'll see a variety of new techniques to exploit kernel bugs.

this has been the case for a decade now on PaX and the only ways to exploit such kernels bugs is ret2libc (which is dead too already) and data-only attacks (which is a tough problem to solve for the kernel).

> At least NULL pointer dereferences will finally stop being exploitable.

UDEREF is about much more than just mere NULL derefs (even vanilla closed that route down for non-privileged processes a few years ago).

> Hmm. Has anyone ever tried using PCID to do something like UDEREF at lower cost?

how about PaX/UDEREF? for over a year now ;).

Garrett: Linux Container Security

Posted Oct 25, 2014 1:01 UTC (Sat) by wahern (subscriber, #37304) [Link] (1 responses)

"seccomp requires voluntary effort by programmers."

This is true, but it's still a better state of affairs than things like SELinux, which rely on a voluntary effort by people who don't even develop the software, and in many cases don't even develop software at all. And they can only really beat around the bush, because not only are object- and privilege-based access controls like that far removed from what the actual application software is doing, they're also far removed from what the kernel is actually doing. Exploits thrive by mining those gaps for bugs.

Furthermore, the kernel/userspace is not the only privilege boundary that matters. It's just happens to be a very fruitful one, and one that Linux in particular has not historically been very good at, given the insane amount of drivers, features, and optimizations (basically, many lines of really complex and often impenetrable code) in the kernel.

Seccomp isn't a substitute for approaches like UDEREF. But neither does UDEREF substitute for approaches like seccomp.

Garrett: Linux Container Security

Posted Oct 25, 2014 7:55 UTC (Sat) by Lionel_Debroux (subscriber, #30014) [Link]

Agreed, UDEREF and seccomp (and why not SELinux, despite all its warts) are complementary for defense in depth.

Garrett: Linux Container Security

Posted Oct 24, 2014 8:41 UTC (Fri) by ortalo (guest, #4654) [Link] (5 responses)

I am starting to hate this "fuzzing" buzzword now. It gets promoted as a dedicated security mechanics while it simply looks and tastes like good old black box testing (whichh by the way is not so good at all, whether agressive or not, and has pretty strong limitations).
At least Matthew does not promote it in isolation; but the auditing is certainly the most important part for security improvement. The rest is just testing, really - and one of these buzzwords that contribute to the security circus sucking up useful resources.

While re-reading up my comment, I note too that it's really funny that (heavy duty) black box testing gets asked simultaneously as auditing.
If auditing is possible, it means we have more information on the internals, most probably the source code, in this case why not do white box testing? It's much more efficient and researched (statistical testing, boundary conditions testing, etc.). In my opinion, that fuzzing thing really really lacks any good thinking and cannot compensate with recent tools and increased computing power.

Garrett: Linux Container Security

Posted Oct 24, 2014 13:03 UTC (Fri) by welinder (guest, #4699) [Link] (3 responses)

Fuzzing is a *very* useful tool for finding problems in the kernel or a user-space application for at least two reasons:

1. It's a 100 monkey test team. You simply won't get that many real, live testers. Mind you, the 100 monkeys aren't particular smart, but it can become a 10000 monkey test team if it has to.

2. It will explore test areas that humans won't think of, including areas you tested for the previous version prior to the changes that Cannot Possibly cause trouble here.

Obviously it is intellectually inferiour to proving the kernel correct, but so is any kind of visual inspection auditing, no matter how good the auditor. A proof, however, won't happen anytime soon: (1) the kernel has bugs, and (2) the kernel is too big for current methods.

If you just meant that using fuzzing as the sole testing tool is insufficient, I agree. That would be reckless.

Garrett: Linux Container Security

Posted Oct 27, 2014 18:33 UTC (Mon) by Wol (subscriber, #4433) [Link] (2 responses)

> Fuzzing is a *very* useful tool for finding problems in the kernel or a user-space application for at least two reasons:

3. We have a blind spot for logic errors in our own code. Fuzzing will gaily stomp all over that blind spot.

> Obviously it is intellectually inferiour to proving the kernel correct, but so is any kind of visual inspection auditing, no matter how good the auditor. A proof, however, won't happen anytime soon: (1) the kernel has bugs, and (2) the kernel is too big for current methods.

3. The kernel interacts HEAVILY with the real world. There's no point proving the kernel is internally mathematically correct, if the external world feeds it "impossible" input. (Which is why Linus sometimes chooses to make the kernel panic rather than handle an error - if the chances of an error in the error-handling are greater than the chances of the error itself, then the safest option is to panic (or something like that).)

Cheers,
Wol

Garrett: Linux Container Security

Posted Oct 29, 2014 4:58 UTC (Wed) by thoughtpolice (subscriber, #87455) [Link] (1 responses)

> if the external world feeds it "impossible" input.

I don't think you know much about software verification. "Impossible input" is often exactly what leads to security vulnerabilities or buggy behavior, and this can be formalized and fixed by having a definitive program semantics which you have proven hold (possibly over some universally quantified input, like from a user). We've done this for other things too: a compiler may crash, loop forever, or improperly parse/reject certain code, which is a kind of "impossible input" leading to a bug. But we have shown compilers which do correctly parse all valid programs according to a formal grammar, will always halt, and will never crash. This compiler is called CompCert (which has all those things).

Memory safety proofs exist the same way. seL4 for example has broad proofs including things like "tasks may not interfere with the address space of any other task", including the kernel (meaning any programs you write attempting to do this won't work, and furthermore, any later kernel extensions which break this property will break the proofs). This general proof can be strengthened to apply only to certain tasks, or only tasks which have not agreed to share memory, and so on.

Systems like seL4 are designed for much different environments anyway, but real kernels hosting arbitrary applications with correctness proofs is happening *today*, not tomorrow or next year.

Naturally this sort of thing is never going to happen to Linux however and in the mean time, while the kernel developers continuously add millions of lines of code and CVEs, grsecurity/pax are (bar none) the best mitigation tools there are, and mitigation technology like it will continuously have a place given the fairly pitiful attitude towards security by the software industry at large, it seems.

Garrett: Linux Container Security

Posted Oct 29, 2014 6:00 UTC (Wed) by cebewee (guest, #94775) [Link]

That's fine for the interactions of the kernel with the software side of the world -- here the kernel has always the (worst-case) option of killing the process, stopping the connection, or similar, if these start to behave in a broken way.

However, when the hardware starts to behave in impossible way, there might not be a sane way to react and stopping the system becomes an acceptable reaction.

Garrett: Linux Container Security

Posted Oct 24, 2014 13:36 UTC (Fri) by tterribe (guest, #66972) [Link]

No one test methodology finds all bugs. The best approach is to use lots of different methodologies in combination, where the strengths of one will cover for the weaknesses of another. Generally, each time you add a new methodology it will uncover a new set of bugs, until it hits diminishing returns and things stabilize. Given the level of effort required to audit something, I'd much rather have it fuzzed first, to eliminate many of the trivial bugs and make it easier to reason about the system.

See Greg Maxwell's presentation on testing the Opus codec for a lot more detail on the things you can do: https://www.ietf.org/proceedings/82/slides/codec-4.pdf

"It's going to take some work"

Posted Oct 24, 2014 19:33 UTC (Fri) by david.a.wheeler (subscriber, #72896) [Link] (7 responses)

Containers are definitely useful.

But "It's going to take some work" is an important part of today's story. Today, if you want strong isolation between components, using separate machines is by far the best bet, followed by hypervisors, followed by containers. From a security point-of-view, the more sharing you do, the more risk you take on (see "least common mechanism" from Saltzer and Schroeder). In particular, containers have a bigger attack surface, and there hasn't been as much work to examine them.

But security is a continuum. Separate machines are more expensive than using hypervisors, and containers should be cheaper still.

I hope that the work will happen to make containers more secure. We shall see.

"It's going to take some work"

Posted Oct 24, 2014 20:31 UTC (Fri) by dlang (guest, #313) [Link] (6 responses)

> Today, if you want strong isolation between components, using separate machines is by far the best bet, followed by hypervisors, followed by containers

This will probably always be the case, the only question is how large are the gaps between them.

If you want performance, the picture changes

separate machines have the lowest overhead (if your app can use the entire machine)

containers are next (or ahead of separate machines if your app isn't going to keep the machine busy all the time and other apps can benefit)

Hypervisors (and the need to run separate kernels) are always going to lag behind, simply due to the fact that they do require separate kernels and the resource sharing between the different kernels is never going to be as good as containers where one kernel can see and adjust all resource usage.

So which is better for you is always going to depend on the balance you need between the increased isolation, and decreased coordination.

"It's going to take some work"

Posted Oct 25, 2014 11:28 UTC (Sat) by ibukanov (subscriber, #3942) [Link] (5 responses)

This is a good point that containers allow to get better overall security for the same price. When accounting for unknown risk of hypervisor bugs I may not afford to rent several hypervisor instances on a shared host to split a web app into isolated components. However I could afford a low-level physical server and run my app in containers.

"It's going to take some work"

Posted Oct 25, 2014 11:49 UTC (Sat) by deepfire (guest, #26138) [Link] (4 responses)

"containers allow to get better overall security for the same price"

Erm, sorry?

Can you provide the model which you seem to use to quantify security?

Are you sure this model is widely applicable?

"It's going to take some work"

Posted Oct 25, 2014 20:09 UTC (Sat) by ibukanov (subscriber, #3942) [Link] (3 responses)

Splitting the application into N containers running on a dedicated box versus having, say, N/3 virtual boxes on a hypervisor that also runs other unknown VPS eliminates the risk associated with hypervisor bugs. Depending on how one accounts such risk it can be cheaper overall to go with containers.

"It's going to take some work"

Posted Oct 25, 2014 20:32 UTC (Sat) by dlang (guest, #313) [Link] (2 responses)

what is the advantage of having three copies of your app running on one box rather than just one copy of your app?

Unless you have scaling problems in your app where you bottleneck, running one copy should be better.

"It's going to take some work"

Posted Oct 25, 2014 20:52 UTC (Sat) by ibukanov (subscriber, #3942) [Link] (1 responses)

I can trivially put, say, the webserver, the app engine and the database into separated containers with pretty much zero performance impact while improving the isolation significantly. I can even split the app engine into several containers improving security farther. With VMs the performance impact is not zero, the setup requires more efforts to develop and maintain and some configuration options that are possible with containers are just not available with VMs.

"It's going to take some work"

Posted Oct 25, 2014 20:58 UTC (Sat) by dlang (guest, #313) [Link]

Ok, I would not have considered that being the 'same app' :-) I view that as three apps as part of the same service.

I only consider it the 'same app' if it's the same code with the same configuration (modulo communications settings).

I agree that your config produces security gains at virtually no cost compared to running them all in the same system and running them in VMs adds significant overhead and reduces your ability to share resources between them effectively.

Garrett: Linux Container Security

Posted Oct 25, 2014 1:13 UTC (Sat) by wahern (subscriber, #37304) [Link] (11 responses)

"I suspect containers can be made sufficiently secure that the attack surface size doesn't matter."

I think this is an example of what some call the "Accounting Fallacy": http://fare.tunes.org/liberty/economic_reasoning.html#SEC...

Basically, the author envisions a future where everybody aggressively shook out the bugs from containers, making containers more secure than hypervisors. He arrives at the conclusion that "surface size doesn't matter" because in this future containers are both more secure and have greater code surface.

The fallacy is that if you also spent effort shaking the bugs out of hypervisors, the future could easily be one where hypervisors are still more secure relative to containers. Thus containers wouldn't be the counter-example to the argument that code surface matters. All that future would prove is that auditing matters. Big shocker.

In fact, precisely because hypervisors have less code surface one would think they would be easier to audit, fuzz, and indeed refactor to be even smaller.

Garrett: Linux Container Security

Posted Oct 25, 2014 1:23 UTC (Sat) by wahern (subscriber, #37304) [Link]

Just be clear: if we want to be honest, let's all admit that containers are here to stay because they're simply more convenient and more performant. Full stop. Like almost everything in IT these days, security doesn't figure into the equation at this point in the collective decision making process.

Therefore, it would be prudent to invest heavily in auditing containers. But I would submit that it would be more prudent to improve auditing of the kernel. Because at the the end of the day that's the relevant scope. Containers use and rely upon the very parts of the kernel that are most vulnerable and most often exploited anyhow.

This idea that we can, a priori, focus our attention on narrow details in an attempt to be more efficient about fixing security issues is just a really bad problem. It's not unlike Security Theater. In Security Theater we delude ourselves into thinking we can divine the future, then put all our eggs into fixing the imagined problems. When really, except for fixing those particular issues that have manifestly caused problems, we need to specialize in being security generalists.

Garrett: Linux Container Security

Posted Oct 25, 2014 1:40 UTC (Sat) by mjg59 (subscriber, #23239) [Link] (9 responses)

My assumption is that hypervisors will continue to be more secure than containers, but that a sufficiently concerted effort may get the *absolute* number of exploitable bugs sufficiently low that the real-world frequency of hypervisor vulnerabilities is not significantly smaller than for containers.

Garrett: Linux Container Security

Posted Oct 25, 2014 2:19 UTC (Sat) by dlang (guest, #313) [Link] (4 responses)

It's also important to realize that a lot of people who use 'virtualization' aren't doing it for the security, they are doing it for the ease of management. and for those people, the slight decrease in container security vs hypervisor security don't matter.

If the bad guys break into your webserver VM and it has access to your database, it doesn't matter that they can't easily hack your DNS server that happens to be running on 2 boxes our of 1000.

Security is not the purpose of having a computer network, serving your customers is the purpose of having a computer network, and security needs to be a cost/benefit evaluation to decide when you have "enough" security for your network.

On the banking networks that I help run, the security needs to be much tighter than on a network serving wordpress sites.

Garrett: Linux Container Security

Posted Oct 25, 2014 2:50 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link] (3 responses)

The problem is, if an adversary breaks into machine through an unofficial Wordpress blog describing recent cafeteria menu additions then they can potentially compromise the mission-critical company DNS server working on the same hardware node.

Garrett: Linux Container Security

Posted Oct 25, 2014 2:57 UTC (Sat) by dlang (guest, #313) [Link] (2 responses)

and the answer to that is to keep your production and corporate resources on separate hardware.

If you have people running "unofficial" stuff on your production systems, no security strategy in the world is going to save you because the users are going to ignore it.

Garrett: Linux Container Security

Posted Oct 25, 2014 4:12 UTC (Sat) by raven667 (subscriber, #5198) [Link] (1 responses)

Having different VM clusters for different security level hosts is a fool-proof way to ward against this kind of vulnerability, and the VM systems I ran years ago did this out of an abundance of caution, I think that today we expect and rely much more heavily on the hypervisor to separate VMs, especially when you are just renting shared hosting time on someone elses hardware. _Any_ amount of multi-tenancy makes the hypervisor much more security critical, _your_ critical server might be on the same hosts as someone elses Wordpress blog.

Garrett: Linux Container Security

Posted Oct 28, 2014 17:14 UTC (Tue) by NightMonkey (subscriber, #23051) [Link]

Be cautious in thinking that *anything* involving computers and networks is "fool-proof". Security Theater's curtains open with that word as an introduction. :)

Garrett: Linux Container Security

Posted Oct 25, 2014 12:27 UTC (Sat) by deepfire (guest, #26138) [Link] (3 responses)

The question is, _of course_, what this "sufficiently concerted" effort amounts to.

It is at the heart of the aforementioned accounting fallacy, after all.

The amount of effort you speak of appears to me truly titanic. Yes the difference in attack surfaces is this huge -- let's not forget about it.

In fact, it's so bad, it enters the realm of sensible to completely abolish the classic-OS-kernels-provide-isolation model.

Garrett: Linux Container Security

Posted Oct 25, 2014 12:28 UTC (Sat) by deepfire (guest, #26138) [Link] (2 responses)

Let me clarify -- the model of classic-OS-kernels-provide-isolation-for-security.

Garrett: Linux Container Security

Posted Oct 26, 2014 23:52 UTC (Sun) by NightMonkey (subscriber, #23051) [Link] (1 responses)

Or, perhaps, it is time to ask what was so wrong with it that we think the new models are better? In my concrete experience, the problem with 'old' security stances was that no one was willing to really investigate and understand the tools we already had available. This present set of movements seems to just blame the old because it is old, and presumes the new is always better.

Sincerely,
G. O. Mai Laun

Garrett: Linux Container Security

Posted Nov 1, 2014 20:19 UTC (Sat) by deepfire (guest, #26138) [Link]

I posit that the huge attack surface is the problem of the classic model.

If you can't fit the whole attack surface in the head, you are bound to have bugs -- and it will _always_ be a losing battle.

Garrett: Linux Container Security

Posted Oct 29, 2014 8:47 UTC (Wed) by christersolskogen (guest, #70295) [Link]

The main problem with linux containers is not that they are not secure. It's that they are not secure by default.