Updates in container isolation
Updates in container isolation
Posted May 17, 2018 1:22 UTC (Thu) by roc (subscriber, #30627)Parent article: Updates in container isolation
This doesn't sound right. The kernel running inside the container is inside the sandbox (the virtual machine interface implemented by the hypervisor), therefore it cannot add to the attack surface.
Posted May 17, 2018 1:36 UTC (Thu)
by anarcat (subscriber, #66354)
[Link] (2 responses)
Posted May 17, 2018 2:23 UTC (Thu)
by thinxer (guest, #121772)
[Link] (1 responses)
Posted May 17, 2018 14:00 UTC (Thu)
by anarcat (subscriber, #66354)
[Link]
Still, the way Xen is designed just feels a little backwards to me as the first layer is not actually the hypervisor itself, but a (compatible) kernel that talks with the hypervisor. And yes, that *does* provide an *extra* layer of security at the cost of performance. But Xen's design also means you need a privileged supervisor domain (the dom0 in the case of Xen) is also part of the attack domain now, and I seem to recall that being used as an attack vector in the past, but I could be mistaken there. I think this is where my analogy came from, but I must admit I cannot substantiate this any further and I am forced to recognize that the attack surfaces are comparable with other hypervisor like gVisor, at least conceptually.
Posted May 17, 2018 23:34 UTC (Thu)
by roc (subscriber, #30627)
[Link] (3 responses)
I think the security argument for gVisor-KVM is that if you have a KVM escape that escapes into the hypervisor's user-level, not the host kernel itself, then you're still in a very restrictive sandbox around the gVisor kernel. Whereas with Kata you'd be in QEMU which probably needs a much less restricted sandbox.
Although one interesting question is, do gVisor-KVM guest processes run at ring 0 or ring 3? If it's ring 3 somehow, then that would be an additional security layer for gVisor, but worse for performance.
I can see advantages for gVisor in terms of memory and storage usage, because the guest can share the host file system rather than mounting its own on a virtual block device.
Posted May 20, 2018 6:31 UTC (Sun)
by prattmic (subscriber, #101817)
[Link]
Posted May 21, 2018 1:39 UTC (Mon)
by bergwolf (guest, #55931)
[Link] (1 responses)
It's ring 3 and each syscall has to vmexit. Bad news for syscall intensive applications.
Posted Jun 1, 2018 8:47 UTC (Fri)
by ZhuYanhai (guest, #44977)
[Link]
And sentry runs in ring3 for ptrace platform, which is designed for development and debug purpose only.
Updates in container isolation
Updates in container isolation
Updates in container isolation
Updates in container isolation
Updates in container isolation
Updates in container isolation
Updates in container isolation