Xen again

By Jonathan Corbet
June 3, 2009

Your editor is widely known for his invariably correct and infallible predictions. So, certainly, he would never have said something like this:

Mistakes may have been made in Xen's history, but it is a project which remains alive, and which has clear reasons to exist. Your editor predicts that the Dom0 code will find little opposition at the opening of the 2.6.30 merge window.

OK, anybody needing any further evidence of your editor's ability to foresee the future need only look at his investment portfolio...or, shall we say, the smoldering remains thereof. Needless to say, Xen Dom0 support did not get through the 2.6.30 merge window, and it's not looking very good for 2.6.31 either.

Dom0, remember, is the hypervisor portion of the Xen system; it's the One Ring which binds all the others. Unlike the DomU support (used for ordinary guests), Dom0 remains outside of the mainline kernel. So anybody who ships it must patch it in separately; for a patch as large and intrusive as Dom0, that is not a pleasant task. It is a necessary one, though; Xen has a lot of users. As expressed by Xen hacker Jeremy Fitzhardinge:

Xen is very widely used. There are at least 500k servers running Xen in commercial user sites (and untold numbers of smaller sites and personal users), running millions of virtual guest domains. If you browse the net at all widely, you're likely to be using a Xen-based server; all of Amazon runs on Xen, for example. Mozilla and Debian are hosted on Xen systems.

Xen developers and users would all like to see that code merged into the mainline. A number of otherwise uninvolved kernel developers have also argued in favor of merging this code. So one might well wonder why there is still opposition.

One problem is a fundamental disagreement with the Xen design, which calls for a separate user-space hypervisor component. To some developers, it looks like an unfortunate mishmash of code in the mainline kernel, in Xen-specific kernel code, and in user space - with, of course, a set-in-concrete user-space ABI in the middle. Many developers are more comfortable with the fully in-kernel hypervisor approach taken by KVM. Thomas Gleixner is especially worried about the possible results of merging the Xen Dom0 code for this reason (among several others):

Aside of that it can also hinder the development of a properly designed hypervisor in Linux: 'why bother with that new stuff, it might be cleaner and nicer, but we have this Xen dom0 stuff already?'.

Steven Rostedt, who has worked on Xen in the past, also dislikes the hypervisor design and the effects it has on kernel development:

The major difference between KVM and Xen is that KVM _is_ part of Linux. Xen is not. The reason that this matters is that if we need to make a change to the way Linux works we can simply make KVM handle the change. That is, you could think of it as Dom0 and the hypervisor would always be in sync.

If we were to break an interface with Dom0 for Xen then we would have a bunch of people crying foul about us breaking a defined API. One of Thomas's complaints (and a valid one) is that once Linux supports an external API it must always keep it compatible. This will hamper new development in Linux if the APIs are scattered throughout the kernel without much thought.

Steven suggests merging the Xen hypervisor into the mainline so that it's all part of Linux, and to make the hypervisor ABI an internal, changeable interface. Some other developers - generally those most hostile to merging Dom0 in its current form - supported this idea. It's certainly not the first time that this sort of idea has been raised. But, despite many calls to bring some of the "plumbing layer" into the kernel proper, that has yet to happen; it seems unlikely that something as large as Xen would be the first user-space component to break through that barrier - even if the Xen developers were amenable to that approach.

The hypervisor design would probably not be an insurmountable obstacle to merging by itself. But there are other complaints. The maintainers of the x86 architecture dislike the changes made to their code by the Dom0 patches. By their reckoning, there are far too many "if (xen)..." conditionals and too many #ifdefs. They would very much like to see the Xen code cleaned up and made less intrusive into the core x86 code. Linus supports them on this point:

The fact is (and this is a _fact_): Xen is a total mess from a development standpoint. I talked about this in private with Jeremy. Xen pollutes the architecture code in ways that NO OTHER subsystem does. And I have never EVER seen the Xen developers really acknowledge that and try to fix it.

The Xen cause was also not helped by some performance numbers posted by Ingo Molnar. If you choose the right benchmark, it seems, you can show that the paravirt_ops layer imposes a 1% overhead on kernel performance. Paravirt_ops is the code which abstracts low-level machine operations; it can enable the same kernel to run either on "bare metal" or virtualized under a hypervisor. It adds a layer of indirect function calls where, before, inline code was used. Those function calls come at a cost which has now been quantified by Ingo (but one should note that Rusty Russell has shown that, with the right benchmark, a number of other common configuration options have a much higher cost).

The problem here is not that Xen users have a slower kernel; the real issue is that any kernel which might ever be run under Xen must be built with paravirt_ops enabled. There are few things which make distributors' lives more miserable than forcing them to build, ship, and support another kernel configuration. So most distributor kernels run with paravirt_ops enabled; that means that all users, regardless of whether they have any interest in Xen, pay the price. In some cases, that cost is too high; Nick Piggin said:

FWIW, we had to disable paravirt in our default SLES11 kernel. (admittedly this was before some of the recent improvements were made). But there are only so many 1% performance regressions you can introduce before customers won't upgrade (or vendors won't publish benchmarks with the new software).

Ingo is strongly critical of the perceived cost of paravirt_ops, but he also proposes a solution:

Note what _is_ acceptable and what _is_ doable is to be a bit more inventive when dumping this optional, currently-high-overhead paravirt feature on us. My message to Xen folks is: use dynamic patching, fix your hypervisor and just use plain old-fashioned _restraint_ and common sense when engineering things, and for heaven's sake, _care_ about the native kernel's performance because in the long run it's your bread and butter too.

He goes on to say that merging Dom0 now would only make things worse; it would give the Xen developers less incentive to fix the problems while, simultaneously, making it harder for distributors to disable paravirt_ops in their kernels.

And that, perhaps, leads to the fundamental disconnect in this discussion. There are two distinctive lines of thought with regard to when code with known problems should be merged:

Some developers point out that code which is in the mainline benefits from the attention of a much wider pool of developers and improves much more quickly. It is easy to find examples of code which, after languishing for years out of the mainline, improved quickly after being merged. This is the reasoning behind the -staging tree and the general policy toward merging drivers sooner rather than later.
Some developers - sometimes, amusingly, the same developers - say, instead, that the best time to get fundamental problems fixed is before merging. This is undoubtedly true for user-space ABI issues; those often cannot be fixed at all after they have been shipped in a stable kernel. But holding code out of the mainline is also a powerful lever which subsystem maintainers can employ to motivate developers to fix problems. Once the code is merged, that particular tool is no longer available.

Both of these themes run through the Xen discussion. There is no doubt that the Xen Dom0 code would see more eyeballs - and patches - after being merged. So some developers think that the right thing to do is to merge this much-requested feature, then fix it up afterward. Chris Mason put it this way:

The idea that we should take code that is heavily used is important. The best place to fix xen is in the kernel. It always has been, and keeping it out is just making it harder on everyone involved.

But the stronger voice looks to be the one saying that the problems need to be fixed first. The deciding factors seem to be (1) the user-space ABI, and (2) the intrusion into the core x86 code; those issues make Xen different from yet another driver or filesystem. That, in turn, suggests that the Dom0 code is not destined for the mainline anytime soon. Instead, the Xen developers will be expected to go back and fix a list of problems - a lot of work with an uncertain result at the end.

Index entries for this article
Kernel	Virtualization/Xen
Kernel	Xen

Xen again

Posted Jun 3, 2009 15:56 UTC (Wed) by sf_alpha (guest, #40328) [Link] (1 responses)

I agree that Xen should merge when it ready. Merge dom0 code to kernel now may not have much benefit for Xen.

Xen user base is large enough for them to keep eye on active Xen development as long as Xen team keep track lastest kernel (in git). Anyone can grab/test/report/fix Xen issue while also able to use latest kernel features and drivers.

Not too hard to get Xen dom0 patches from git for stable kernel.

Distributions can ship more than one kernel to support XEN (And they usually does for PAE)

Last time I tested Xen dom0 (xen-tip/next), there are still some missing features compare to stable xen kernel.

But ... KVM is not thing for replace XEN. At least, KVM require processor support to work.

Xen again

Posted Jun 4, 2009 2:10 UTC (Thu) by jengelh (guest, #33263) [Link]

>But ... KVM is not thing for replace XEN. At least, KVM require processor support to work.

But the effort to get Xen run these days is, compared to less-paraful HVs like Vbox/Vmware/etc., high, from here's standpoint POV. Even UML requires less work ;)

Xen again: which ABI

Posted Jun 3, 2009 16:40 UTC (Wed) by arjan (subscriber, #36785) [Link] (2 responses)

The most concern is the hypervisor <-> Kernel ABI, not a Kernel <-> Userspace ABI. The hypervisor isn't a userspace component; it's more the other way around: The Linux kernel is the "userspace" of Xen.

Xen again: which ABI

Posted Jun 3, 2009 17:56 UTC (Wed) by aigarius (subscriber, #7329) [Link] (1 responses)

Yes, but for a kernel developer anything out of Linux (whether higher in the food chain or lower) is 'userspace'. Even BIOS. Whne people say that Kernel<->Userspace interface is set in stone, they really mean Kernel<->Anything-that-is-not-the-kernel.

Xen again: which ABI

Posted Jun 3, 2009 23:39 UTC (Wed) by mjg59 (subscriber, #23239) [Link]

That's not the case. We've broken the Linux <-> BIOS interface several times in ACPI, though admittedly every time it's been to make ourselves look more like Windows.

Let's step back a bit

Posted Jun 3, 2009 18:06 UTC (Wed) by BrucePerens (guest, #2510) [Link] (40 responses)

I've not been following virtualization too closely.

What does Xen do that KVM doesn't? What is missing from both that a "proper" virtualization system for Linux would provide?

Let's step back a bit

Posted Jun 3, 2009 19:03 UTC (Wed) by Thue (guest, #14277) [Link] (30 responses)

Unlike KVM, XEN does not require hardware virtualization support.

Let's step back a bit

Posted Jun 3, 2009 19:21 UTC (Wed) by BrucePerens (guest, #2510) [Link] (18 responses)

OK. But DomU is already in the kernel, and isn't that part already coded to not require hardware virtualization support?

So, the important part of Xen, in that it provides something that KVM doesn't have, is already in the kernel. KVM has a hypervisior already in the kernel. The Xen hypervisor is inelegant.

So, is it possible to make the KVM hypervisor support Dom0?

Let's step back a bit

Posted Jun 3, 2009 20:09 UTC (Wed) by nevets (subscriber, #11875) [Link] (5 responses)

KVM only works on hardware that has virtualization support. Of my 12 boxes, I have three that do that. One is crap, the other is OK, and the third is my latest laptop.

KVM developers have no interest (nor have they designed KVM) to work with paravirtualization (the thing needed by the OS to support non virtualization supported hardware). Although, I do believe KVM can make use of virtio, but that's another story.

We have enough in the kernel to support a DomU. That is, a true guest.

But Dom0 is a special guest with Xen. The Xen hypervisor passes off the work of drivers to Dom0 to have it do the work. But this interface between Dom0 and the hypervisor is a bit more intrusive than the interface needed by DomU (and already exists).

The issue is that once we add this Dom0 interface, we will forever need to support it. Because any changes we make will break Xen. This is why I suggested having Linux host the Xen source code. Then we can freely change the Dom0<->hypervisor interface without worrying about breaking an external ABI.

Note, my suggestion is not about Xen being inside Linux. It would still be a micro kernel loaded first. But the vmlinuz image would be one. First we load the Xen hypervisor, and then we load Dom0. This will couple the two tightly and the user would not need to worry about incompatibilities.

Let's step back a bit

Posted Jun 4, 2009 8:42 UTC (Thu) by rwmj (subscriber, #5474) [Link] (4 responses)

Bruce, this is an interesting and valid point, but it's also a bit like the discussion of 3D rendering that happened in the mid 90s. Sure, 3D graphics cards were rare and expensive at first, and that meant there was a place for software rendering.

Nowadays though no serious 3D program (ie. no game!) comes with a software renderer, because the 3D hardware is everywhere, on motherboards, in open handhelds like the GP2x-Wiz, and even in experimental boards like the ARM-based Beagleboard.

Hardware virt support is in just about every new x86-64 processor that comes out. A few 32 bit netbooks don't have it right now, but it'll come to those too.

Also don't overlook the fact that KVM does have software emulation. OK, it's slow, it's in userspace, and it relies on qemu. Nevertheless, just running qemu-kvm will transparently fall back to software emulation if the hardware doesn't support virtualization.

Rich.

Let's step back a bit

Posted Jun 4, 2009 8:43 UTC (Thu) by rwmj (subscriber, #5474) [Link]

s/Bruce/nevets/ ...

Let's step back a bit

Posted Jun 4, 2009 12:18 UTC (Thu) by nye (subscriber, #51576) [Link]

While I mostly agree, there are still a number of new mainstream CPUs which don't support hardware virtualisation, mostly aimed at the budget or mobile market. I know if I could use KVM on this laptop it would make my life a little easier. I've never used Xen so I don't know if it would be worth the effort for my purposes, but I'd be a lot more likely to try if it were in the kernel already.

Let's step back a bit

Posted Jun 4, 2009 17:44 UTC (Thu) by buchanmilne (guest, #42315) [Link] (1 responses)

<blockquote>Hardware virt support is in just about every new x86-64 processor that comes out.</blockquote>

But, an 18-month-old 16-core (8*"Dual Core AMD Opteron(tm) Processor 885") server (Sun X4600-M1) doesn't have it. With another 5 years of lifetime on these boxes, it really would be nice to keep Xen (which is what they are currently running). There's no way I would migrate this (with heavily utilised VMs) to qemu-kvm ...

Let's step back a bit

Posted Jun 4, 2009 21:28 UTC (Thu) by jimparis (guest, #38647) [Link]

But when you were purchasing a box 18 months ago, with a plan for a 5-year lifetime, wasn't it a huge mistake to overlook the hardware virtualization feature? I mean, KVM was merged into the kernel some 28 months ago.
I agree your situation sucks, but it seems more of a purchasing mistake than a reason to not move the world towards proper hardware virtualization.

Let's step back a bit

Posted Jun 3, 2009 20:10 UTC (Wed) by dtlin (subscriber, #36537) [Link]

xenner is a utility which is able to run xen paravirtualized kernels as guests on linux hosts, without the xen hypervisor, using kvm instead.

I haven't tried it out, but running Xen DomU on KVM seems perfectly possible. In any case, KVM and Xen+HVM are about equal in terms of guest support.

KVM's "Dom0" is the unmodified Linux kernel, running on bare hardware — there's nothing special about it. I'm not sure why you'd even want Xen's Dom0 there?

             HVM                      No HVM
KVM  Supports many guests           Not possible
Xen  Supports manu guests  Supports paravirtualized guests

The "not possible" (unless you're satisfied with QEMU) is what the Xen supporters are really focusing on.

No, it's completely unrelated.

Posted Jun 3, 2009 20:35 UTC (Wed) by gwolf (subscriber, #14632) [Link] (10 responses)

Xen and KVM are similar in that both can be used to run _hardware-assisted_ virtual machines. The strategies are, yes, completely different - KVM uses Linux as the "uppermost" piece, and each virtual machine is just a process as far as the host Linux is concerned.

KVM is great, say, if you want to run Windows instances - None of them will know (well, except for the hardware self-description strings) they are running virtualized. Same thing, yes, can be specified to Xen.

However, Xen's paravirtualization funcionality is completely unmatched by KVM - Xen can run DomU (guest) kernels that are explicitly aware they are running under a paravirtualized environment. This, of course, excludes non-free software, as they would have to be ported to the Xen pseudo-architecture. However, it is a very popular way to run completely independent Linux systems.

Why do you want to paravirtualize? Because the performance impact is way lower. You don't have to emulate hardware at all - In a regular virtualization setup, the guest OS will still shuffle bits around to give them to, say, the ATA I/O interface, possibly aligning them to cylinder/head/sector - On a hard disk that just does not exist, that is a file on another filesystem or whatever. When it is paravirtualized, the guest OS just signals the host OS to do its magic.

My favorite way out for most of the cases I would be forced to handle with Xen for this kind of needs is to use vserver - Which is _not_ formally a virtualization technology, but a compartmentalization/isolation technology (akin to what was introduced as the BSD Jails around 2000), where many almost-independent hosts share a single kernel, but live within different security contexts.

No, it's completely unrelated.

Posted Jun 4, 2009 1:59 UTC (Thu) by drag (guest, #31333) [Link] (9 responses)

> My favorite way out for most of the cases I would be forced to handle with Xen for this kind of needs is to use vserver - Which is _not_ formally a virtualization technology, but a compartmentalization/isolation technology (akin to what was introduced as the BSD Jails around 2000), where many almost-independent hosts share a single kernel, but live within different security contexts.

Well things like BSD Jails, Vserver, OpenVZ, etc etc. All of these are very much virtualization technology in a very real sense. They just are not hardware virtualization.

> Why do you want to paravirtualize? Because the performance impact is way lower. You don't have to emulate hardware at all - In a regular virtualization setup, the guest OS will still shuffle bits around to give them to, say, the ATA I/O interface, possibly aligning them to cylinder/head/sector - On a hard disk that just does not exist, that is a file on another filesystem or whatever. When it is paravirtualized, the guest OS just signals the host OS to do its magic.

Heh. KVM has paravirt drivers that are built into the kernel right now.

virtio-blk = block driver
virtio-rng = random number generator
virtio-net = ethernet network driver
virtio-balloon = used for reclaiming memory from VMs
virtio-pci = pci driver
9pnet_virtio = plan9 networking

And that works fine with updated versions of Qemu also. So you should be able to take advantage of them if your using Kqemu + Qemu for your virtualization. I think. But virtio is a standardized way of doing things. Should probably work with Qemu-dm for Xen stuff.

I there are windows drivers for virtio network. I am not sure about virtio block or balloon though...

I don't know how well KVM + Virtio compares to Xen PV..

Then on top of that you can use AMD's IOMMU or Intel's VT-d to map real hardware directly to virtualized hosts, which would be the fastest possible since your handing off direct access to the hardware.

No, it's completely unrelated.

Posted Jun 4, 2009 7:03 UTC (Thu) by sf_alpha (guest, #40328) [Link] (1 responses)

If KVM + virtio still need processor support it would be very slow compared to Xen when running on unsupported processor.

No, it's completely unrelated.

Posted Jun 4, 2009 12:20 UTC (Thu) by drag (guest, #31333) [Link]

Yes.

You need to have Intel or AMD's virtualization support to take advantage of KVM.

Even with the virtualization support KVM will be slower then PV. Xen's PV is very superior in terms of performance in almost all situations.

KVM's advantages over Xen are:

* Cleaner design. I am guessing that KVM hypervisor code is between 20k-30k with all the arch it supports were Xen's hypervisor code is easily 10x that much.

* Much easier to administrate and deal with. Does not require patches, does not require rebooting or anything of that nature. It's "just there". Does not require special console software or management tools beyond just qemu if that is all you want. You can use top to monitor VMs and crtl-z to pause them if you started them from a terminal, for example.

* Does not require to have your OS "lifted" into a Dom0... The way Linux interacts with the hardware does not change. This means (with latest kernels) I can suspend my laptop while running VMs and it just works.

* Heavily leverage's Linux's existing features. Instead of having to write various peices of hardware support into the hypervisor KVM gets all that and more by default. When Linux does improvements to, say, memory management then people using KVM directly benefit from that work.
(this is not a huge advantage over Xen, its more of a big improvement when compared to Vmware ESX.. no restrictions to hardware, network block protocols, or sata or anything like that... if Linux supports it you can use it in KVM)

* It is already installed and setup on your machine. All you have to do is intall the qmeu portion and the virt-manager or libvirt stuff if you want to have a nice and easy way to manage them. All Linux distributions have KVM support.. it's modules are by default by everything I've looked at.

disadvantages:

* PV on Xen is still easily performance king.

* require some hardware support.

No, it's completely unrelated.

Posted Jun 4, 2009 7:06 UTC (Thu) by bronson (subscriber, #4806) [Link] (4 responses)

> 9pnet_virtio

Wow, people are still writing 9p code? Given the sad state of http://sourceforge.net/projects/v9fs and http://sourceforge.net/projects/npfs I thought that these projects were stone dead.

I'd really like a network filesystem that is easier to administer than NFS and CIFS... Tried DRBD but didn't like it much. Is v9fs worth a look?

No, it's completely unrelated.

Posted Jun 4, 2009 12:03 UTC (Thu) by drag (guest, #31333) [Link] (2 responses)

No clue about plan9.

But DRBD is a way of keeping volumes in sync, not so much a file system.

The easiest FS to administer that I know of is sshfs. I use it heavily and it is stable and actually very fast. It can beat NFS even sometimes.. And all you need is Openssh server running and a fuse support in the client. The ssh server is the real gauge on how well sshfs works. Anything other then a relatively recent version of OpenSSH and I doubt the results will be that good.

But if DRBD was even being considured then your needs are going to be specialized. Other alternative to look at could possibly be Redhat's GNBD from GFS or ISCSI.

No, it's completely unrelated.

Posted Jun 4, 2009 19:32 UTC (Thu) by bronson (subscriber, #4806) [Link] (1 responses)

Tried sshfs 5 or so years ago, rejected it because the crypto overhead prevented me from filling a 100 MBit link. I should probably try it again since that won't be a problem nowadays.

I only mentioned DRBD to illustrate how desperate I've become! It was actually pretty good except that I couldn't get the split brain recovery to work the way I wanted. So close and yet so far. Haven't gotten desperate enough to try AFS yet!

Why doesn't 9p or webdav or some simple protocol take off? It's amazing to me that NFS and CIFS are still state of the art. I guess I don't understand the trade-offs very well.

No, it's completely unrelated.

Posted Jun 4, 2009 20:20 UTC (Thu) by drag (guest, #31333) [Link]

For sshfs if you want to have good performance you need to disable compression. If you think the crypto has to much overhead then change the encryption method to RC4.

Very likely you were running something like 3DES that has very high overhead. And like I said you need to have a relatively recent version of OpenSSH (say a version from the past 2 years or so) for reliable service.

You can set these on a per server basis in your ~/.ssh/config

I have had no problem personally beating NFS when it comes to my personal usage at home over wireless and gigabit link.. although of course this sort of thing is not suitable for large numbers of users.

No, it's completely unrelated.

Posted Jun 4, 2009 14:08 UTC (Thu) by sbergman27 (guest, #10767) [Link]

My understanding is that the main thrust of the the 9p virtio stuff is to implement shared volumes without all the ugly network guts being exposed to the administrator. And hopefully, at lower latency than the rather significant local latencies one sees even using a virtio network driver.

I have an ugly situation where I have a (proprietary) cobol C/ISAM <-> SQL gateway to some cobol accounting files. Due to the brain-deadness of the proprietary vendor (political concerns, their licenscing with their Cobol runtime supplier, yadda, yadda, yadda...), I have to run it virtualized in an old distro and it sees the C/ISAM files via NFS4. It's written to do a lot of fsync'ing and doesn't seem to make any use of any sort of NFS caching, and so latency absolutely kills its performance. I can't use any of the virtio stuff because the guest kernel is too old to support it, and even that has latencies in the hundreds of microseconds. So I'm using the software emulated E1000 driver, which is almost as efficient as virtio.

However, if I could use the 9p shared volume stuff, I suspect, but am not sure, that latency would be much improved. As it stands, it is still over twice as fast as running on a separate machine via NFS4 over 1000baseT.

So far as I know, the 9p-virtio thing is still an active project, but not yet in mainline KVM. Or, at least, it does not seem to be in Ubuntu 9.04 server.

No, it's completely unrelated.

Posted Jun 4, 2009 12:48 UTC (Thu) by gwolf (subscriber, #14632) [Link] (1 responses)

> Heh. KVM has paravirt drivers that are built into the kernel right now.

Yes, and that's good - I use KVM with paravirt network and disk devices for Windows hosts. Still, many things (i.e. memory access, real CPU mapping, even the kind of architecture the guests report as having) have to be emulated. Paravirt devices are a great boost, though - And by being much simpler, say, than hardware-specific drivers, I am also reducing the most common cause for Windows' instability.

Now, both with Xen and with KVM (and I'd expect with any other virtualization technology) you can forward a real device - Just remove support for it on the host (or Dom0) kernel and ask the virtualizer to forward the needed interrupts/mapped memory space/bus address, and you have it natively inside. Of course, you lose the ability to perform live migrations - But you cannot always win! :)

No, it's completely unrelated.

Posted Jun 10, 2009 17:37 UTC (Wed) by tmassey (guest, #52228) [Link]

You say you have virtualized *disk* drivers for Windows for KVM? I'm aware of the paravirt network drivers, but I've looked repeatedly for block drivers. They've always been 'planned for the future', but I've not been able to find them.

Where would I get paravirt Windows drivers for KVM?

Let's step back a bit

Posted Jun 3, 2009 20:13 UTC (Wed) by ncm (guest, #165) [Link] (7 responses)

Does it really matter any more whether a new release of Xen requires hardware virtualization support? Doesn't all the current hardware where people want to run Xen have such support already? This seems akin to compilers supporting funny x86 memory models long after everybody already had a 386. (There were lots of 286s still around, but their owners weren't buying new software.) How many of these 500,000 servers running Xen can't run KVM? And aren't those on a schedule to be retired, for other reasons (e.g. power consumption, increasing failure rate, etc.) soon?

Let's step back a bit

Posted Jun 4, 2009 12:56 UTC (Thu) by gwolf (subscriber, #14632) [Link] (5 responses)

> How many of these 500,000 servers running Xen can't run KVM? And aren't those on a schedule to be retired, for other reasons (e.g. power consumption, increasing failure rate, etc.) soon?

When I bought my laptop, January 2008, I shopped explicitly for one with virtualization capability. However, for a long time I just was not able to use it as such - Because of the lack of support in Xen for core features I want a laptop to support, such as ACPI (which is mainly useful for laptops, granted, but that could be very well used everywhere, leading to sensible power savings). Virtualization does not only work at the server farm, it can also be very useful at desktops.

Let's step back a bit

Posted Jun 4, 2009 15:42 UTC (Thu) by TomMD (guest, #56998) [Link] (3 responses)

> Virtualization does not only work at the server farm, it can also be very useful at desktops.

YES! And its not just for x86 anymore, but there are architectures that don't have VT or SVM hackery and are perfectly viable users of Xen. I'd love to run Xen on the (ARM based) beagle board or a BB based laptop.

Let's step back a bit

Posted Jun 4, 2009 20:29 UTC (Thu) by drag (guest, #31333) [Link] (2 responses)

the VT and SVM cpu extensions are only needed for X86 platform because the X68 ISA design is such a huge pile of shit.

KVM works fine on other architectures (like PowerPC), so that is all a bit of a red herring.

For x86 systems that donnot have VT/SVM you can use Kqemu and get similar functionality and speed.

Let's step back a bit

Posted Jun 9, 2009 2:11 UTC (Tue) by xyzzy (guest, #1984) [Link]

I migrated my Xen DomUs to kqemu VMs a year ago. I didn't rigourously benchmark but the performance drop was noticeable -- I went from being able to fill 100mbps, to not being able to fill even half of it. And this was with wget and apache and static files, so mostly an I/O performance issue.

Let's step back a bit

Posted Jun 9, 2009 7:50 UTC (Tue) by paulj (subscriber, #341) [Link]

Kqemu is long unmaintained. The Qemu developers are discussing ripping it out. Kqemu guest-kernel-space is very buggy and nearly always unuseable. So any deployment of Kqemu will run guest kernel under emulation, which obviously leads to very poor performance for all applications except those which are near-completely userspace CPU bound.

Let's step back a bit

Posted Jun 7, 2009 10:41 UTC (Sun) by djao (guest, #4263) [Link]

When I bought my laptop, January 2008, I shopped explicitly for one with virtualization capability. However, for a long time I just was not able to use it as such - Because of the lack of support in Xen for core features I want a laptop to support, such as ACPI

This is a fatal flaw in Xen, sure, but I don't understand why it would have stopped you from using KVM. You mention that you specifically bought a laptop with support for hardware virtualization, and KVM works fine with ACPI or any other core laptop feature, since KVM is just Linux.

I bought my laptop in April 2008 and I've been using it with KVM almost from day one. Everything works great, including ACPI.

Let's step back a bit

Posted Jun 4, 2009 13:28 UTC (Thu) by ESRI (guest, #52806) [Link]

I know we have a LOT of Dell PE 2850's and newer still with a lot of life and horsepower in them... perfect for running Xen, but not at all good for running KVM.

Let's step back a bit

Posted Jun 3, 2009 22:21 UTC (Wed) by nix (subscriber, #2304) [Link] (2 responses)

But KVM is just a speedup component for qemu, really. If you don't have
KVM, qemu still works, only slower (much slower if you don't load kqemu).

If you don't have VT support, my understanding is that Xen similarly
works, just slower.

So what's the substantive difference?

Let's step back a bit

Posted Jun 3, 2009 23:01 UTC (Wed) by nevets (subscriber, #11875) [Link] (1 responses)

I believe that a paravirtualized guest runs much faster than a qemu guest. But I have not taken any benchmarks.

I also think the issue is that Xen is still quite a head of KVM in features, but this too is slowing down.

Let's step back a bit

Posted Jun 4, 2009 2:34 UTC (Thu) by drag (guest, #31333) [Link]

> I believe that a paravirtualized guest runs much faster than a qemu guest. But I have not taken any benchmarks.

YES PV is massively faster then just plain Qemu. Massively faster in all respects. The overhead of Xen PV vs naked hardware is going to be just a few percent.

Of course this requires modification to the guest.

Let's step back a bit

Posted Jun 3, 2009 20:28 UTC (Wed) by skitching (guest, #36856) [Link] (3 responses)

My understanding is that XEN is a three-tier system:
1 x hypervisor -- manages CPU, ram, interrupts
1 x dom0 -- contains device drivers and admin apps
n x domU -- guest VMs

The dom0 impl therefore provides services (esp devices) to guests while using services provided by the hypervisor.

KVM is a two-tier system, collapsing the hypervisor and dom0 into one layer.

The advantages of the Xen approach appear to be:
1) critical hypervisor code is small, so more easily ported and audited
2) possible to use same hypervisor with different dom0 impls, eg windows/bsd/solaris as dom0.
3) this is the traditional approach, hence more research/experience
4) reduces changes needed to an OS to make it a dom0, as some of the virtualization-specific logic is in a separate layer. In particular, one article implied that KVM would have to rework the standard linux scheduler implementation to get good scheduling for guests. I guess this means that xen does at least some scheduling decisions in the hypervisor.

It seems that (4) isn't working out so well in practice though.

The recent suggestion to put a hypervisor implementation into the kernel git tree does seem interesting. AIUI, the suggestion is not to collapse the layers KVM-style; building the kernel would also generate a separate hypervisor image. However because (linux-hypervisor, linux-kernel) are always released as a pair there is no ABI issue. The downside is that no other dom0 implementation would run on that hypervisor, so
(a) XenSource would presumably have to maintain two similar-but-not-identical hypervisor implementations,
(b) If someone upgrades their linux kernel, they also need to upgrade the hypervisor to match.
(c) If someone switchs to a different dom0 (eg windows) they need to change hypervisor to match.

Presumably (b) and (c) sink any idea of putting the hypervisor into ROM. And just the idea of modifying the hypervisor in sync with linux changes means that the "auditability" benefit of a hypervisor is mostly lost: it's expensive to "recertify" a hypervisor impl that changes every 6 months.

I'm no expert on this, so all corrections welcome!

Cheers, Simon

Let's step back a bit

Posted Jun 3, 2009 20:58 UTC (Wed) by nevets (subscriber, #11875) [Link] (2 responses)

1) critical hypervisor code is small, so more easily ported and audited

I've heard this argument before, but it has one flaw. It still depends on Dom0. You still need to port Dom0 and audit it too. If you crack Dom0, you cracked the box

2) possible to use same hypervisor with different dom0 impls, eg windows/bsd/solaris as dom0.

I'm not sure this is that big of a deal. As Ted mentioned in an email, Linux supports more devices than anything else, and makes it the ideal Dom0.

3) this is the traditional approach, hence more research/experience

Or perhaps its the approach that does not think outside the box.

4) reduces changes needed to an OS to make it a dom0, as some of the virtualization-specific logic is in a separate layer. In particular, one article implied that KVM would have to rework the standard linux scheduler implementation to get good scheduling for guests. I guess this means that xen does at least some scheduling decisions in the hypervisor.

You are right that this is not quite true in practice, as we see.

(a) XenSource would presumably have to maintain two similar-but-not-identical hypervisor implementations,

Keeps them employed.

Actually, if they make the Linux version their main code base, they could add a kluge layer to interact with other Dom0s.

(b) If someone upgrades their linux kernel, they also need to upgrade the hypervisor to match.

As I stated before. That would happen automatically. You would just install the vmlinuz, and it would load both the new Xen hypervisor along with the new Linux kernel.

(c) If someone switchs to a different dom0 (eg windows) they need to change hypervisor to match.

Or add the kluge layer.

Presumably (b) and (c) sink any idea of putting the hypervisor into ROM.

This point hits the main argument that we have against adding the Dom0 interface. Once it is there, it is set in stone, and we can not change it. Otherwise we will get all those people that loaded their box's ROM with a hypervisor crying to us that they can't run the latest kernel.

The current Dom0 ABI is intrusive and limits the development of Linux. This is the reason it is being rejected. Not to mention that it also comes with a certain degree of performance overhead when not in use.

If the Dom0 ABI were to come in without the Xen hypervisor, then it must be clean, and not scattered through out the kernel.

And just the idea of modifying the hypervisor in sync with linux changes means that the "auditability" benefit of a hypervisor is mostly lost: it's expensive to "recertify" a hypervisor impl that changes every 6 months.

Again, if you do not audit the Dom0 you are wasting your time. If you need to audit the hypervisor, pick a image, and audit it. You better audit the Dom0 in use too. Then after that is done, don't upgrade.

Let's step back a bit

Posted Jun 4, 2009 6:28 UTC (Thu) by drag (guest, #31333) [Link]

> 1) critical hypervisor code is small, so more easily ported and audited

Ya.. it starts up like that. But it rarely stays like that.

In the case of Xen I think your looking at about 300-400K lines of code in a full-featured version.

In the case of KVM the initial patch was about 12K loc and was accepted almost immediately into the kernel code. (quite a achievement)

Let's step back a bit

Posted Jun 4, 2009 7:45 UTC (Thu) by sf_alpha (guest, #40328) [Link]

It isn't true that KVM does not have Dom0. But it reside in different layer, put the fact that both KVM and Xen -- Dom0 is privileged and handle device drivers + hardware, weather is in top or bottom of virtualization later.

For Xen, Xen Hypervisor is a controller too grant access Real hardware/cpus, but Dom0 is virtually both HW driver and admin system.

For KVM, It is application + kernel part too grant access Real hardware/cpus (using processor assisted), but system running KVM is both HW driver and admin system.

In both cases, if parent or dom0 cracked, whole system is compromised.

I agree that if Xen code is too much impact on x86 core arch code it should not merge until this fixes, but again KVM is not Xen replacement at all. Even with xenner which is really KVM but Xen guest.

I can say that Xen lacks support from its users. If Xen shipped in kernel it would heavily tested. But now it's not, even Xen currently keep track latest kernel inside git, most people seems to use stable and old Xen kernel and not many are working on new Xen Dom0 kernel.

And again, KVM is not thing that would replace Xen and not replace each other. I cannot see any benefits for replace Xen with KVM for now (I running a couple servers using Xen, some of those not support VT-d or AMD-V).

Topics of this problem is that Xen cause too much changes of the core x86 code and seems not to clean enough.

Let's step back a bit

Posted Jun 4, 2009 1:05 UTC (Thu) by ras (subscriber, #33059) [Link] (4 responses)

I am just exploring the implications of this issue myself.

When I going out and try to buy CPU cycles, the most attractive way to do it right now is a Xen VM. There are other options - shared hosting, VMware and probably others. All have warts compared to a Xen VM. It is really nice to be able to configure and debug my VM on my laptop, then send it to the hosting provider. So that is point 1: unlike KVM or any other solution described here, Xen is out there, in the real world. Because KVM isn't, it is in a practical sense unless for one of the major applications of VM's - cloud computing.

Point 2 is that many of those Xen images out there are para-virtualised for speed, so I can't use KVM to develop them.

Point 3 is I want to run the latest Linux kernel as my Dom0 - principally because nothing else seems to work on modern hardware. Applying the Xen patches myself is an absolute PITA.

The end result is not having Xen is making Linux hard to use in an emerging platform - cloud computing. I don't doubt there are real issues - the fact that Xen uses the Dom0 to talk to the hardware sounds to me like it has the makings of a real ugly patch. However Xen isn't a webcam that can be ignored. Xen is an entire platform - like Windows or Linux. And it is an open source solution locked in battle with closed alternatives. I want it to win - after all I could just use VMware. If Xen doesn't win, possibly no open source solution will. KVM is not even a player in this space.

Let's step back a bit

Posted Jun 4, 2009 1:23 UTC (Thu) by ncm (guest, #165) [Link] (3 responses)

This is a very interesting and persuasive view. I wonder, though: if Xen can be hosted on KVM, then the ultimate choice doesn't affect you, does it? You can run your Xen image on an old Xen or a new Xen, and not notice the difference. Where am I confused? Is it that a KVM Xen can't run your paravirtualized image?

Let's step back a bit

Posted Jun 4, 2009 1:51 UTC (Thu) by ras (subscriber, #33059) [Link] (2 responses)

ncm: Is it that a KVM Xen can't run your paravirtualized image?

Firstly, beware I am trying to do this as we speak. I've tried things. Some of them didn't work. But maybe it is just because I don't have a clue.

There are two issues here. One is I don't seem to get a choice as to what DomU I am running. The VM hosting provide couple and I choose one. If I want to work on it locally, I image it and have a fiddle. The essential point being, I can't choose one that suits KVM. On the other hand, if there is some magic way I can make any image run fine under KVM, and back again 100% reliably then I am mostly satisfied. Right now I haven't found that way. As I said, this is possibly because I need a clue.

Secondly, I would like to use the same tools for managing the DomU the hosting provider does - it just makes it so much easier to understand what is going on. In other words, I want to run the xen tools in a Linux Dom0. This is very much a secondary consideration for me though. It is possibly more important from the Xen world domination point of view as you want to make it as easy as possible for the hosting providers, and right now they have to be very picky about what they use as their Dom0.

Currently I can use Linux 2.6.26 as my Dom0. Sadly a big chunk of my peripherals aren't supported by 2.6.26. In fact some aren't stable under 2.6.29 (wireless causes the machine to freeze if I use it heavily), so right now I am eagerly awaiting each new kernel release. But porting the Xen patches to each and every kernel release is simply too much work.

Let's step back a bit

Posted Jun 4, 2009 6:34 UTC (Thu) by jamesh (guest, #1159) [Link] (1 responses)

An earlier post linked to http://kraxel.fedorapeople.org/xenner/. Can you use that to run the guest images from your provider?

Let's step back a bit

Posted Jun 4, 2009 6:49 UTC (Thu) by ras (subscriber, #33059) [Link]

It may well be exactly what I want. Thanks.

Xen is not user-space

Posted Jun 4, 2009 12:11 UTC (Thu) by dunlapg (guest, #57764) [Link] (2 responses)

There seems to be a slight confusion in the article, where twice it refers to Xen as "user-space". In kernel terms, "user-space" has a technical definition: running in processor ring 3. In fact, Xen runs in ring 0, and when Linux runs as a domU or dom0 on Xen, it runs in ring 1.

Someone in the lkml discussion said that Xen wasn't an operating system, which was quickly refuted: the core job of an OS is to manage the sharing of resources, specifically memory and cpu time. Sharing memory and cpu time between multiple VMs specifically what Xen does, delegating the sharing of almost all other resources (disk, network, &c) to dom0 or driver domains. This makes Xen very akin to a micro-kernel. (Xen proponents have acknowledged this similarity, and said that Xen is "a microkernel done right".)

So when the article says "it seems unlikely that something as large as Xen would be the first user-space component to break through that barrier", something is confused. Xen isn't a user-space component. If Xen were to be merged in (or an in-kernel hypervisor project started), it would run in ring 0 and include memory and processor scheduling algorithms, just as Linux currently does, but targeted towards sharing between VMs, as opposed to processes.

The reason this is being discussed in the first place is that some people (myself included) don't believe that the same code can work well both as a kernel and as a hypervisor. This includes all aspects of the hypervisor, but specifically scheduler can work well both as a scheduler of processes and a scheduler of VMs. KVM currently runs VMs as processes, with Linux as the de-facto hypervisor. So a VM running under KVM gets a processor scheduler rather than a VM scheduler. Maybe the Linux scheduler could be made to do both, but the Xen community thinks it's better to have a dedicated VM scheduler, as well as other dedicated VM-oriented algorithms.

Having this code in the kernel would address two issues:
* The fear that Linux will be stuck with supporting a very invasive ABI. If the hypervisor is in the kernel tree, it would be a lot easier to tie them together.
* The feeling that dom0 changes add a ton of ugly "hooks", but don't make the Linux code base any better. Since the Xen hypervisor functionality is outside of Linux, having cool functionality in Xen doesn't really count as improving Linux. But if a hypervisor component were in the kernel, rather than a separate projec,t then the kernel as a whole would seem to benefit. (This is a bit of a subtle one to understand; it depends on drawing a boundary and saying, "If it's inside this tree, it contributes to making Linux cool, and may be worth the cost of the hooks. But if it's outside the tree, it doesn't contribute to making Linux cool, so the hooks are a cost without a benefit.")

Whether it's practical to do a merge with Xen specifically, is another quesiton. :-)

Xen is not user-space

Posted Jun 4, 2009 13:19 UTC (Thu) by nevets (subscriber, #11875) [Link] (1 responses)

> The feeling that dom0 changes add a ton of ugly "hooks", but don't make the Linux code base any better. Since the Xen hypervisor functionality is outside of Linux, having cool functionality in Xen doesn't really count as improving Linux. But if a hypervisor component were in the kernel, rather than a separate projec,t then the kernel as a whole would seem to benefit. (This is a bit of a subtle one to understand; it depends on drawing a boundary and saying, "If it's inside this tree, it contributes to making Linux cool, and may be worth the cost of the hooks. But if it's outside the tree, it doesn't contribute to making Linux cool, so the hooks are a cost without a benefit.")

This is a very important point. I would look at as, if it is in the kernel, it means Linux and Xen are packaged together. If it is out of the kernel, then Xen and Linux are too separate packages.

This is a key point. If they are as one package, and a change in Linux breaks the interface between kernel and hypervisor, the fix would be to update the hypervisor to handle the new change.

If they are two packages, and Linux breaks the interface between kernel and hypervisor, then the fix would be to redesign the Linux change to cope with keeping the same ABI to the hypervisor. This is a burden that the maintainers do not want to carry.

Having the two as one package would mean if you upgrade one, you also upgrade the other. A subtle point indeed, but an important one.

Xen is not user-space

Posted Jun 4, 2009 13:32 UTC (Thu) by dunlapg (guest, #57764) [Link]

-- Begin Quote --
If they are two packages, and Linux breaks the interface between kernel and hypervisor, then the fix would be to redesign the Linux change to cope with keeping the same ABI to the hypervisor. This is a burden that the maintainers do not want to carry.
-- End Quote --

That's an understandable concern. But Keir Fraser has unequivocally stated that he does not expect that. It's Xen's job to be backwards compatible with older kernels if it wants to be. It's not Linux's job to be backwards compatible with older hypervisors. If the ABI changes in Linux, Xen will upgrade to match; and if someone upgrades Linux over a dom0<->xen ABI change, then they will have to upgrade Xen over that same ABI change as well.

Real world KVM

Posted Jun 4, 2009 15:47 UTC (Thu) by cdmiller (guest, #2813) [Link] (2 responses)

Huh,

We have been using KVM for core servers for ~10 months now. We have ~25 Windows and Linux server VM's on our production cluster of 2 test and 4 production hardware servers (~$3000 each). We even run MSSQL on KVM. Shared storage is on NFS from a couple of DRBD'd storage servers.

When we tested the commercial Citrix Xen offering it could not create a Mandriva or Ubuntu server for us. We have found it much easier to roll VM's with KVM than with Xen. Using a wrapper script and some debconf with Ubuntu's VM builder we can spit out new, updated, Ubuntu server VM's in about 5 minutes custom configured for our environment.

Live migration was far superior for KVM than with VMWare when we tested last summer. KVM allows live migration over SSH (used to be the default mechanism). We see a couple of pings (2 to 5 packets) with increased latency during the infamous migration ping test, compare that to VMWare. We didn't test vs Xen due to the VM creation problem above.

The KVM GUI tools were lacking but were easily overcome with a minimal Perl driven CGI. We even get console in the web browser with a Java VNC applet.

In our study, we eliminated HyperV and Xen early on and it became a contest between VMWare and KVM. KVM won out despite some shortcomings in GUI tools and commercial support. The fact that a VMWare update had a critical bug last summer during our testing influenced our results a bit.

Real world KVM

Posted Jun 4, 2009 21:29 UTC (Thu) by drag (guest, #31333) [Link] (1 responses)

I know it's not done for you right now.

But just since you brought up the subject of GUI tools and that sort of thing...

Rehdat is moving to KVM for the Redhat ES 5.4 release.

Redhat/Fedora vit-manager is working very very well with KVM on Fedora 11 beta.

http://libvirt.org/ is a library that is being developed that will provide a stable/unified API for people programming for Linux VM solutions. Not just KVM, but Xen, LXC (linux container system), OpenVZ, User mode linux, and Virtualbox.

Real world KVM

Posted Jun 10, 2009 20:04 UTC (Wed) by cdmiller (guest, #2813) [Link]

Yes, we looked closely at libvirt. At the time it was in transition as Qumranet was about to be bought by RedHat. When we first looked there was no support for live migration and it was out of sync with recent qemu's (changing CD's). We will probably be looking at libvirt again at some point.

Xen again

Posted Jun 4, 2009 23:05 UTC (Thu) by caitlinbestler (guest, #32532) [Link]

Actually Dom0 is not "the hypervisor portion of the Xen System", it is the hypervisor designated default owner of devices.

To be even more precise, Device Domains (or DomDs) are the owners of specific devices. Dom0 is the traditional and default DomD.

The separation of Device control from *the* hypervisor is one of the key strengths of the Xen architecture because it keeps the true Hypervisor (on which *all* other kernels rely) very stable. I suspect at least part of the opposition to "Xen Dom0" is not actually about "Dom0" but to the fact that the Xen Hypervisor is *not* Linux.

But focusing on what DomDs/Dom0 really requires the following pieces:

The actual/native device driver(s).
The "backend" drivers that support the Xen "frontend" drivers.
Glue to connect the backend drivers to the actual drivers.
The ability to configure that glue, preferably from user mode.

There is no particular reason why a Device backend has to be implemented in Linux. But Linux is particularly inviting for reasons #1,#3 and #4. Especially reason #1.

Understood, the "ABI" question is really a distraction. It is actually the Backend drivers that have stable ABIs, not Linux. The fact that Linux insists that kernel modules count as "Linux" is a Linux issue. People who develop those modules would really rather be free to follow their own coding standards, etc. Being GPL and following coding standards is reasonable, but having your module's interfaces being viewed as though they were Linux interfaces is a totally different issue.

So the real question is what kernel capabilities do backend modules truly need, and are these legitimate generic capabilities rather than Xen inflicting itself on the kernel?

Ultimately these relate to the ability to communicate with other Virtual Machines through shared memory and delegation of PCI functions and MSI-X interrupts. If you accept the concept that a non-Linux Hypervisor may partition a machine into physical machines, then is there any legitimate reason why mainline Linux should not be usable for Device controlling Domains?

Rejecting this application of Linux because Linux would rather also be the Hypervisor strikes me as taking your baseball home because the team wants somebody else to pitch. The true spirit of Open development would allow users to decide on Hypervisor versus Device Control separately rather than forcing them to lock the two decisions together.