Using the KVM API
Using the KVM API
Posted Sep 29, 2015 21:18 UTC (Tue) by pbonzini (subscriber, #60935)In reply to: Using the KVM API by josh
Parent article: Using the KVM API
But KVM is not really needed in this case: on one hand you don't need near bare-metal performance that KVM provides, because dosemu/dosbox only need to emulate a 100 MHz machine or so, and a simple interpreter or a JIT compiler like QEMU's can handle it (QEMU is known as slowish for a JIT translator, but there's some work being done on that side as well). On the other hand KVM's performance comes with some fine print, which you cannot really afford in the case of dosemu/dosbox. A KVM_EXIT_IO exit is very slow, on the order of a few thousand cycles on the newest processors. By comparison, QEMU can dispatch a single memory-mapped I/O operation in about 100 clock cycles, so 60-150 times faster than KVM. Hence running demos like Unreal (https://www.youtube.com/watch?v=VjYQeMExIwk#t=7m) doesn't work too well on QEMU with KVM because they do an insane number of such exits.
To play old games (man, I should send those Jazz Jackrabbit patches upstream...) I typically use QEMU without KVM.
Posted Sep 29, 2015 21:26 UTC (Tue)
by josh (subscriber, #17465)
[Link] (5 responses)
True. Out of curiosity, does any means exist to turn that *off*? I have some interest in compiling out most or all of the in-kernel instruction emulation, to reduce attack surface area.
> A KVM_EXIT_IO exit is very slow, on the order of a few thousand cycles on the newest processors. By comparison, QEMU can dispatch a single memory-mapped I/O operation in about 100 clock cycles, so 60-150 times faster than KVM.
What about with coalesced or fd-ed I/O?
Posted Sep 30, 2015 13:45 UTC (Wed)
by pbonzini (subscriber, #60935)
[Link] (4 responses)
With unrestricted_guest=1 you only exit to the emulator for a few privileged instructions (where for simplicity KVM emulates them instead of having a mini-interpreter in vmx.c/svm.c) and for I/O. But unfortunately, thanks to the x86 ISA's read-modify-write instructions that's still a _lot_ of different instructions that you can emulate.
So there's not much that you can compile out. You could simply modify KVM to refuse loading if unrestricted_guest=0, but you can still trigger any bit of emulator code by setting up a race between two VCPUs. One triggers I/O continuously, the other races against the emulator changing the opcodes of the I/O instruction into something else. This actually used to be a vulnerability, but it's been patched for several years and the emulator is now considered a security sensitive component.
> > A KVM_EXIT_IO exit is very slow, on the order of a few thousand cycles on the
Still around 1500-2000 cycles. For ioeventfd you have to add the latency of waking up the I/O thread if it's sleeping (but if the fd is really busy, e.g. running fio in the guest, it won't have time to go to sleep).
Posted Sep 30, 2015 17:53 UTC (Wed)
by josh (subscriber, #17465)
[Link] (3 responses)
How much *minimum* latency comes from the vmexit, and how much gets added by the path from the in-kernel vmexit handling and whatever mechanism it uses to contact the I/O thread? If much of it comes from the latter, perhaps we could find a way to accelerate that via another (latency-optimized) interface.
Posted Oct 1, 2015 7:17 UTC (Thu)
by pbonzini (subscriber, #60935)
[Link] (2 responses)
Posted Oct 1, 2015 15:50 UTC (Thu)
by josh (subscriber, #17465)
[Link] (1 responses)
Posted Oct 1, 2015 16:02 UTC (Thu)
by pbonzini (subscriber, #60935)
[Link]
Posted Sep 30, 2015 3:42 UTC (Wed)
by voltagex (guest, #86296)
[Link] (1 responses)
Yes, yes you should. By the way, can you still buy that game? That and OMF 2097 are my favourites of all time.
Posted Sep 30, 2015 13:46 UTC (Wed)
by pbonzini (subscriber, #60935)
[Link]
Posted Sep 30, 2015 6:17 UTC (Wed)
by eru (subscriber, #2753)
[Link] (7 responses)
Probably true for old games, but the situation I am thinking of involves using ancient cross-compilers to compiler large masses of legacy code for a weird environment that still has to be maintained. One would think (and I did think) this is an I/O-bound operation, but it turned out the speed difference between dosemu with VM86 and dosemu on x86_86 with emulation is very noticeable (order of magnitude for large inputs). On the other hand, dosemu also has advantages, because it can run "headless", can easily access native files, and starts up quickly. These are important features, because the ancient compilers are wrapped in layers that hide their MS-DOS internals, so from the Linux user's point of view they act like normal command-line tools.
Posted Sep 30, 2015 13:46 UTC (Wed)
by pbonzini (subscriber, #60935)
[Link] (6 responses)
Posted Sep 30, 2015 15:28 UTC (Wed)
by eru (subscriber, #2753)
[Link] (5 responses)
Posted Sep 30, 2015 17:15 UTC (Wed)
by felix.s (guest, #104710)
[Link] (4 responses)
Also, funnily enough, some time ago I've been working on a DOS/BIOS ABI layer based on KVM (I tried to make backends interchangeable, but I'm not sure how well I've succeeded), and I think it would be ideal for the use case you describe. I even managed to include a simplistic packet driver, so I can use the FDNPKG package manager to download programs to test. However, the code is currently such a mess that I'm too embarrassed to publish it. Maybe some day...
Posted Sep 30, 2015 17:34 UTC (Wed)
by josh (subscriber, #17465)
[Link] (3 responses)
Posted Sep 30, 2015 17:49 UTC (Wed)
by kvaneesh (subscriber, #45646)
[Link] (2 responses)
Posted Oct 1, 2015 6:29 UTC (Thu)
by kleptog (subscriber, #1183)
[Link] (1 responses)
Now, I get that it's probably a configuration issue since it clearly wasn't caching anything, but I found it really hard to find documentation about qemu that explained this behaviour. On top of that I'm managing them via libvirt, so even if I find a command-line option to deal with something, if libvirt doesn't support it I'm still SOL.
Overall, it hasn't been a great experience, next time I'll probably do what other people do, use VirtualBox or VMWare.
But back to the article, it's a pretty nice interface actually. Hopefully I'll find some reason to use it sometime :)
Posted Oct 8, 2015 16:48 UTC (Thu)
by LightDot (guest, #73140)
[Link]
<domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
The VNC option doesn't need to be presented as a command line, I just left it as an example.
Posted Sep 30, 2015 21:16 UTC (Wed)
by luto (guest, #39314)
[Link] (1 responses)
Please tell me that this is at least *guest* vm86 mode and not host vm86 mode.
Also, why does it care how the guest->host physical mappings are set up?
Posted Oct 1, 2015 7:21 UTC (Thu)
by pbonzini (subscriber, #60935)
[Link]
> Also, why does it care how the guest->host physical mappings are set up?
Can you explain your question better? Who is the subject?
Posted Sep 30, 2015 23:04 UTC (Wed)
by josh (subscriber, #17465)
[Link] (2 responses)
Posted Oct 1, 2015 8:21 UTC (Thu)
by pbonzini (subscriber, #60935)
[Link] (1 responses)
If you don't call KVM_SET_TSS_ADDR you actually get a complaint in dmesg, and the TR stays at 0. I am not really sure what kind of bad things can happen with unrestricted_guest=0, probably you just get a VM Entry failure. The TSS takes 3 pages of memory. An interesting point is that you actually don't need to set the TR selector to a valid value (as you would do when running in "normal" vm86 mode), you can simply set the base and limit registers that are hidden in the processor, and generally inaccessible except through VMREAD/VMWRITE or system management mode. So KVM needs to set up a TSS but not a GDT.
For paging, instead, 1 page is enough because we have only 4GB of memory to address. KVM disables CR4.PAE (page address extensions, aka 8-byte entries in each page directory or page table) and enables CR4.PSE (page size extensions, aka 4MB huge pages support with 4-byte page directory entries). One page then fits 1024 4-byte page directory entries, each for a 4MB huge pages, totaling exactly 4GB. Here if you don't set it the page table is at address 0xFFFBC000. QEMU changes it to 0xFEFFC000 so that the BIOS can be up to 16MB in size (the default only allows 256k between 0xFFFC0000 and 0xFFFFFFFF).
The different handling, where only the page table has a default, is unfortunate, but so goes life...
Posted Oct 1, 2015 15:54 UTC (Thu)
by josh (subscriber, #17465)
[Link]
Ah, I see.
> If you don't call KVM_SET_TSS_ADDR you actually get a complaint in dmesg, and the TR stays at 0.
While I saw the mention of that message in a few places, I don't actually get that message at any point. Presumably that only happens with unrestricted_guest=0?
Please consider documenting the use of these two ioctls and the data they point to, as well as what circumstances require them; the current KVM documentation doesn't mention any of that.
Using the KVM API
Using the KVM API
> in compiling out most or all of the in-kernel instruction emulation, to reduce attack
> surface area.
> > newest processors. By comparison, QEMU can dispatch a single memory-mapped I/O
> > operation in about 100 clock cycles, so 60-150 times faster than KVM.
>
> What about with coalesced or fd-ed I/O?
Using the KVM API
Using the KVM API
Using the KVM API
Using the KVM API
Using the KVM API
Using the KVM API
on one hand you don't need near bare-metal performance that KVM provides, because dosemu/dosbox only need to emulate a 100 MHz machine or so, and a simple interpreter or a JIT compiler like QEMU's can handle it
Using the KVM API
Using the KVM API
I probably should look into qemu again some day. One problem is file system access. As noted, I want the MS-DOS compilers to transparently compile sources in the Linux file system and write the objects there, and preferably without having to install any network support in the emulated MS-DOS or FreeDOS, so to leave maximum "real" memory for the compilers. Both dosemu and dosbox handle this requirement.
Using the KVM API
Using the KVM API
Using the KVM API
Using the KVM API
Using the KVM API
Using the KVM API
...
<qemu:commandline>
<qemu:arg value='-vnc'/>
<qemu:arg value=':30,tls'/>
<qemu:arg value='-k'/>
<qemu:arg value='fr'/>
<qemu:arg value='-no-fd-bootchk'/>
</qemu:commandline>
</domain>
Using the KVM API
Using the KVM API
Using the KVM API
Using the KVM API
Using the KVM API