|
|
Subscribe / Log in / New account

Multiple kernels on a single system

By Jonathan Corbet
September 19, 2025
The Linux kernel generally wants to be in charge of the system as a whole; it runs on all of the available CPUs and controls access to them globally. Cong Wang has just come forward with a different approach: allowing each CPU to run its own kernel. The patch set is in an early form, but it gives a hint for what might be possible.

The patch set as a whole only touches 1,400 lines of code, adding a few basic features; there would clearly need to be a lot more work done to make this feature useful. The first part is a new KEXEC_MULTIKERNEL flag to the kexec_load() system call, requesting that a new kernel be booted on a specific CPU. That CPU must be in the offline state when the call is made, or the call will fail with an EBUSY error. It would appear that it is only possible to assign a single CPU to any given kernel in this mode; the current interface lacks a way to specify more than one CPU. There is a bunch of x86-64 assembly magic to set up the target CPU for the new kernel and to boot it there.

The other significant piece is a new inter-kernel communication mechanism, based on inter-processor interrupts, that allows the kernels running on different CPUs to talk to each other. Shared memory areas are set aside for the efficient movement of data between the kernels. While the infrastructure is present in the patch set, there are no users of it in this series. A real-world system running in this mode would clearly need to use this communication infrastructure to implement a lot of coordination of resources to keep the kernels from stepping on each other, but that work has not been posted yet.

The final patches in the series add a new /proc/multikernel file that can be used to monitor the state of the various kernels running in the system.

Why would one want to do this? In the cover letter, Wang mentions a few advantages, including improved fault isolation and security, better efficiency than virtualization, and the ease of zero-downtime updates in conjunction with the kexec handover mechanism. He also mentions the ability to run special-purpose kernels (such as a realtime kernel) for specific workloads.

The work that has been posted is clearly just the beginning:

This patch series represents only the foundational framework for multikernel support. It establishes the basic infrastructure and communication mechanisms. We welcome the community to build upon this foundation and develop their own solutions based on this framework.

The new files in the series carry copyright notices for Multikernel Technologies Inc, which, seemingly, is also developing its own solutions based on this code. In other words, this looks like more than a hobby project; it will be interesting to see where it goes from here. Perhaps this relatively old idea (Larry McVoy was proposing "cache-coherent clusters" for Linux at least as far back as 2002) will finally come to fruition.

Index entries for this article
KernelMulti-kernel


to post comments

Some precendent for this in VMware's ESX kernel (version 5.0 and earlier)

Posted Sep 19, 2025 21:12 UTC (Fri) by tullmann (subscriber, #20149) [Link] (2 responses)

In the initial versions of VMware's ESX servers (up through version 5.0), a Linux kernel would boot with (standard) command-line options restricting it to CPU 0, the first chunk of physical RAM, and a subset of the PCI devices. A subsequent loader would load the VMware hypervisor and it would manage the remaining memory, CPUs, and the few PCI devices it understood. The hard-partitioning of hardware worked surprisingly well and two very different kernels could exist in tandem on a single machine.

In a similar space, the Barrelfish OS (https://barrelfish.org/) was a "multikernel" research operating system in the 2010s that was built around separate "kernels" running on each CPU. But they worked to make the communication between cores work smoothly enough to present a single logical system to applications running on it.

Some precendent for this in VMware's ESX kernel (version 5.0 and earlier)

Posted Sep 20, 2025 21:57 UTC (Sat) by chexo4 (subscriber, #169500) [Link]

IIRC this is how multi-core systems under the seL4 microkernel work. At least in some configurations. Something about it being simpler to implement probably.

Some precendent for this in VMware's ESX kernel (version 5.0 and earlier)

Posted Sep 23, 2025 19:31 UTC (Tue) by acarno (subscriber, #123476) [Link]

Virginia Tech's SSRG has an on-going project similar to Barrelfish called Popcorn Linux (a fun joke about multiple "kernels" ;)

In addition to running natively (e.g., multiple kernels on a single multi-core system), they also investigated running across different architectures and performing stack transformation to migrate memory between nodes.

> The project is exploring a replicated-kernel OS model for the Linux operating system. In this model, multiple Linux kernel instances running on multiple nodes collaborate each other to provide applications with a single-image operating system over the nodes. The kernels transparently provide a consistent memory view across the machine boundary, so threads in a process can be spread across the nodes without an explicit declaration of memory regions to share nor accessing through a custom memory APIs. The nodes are connected through a modern low-latency interconnect, and each of them might be based on different ISA and/or hardware configuration. In this way, Popcorn Linux utilizes the ISA-affinity in applications and scale out the system performance beyond a single system performance while retaining full POSIX compatibility.

Project Website: https://popcornlinux.org/
2020 LWN Article: https://lwn.net/Articles/819237/

Lots of use cases

Posted Sep 19, 2025 21:25 UTC (Fri) by geofft (subscriber, #59789) [Link] (10 responses)

I can imagine a whole lot of interesting use cases for this. Off the top of my head:
  • You can run a hard realtime kernel and a more interactive-suitable kernel at the same time, doing e.g. audio processing on the realtime side and a normal desktop on the other side.
  • I wonder if you can extend this to run two different kernels, most attractively NT for the crowd who wants to do Windows gaming but otherwise have a Linux desktop, where the current best answer is something like VFIO. Here you could potentially offline all but one CPU as well as the GPU, then boot up Windows on that hardware, keeping the "host" machine accessible over a virtual network or something.
  • I have an M2 MacBook which supports virtualization but not nested virtualization, meaning I can't use qemu-kvm inside my Linux VM. This might let me get a user experience that is effectively like having the ability to launch VMs from inside Linux. I don't actually need security isolation between those VMs.
  • Sometimes you run into applications that for whatever reason need a specific old kernel. If you're running these applications under containerization (e.g. Kubernetes), this holds back either your entire container fleet or some subset of them. With multikernel you can treat the desired kernel version as a property of the container, provided you're willing to dedicate an integer number of cores to the container (which is a good idea for its own sake), and avoid the overhead of actual virtualization. Containers have their own network identity etc. anyway so it's probably doable to map that model onto this.

Lots of use cases

Posted Sep 19, 2025 22:23 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link]

> I wonder if you can extend this to run two different kernels, most attractively NT for the crowd who wants to do Windows

There was a project to do the inverse, run Linux on Windows NT. It did not isolate individual CPUs, but otherwise it was a similar idea.

Its website is still up: http://www.colinux.org/

Lots of use cases

Posted Sep 20, 2025 13:43 UTC (Sat) by jannex (subscriber, #43525) [Link] (1 responses)

> I have an M2 MacBook which supports virtualization but not nested virtualization, meaning I can't use qemu-kvm inside my Linux VM

Apple's M2 and later support nested virtualization. The issue is probably that arm64 nested virtualization support in Linux itself is rather fresh. It was just merged in 6.16 (I haven't tried it yet).

This approach will be hard or impossible to support on Apple silicon systems. There is no easy way to support PSCI so taking CPU cores offline is currently not supported.

Lots of use cases

Posted Sep 20, 2025 18:25 UTC (Sat) by geofft (subscriber, #59789) [Link]

For whatever reason Apple only enables it with the M3 chip and later, as documented for the high-level Virtualization.framework's VZGenericPlatformConfiguration.isNestedVirtualizationSupported.

I also get false from the lower-level Hypervisor.framework's hv_vm_config_get_el2_supported() on my machine.

Lots of use cases

Posted Sep 21, 2025 4:37 UTC (Sun) by kazer (subscriber, #134462) [Link] (4 responses)

> two different kernels

That second "foreign" kernel would need to understand the "partition" it is allowed to use so it won't try to take over rest of the machine where another kernel may be running. Unless there is a way to make hardware understand where that another one is allowed to run (basically selective removing of supervisor-rights from the foreign kernel).

So I can only see that happening if the second kernel understands multikernel situations correctly as well. Otherwise it is back to hypervisor virtualization.

> old kernel

Sorry, but for the reasons mentioned above (supervisor access to hardware) that old kernel would need to be multikernel compliant as well. Otherwise you need a plain old hypervisor for virtualization.

Lots of use cases

Posted Sep 21, 2025 12:16 UTC (Sun) by kleptog (subscriber, #1183) [Link] (3 responses)

> That second "foreign" kernel would need to understand the "partition" it is allowed to use so it won't try to take over rest of the machine where another kernel may be running

It is already the case that a booting kernel asks the underlying system which part of physical memory it is allowed to use. It can then prepare the kernel mapping so it can only access the parts it is allowed to. It can't assume anything about all the other parts.

Now, this only prevents accidental interference. There's nothing that prevents the kernel from modifying its mapping (dynamically adding RAM/devices is a thing) but it would give a very high degree of isolation. Not as good as a hypervisor, but pretty good.

Lots of use cases

Posted Sep 21, 2025 17:46 UTC (Sun) by glettieri (subscriber, #15705) [Link] (2 responses)

> It is already the case that a booting kernel asks the underlying system which part of physical memory it is allowed to use

However, in this case the underlying system is the hardware, that doesn't know anything about these partitions. A non-multikernel-aware kernel would discover all the memory and all the devices, and think that it owns everything.

Lots of use cases

Posted Sep 22, 2025 4:50 UTC (Mon) by skissane (subscriber, #38675) [Link] (1 responses)

> However, in this case the underlying system is the hardware, that doesn't know anything about these partitions. A non-multikernel-aware kernel would discover all the memory and all the devices, and think that it owns everything.

Maybe someone just needs to add a “telling lies facility” to the hardware/firmware which the multikernel could use to get the hardware/firmware to lie to the non-multikernel-aware kernel? This could be much more lightweight than standard virtualisation since it wouldn’t be involved at runtime only in config discovery

Lots of use cases

Posted Sep 22, 2025 22:13 UTC (Mon) by Wol (subscriber, #4433) [Link]

And then the non-multi-kernel-aware kernel trips over a bug, tries to do something which would normally crash, and there just happens to be something real there that it accidentally trashes ...

Cheers,
Wol

Lots of use cases - Rolling kernel upgrade

Posted Sep 22, 2025 8:10 UTC (Mon) by rhbvkleef (subscriber, #154505) [Link] (1 responses)

I think another enticing use-case would be a kind of "rolling" kernel upgrade where we can start a newer kernel on a subset of cores, and migrate our userspace over to it gradually, before killing the old kernel.

Lots of use cases - Rolling kernel upgrade

Posted Sep 25, 2025 12:55 UTC (Thu) by Karellen (subscriber, #67644) [Link]

I'm not sure having multiple kernels accessing the same block device/filesystem is going to work very well. If it were going to work at all you'd probably need a separate virtual filesystem driver for the new kernel, which talks to a server on the old, and then once the old kernel is ready to unmount the filesystem do a switcheroo-handover type thing. The new kernel would have to swap out the virtual filesystem driver for the real ext4/btrfs/... and start accessing the "real" inodes directly?

Neat: but isn't this a type-1 hypervisor?

Posted Sep 19, 2025 21:30 UTC (Fri) by quotemstr (subscriber, #45331) [Link] (18 responses)

I've been wondering for a long time when we're going to start thinking of big heavily-NUMA oriented machines as clusters with fast interconnects instead of as single machines. This work is a step in that direction. That said, on a technical level, it seems to *amount* to a Xen-like hypervisor that just happens to pin guests to cores. A multi-kernel system, like any other hypervisor, has to worry about guest isolation, memory sharing, and device arbitration. Also, it's not clear from the article whether this system enforces guest memory isolation or whether the different guests (i.e. multikernel instances) just cooperatively agree not to stomp on each other.

It would be neat to be able to use industry-standard interfaces like libvirt to work with these guest kernels.

> better efficiency than virtualization,

Like I said, I'd consider this cool thing a *kind* of virtualization, one that trades flexibility for performance, not something *distinct* from virtualization.

Neat: but isn't this a type-1 hypervisor?

Posted Sep 20, 2025 0:45 UTC (Sat) by stephen.pollei (subscriber, #125364) [Link] (13 responses)

I can't seem to find a good source for citation but I think it was Larry McVoy that thought something very similar that about 16 cpu/cores was a good limit; beyond that he suggested you run multiples independent kernels and just have fast message passing between them. I can't seem to find source and my memory could be faulty.

Neat: but isn't this a type-1 hypervisor?

Posted Sep 20, 2025 15:29 UTC (Sat) by ballombe (subscriber, #9523) [Link] (9 responses)

This seems to preclude workloads that spawns more than 16 threads.

Neat: but isn't this a type-1 hypervisor?

Posted Sep 20, 2025 17:10 UTC (Sat) by quotemstr (subscriber, #45331) [Link] (8 responses)

No it doesn't. You can have more threads than cores. If you mean that you can't get more than 16-way parallelism this way using threads: feature, not a bug. Use cross-machine distribution mechanism (e.g. dask) and handle work across an arbitrarily large number of cores across an arbitrarily large number of machines.

Neat: but isn't this a type-1 hypervisor?

Posted Sep 20, 2025 20:22 UTC (Sat) by roc (subscriber, #30627) [Link] (7 responses)

There are plenty of programs that work perfectly well with (e.g.) 200 threads on 200 cores, on hardware that exists today. Asking people to rewrite them to introduce a message-passing layer to get them to scale on your hypothetical cluster is a non-starter. Definitely a bug, not a feature.

If the Linux kernel had been unable to scale well beyond 16 cores then this cluster idea might have been a viable path forward. But Linux did and any potential competitor that doesn't is simply not viable for these workloads.

Neat: but isn't this a type-1 hypervisor?

Posted Sep 21, 2025 8:07 UTC (Sun) by quotemstr (subscriber, #45331) [Link] (6 responses)

> There are plenty of programs that work perfectly well with (e.g.) 200 threads on 200 cores, on hardware that exists today. Asking people to rewrite them to introduce a message-passing layer to get them to scale on your hypothetical cluster is a non-starter. Definitely a bug, not a feature.

Yes, and those programs can keep running. Suppose I'm developing a brand-new system and a cluster on which to run it. My workload is bigger than any single machine no matter how beefy, so I'm going to have to distribute it *anyway*, with all the concomitant complexity. If I can carve up my cluster such that each NUMA domain is a "machine", I can reuse my inter-box work distribution stuff for intra-box distribution too.

Not every workload is like this, but some are, and life can be simpler this way.

Neat: but isn't this a type-1 hypervisor?

Posted Sep 21, 2025 9:17 UTC (Sun) by ballombe (subscriber, #9523) [Link] (5 responses)

...or you can run a SSI OS that move the complexity to the OS where it belongs.
<https://en.wikipedia.org/wiki/Single_system_image>
... or HPE will sell you NUMAlink systems with coherent memory across 32 sockets.

But more seriously, when using message passing, you still want to be share your working set across all cores in the same node to preserve memory.
Replacing a 128 cores system by 8 16-cores system will require 8 copies of the working set.

Neat: but isn't this a type-1 hypervisor?

Posted Sep 21, 2025 10:15 UTC (Sun) by willy (subscriber, #9762) [Link] (4 responses)

Well, there's two schools of thought on that. Some say that NUMA hops are so slow and potentially congested (and therefore have high variability in their latency) that it's worth replicating read-only parts of the working set across nodes. They even have numbers that prove their point. I haven't dug into it enough to know if I believe that these numbers are typical or if they've chosen a particularly skewed example.

Neat: but isn't this a type-1 hypervisor?

Posted Sep 21, 2025 12:42 UTC (Sun) by ballombe (subscriber, #9523) [Link] (3 responses)

This is correct. However, NUMA systems come with libraries to give you access to the physical layout so you can copy the working set only once per coherent NUMA blocks, which are much larger than 16 cores nowadays.

Neat: but isn't this a type-1 hypervisor?

Posted Sep 21, 2025 20:19 UTC (Sun) by willy (subscriber, #9762) [Link] (2 responses)

If those libraries already exist, why do people keep submitting patches to add this functionality to the kernel?

Neat: but isn't this a type-1 hypervisor?

Posted Sep 21, 2025 20:35 UTC (Sun) by quotemstr (subscriber, #45331) [Link] (1 responses)

Because the libraries have to have something to talk to? It's like asking why we add KVM syscalls when we have kvm command line. Separate jobs.

Neat: but isn't this a type-1 hypervisor?

Posted Sep 21, 2025 20:39 UTC (Sun) by willy (subscriber, #9762) [Link]

... no.

The patches are to do this automatically without library involvement. I think the latest round were called something awful like "Copy On NUMA".

Neat: but isn't this a type-1 hypervisor?

Posted Sep 20, 2025 18:59 UTC (Sat) by willy (subscriber, #9762) [Link] (2 responses)

You're right; Larry wanted a cluster of SMPs. Now, part of that was trying to avoid the locking complexity cliff; he didn't want Solaris to turn into IRIX with "too many" locks (I'm paraphrasing his point of view; IRIX fanboys need not be upset with me)

But Solaris didn't have RCU. I would argue that RCU has enabled Linux to scale further than Solaris without falling off "the locking cliff". We also have lockdep to prevent us from creating deadlocks (I believe Solaris eventually had an equivalent, but that was after Larry left Sun). Linux also distinguishes between spinlocks and mutexes, while I believe Solaris only has spinaphores. Whether that's terribly helpful or not for scaling, I'm not sure.

Neat: but isn't this a type-1 hypervisor?

Posted Sep 20, 2025 21:31 UTC (Sat) by stephen.pollei (subscriber, #125364) [Link] (1 responses)

I do seem to recall that it was for "locking complexity" reasons. If I recall correctly, around this time, there was the BKL and relatively fewer locks. With even just a BKL, it could scale to 2 to 4 cores/cpus with a lot of typical workloads. There was too much contention for the kernel to scale up to even the 12 to 16 core and beyond range effectively. Several people were of the opinion that Sun Solaris and others had their locks too fine-grained. For this reason, I think they tried to be very cautious in breaking up coarse-grained locks for finer-grained locks; they tried requiring that there were measurements on realistic loads that a lock was having contention or latency issues before they accepted patches to break it up. They tried to avoid too much locking complexity and over-head.

I don't know enough to have an opinion on how Linux kernel was able to scale as successfully as it has. There were certainly doubts in the past. If I recall correctly, RCU was being used in other kernels before it was introduced in Linux, but I don't recall which ones.

Neat: but isn't this a type-1 hypervisor?

Posted Sep 21, 2025 6:04 UTC (Sun) by willy (subscriber, #9762) [Link]

RCU was invented at Sequent (who were bought by IBM) and used in their Dynix/ptx kernel.

Neat: but isn't this a type-1 hypervisor?

Posted Sep 21, 2025 10:56 UTC (Sun) by kazer (subscriber, #134462) [Link] (1 responses)

> Like I said, I'd consider this cool thing a *kind* of virtualization

Virtualization is an abstraction of the hardware.

Better term for multi-kernel system would be *partition* (term has been used in mainframe-world already). In a multi-kernel design, kernel would still see the whole hardware as it is (not an abstraction), but it would be limited to a subset of the capabilities (a partition).

Linux already has various capabilities to limit certain tasks to run on certain CPUs so this would be taking that approach further, not adding abstractions.

Neat: but isn't this a type-1 hypervisor?

Posted Sep 21, 2025 11:40 UTC (Sun) by quotemstr (subscriber, #45331) [Link]

> Virtualization is an abstraction of the hardware.

So VMMs doing PCIe pass-through aren't doing virtualization?

Anyway, the terminology difference is immaterial. In a purest view of virtualization, a guest shouldn't be aware that it's virtualized or that other guests exist. In the purest view of a partition, the whole system is built around multi-instance data structures. In reality, the virtualization is
leaky, and deliberately so because the leaks are useful. Likewise, in a partition setup, especially one grafted into an existing system, at some point you arrange data structures such that code running on one partition "thinks" it owns a system --- there's your abstraction.

Besides: lots of people arrange VMs and assign resources such that the net effect ends being a partition anyway. The multi kernel work might be a way to achieve practically the same configuration with more performance and less isolation.

My point is that it would be nice to manage configurations like this using the existing suite of virtualization tools. Even if multi kernel is not virtualization under some purist definition of the word, it's close enough, practically speaking, that virtualization tools can be made to work well enough that the configuration stacks can be unified and people don't have to learn a new thing.

Neat: but isn't this a type-1 hypervisor?

Posted Sep 22, 2025 9:46 UTC (Mon) by farnz (subscriber, #17727) [Link]

It's certainly an interesting turn of the wheel; one of the selling points of NUMA over clusters back in the 1990s was that a cluster required you to work out what needed to be communicated between partitions of your problem, and pass messages, while a NUMA cluster let any CPU read any data anywhere in the system.

NUMA systems could thus be treated as just a special case of clusters (instead of running an instance per system, passing messages over the network, run an instance per NUMA node, bound to the node, passing messages over shared memory channels), but they benefited hugely for problems where you'd normally stick to your instance's data, but could need to get at data from anywhere to solve the problem, since that was now just "normal" reads instead of message passing.

I'd be interested to see what the final intent behind this work is - is it better RAS (since you can upgrade the kernel NUMA node by NUMA node), is it about sharing a big machine among smaller users (like containers or virtualization, but with different costs), or is it about giving people an incentive to write their programs in terms of "one instance per NUMA node" again?

Neat: but isn't this a type-1 hypervisor?

Posted Sep 22, 2025 10:05 UTC (Mon) by paulj (subscriber, #341) [Link]

> Like I said, I'd consider this cool thing a *kind* of virtualization, one that trades flexibility for performance, not something *distinct* from virtualization.

Similar stuff before has been called "Logical Partitions" (LPARs) by IBM, and "Logical Domains" (LDOMs) by Sun Microsystems (the sun4v stuff introduced in UltraSPARC T1 Niagara).

Message passing OS

Posted Sep 19, 2025 23:01 UTC (Fri) by linusw (subscriber, #40300) [Link]

To me it seems like essentially a message-passing (mailboxing) operating system with several kernels, such as was the basic idea in several operating systems of the past and the ambition in things like CORBA or DCOM, just on CPUs separated by IPIs and passing shared memory on the same silicon instead of separate computers separated by a network and passing packets.

It seems to more or less require an in-kernel and intra-kernel IPC mechanism so with kdbus and BUS1 having stalled here is a new reason to have something like that, because these kernels and userspaces will really need to talk to each other in structured ways.

In say a desktop scenario the messaging mechanism if that is say systemd on D-Bus is going to do coordination over bringing up not just processes but entire kernels with processes in them.

Limited isolation

Posted Sep 20, 2025 1:44 UTC (Sat) by roc (subscriber, #30627) [Link] (2 responses)

The obvious downside here compared to running guest VMs is that there is no security boundary between the kernels --- Corbet's summary here says "improved fault isolation and security", which true when you compare this approach to running workloads on the same kernel, but not when you compare it to running workloads in separate guest VMs. Anyone who cares strongly about workload isolation is already using guest VMs so they're unlikely to move to multikernels.

However, as a way to do host kernel upgrades without interrupting guest VMs, it could definitely be useful.

Limited isolation

Posted Sep 21, 2025 9:23 UTC (Sun) by cyperpunks (subscriber, #39406) [Link] (1 responses)

Would be possible to mix vanilla kernel and GRSecurity kernel at the same system? Such thing indeed be very useful imho.

Limited isolation

Posted Sep 21, 2025 20:10 UTC (Sun) by Lionel_Debroux (subscriber, #30014) [Link]

Suitably configured grsec kernels typically use per-CPU PGDs for security reasons (IIRC, to avoid some race conditions), so I wonder how that would mix with a mainline kernel which doesn't.

memory and devices

Posted Sep 20, 2025 12:26 UTC (Sat) by meyert (subscriber, #32097) [Link] (1 responses)

How would memory and devices get managed in such a setup?

memory and devices

Posted Sep 20, 2025 17:34 UTC (Sat) by willy (subscriber, #9762) [Link]

You partition them. Assign various devices and memory to each kernel.

Shared access?

Posted Sep 20, 2025 14:42 UTC (Sat) by brchrisman (subscriber, #71769) [Link] (1 responses)

This is going to be entirely intra-node RDMA?
Bind them together in an old school MOSIX style single-system-image cluster?
Or SR-IOV devices/functions, one per kernel instance?

Shared access?

Posted Sep 20, 2025 18:38 UTC (Sat) by Lennie (subscriber, #49641) [Link]

The old https://en.wikipedia.org/wiki/OpenSSI code is still available too

interesting similarities to "hardware partitioning" of IBM mainframes

Posted Sep 21, 2025 7:19 UTC (Sun) by dale.hagglund (subscriber, #48536) [Link] (2 responses)

IBM mainframes (I won't say for sure about modern ones, but certainly the the 370, 390, and compatible Amdahl systems I was aware of in the mid 80s at university) supported a feature where the hardware could be divided into "partitions", each of which could run a fully separate "real mode" OS instance. Again, I don't know this for sure, but I wouldn't be entirely surprised if there was some hardware help for controlling which cpus, memory, devices, etc, could be discovered by the os running in a particular partition. As I understand it, partitioning was commonly used for testing new releases of the os and related software, to separate production from dev development and test, and no doubt other reasons.

Anyway, this new multi-kernel work could be used in many different and useful ways, as others have already noted, but it's always interesting to see how essentially every "new" idea has antecedents in the past.

interesting similarities to "hardware partitioning" of IBM mainframes

Posted Sep 21, 2025 23:13 UTC (Sun) by marcH (subscriber, #57642) [Link] (1 responses)

> Anyway, this new multi-kernel work could be used in many different and useful ways, as others have already noted, but it's always interesting to see how essentially every "new" idea has antecedents in the past.

There is a gazillion different potential reasons for that: the solution was in search of a problem, it was too expensive, it was not mature yet, it broke backwards compatibility too much, it was mature and successful for a while but displaced by less convenient but much cheaper commodity solutions, etc.

1% inspiration, 99% perspiration. The lone inventor and its eureka! moment is probably the least common case but it makes the best stories to read or watch and they massively skew our perception. Our tribal brain is hardwired for silver bullets and miracles and "allergic" to slow, global and real-world evolutions. Not just for science and technology, it's the same for economics, war, sociology, climate, etc.

interesting similarities to "hardware partitioning" of IBM mainframes

Posted Sep 22, 2025 22:10 UTC (Mon) by Wol (subscriber, #4433) [Link]

> There is a gazillion different potential reasons for that: the solution was in search of a problem, it was too expensive, it was not mature yet, it broke backwards compatibility too much, it was mature and successful for a while but displaced by less convenient but much cheaper commodity solutions, etc.

It wasn't interesting to Universities? (So students never knew about it.)

Cheers,
Wol

Shared memory

Posted Sep 22, 2025 1:10 UTC (Mon) by SLi (subscriber, #53131) [Link] (2 responses)

I can see how this could be interesting for some kind of fault tolerance and perhaps especially zero downtime kernel updates. A message passing model is neat and clean.

Having said that, I started to wonder. Would it still be possible, and would it make enough sense, to have some kind of a shared memory mechanism between userspace processes running on the different kernels? I don't think it can look like POSIX, but something stripped down.

What I'm basically thinking of: Multikernel gives us some benefits while arguably sacrificing other things as less important. Could we meaningfully claw back some of those lost things where it makes sense?

Shared memory

Posted Sep 22, 2025 3:28 UTC (Mon) by quotemstr (subscriber, #45331) [Link] (1 responses)

> I can see how this could be interesting for some kind of fault tolerance and perhaps especially zero downtime kernel updates. A message passing model is neat and clean.

You can do this with VMs today. Why would you use this new thing instead of the nice mature virtualization stack?

Shared memory

Posted Sep 22, 2025 8:09 UTC (Mon) by matthias (subscriber, #94967) [Link]

With this system you could upgrade the host kernel of a VM system by bringing up the new kernel and then migrating the VMs to the new kernel with zero copy operations. You only need to change the memory mappings of the old and new kernel. This could be way more efficient than migrating all VMs to a backup machine and migrating them back afterwards.

Firmware "kernels"

Posted Sep 22, 2025 17:55 UTC (Mon) by marcH (subscriber, #57642) [Link] (1 responses)

> The other significant piece is a new inter-kernel communication mechanism, based on inter-processor interrupts, that allows the kernels running on different CPUs to talk to each other. Shared memory areas are set aside for the efficient movement of data between the kernels.

Maybe this could become a standard for communicating with firmwares too, so drivers don't have to keep re-inventing this wheel?

It's quite different because it's heterogeneous (both at the HW and SW levels) but most systems are _already_ "multi-kernels" when you think about it!

Firmware "kernels"

Posted Sep 23, 2025 15:04 UTC (Tue) by linusw (subscriber, #40300) [Link]

Yes they call it AMP "Asynchronous Multi-Processing" and there are attempts such as OpenAMP to standardize around using rpmsg for communication across these.

Not sure about this ...

Posted Sep 26, 2025 12:39 UTC (Fri) by jsakkine (subscriber, #80603) [Link]

Looking at how branching, inline comments and even the cover letter have laid out, my initial guess would be that this is generated. It's just does not "feel" how anyone would write their changes.

I'd like to point out that personally my viewpoint does not come from any opionated standing point. Using any form of code generation to get some stuff ongoing is absolutely fine, as far as I'm concerned. The thing is, however, that it is exactly *placeholder code*, and for my eyes the patch set looks like placeholder/stub code as a feature.

Obviously I don't know facts, this is exactly a guess, and I absolutely don't enjoy making claims like this, and I hope that I have completely misunderstood the topic.


Copyright © 2025, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds