User: Password:
|
|
Subscribe / Log in / New account

Microkernels are better

Microkernels are better

Posted Feb 26, 2013 9:48 UTC (Tue) by khim (subscriber, #9252)
In reply to: Microkernels are better by Wol
Parent article: MINIX 3.2.1 released

Firstly, you're going on about deep pipelines. Which causes processor stalls. Which was, I believe, a major reasoning for abandoning the Pentium 4 architecture - it was so prone to massive stalls it wasn't true.

Well, it had 31 stages and was able to execute up to three μops. Which meant you need to always have almost hundred μops in flight. It was unfeasible. Today fastest CPUs have 16 stages and can execute up to four μops. That's two times smaller but still is pretty hard to keep all these pipes filled, yes. What's your point? That you can reduce size of the pipe and this will solve most problems? Yes, but speed will suffer: you'll have larger stages in the pipes and they will be naturally slower.

And secondly, while I can't remember / don't know an awful lot about 50-series architecture, I don't understand why ring-switching should be slow. It's something to do with the memory segmentation, but the point was the segmentation gave you fast AND SAFE switching.

No matter how exactly switching is done it changes context. Either you need more context to keep all rings "in the loop" (which means larger pieces of CPU core which means slower frequency which means slower CPU overall) or you need to load and unload said context (which means ring switch is slow).

The Intel architecture won. Intel architecture cannot do a fast ring-switch.

Yes, but why do you think it's coincidence? It's not. The fact that Intel won the war may be an accident, but the fact that architecture which won can't do fast ring-switch is not a coincidence. The very some tricks which brings you more raw speed for the same price (and that is how Intel architecture won) make it harder to have fast ring switch.

Doesn't mean that other architectures can't, doesn't mean that Intel architecture is the best. It just happened to be the one that gained the market share needed for network effects to knock out the competition.

Yes and no. Intel architecture won because it was faster. And it was faster because it used tricks to make smaller CPU core pieces (that's the only way to keep frequency of CPU high enough) and to have smaller CPU core pieces you need smaller number of stuff in them.

If Pr1me hadn't lost out in the market, and had continued development of their cpus, I'm sure they could have taken advantage of all the same things as Intel, and we would expect fast ring-switching as a matter of course.

Nope. To make fast CPU you need to make it's synchronously-executing pieces small. And that means you need to push "useless fat" out of them. You make fast-path which only executes the most important pieces and slow-path which does everything else. Either you keep the machinery needed for optional rare things like ring switch in the fast path or you keep them on slow path. In the first case you have slow CPU (basically CPU has 2-3-4x slower frequency then streamlined AMD's, IBM's or Intel's CPU) in the second case you have slow ring switch.

P.S. PowerPC 601 had 32 KiB cache back in 1992. Latest and greatest Intel's CPU still have 32 KiB L1 cache. Think about it and about implications for fancy techniques (like GC support or fast ring-switching or… whatever can you stuff in the CPU core to simplify life for OS and pplication writers). Twenty years ago "fancy techniques" meant "bigger price" — and thus people used them where price was not the most important aspect. But fifteen or ten years ago (and most definitely today) trade-offs changes and "fancy techniques" started to mean "slower CPU". And people have chosen "faster CPU" over fancy techniques. The fact that all these interesting architectures have died off at that time and were replaced by dull AMD's, IBM's, Intel's (and for some time SGI's and Sun's) creations is not a coincidence.


(Log in to post comments)

Microkernels are better

Posted Feb 27, 2013 14:11 UTC (Wed) by gmatht (subscriber, #58961) [Link]

No matter how exactly switching is done it changes context. Either you need more context to keep all rings "in the loop" (which means larger pieces of CPU core which means slower frequency which means slower CPU overall) or you need to load and unload said context (which means ring switch is slow)."
How much context do we need per ring? According to Wikipedia, ring switches can be relatively fast, presumably because they don't need to reload the page table.

Microkernels are better

Posted Feb 27, 2013 14:36 UTC (Wed) by khim (subscriber, #9252) [Link]

How much context do we need per ring?

Enough to distinguish access from ring-0 to the access from ring-3, heh. Either you add tags to all the commands and all the data in the pipelines or you flush the pipeline after flush.

Basically the question is: if "mov [some_address], register" should succeed in ring-0 and fail in ring-3 then how do you detect this? Either you keep this metainformation near the information itself (that is: when you assign registers you now have 2-3-4x more physical registers and thus more complex logic to assign them) or you need to flush the pipeline after ring switch. First approach will mean larger core pieces (and thus slower CPU frequency), second approach will mean slow ring switch.

According to Wikipedia, ring switches can be relatively fast, presumably because they don't need to reload the page table.

The key word here is "relatively". If you flush the pipeline then there are 15-20 ticks stall and in that time CPU can execute about 30-40 simple commands.

Microkernels are better

Posted Feb 27, 2013 18:09 UTC (Wed) by ARealLWN (guest, #88901) [Link]

Although a number of points you make are accurate, lets at least do our best to not rewrite history. Intel won the war because they were the processor architecture used in the ibm pc. If you want to talk about what was faster, the DEC Alpha was faster then the Pentium. If you want to talk about what was more affordable, Be made a computer with a pair of processors that ran faster then a 386 for less money. You could build a computer with a cheap risc cpu and a dsp that would have much better mips per dollar/pound/franc then something with an intel processor. Intel won because they offered decent price performance which was able to still be reasonably competitive with offers from workstations by having a standardized way to add components to the cpu or motherboard chipset thereby allowing competition to thrive in a commodity market. I thought everyone knew this.

Microkernels are better

Posted Feb 27, 2013 18:57 UTC (Wed) by hummassa (subscriber, #307) [Link]

I think you just revealed your age... ;-)
(and I'm probably half a dozen years older)

Microkernels are better

Posted Feb 27, 2013 20:34 UTC (Wed) by khim (subscriber, #9252) [Link]

Intel won the war because they were the processor architecture used in the ibm pc.

Nope. Intel got money for the war because it built the architecture used in the ibm pc, that's true. But it won the war because it was faster. Do you think developers of monsters in top500 list care about ibm pc compatibility? Nope: they care about performance. And this list was dominated by x86 CPUs for years.

If you want to talk about what was faster, the DEC Alpha was faster then the Pentium.

For tasks with floating point — may be at first, but for tasks which only use integers it was actually slower. And when you compare Alpha 21364 with Pentium 4 HT 3.06… it was no longer faster even for floating point.

You could build a computer with a cheap risc cpu and a dsp that would have much better mips per dollar/pound/franc then something with an intel processor.

Then why people are not doing it? Take a look on the list once more: 75% Intel x86-64, 12% AMD x86-64, 12% IBM POWER, and 1% SPARC. Where are these risc cpus and dsps? Why there are so few of them in the list?

Microkernels are better

Posted Feb 27, 2013 21:51 UTC (Wed) by dlang (subscriber, #313) [Link]

the fact that the x86 was used on the most common platform meant that there was more money for speeding up the x86 chips, which made them more popular, which provided more money for speeding them up......

This is why small companies like Transmeta folded, they were compatible, but they didn't have the R&D budgets and manufacturing capability to compete with Intel. AMD is barely hanging on, and if Intel hadn't made the Itanioum blunder (leaving the gap open for the AMD-64 chips), I doubt if AMD would have survived.

network effects matter, when everyone is running binary software, being binary compatible matters. Since the IMB PC became the standard, any chips that weren't PC compatible became marginal and the popularity -> money -> R&C -> speed -> popularity cycle started.

With mobile devices NOT being x86 compatible, we are seeing a resurgence in competition at the architecture level again for consumer devices (enabled by Linux's cross platform support), and Microsoft and Intel have been trying for years to ignore and block this, but now they are having to really recognize the competition.

Microkernels are better

Posted Feb 27, 2013 22:07 UTC (Wed) by khim (subscriber, #9252) [Link]

Since the IMB PC became the standard, any chips that weren't PC compatible became marginal and the popularity -> money -> R&C -> speed -> popularity cycle started.

Sure, but even if you have enough money you are still constrained by law of physics.

With mobile devices NOT being x86 compatible, we are seeing a resurgence in competition at the architecture level again for consumer devices

Sure, but will fast ring switching survive this push? I very much doubt it. Note that POWER (which actully slightly faster then x86 although more expensive) is also not all that fast with the context switches AFAICS.

Microkernels are better

Posted Feb 27, 2013 22:41 UTC (Wed) by dlang (subscriber, #313) [Link]

I am not trying to say that context switches will be fast, I was merely responding to the logic of why x86 architecture won. It isn't because it's the best, it's because it's had the most R&D effort pumped into it to work around it's problems

This includes to a large extent, being produced on the most advanced fab processes, if you took the competing designs and produced them at the same resolution that Intel uses for their x86 chips, they would be much smaller, cheaper, faster, and use significantly less power than they currently do. The fact that with all these handicaps they are competitive to Intel chips in many uses is a good indication of how bad the x86 architecture is.

Microkernels are better

Posted Mar 1, 2013 1:22 UTC (Fri) by ARealLWN (guest, #88901) [Link]

I would like to argue that the Itanium chip wasn't really a blunder on the part of Intel. The technical merits of it can certainly be called into question but it killed off the DEC Alpha, SGI's interest in MIPS technology, and the PA-RISC architecture of HP simply with marketing because everyone bought into the idea of EPIC being the future of high performance computing. They eliminated a large class of potential threats to their interests in the server and workstation market before shipping any silicon. That hardly seems like a blunder to me. The easiest way to make sure you win a race is to make sure anyone faster then you doesn't show up. Intel simply diverted attention away from other competition to make certain players who might pose a more immediate risk were out of the equation first. IMHO.

Microkernels are better

Posted Feb 28, 2013 15:20 UTC (Thu) by deater (subscriber, #11746) [Link]

> Then why people are not doing it? Take a look on the list once more:

I did. Notice in the November 2012 list that Intel doesn't make the top 5 at all. Yet Power and SPARC do, both considered RISC chips by most people I think (although Power is debatable).

x86 got to the top just because of economies of scale and because it is good enough, relatively cheap. Being able to buy things off-the-shelf does help. Having spent some time in an HPC group I can tell you that x86 is used because it's there, not because it has any real benefits. How long has it taken them to get a fused multiply-add instruction?

Microkernels are better

Posted Mar 1, 2013 2:23 UTC (Fri) by ARealLWN (guest, #88901) [Link]

I believe that power (or powerpc) claims to be a performance optimized risc architecture (source would be Orielly publishing High Performance Computing, second edition). As I understand it that means that they say that they are risc based but will include additional instructions if it seems like they could improve the performance of software written for the architecture. I do appreciate that you have given backing to my initial statements and would like to thank you for doing so.

Microkernels are better

Posted Mar 1, 2013 1:57 UTC (Fri) by ARealLWN (guest, #88901) [Link]

I was going to type a rebuttal stating how you are wrong and don't know what you're talking about. After carefully reading you're reply I must say that I don't think I expressed my statements clearly the first time and that I believe you are probably correct about Intel being faster. I was trying to state that Intel has not been faster or faster per money invested in the past, not currently. Currently if you want a fast general purpose processor Intel isn't a bad choice. If you want pure processing you can get a gpu but those don't work well as general processing and only work with certain workloads, much like a dsp in the past. I'm not sure if anyone ever made a processing system based on risc and dsp architecture but in the past someone developed a system based on a bunch of TI dsp with 2mb ram on a 72 simm modules (if memory serves) that had the best performance per dollar for it's time. In order to break DES the EFF developed a machine with custom chips which certainly weren't intel compatible but had much better performance. Building a computer with good performance depends as much on what applications you are running as what cpu you choose and which peripherals you put inside it. As far as your reference to the pentium 4 compared to an early alpha processor, I won't comment except to mention that alpha was dead by then as far as DEC was concerned and had been for a while. The engineers had moved to AMD or some other company and the Athlon was competing with that processor very favorably in SPEC benchmarks without needing a 6ghz alu.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds