|
|
Log in / Subscribe / Register

Another round of speculative-execution vulnerabilities

Another round of speculative-execution vulnerabilities

Posted Aug 9, 2023 19:45 UTC (Wed) by willy (subscriber, #9762)
In reply to: Another round of speculative-execution vulnerabilities by Wol
Parent article: Another round of speculative-execution vulnerabilities

I told you why this argument was crap last time you espoused it. Knock it off.


to post comments

Another round of speculative-execution vulnerabilities

Posted Aug 9, 2023 20:58 UTC (Wed) by Wol (subscriber, #4433) [Link] (19 responses)

Are you telling me we can communicate faster than light? Other people have explained the problem is worse than I thought, but from what I remember of your explanation you were saying that c wasn't a problem. (Well, it does seem it's not THE problem, but it does place a hard upper limit on chip frequency - of 5GH on a 3cm chip ...)

Cheers,
Wol

Another round of speculative-execution vulnerabilities

Posted Aug 9, 2023 21:40 UTC (Wed) by willy (subscriber, #9762) [Link] (4 responses)

You have two misunderstandings relevant to your argument. They're opposite in sign, so they come close to cancelling each other out.

The first is that things need to happen in a single cycle. An instruction that needs data from L3 cache can and will stall for hundreds of cycles. During that time the CPU will execute some of the other dozens of instructions that it has ready. It's something like six clock ticks to retrieve data from L1. Data in registers is ready to operate on and incurs no delay.

The second is that the speed of communication between different parts of the CPU have anything to do with the speed of light. Speed of electrons in copper is much slower. That's the part other people are telling you that you have wrong.

(There are other problems with your argument, but those are the big two)

Another round of speculative-execution vulnerabilities

Posted Aug 9, 2023 21:59 UTC (Wed) by farnz (subscriber, #17727) [Link] (3 responses)

Note that the speed of electrons is irrelevant; the voltage change that represents a change in state moves much faster than the electrons do, typically at around 60% to 70% of the speed of light in a copper conductor.

But the point about things not needing to happen in a single cycle is key; I can design my logic to account for propagation delays in the circuit, and have it work perfectly. This is what the timing diagrams that are part of any digital logic chip datasheet (and in every CPU datasheet since the 4004) are all about - how do I connect up the entire system's worth of logic such that the system's timing constraints are met?

Another round of speculative-execution vulnerabilities

Posted Aug 10, 2023 7:59 UTC (Thu) by Wol (subscriber, #4433) [Link] (2 responses)

So, as farnz says, you appear to have completely mis-understood my argument and are arguing against a straw man.

Signals are carried by photons (or em waves, same(ish) thing) so the speed of light IS relevant, although from what others have said the telegraph effect is probably more important, and

My argument has repeatedly been prefixed with "IF components need to communicate" so okay, I'm not necessarily talking about clock cycles, but a single communication cycle has that upper limit. I'm not always clear in what I say, I know that, but if you make no attempt to understand me, I can't understand you either. So IFF a communication cycle equals a clock cycle, 5GHz is the maximum clock possible between two random components in a chip. Of course, splitting a communication clock cycle into multiple clock cycles can speed OTHER stuff up, but it makes no difference to the speed at which a signal travels across a chip.

(And of course, without communication a chip can't work.)

Cheers,
Wol

Another round of speculative-execution vulnerabilities

Posted Aug 10, 2023 15:03 UTC (Thu) by farnz (subscriber, #17727) [Link]

A corollary of your argument is that Starlink satellites (communication clock rate of around 230 kHz) can be no higher than 1.3 km above the receiver, and Sky TV satellites (communication clock rate of 22 MHz or above) can be no higher than 13 metres above the receiver.

Another round of speculative-execution vulnerabilities

Posted Aug 10, 2023 16:23 UTC (Thu) by malmedal (subscriber, #56172) [Link]

Nobody is misunderstanding you, it is very easy to understand what you are saying. It's just that it is wrong.

However you seem to be unable to understand what people are saying, please read more carefully.

For instance there is no "telegraph effect" the "telegrapher's equations" are just Maxwell's equations applied to signals in a wire.

If you wish to be able to say anything intelligible about chips you need to understand what "pipelines" are in this context. This appears to be a major gap in your knowledge, you completely ignore it when people bring this up. It is not just a word, it is one of the fundamental concepts.

Already the 8088 had a pipeline, it is not a new concept.

Another round of speculative-execution vulnerabilities

Posted Aug 9, 2023 21:57 UTC (Wed) by malmedal (subscriber, #56172) [Link] (13 responses)

As I said, there is no hard limit, with a speed of 0.05mm per nanosecond would imply that a 3cm wide chip could only manage less than 2MHz. Yet the speeds of the latest processors are above 5GHz.

Another round of speculative-execution vulnerabilities

Posted Aug 9, 2023 22:04 UTC (Wed) by willy (subscriber, #9762) [Link] (11 responses)

To further emphasize this point, the speed of a PCIe gen 6 link is now 64GHz. And the maximum length of a trace is considerably longer than 3cm.

Another round of speculative-execution vulnerabilities

Posted Aug 9, 2023 22:21 UTC (Wed) by mathstuf (subscriber, #69389) [Link] (2 responses)

Another way to think about it: these signals are largely one-way communication. If it were round-trip, the speed of light would matter. But all that really matters is how accurately you can sample the wire for a signal (PCIe gen 6 can apparently do so 64 billion times a second).

And another way to sanity check things: if the size of space between communication endpoints limited your processing rate, we'd probably still be waiting for the first (quality) images from various Mars rovers.

At least I think that's somewhat closer than what Wol has as a model.

Another round of speculative-execution vulnerabilities

Posted Aug 10, 2023 8:00 UTC (Thu) by Wol (subscriber, #4433) [Link] (1 responses)

No. I did say the model is based on communication *between* components, ie there-and-back.

Cheers,
Wol

Another round of speculative-execution vulnerabilities

Posted Aug 10, 2023 20:28 UTC (Thu) by rschroev (subscriber, #4164) [Link]

If I were to send you a good old-fashioned letter, it would take a day or 4 (or somewhat less or more, I don't actually know; the exact value doesn't matter for this discussion); a reply from you to me would take 4 days too; there and back is then 8 days. Does that mean that I can only send you a letter once every 8 days? Only when my letter needs information from your reply; in that case I have to wait until I get your letter. But in all other cases, I can easily send new letters while old ones are still in transit. The distance between you and me sets a lower limit on latency, but does not affect bandwidth. It's the same for communication in computer systems.

Another round of speculative-execution vulnerabilities

Posted Aug 9, 2023 22:23 UTC (Wed) by malmedal (subscriber, #56172) [Link]

It's amazing how far they have pushed the technology, just ten years ago I would have said it was impossible to get that high outside of a lab-setting.

Another round of speculative-execution vulnerabilities

Posted Aug 10, 2023 10:49 UTC (Thu) by james (guest, #1325) [Link] (6 responses)

To further emphasize this point, the speed of a PCIe gen 6 link is now 64GHz.
I'm pretty sure this isn't technically correct, at least when talking about how far the signal propagates before the next signal is generated. PCIe 6.0 uses
PAM4 (Pulse Amplitude Modulation with 4 Levels) [...] a multilevel signal modulation format used to transmit data. [...] It packs two bits of information into the same amount of time on a serial channel. The utilization of PAM4 allows the PCIe 6.0 specification to reach 64 GT/s data rate and up to 256 GB/s bidirectional bandwidth via a x16 configuration.
It's basically the same concept as MLC versus SLC in flash.

This is the key difference between PCIe 5.0 (which used NRZ, or one bit per cycle) and PCIe 6.0. Both run at 32 billion signals per second: it's just with PCIe 6.0 each signal conveys two bits.

Your main point is correct, though -- this isn't what limits the length of a PCIe 6.0 connection.

Another round of speculative-execution vulnerabilities

Posted Aug 10, 2023 15:39 UTC (Thu) by kpfleming (subscriber, #23250) [Link] (5 responses)

And those 32 billion transfers per second are spread across 16 parallel lanes... so each lane is nowhere close to '64 GHz' :-)

Another round of speculative-execution vulnerabilities

Posted Aug 10, 2023 16:44 UTC (Thu) by malmedal (subscriber, #56172) [Link] (4 responses)

> And those 32 billion transfers per second are spread across 16 parallel lanes...

No. Each lane separately transmits 64 Gigabits per second.

Standard terminology is 64GT and and 32GHz.

Another round of speculative-execution vulnerabilities

Posted Aug 23, 2023 5:28 UTC (Wed) by JosephBao91 (subscriber, #157211) [Link] (3 responses)

Well, I think this statement is still not correct.
PCIe Gen5 is 32GT/s, with a frequency of 16GHz (tranfer data both posedge and negedge), and Gen6 uses PAM4 instead of NRZ, it transfers 2bits each time, and the frequency is still 16GHz, but the speed is 64GT/s.
And for hardware design, PAM4 16GHz is more difficulty compared with NRZ 16GHz.

Another round of speculative-execution vulnerabilities

Posted Aug 23, 2023 11:34 UTC (Wed) by malmedal (subscriber, #56172) [Link] (2 responses)

Do you have a reference for this? I haven't read the actual specs myself, but all the articles I've read say 5.0 is 32GHz e.g. https://www.tomshardware.com/reviews/pcie-definition,5754...

Another round of speculative-execution vulnerabilities

Posted Aug 23, 2023 12:13 UTC (Wed) by excors (subscriber, #95769) [Link] (1 responses)

I think that may just be a different meaning of "frequency": the sampling rate is 32GHz, while the Nyquist frequency (basically the frequency of the sine wave corresponding to the worst-case signal 1010101...) is 16GHz. Same as describing audio CDs as 44kHz (the sampling rate) or 22kHz (the highest audio frequency that can be encoded without aliasing) - both are reasonable in different contexts.

For example https://blog.samtec.com/post/why-did-pcie-6-0-adopt-pam4-... describes the Nyquist frequency of PCIe 5.0/6.0 as 16GHz. (The sampling rate is also the same in both, the difference is that in 6.0 each sample encodes 2 bits, so it's 16GHz Nyquist frequency with 32GHz sampling rate and 64GT/s data rate.)

Another round of speculative-execution vulnerabilities

Posted Aug 23, 2023 13:14 UTC (Wed) by malmedal (subscriber, #56172) [Link]

Thank you, sounds plausible.

Another round of speculative-execution vulnerabilities

Posted Aug 9, 2023 22:26 UTC (Wed) by farnz (subscriber, #17727) [Link]

For a very clear example, a geostationary TV satellite is typically transmitting at 22 MHz or higher symbol rates; if the signal has to propagate all the way from the satellite to the receiver before the satellite can start the next symbol, then geostationary orbit has to be no higher than 14 meters above the satellite dish. In practice, everything is designed to handle this delay, and thus it's fine.

If you insist on two-way communication, Starlink's signal has been partially reversed engineered, and has a symbol time of 4.4 µs; this corresponds to 1.3 km path length in free space. And yet, a Starlink satellite is around 550 km above the Earth's surface, for a propagation delay of around 1,800 µs - significantly more than the symbol time.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds