User: Password:
|
|
Subscribe / Log in / New account

Factoring in I/O

Factoring in I/O

Posted Jul 1, 2008 23:08 UTC (Tue) by jd (guest, #26381)
Parent article: The Kernel Hacker's Bookshelf: Ultimate Physical Limits of Computation

I/O is a relatively simple thing to factor in, provided we assume something analogous to the
Transputer, which communicated to "adjacent" nodes, where "adjacent" would either mean the
physically adjacent Transputers or the ones that would be logically adjacent if you used a
hypercube topology. The latter was generally faster, but here we don't have wires and can't
organize the atoms that way.

Basically, you need to consider a cluster of atoms, such that the cluster supports your
logical operation plus the ability to receive input and the ability to deliver output. The
Transputer used four serial connectors. Could we have more than that, here? Well, that depends
on the maximum density you could pack the atoms and the arrangement they took. The tetrahedron
is a nice, packed structure, and if I remember correctly, gives you 6 bonds per atom. (One
bond to the other three atoms in the tetrahedron, plus one bond to each of the three adjacent
tetrahedrons.) Each bond is capable of conveying data, so each bond can be regarded as a
connector. As with the Transputer, data can flow in either direction but can only flow in one
direction at a given time.

So, your processing atom has 6 lines for I/O. Assuming it can only handle two input lines and
one output line (a very basic gate), you have sufficient lines to process at double the I/O
bandwidth if necessary without running out of data. Assuming an I/O transaction to be simply
the delivery of information to a processing atom used for I/O, a gate operation, and then the
delivery of the result, each I/O operation consumes the time for a gate transaction plus twice
the time it takes for an electron to travel the distance between atoms in this structure.

(Not sending to a specific atom is equal to ANDing with zero, so this drives the heat output
up considerably. We are assuming gate transactions are based on instantaneous input values and
do not need to be actually measured, per se. Otherwise, we need to add in the time for two
independent states to stabilize and be measured, where the two state changes do not take place
simultaneously from the perspective of the observing atom. IIRC, the correct value to use is
1.5x the time it takes to stabilize for a single state.)

So, the total time for a local transaction is now the time for three gate transactions, plus
the time for four electron hops, plus optionally 3x the time it takes for a state to
stabilize. An electron hop is limited by the speed of light, but the exact distance depends on
the forces involved. For simplicity, I will disregard it, but if you want a more accurate
value, I suggest picking the carbon atom and maybe diamond as the structure, then figure out
the correct values from that. The other values are in the article.

However, transactions are multicast. Each I/O atom can deliver to anywhere between zero and
two target processing atoms (assuming it cannot deliver to the originating atom), plus zero to
three other I/O atoms, which can in turn cascade the data to processing atoms and other I/O
atoms. The exact number depends on what happens to be adjacent.

This means a signal can go between any given processing atom and any given set of processing
atoms (excluding any set including itself). We use the per-hop calculation to calculate the
average time for an actual transaction by calculating the average volume you would need to
transmit over. The maximum time is the time for a signal to sweep the entire structure. The
maximum time assuming any-to-any communication will be for the signal to sweep half the
distance.

Because 2/3 of the atoms are used for I/O, and because it takes quite some time for data to
reach a given processing atom, the maximum theoretical speed should be reduced accordingly.

This has assumed we are using electrons to convey data, but we don't have to. An electron, at
suitable speed, striking a nucleus, will release an X-Ray. An X-Ray, at suitable energy,
striking a nucleus, will release an electron. This is known as X-Ray fluorescence and is a
widely-used technique.

This reduces (but does not eliminate) the need to use processing atoms as I/O switches. If
there's line-of-sight, it's possible for an X-Ray to deliver data to a target cluster of
atoms. You would use the cluster to absorb the X-Rays and then deliver the data as electrons
to the processing atom. The processing atom would need to (somehow) fire an electron fast
enough to convert the result back into an X-Ray.

This I/O is point-to-point, not multicast, but has significantly less latency, allowing for
far larger "neighbourhoods".


(Log in to post comments)


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds