Question about Linus' QOTW
Question about Linus' QOTW
Posted Aug 25, 2011 10:47 UTC (Thu) by pr1268 (guest, #24648)Parent article: Quotes of the week
I've questions about Linus' message—immediately preceding "Has anybody ever looked at a real computer?" he says "YOU ARE DIVIDING A UNSIGNED VALUE BY TWO" (emphasis his).
Is his complaint about misuse (or unnecessary use) of the do_div macro? It seems that dividing an unsigned value by two is as easy as a bit shift (instead of a six-line macro in /usr/src/linux/include/asm-generic/div64.h). Thanks for the clarification.
Posted Aug 25, 2011 15:40 UTC (Thu)
by bronson (subscriber, #4806)
[Link] (2 responses)
Posted Aug 25, 2011 16:30 UTC (Thu)
by hmh (subscriber, #3838)
[Link]
Linus says this later (which I also consider QotW material): Which is in no way an overstatement of the badness, so I think you got it just right. It is ungodly expensive, indeed.
Posted Aug 25, 2011 17:51 UTC (Thu)
by blitzkrieg3 (guest, #57873)
[Link]
Posted Aug 25, 2011 17:50 UTC (Thu)
by blitzkrieg3 (guest, #57873)
[Link] (8 responses)
Posted Aug 26, 2011 4:53 UTC (Fri)
by pr1268 (guest, #24648)
[Link] (7 responses)
Now I've another question: where's the compile error? The following code compiles and runs fine on my decrepit 32-bit P4: Or are there 32-bit non-x86 architectures on which Linux runs which don't have such support for 64-bit integral operations? (And if so, just curious, which archs?) Thanks!
Posted Aug 26, 2011 4:59 UTC (Fri)
by dtlin (subscriber, #36537)
[Link] (5 responses)
Posted Aug 28, 2011 3:03 UTC (Sun)
by giraffedata (guest, #1954)
[Link] (4 responses)
Not when I do it. Never mind pr1268's fancy Pentium IV. On my Pentium II, with GCC 2 (-O0), "unsigned long long a; a /= 2" compiles to two instructions, after loading 'a' into two 32 bit registers: 1) shrdl to shift the lower 32 bits right 1, shifting in the low bit of the upper 32 bits and 2) shrl to shift the upper 32 bits right 1.
No call to libgcc is emitted.
Posted Aug 28, 2011 20:25 UTC (Sun)
by robbe (guest, #16131)
[Link]
Or does the kernel use some compiler flags that forbid the use of the simple two shift instructions?
Posted Aug 29, 2011 13:44 UTC (Mon)
by blitzkrieg3 (guest, #57873)
[Link] (2 responses)
Here's the error they were trying to fix:
> "ERROR: "__udivdi3" [fs/btrfs/btrfs.ko] undefined!
From what I can tell, the compiler can't translate: min_free /= 2; to a single operation on 32 bit, so it's replaced with the __udivdi3 call. In userspace, since it's a division by two, you get two bit shifts. In other cases, it would be two divl operations. The kernel can do a double divide by using do_div.
By not including __udiv in the 32-bit libraries, Linus is essentially forcing the programmer to:
A) realize that integer division is costly
http://gcc.gnu.org/onlinedocs/gccint/Integer-library-rout...
Posted Aug 30, 2011 2:34 UTC (Tue)
by giraffedata (guest, #1954)
[Link] (1 responses)
It's a statically linked call to the library. I think you mean "run time function" -- that's what the GCC manual page you point to calls it." It means a function in the GCC runtime library, i.e. libgcc.
__udivdi3() is for use when compiling for a CPU that doesn't have a divide instruction (that used to be pretty common -- divide is a pretty complex instruction; I don't know if it is common now. x86 has always had divide). But if the compiler is smart enough to recognize a divide by two and know you can do that by shifting, I would expect it not to use __udivdi3() on any CPU.
But the error message makes it clear that the compiler in this case generated a call to __udivdi3() when it created btrfs.ko. I assume, then, that this is not x86 because as we've seen, the compiler generates shift instructions instead on x86 (and even if it didn't, it would just generate a divide instruction). I also have to believe the divide-by-shifting cleverness is in a platform-dependent part of GCC and is absent for whatever platform this is.
One other thing to note: we've been saying this is problem with 64 bit numbers on a 32 bit CPU. But __udivdi3() is the divide function for "unsigned long" operands, i.e. on a 32 bit machine it is a 32 bit divide.
The compiler would use __udivti3() instead if it wanted to divide 64 bit numbers.
Posted Aug 30, 2011 3:34 UTC (Tue)
by nybble41 (subscriber, #55106)
[Link]
That's not quite correct. The preface to the runtime library section[1] of the GCC Internals manual states that the C types are "for illustrative purposes", and that "these routines take arguments and return values of a specific machine mode, not a specific C type." The Machine Modes section[2] defines SImode, DImode, and TImode as a four-byte integer, an eight-byte integer, and a 16-byte integer, respectively. In other words, in the hypothetical architecture invented to illustrate these library routines, the "int" in __udivsi3() is 32-bit, the "long" in __udivdi3() is 64-bit, and the "long long" in __udivti3() is 128-bit.
A simple test program confirms that 64-bit division generates a call to __udivdi3() in 32-bit mode. Similarly, division of unsigned __int128 values in 64-bit mode generates a call to __udivti3().
[1] http://gcc.gnu.org/onlinedocs/gccint/Libgcc.html
Posted Aug 30, 2011 3:00 UTC (Tue)
by jrn (subscriber, #64214)
[Link]
But how? you might wonder. Well, the patch fixing it changes two lines --- one contains a division by two, which should be compiled to a shift, and the other a division by a variable amount, which requires human attention and hence results in an unresolved reference to __udivdi3, presumably. I imagine the patch author noticed both divisions in the same function and needlessly changed the former at the same time as the latter.
Question about Linus' QOTW
Question about Linus' QOTW
Big 64-bit divides are bad bad bad. They are so horrendously bad on 32-bit that we don't even support them.
Question about Linus' QOTW
do_div
was already there
http://lxr.linux.no/#linux+v3.0.3/arch/x86/include/asm/div64.h#L20
It is only as easy as a single bit shift on 64 bit architecture. Here's what I gather. The patch doesn't compile on 32-bit, because of
Question about Linus' QOTW
min_free /= 2;
Since min_free is a 64 bit unsigned integer, this is a simple bit shift for 64 bit architectures (your compile will optimize it for you) but is impossible for 32 bit architectures and generates a compiler error.
So to fix that, they use do_div
, which is an expensive macro that allows 32-bit machines to do 64-bit division. But all you really need to do on 32-bit machines is two bit shifts instead of one: Bit shift the high register, than the low register.
Instead, on 64 bit, the do_div
fix changes a single simple instruction operation to the more expensive divl
operation. On 32 bit, the fix changes what should be two simple instruction operations to the expensive multiple instruction do_div
macro.
64-bit division on 32-bit arch - where's the compiler error?
#include <inttypes.h>
#include <stdio.h>
#include <stdlib.h>
int main(int argc, const char* argv[])
{
uint64_t val = 0;
if (argc > 1)
val = strtoull(argv[1], 0, 10);
if (val > 2)
val /= 2; /* or val >>= 1; */
(void) fprintf(stdout, "%llu\n", val);
return 0;
}
That works in userspace thanks to libgcc, which isn't linked into the kernel.
64-bit division on 32-bit arch - where's the compiler error?
64-bit division on 32-bit arch - where's the compiler error?
That works in userspace thanks to libgcc, which isn't linked into the kernel.
64-bit division on 32-bit arch - where's the compiler error?
64-bit division on 32-bit arch - where's the compiler error?
B) Manually decide whether to bitshift, or do two divisions with do_div.
64-bit division on 32-bit arch - where's the compiler error?
It's not a call to the library, it's statically linked with the "real time function" __udivdi3.
64-bit division on 32-bit arch - where's the compiler error?
[2] http://gcc.gnu.org/onlinedocs/gccint/Machine-Modes.html
64-bit division on 32-bit arch - where's the compiler error?