| From: | Thomas Schwinge <thomas-AT-codesourcery.com> | |
| To: | <info-gnu-AT-gnu.org>, <bug-hurd-AT-gnu.org>, <hurd-devel-AT-gnu.org> | |
| Subject: | GNU Hurd 0.6 released | |
| Date: | Wed, 15 Apr 2015 22:51:08 +0200 | |
| Message-ID: | <8738415d4z.fsf@kepler.schwinge.homeip.net> | |
| Archive-link: | Article |
Hi! We're pleased to announce version 0.6 of the GNU Hurd, <http://www.gnu.org/software/hurd/>. The GNU Hurd is the GNU project's replacement for the Unix kernel. It is a collection of servers that run on the Mach microkernel to implement file systems, network protocols, file access control, and other features that are implemented by the Unix kernel or similar kernels (such as Linux). More detailed: <http://www.gnu.org/software/hurd/hurd/documentation.html>, <http://www.gnu.org/software/hurd/hurd/what_is_the_gnu_hur...>. GNU Hurd runs on 32-bit x86 machines. A version running on 64-bit x86 (x86_64) machines is in progress. Volunteers interested in ports to other architectures are sought; please contact us (see below) if you'd like to help. To compile the Hurd, you need a toolchain configured to target i?86-gnu; you cannot use a toolchain targeting GNU/Linux. Also note that you cannot run the Hurd "in isolation": you'll need to add further components such as the GNU Mach microkernel and the GNU C Library (glibc), to turn it into a runnable system. This new release bundles bug fixes and enhancements done since the last release: | Version 0.6 (2015-04-10) | | Numerous cleanups and stylistic fixes of the code base. Several | problems have been identified using static analysis and exercising | tools, and have subsequently been fixed. | | The message dispatching code in the Hurd servers has been improved. | Among other things, we now make use of the protected payloads | introduced in GNU Mach 1.5. | | The embedded gz and bz2 decompressor code has been removed, libz and | libbz2 is used instead. | | The native fakeroot tool has been greatly improved and is now able to | build many packages. The portinfo and rpctrace tools now offer a | better debugging experience. | | The performance of the integer hashing library has been improved. | | The init server has been split into the startup server (handling early | system bootstrap and shutdown), and a SysV-style init program (aptly | named `init'). | | The procfs and random translators have been merged. Many thanks to all the people who are helping! Releases may be downloaded from <ftp://ftp.gnu.org/gnu/hurd/>, or checked out of Git, <http://git.savannah.gnu.org/cgit/hurd/hurd.git/>. The MD5 and SHA1 checksums for this distribution are: 7d69c5e1bb47c9d5636054c57fbc0304 hurd-0.6.tar.bz2 0ac9af94761e5b59a3f19756c6f8d059 hurd-0.6.tar.bz2.sig 0b5130fffe640edc8e60fea3ce7b3d68 hurd-0.6.tar.gz 6c1ad02e1bfe8219341fae218612abc4 hurd-0.6.tar.gz.sig 08ef505f425db3a15d2ecee5f35897d1b7ef7755 hurd-0.6.tar.bz2 9049c1bbcc71fafc459f07a582575804cfd48ebb hurd-0.6.tar.bz2.sig a5d90c51d2b778c1a79895e11c1699ac98796020 hurd-0.6.tar.gz bff54932420a7e290a096a8582acf69c7b2bafec hurd-0.6.tar.gz.sig Please read the FAQ at <http://www.gnu.org/software/hurd/faq.html>. Bug reports should be sent to <bug-hurd@gnu.org> or filed on <http://savannah.gnu.org/bugs/?group=hurd>. Requests for assistance should be sent to <help-hurd@gnu.org> or filed on <http://savannah.gnu.org/support/?group=hurd>. You can also find us on the Freenode IRC network in the #hurd channel. For the maintainers, Thomas -- If you have a working or partly working program that you'd like to offer to the GNU project as a GNU package, see https://www.gnu.org/help/evaluation.html.
Finally 64-bit support in sight
Posted Apr 16, 2015 21:46 UTC (Thu) by proski (subscriber, #104) [Link]
Unless things have improved, the filesystems supported by GNU/Hurd are limited to 2 gigabytes, as they use mmap for the whole partition. Switching to a 64-bit architecture should fix that limitation. Considering the shortage of GNU/Hurd developers, I think it would make sense to give up 32-bit support completely and concentrate on x86_64.
Finally 64-bit support in sight
Posted Apr 16, 2015 23:21 UTC (Thu) by rahvin (subscriber, #16953) [Link]
They just don't have enough developers because Linux draws them and they can get paid for working on Linux. Though I suspect they will keep moving along very slowly and by the time they get a 64 bit bootable kernel the rest of the world might be moving to 128bit. (yes I'm that skeptical). You just can't build a Kernel of that complexity with a dozen developers working part time for fun.
Finally 64-bit support in sight
Posted Apr 17, 2015 15:55 UTC (Fri) by flussence (subscriber, #85566) [Link]
TempleOS is doing pretty well as a one-man operation. Maybe Hurd has too *many* people working on it?
Finally 64-bit support in sight
Posted Apr 18, 2015 17:30 UTC (Sat) by ncm (subscriber, #165) [Link]
The full 256-bit instruction set is ridiculously fat and unhealthy because 8086 ate all the slim instructions, but small loops won't notice because the instruction cache sheds the fat on the way in.
When the chips that support AVX512 come out, you will have native 512-bit word support. The languages are running way, way behind, and the compiler infrastructure, too. I don't think there is even a native ABI for AVX2 yet, never mind a Linux kernel. What does "natural word size" mean, on a 256-bit ISA, and do you really want int to be that big?
The languages, even totally new ones, have not begun to come to grips with the new facts. What does it mean to have a C ABI, or C APIs, or a C FFI when there are no types that name the native word size?
Finally 64-bit support in sight
Posted Apr 19, 2015 3:02 UTC (Sun) by roc (subscriber, #30627) [Link]
Finally 64-bit support in sight
Posted Apr 20, 2015 12:25 UTC (Mon) by ncm (subscriber, #165) [Link]
128-bit addressing is not needed, usable, by anybody. When it is needed, if ever, it is trivial to add.
Finally 64-bit support in sight
Posted Apr 21, 2015 8:30 UTC (Tue) by jem (subscriber, #24231) [Link]
Indeed. The current 64-bit processors don't even offer 64-bit addressing. For example, the x86-64 is currently limited to 48 bits.
Finally 64-bit support in sight
Posted Apr 24, 2015 18:10 UTC (Fri) by cwitty (subscriber, #4600) [Link]
Finally 64-bit support in sight
Posted Apr 25, 2015 6:00 UTC (Sat) by jzbiciak (subscriber, #5246) [Link]
Long vector processors have been around for many, many years. By your measure, ARMv7 was a 128-bit architecture due to NEON. Heck, the Pentium MMX (which came out almost 20 years ago) would have been a 64-bit processor, and I doubt you'd find many folks who call the Pentium MMX a 64-bit machine.
While there have been many debates on what defines the "bitness" of an architecture, the world seems to have settled on a criterion that I will state roughly as: "What is the width of the largest integer data type that the architecture supports, for which there is no faster, smaller datatype?"
So, for example, the 68000 offered 32-bit registers and 32-bit math, but it offered a 16-bit ALU, and the fastest integer operations on that machine were 16 bits. Later, when the ALU widened to 32 bits in that family (with the 68020, IIRC), there was no difference in performance between 16-bit and 32-bit arithmetic for register-to-register operations. So, by that measure the 68000 was a 16-bit architecture and the 68020 was a 32-bit architecture.
There have been interesting hybrids. The Z80, for example, has a 4-bit ALU. But, it doesn't offer any 4-bit instructions that are faster than its 8-bit instructions. Most everyone considers it an 8-bit processor.
Note that this is separate of the pointer size / address reach. 8-bit computers generally had a 16-bit address space. The 68020 only had a 24-bit address space. The original ARM only had a 26-bit address space. The modern x86-64 architecture only has a 48-bit virtual address space.
Vectorizing just puts more of the same ALUs in parallel, and forces them to run in lockstep—Single Instruction, Multiple Data (SIMD). It doesn't widen the natural width of the arithmetic. Vector processors have been around for decades. (The TI ASC came out in the early 70s with a 256-bit vector width. But its fundamental integer word was 32 bits.) While you might argue that SIMD widened the effective arithmetic width, I disagree.
Each of those operations, despite being issued in parallel, were separate operations. How do you compare that to an out-of-order superscalar processor that has an equal ALU bandwidth (say, 8 parallel 32-bit ALUs), that are kept busy by dynamic scheduling algorithms? To call one 256-bit and the other 32-bit really seems to miss the point. And if you do sum up the total ALU bandwidth to determine a bitness, then multiple closely related processors that are binary object code compatible would all get different bitness scores. That seems cross-purpose to the "bitness" measure. There are other measures (FLOPs, SPECint, etc) that measure total compute performance.
So really, by the more sane and consistent definition of "bitness" I offered above, the arrival of 128-bit machines will be heralded by the arrival of ALUs that perform arithmetic on 128-bit values as fast as any narrower type. And those machines truly seem to be a long ways off.
Now size of the address space and size of the fundamental machine word are decoupled, but not completely. They've generally remained close to each other. Large linear address spaces have taken over. With the advent of 32-bit processors, addresses have generally stayed smaller or equal to the width of the processor's ALU. (Ok, there were hacks to extend 32-bit at the end, delaying 64-bit. That doesn't really invalidate the argument I'm trying to make.)
Current systems seem to cope well with ~40 bit address spaces (1TB), and heck, I'll throw you a bone and offer that 16TB address spaces represent a good upper bound for today's systems (44 bits). Let's further assume that Moore's Law continues to double transistor densities (and therefore reasonable problem sizes) every 18 months, so that every 3 years, you add 2 bits to that. To get from 44 bits to 64 bits of required address space, by that measure, will take 30 years. That implies we won't need to expand to a larger "bitness" in our basic calculations for another 3 decades.
So please, don't confuse vector length with the size of a widest fundamental operation that defines the baseline performance.
Finally 64-bit support in sight
Posted Apr 25, 2015 6:05 UTC (Sat) by jzbiciak (subscriber, #5246) [Link]
And if you do sum up the total ALU bandwidth to determine a bitness, then multiple closely related processors that are binary object code compatible would all get different bitness scores. That seems cross-purpose to the "bitness" measure.
And by this, I mean to suggest looking at the current crop of x86 processors (ignoring SSE/AVX/etc), and just compare the number of integer units they have. You have everything from single-issue to quite wide super-scalar issue, but all are called "64 bit machines" because they run x86-64 code.
Both the 486 and Pentium were called 32-bit machines, despite the Pentium being able to issue two 32-bit instructions in parallel. And both could compute addresses in parallel with the main compute, so both were doing more than 32 bits of computation every cycle. You'd be hard pressed to find anyone remotely mainstream that would describe either of these processors as something other than 32-bit processors, though.
Finally 64-bit support in sight
Posted Apr 16, 2015 23:57 UTC (Thu) by DrPhil (guest, #102022) [Link]
>The 2 GiB limit has been removed.
>IDE disk drivers however currently do not support more than 2^28 sectors, i.e. 128GiB.
>The AHCI disk driver supports up to 2^32 sectors, i.e. 2TiB.
Just give it up already
Posted Apr 16, 2015 22:21 UTC (Thu) by HelloWorld (guest, #56129) [Link]
Even if one believes in the idea of multi-server microkernels designs, why bother with Hurd?
Just give it up already
Posted Apr 16, 2015 23:18 UTC (Thu) by xslogic (guest, #97478) [Link]
I had heard that it's a bit of a bloaty microkernel - not that I've verified it.
Just give it up already
Posted Apr 17, 2015 16:03 UTC (Fri) by HelloWorld (guest, #56129) [Link]
And of course there's dozens of other points, but if you're not convinced at this point, probably nothing will.
Just give it up already
Posted Apr 17, 2015 16:49 UTC (Fri) by kreijack (guest, #43513) [Link]
I was unaware of this before. But.... which would be the problem ?
Give me an architecture where your statement is true, and show me on this same architecture which *not broken* languages exist.
What I am saying is that, even if the C is far to be perfect, it is difficult to find a better "medium/low level" language. May be Rust...
And finally, the Hurd problems are unrelated to the programming language. Otherwise if C would be the problem, how could you explain the success of the Linux kernel ?
Just give it up already
Posted Apr 17, 2015 17:49 UTC (Fri) by HelloWorld (guest, #56129) [Link]
This has nothing to do with architectures, it can cause problems on normal architectures where the NULL pointer is indeed represented as a word with all bits set to zero. The issue is that compiler optimisations can make use of this in order to elide checks. For code such as
memset(&p, 0, sizeof p);
if (foo)
p = &bar;
if (p) {
…
}
the compiler is allowed to optimise the second (p) test away, regardless of how the machine actually happens to represent NULL pointers. The problem is that C looks as if it were close to the machine but it really isn't, because a lot of knowledge (e. g. “I know my machine's add instruction will wrap around on overflow”) just doesn't carry over to C. And most C programmers are completely unaware of this situation.
> Otherwise if C would be the problem, how could you explain the success of the Linux kernel ?
If you throw enough smart people at it you can overcome problems with the language, that's a well-established fact. Hell, tons of software is written in PHP or COBOL, are those language not problematic in your opinion? There are reasons to choose C, like availability of tons of skilled programmers and portability to even the most obscure architectures. But that doesn't make C a well-designed language.
Just give it up already
Posted Apr 17, 2015 18:11 UTC (Fri) by scottwood (subscriber, #74349) [Link]
Just give it up already
Posted Apr 17, 2015 18:48 UTC (Fri) by HelloWorld (guest, #56129) [Link]
There's an interesting article about this:
http://blog.regehr.org/archives/213
> What implementation does this?
I don't care about implementations, my problem is that the specification allows this sort of thing. Even if this specific instance doesn't currently happen in practice (I don't know if it does), others do and have led to exploitable bugs in the past. Its too subtle a model for most programmers to comprehend.
Just give it up already
Posted Apr 17, 2015 19:23 UTC (Fri) by kreijack (guest, #43513) [Link]
Sorry I can't agree. The compiler *is not free* to assume; it depends by the architecture...
And again, could you give us an example of those architecture ? And does this kind of architecture have a better programming language ?
We could discuss a lot about how bad is C, but which is better (and support the same architectures.....) ?
My hope is Rust, but it has to prove its goodness...
Just give it up already
Posted Apr 17, 2015 20:13 UTC (Fri) by HelloWorld (guest, #56129) [Link]
Just give it up already
Posted Apr 21, 2015 14:19 UTC (Tue) by jwakely (guest, #60262) [Link]
Just give it up already
Posted Apr 17, 2015 19:32 UTC (Fri) by scottwood (subscriber, #74349) [Link]
And implementations matter. Even if one gives up on fully portable C (which is implied by using memset to initialize pointers to NULL, as well as several other things OS kernels do), one can still have "sane implementations of C" or some subset thereof as the target, rather than C itself.
I'm not saying C is great, but this falls a bit short of "broken in almost every imaginable way".
Just give it up already
Posted Apr 17, 2015 20:12 UTC (Fri) by HelloWorld (guest, #56129) [Link]
> or they don't know and thus should treat the NULL-ness as unknown after a memset()
Wrong. C doesn't care what the bits in memory are, that's just not how the C standard works. If you don't believe me, go read the article I've linked to, perhaps you'll believe him.
> I'm not saying C is great, but this falls a bit short of "broken in almost every imaginable way".
You've picked one argument of a list and tried to debunk it (and involuntarily confirmed it). Even if that argument were wrong, there's a bunch of others that you simply ignored. I've mentioned syntax, type system, control structures, modularity, runtime semantics. What remains?
Just give it up already
Posted Apr 17, 2015 20:37 UTC (Fri) by kleptog (subscriber, #1183) [Link]
The way I understand it is that certain operations (like pointer dereferencing) have preconditions to make sense (in that case not-NULLness) and the compiler is allowed to assume the preconditions are true.
For example, when shifting left the compiler is allowed to assume things about the amount being shifted and generate code based on that assumption. This is because at least on the popular x86 architecture the SHL instruction ignores part of the shift so "SHL eax, 32" in fact leaves eax unchanged. No doubt other architectures behave differently. If the compiler is allowed to assume the shifted amount is less than the word size, then it can generate efficient code on both architectures.
This leads to the reason why the undefined behaviour exists: if C specified what happened when you added one to the largest integer, it would make it hard to write efficient code on a non-twos-complement architecture. Since the goal is to being to use C to write efficient programs on any architecture, undefined behaviour is there to give compilers room to manoeuvre.
You may think that since everyone uses two-complement these days that such a restriction is crazy, but you can't say there's no reason for it.
Which goes back to your example: there are no operations in your example which require preconditions to make sense. "Definedness" is not a relevant precondition since the if(p) cannot be optimised away whether or not p is defined. Here the compiler is allowed to assume p is defined, so the code works exactly as expected.
Just give it up already
Posted Apr 17, 2015 22:33 UTC (Fri) by khim (subscriber, #9252) [Link]
I'm not convinced it works that way (but am willing to be educated).
It actually works that way if undefined-behavior is involved. Of course in that particular case there are no undefined-behaviors at all. It's implementation-defined behavior which, of course, works as expected.
C may not be all that pretty, but when someone starts cooking up horror stories not related to reality at all the simplest thing is to ignore him or her. And it looks like it's time to do that with LWN as whole: too many guys who don't know what they are blabbling about yet produce dozens upon dozens of useless comments.
Which goes back to your example: there are no operations in your example which require preconditions to make sense.
Indeed. memset function is defined in the following way:
Thememsetfunction copies the value ofc(converted to anunsigned char) into each of the firstncharacters of the object pointed to bys.
And of course converted to an unsigned char is there by a reason—strict aliasing rules specifically explain that arrays of chars could alias with anything in a valid program:
Values stored in non-bit-field objects of any other object type [except forunsigned char] consist ofn×CHAR_BITbits, wherenis the size of an object of that type, in bytes. The value may be copied into an object of typeunsigned char[n](e.g., bymemcpy); the resulting set of bytes is called the object representation of the value.
"Definedness" is not a relevant precondition since the if(p) cannot be optimised away whether or not p is defined.
It's kinda relavant: if compiler discoverts that a particular codepath triggers undefined behavior in all possible cases then it starts to assume that variables on that codepath are “undefined” and thus return “false” in comparisons no matter what. But of course there are no undefined behaviors in this particular example which just shows that C-haters rarely understand C.
P.S. Thanks for explaining why undefined behavior is there, how to catch it and why it's not as nasty as people like to pretend.
Just give it up already
Posted Apr 18, 2015 12:42 UTC (Sat) by HelloWorld (guest, #56129) [Link]
Nevertheless I stand by my point that the whole undefined/unspecified/implementation-defined quagmire is too subtle for most programmers to comprehend. There are other examples such as the vulnerabilities in the Linux kernel due to NULL pointer checks being optimised away. I also stand by all my other points, e. g. lack of necessary features, bad syntax, antifeatures like the preprocessor. The fact that one example I've given was wrong doesn't change the big picture.
Just give it up already
Posted Apr 18, 2015 16:50 UTC (Sat) by khim (subscriber, #9252) [Link]
Of course there had been implementations where null pointer isn't all-bits-zero; you would need more than that to get an undefined behaviour, though - implementation should have trapping pointer representations *and* all-bits-zero should be one of those.
And even that will not make the result into an undefined behavior.
I find really, really, strange that people are talking so much about undefined behavior without clear understanding of what is it, how it works, and why it's there.
Let's start with TL;DR first: any language without undefined behaviors is a language which couldn't be used to write portable yet efficient code (and thus is most likely not something you would use for system programming). Undefined behavior is natural consequence of the C language purpose and it couldn't fullfill that purpose without it. And any other language which will try to fill that same niche will need to have undefined behaviors, too.
─────────────────────
What? How? Why?
Well, let's go back to the beginning. Just what is the C language? The answer is obvious: it's language developed for the portable OS, UNIX. Thus two main requirements:
And that's it. Combination of these two requirements automatically leads to the “undefined behavior” notion.
Because, you see, if we want to keep property #2 then we need certain assumptions. For example we couldn't use pointer after call to free. And it's easy to see why: because sometimes it works, sometimes it does not work and to make it work reliably you'll need quite costly and expensive GC or something like that.
Thus we have the rule: “good, portable program shouldn't try to ever use pointer after call to free function.” Note that this is is rule for the software developer, not for the compiler!
Ok. If we follow that rule then we could easily write portable programs. But then, to make our code faster we could decide to trap NULL access and… oops: we really couldn't! There are systems where every bitsequence is a valid pointer (mostly embedded systems). Even if not addresses are valid on some other systems it may not be easy to trap a NULL memory access (think 8086 in a real mode). If we want to keep property #2 then we must confine software developer still further: “good, portable program shouldn't try to ever use NULL pointer.” Alternative is awful code where each and every memory access includes additional check for NULL-access on these systems! This is, again, rule for the software developer, not for the compiler!
Now, are we done? Well, what'll happen if someone wants to convert bits value into a bitmask exactly bits long? Surely (1U << bits) - 1U should do just nicely? Oops, no: when we try to create “maximum mask” it works just fine, e.g. on ARM (where LSL instruction uses lowest byte) but fails on Intel (where SHL instruction uses wordwith lowest bits). Thus to make our code portable we'll need impose yet another rule on the programmer: “good, portable program shouldn't try to ever use overflowing shifts, not even unsigned ones.” Again: that's what software developer should avoid, not something compiler should treat in a certain way.
And so on. Every time we find place where “good, portable program shouldn't try to ever do X” we add some new restriction on the software developer—to make the resulting portable code faster, of course!
Now, when C was finally standartized that set of rules was codified into said stadard: they are mentioned there under the name of… “undefined behaviors”! Again: these things have not arrived from the properties of CPU (well, they did, but in a very indirect way), they are codification of the set of laws (they were just rules before, but after standard have listed them they became laws) which software developer should honour!
Now it's easy to see why it does not matter what a particular implementation does with all-zero-bits pointer! “Undefined behavior” could only do us good when software developers are doing their homework right and write portable programs which avoid it we need to codify it in a standard first! Non-standard “undefined behavior” makes no sense whatsoever. Well, it could make sense if you are writing code for some “really” exotic architecture, but it's not something you could introduce easily. Because if you do that you are no longer dealing with C! You are dealing with some other langauge which means that you no longer could use all that wonderful portable code written in C! You must reaudit everything to make sure that new, non-standard “undefined behavior” is not triggered anywhere!
─────────────────────
I think the unfortunate shift happened when C became too popular and people started using it to write system-specific code. The list of undefined behaviors, quite understandable and natural to anyone who tries to write portable code became a nightmare for the developers who want to write system-specific code!
There are some CPUs which couldn't trap memory accesses? Why should I care, Linux does not run on these CPUs anyway!
There are one's complement CPUs? Yeah, maybe, I've never seen one and I certainly don't want to cater for their needs, why couldn't that stupid compiler understand that?
And, indeed, the fact that all rules are always in effect is really tiring. Some rules were relaxed somewhat e.g. most rules which protect one's complement CPUs don't belong to the list of “undefined behavior” rules, these are “implementation defined” things: this means that if you rely on them your program then your program is no longer 100% C portable, but it's still something you could rely upon on CPUs where two’s complement mode is used. Yes, it'll not work on a system with one's complement mode, but do you really care?
Indeed, one may argue that C have put too many rules into “undefined behavior” bucket (which every software developer must avoid like a plague) and too few into “implementation defined” bucket (which could be used if 100% C portability is not a goal), but what's done is done.
Just give it up already
Posted Apr 18, 2015 17:05 UTC (Sat) by viro (subscriber, #7872) [Link]
If all-bits-zero is such for pointer (which is implementation-defined, not undefined), the effect of memset() will leave you in a state where if (p) will, indeed, trigger undefined behaviour. Conversion from integer is unrelated - it's not required to produce the same value and it _can't_ produce a trap representation by definition - its result is not a stored value of an object at all.
Processor traps are only tangentially related to that - basically, implementation is not required to avoid them (or any other behaviour) in such situation. And on sufficiently weird architecture it might make sense to declare all-bits-zero to be trap representation for pointers in order to avoid serious overhead.
Just give it up already
Posted Apr 18, 2015 19:09 UTC (Sat) by khim (subscriber, #9252) [Link]
Ah, I see now. Right. Yeah, on such extremely weird architecture compiler may remove this whole function completely.
Conversion from integer is unrelated - it's not required to produce the same value and it _can't_ produce a trap representation by definition - its result is not a stored value of an object at all.
Actually it could produce trap representation—by defition: An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation.
Which means that yes, if zero-bits are representing such a trap representation then simple "int x = 0; void *p = x;" will trigger undefined behavior. Which is a little weird, but makes sense if you'll recall things like iAPX_432 or E2K.
This whole excercise just shows that unconditional selection of “extremely wide” portaibility may not be a good idea for C. It would have been better to add some pragmas to control most onerous requirements, but that's not our call to make.
Just give it up already
Posted Apr 18, 2015 23:54 UTC (Sat) by dashesy (guest, #74652) [Link]
What C lacks is a compiler that does not silently use the undefined behavior for optimization, to get some inflamed numbers in comparison charts. If every undefined behavior came with a warning (that could be silenced per-statement or per-function perhaps using a pragma) then pragmatically no one would be bitten by it. So it is a problem with sneaky compilers trying to be too smart.
Just give it up already
Posted Apr 19, 2015 0:49 UTC (Sun) by khim (subscriber, #9252) [Link]
If every undefined behavior came with a warning (that could be silenced per-statement or per-function perhaps using a pragma) then pragmatically no one would be bitten by it.
…because noone would ever use it. Do you really think that anyone would use a language where something “extremely awful” like “a[i] = b[i] + c[i]” would need dozen of pragmas (and dozen is most likely underestimation)? That's just impractical: pragmatically you'll not be able to find the actual code behind all these pragmas. Even if you could attach these dozen of pragmas to a function and not that one line.
Just give it up already
Posted Apr 19, 2015 1:21 UTC (Sun) by dashesy (guest, #74652) [Link]
BTW, you should not underestimate the power of wiggly lines and them showing in code preview, since I started using Pycharm it has become a duty for me to get rid of them all whenever I see one.
Just give it up already
Posted Apr 19, 2015 2:05 UTC (Sun) by khim (subscriber, #9252) [Link]
BTW, you should not underestimate the power of wiggly lines and them showing in code preview, since I started using Pycharm it has become a duty for me to get rid of them all whenever I see one.
Wiggly lines are great, but they only work because they are shown not at an every line which triggers undefined behavior but only on lines where “code stinks especially badly”. Think the aforementioned “a[i] = b[i] + c[i]” example. You could use dangling pointers, or you could use index which goes beyond the end of an array, or you could use different types which could violate “strict aliasing” rules, or you could have overflow or underflow, or you could access trap representation in memory, or you could derefence NULL there, or any of these four variables could be uninitialized, etc, etc.
The trouble is: if compiler or IDE will ditifully list all these troublesome possibilities you'll scream bloody murder: most of these are not even actively kept in memory yet satisfied by a developer automatically. I shouldn't try to use uninitialized variables? Duh! Yeah, I know. I shouldn't try to reach beyond the beginning of array? Of course! And so on. You check for them subconsciously because they are “kinda natural and obvious”, wiggly lines are not needed. And then you miss one rule out of dozens because for you it's not “kinda natural and obvious”! Sometimes you even do that because you know too much… yet not enough. E.g. you may know that there are couple of words before malloc-alocated array used by OS for book-keeping purposes and will decide to leave potential b[-1] access in place if you know that randomly picked result will not matter—yet forget that such access by itself triggers undefined behavior which means that compiler could then remove that function from your program! Oops: you have a problem.
It's relatively easy to find and show all possible undefined behaviors in a given program. But if you want to see somewhat useful list instead of thousands upon thousand of mostly useless complains… that's not easy to do. Because what may look “kinda natural and obvious” for you may be not all that “not all that natural or obvious” to me (or the other way around)!
Just give it up already
Posted Apr 22, 2015 8:55 UTC (Wed) by michaeljt (subscriber, #39183) [Link]
That has got me curious - what is wrong with that line?
Just give it up already
Posted Apr 22, 2015 8:58 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link]
Just give it up already
Posted Apr 22, 2015 9:42 UTC (Wed) by michaeljt (subscriber, #39183) [Link]
Just give it up already
Posted Apr 22, 2015 9:05 UTC (Wed) by michaeljt (subscriber, #39183) [Link]
Or what about a compiler which was smarter about when to optimise and when not? We learn as human programmers that optimising non-critical code usually does more harm than good, but (please correct me if I am wrong, which I am quite often) it seems to me that compilers happily assume that they should do every optimisation they can. I would expect that compiler authors clever enough to create those optimisations would also be capable of analysing code to spot which places are worth optimising, though user input might help too. They could then warn more diligently about potential problems in those areas and add stricter run-time checks and better debugger-friendliness to the others.
As an added bonus this might even improve build times if the additional analysis was not too expensive.
Just give it up already
Posted Apr 21, 2015 14:29 UTC (Tue) by jwakely (guest, #60262) [Link]
But "enum { E = 0 }; void* p = E;" will not, because unlike x, E is an integer constant expression.
Just give it up already
Posted Apr 21, 2015 22:22 UTC (Tue) by nix (subscriber, #2304) [Link]
Just give it up already
Posted Apr 19, 2015 23:24 UTC (Sun) by roc (subscriber, #30627) [Link]
Rust shows that it's possible to prevent use-after-free bugs in the language without requiring GC, in almost all cases.
We've learned a lot about language design, compile-time checking and optimization over the last 30 years, and at the same time hardware behaviors have become a lot less diverse; you don't see non-2's-complement machines anymore, or machines where "char" isn't 8 bits, or machines where NULL isn't all zero bytes, and even big-endian hardware is on the way out. This means we can now have languages that are just as fast and portable as C, but eliminating most or all of the undefined behaviors.
Just give it up already
Posted Apr 21, 2015 14:30 UTC (Tue) by jwakely (guest, #60262) [Link]
The compiler is only free to choose if it knows memset(&p, 0, sizeof p) definitely doesn't produce a valid pointer value for that particular implementation, not just because it might produce an invalid pointer value for some implementations.
Just give it up already
Posted Apr 17, 2015 21:05 UTC (Fri) by mbunkus (subscriber, #87248) [Link]
The C standard has nearly 200 cases (!!) of undefined behavior. Undefined behavior means that the standard basically says »anything goes, nothing is guaranteed«. So if the compiler encounters a case of undefined behavior is NOT under any obligation to produce anything valid. In fact, it is not under any obligation to produce any code in for such statements at all! Read the link HelloWorld has provided, it's very educating for non-C experts.
Now the case of optimizations. The compiler's goal with optimizations is to make the code as fast as possible while still being compliant with the C standard. If it encounters code which may (or may not) have undefined behavior it has to consider two cases:
1. Due to corner cases in the input value the statement has undefined behavior. In this case (like I said above) the compiler has no obligation to do anything, no restrictions at all.
2. Due to sane input values the code has defined behavior. In this case there are obligations and restrictions.
With the goal of producing fast code in mind the compiler can now completely ignore case 1 and optimize for 2 only – even if the input values turn out to lead to case 1.
And this is the step where things like removal of checks for NULL pointer values come from.
Again, please read the article HelloWorld linked to. It contains a nice null pointer deference in the kernel due to undefined behavior, even though a human might think that such a case could never be compiled that particular way.
Anyway, HalloWorld's main point was that C is a badly designed language. All this discussion proves this point! Way too few people actually understand how undefined behavior works and why it is so dangerous; even fewer can recognize undefined behavior reliably. It's incredibly hard for us humans to detect it, even in code reviews (the linked article contains a VERY nice example of six reviewers not catching one such case!). If that isn't a sign of bad design then I don't know what is.
Just give it up already
Posted Apr 17, 2015 22:56 UTC (Fri) by khim (subscriber, #9252) [Link]
The C standard has nearly 200 cases (!!) of undefined behavior.
This sounds awful till you'll try to recall why they are there and why people [relatively] rarely are bitten by them. E.g. if you'll shift int by 32 bits then the result is undefined, yes, but why? It's easy: because different CPUs give you different results! If you'll use SHL on Intel the value will be unchanged, but if you'll use LSL on ARM then value will be zero. That's why portable program shouldn't ever do that and that's why compiler may rely on it.
Oh and that's why an attempt to show how awful these undefined behaviors are usually leads to embarassment: either you produce program which works and works correctly (as the program under discussion does) or you produce program which does something stupid which couldn't be efficiently implemented on all platforms (e.g. the aforementioned shift by 32) and then the obvious reaction “don't do that, duh”.
Anyway, HalloWorld's main point was that C is a badly designed language. All this discussion proves this point!
Really? How? It starts with completely correct program which was deemed incorrect for no good reason and then scary links were added and other stuff which just shows that topicstarter does not understand how C works. FUD campaign at it's best. If anything is shows how pathetic is topicstarter, nothing more.
Well, sure, if you want to use C you need to know C, but that's kinda true for all other languages, too.
Yes, some “undefined behaviors” are really subtle (like aliasing rules), but that's why most sane compilers have a way to define them (options -fstrict-aliasing, -fwrapv and so on).
P.S. Strict aliasing rules are actually easy to understand and learn if you'll recal history of computing. Think good old 8086 and good old 8087. These two chips worked in parallel on top of the same memory. Which meant that, naturally, if you tried to access the same piece of memory in two adjacent lines of code as float and long you could have easily gotten grabage becase one of them could have been “too fast” or “too slow” easily. To circumvent it you needed specialized functions (memcpy) which included special wait command which guaranteed that everything will be properly synchronized. And that's how you are supposed to write portable code even today! End of story.
Just give it up already
Posted Apr 17, 2015 19:36 UTC (Fri) by kreijack (guest, #43513) [Link]
Could you give me an example of a software like (big as/complex as/which so many installation as) the linux kernel written in PHP and/or Cobol ?
I am sure that the success of linux kernel depends by a lot of factors, *all necessary*:
- smart people
- it reached a critical mass which attracted a lot of interest by the industries, when it was young
- it is still capable of growing
- it is based on good tools (C language, gcc compiler, git CSV...)
> There are reasons to choose C, like availability of tons of skilled
> programmers and portability to even the most obscure architectures.
> But that doesn't make C a well-designed language.
This doesn't mean that C is a bad language
Just give it up already
Posted Apr 17, 2015 20:27 UTC (Fri) by HelloWorld (guest, #56129) [Link]
> This doesn't mean that C is a bad language
Sure, *that* doesn't. The reasons I pointed out earlier do.
Just give it up already
Posted Apr 17, 2015 21:58 UTC (Fri) by peter-b (subscriber, #66996) [Link]
Facebook would be a good example — it's all PHP.
Just give it up already
Posted Apr 18, 2015 6:01 UTC (Sat) by b7j0c (guest, #27559) [Link]
and even the view layer is only PHP in the syntax...facebook has written two major alternative execution platforms for PHP - HipHop and HHVM. HipHop was an attempt to do source code translation into native executables (i.e., translate PHP to C/C++)...HHVM is a JIT runtime. In either case, the "stock" PHP runtime is not used. afaik the stock PHP runtime has not been in use there for many years.
to address issues with PHP's syntax, they are also developing HAck, which is basically PHP with some extra type annotation features that allow tooling to do some cursory static analysis
if you want to talk about the biggest "pure" (stock) PHP deployment, it is easily wordpress.
Just give it up already
Posted Apr 18, 2015 6:54 UTC (Sat) by kreijack (guest, #43513) [Link]
Do you have any data ?
* big as: the linux kernel source has about 20.000.00 loc; Facebook ?
* complex as: for this I don't have any metric; the number of subsystem/driver could give an idea.. but facebook ?
* installed as: even not counting the the installation number (think only about to all the android device, router....), linux is installed on a great variety of hardware (as hsorter list: mips, powerpc, x86, arm both in 32 bit and 64 bit, but there are about 30 directories under arch/).... Facebook ?
I don't know facebook (really I don't have an account); I consider managing all the facebook users is a very complex jobs... but I suspect that it still is not comparable to the linux kernel.
Just give it up already
Posted Apr 17, 2015 21:27 UTC (Fri) by PaXTeam (guest, #24616) [Link]
that's wrong, the compiler cannot optimize the test away unless it can prove that 'foo' is always true. think about it this way: if you swap memset out for an arbitrary function that takes a pointer to 'p', then the compiler cannot in general know what that function will do to 'p' therefore it cannot assume anything about its value in the test unless it can prove something about 'foo'.
Just give it up already
Posted Apr 17, 2015 21:50 UTC (Fri) by HelloWorld (guest, #56129) [Link]
Just give it up already
Posted Apr 17, 2015 23:12 UTC (Fri) by khim (subscriber, #9252) [Link]
More specifically, it might well know what memseting a pointer yields something that is neither NULL nor a valid pointer and thus invokes undefined behaviour when inspected by an if statement.
It couldn't invoke undefined behaviour: An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation.
Result could be valid or invalid pointer, it could be a pointer which couldn't be used at all, but it's still implementation-defined result, not undefined result.
P.S. As previously specified refers to the explanation that An integer constant expression with the value 0 could be converted to pointer and will produce null pointer. Note the constant in the requirement: it's to work with the architectures where null pointer does not include all zero bits, but could be different.
Just give it up already
Posted Apr 18, 2015 0:28 UTC (Sat) by viro (subscriber, #7872) [Link]
Just give it up already
Posted Apr 18, 2015 7:03 UTC (Sat) by kreijack (guest, #43513) [Link]
http://stackoverflow.com/questions/2597142/when-was-the-n...
Just give it up already
Posted Apr 18, 2015 7:37 UTC (Sat) by viro (subscriber, #7872) [Link]
Again, it's about the situation when _fetching_ the pointer from memory would let nasal daemons fly on invalid bit combinations and when all-zeroes happens to be such a combination. Of course, it can't happen if all-zeroes is used for null pointers, but simply having something else used to represent null pointer doesn't guarantee that it happens.
Just give it up already
Posted Apr 18, 2015 16:54 UTC (Sat) by khim (subscriber, #9252) [Link]
Hit the wrong “reply” button and my answer was posted elsewhere. Sorry.
Just give it up already
Posted Apr 19, 2015 23:05 UTC (Sun) by nix (subscriber, #2304) [Link]
Just give it up already
Posted Apr 18, 2015 12:07 UTC (Sat) by HelloWorld (guest, #56129) [Link]
However I don't quite understand your explanation. Integers may be converted to pointer types in an implementation-defined manner, i. e. something like "int i = 42; void *p = i;" yields an implementation-defined pointer. But how does that relate to the representations of pointers? After all memset messes around with the representation. When you memcpy a float to an int, you'll also get a different result than if you assign it.
Just give it up already
Posted Apr 18, 2015 15:38 UTC (Sat) by khim (subscriber, #9252) [Link]
When you memcpy a float to an int, you'll also get a different result than if you assign it.
That's because there are special conversion defined for that case (6.3.1.4 in C11: Conversions/Real floating and integer).
After all memset messes around with the representation.
It takes object representation of zero unsigned char and stuffs it into a pointer (or any other object). That's valid thing to do, for pointers at least. According to 6.2.6.2 Integer types:
For unsigned integer types other than unsigned char, the bits of the object representation shall be divided into two groups: value bits and padding bits (there need not be any of the latter). If there are N value bits, each bit shall represent a different
power of 2 between 1 and 2N−1, so that objects of that type shall be capable of representing values from 0 to 2N − 1 using a pure binary representation; this shall be known as the value representation. The values of any padding bits are unspecified (some combinations of padding bits might generate trap representations, for example, if one padding bit is a parity bit. Regardless, no arithmetic operation on valid values can generate a trap representation other than as part of an exceptional condition such as an overflow, and this cannot occur with unsigned types. All other combinations of padding bits are alternative object representations of the value specified by the value bits.)
Any sequence of bits is an integer (it may be a trap representation, but it's still not an undefined behavior). You could copy any object to any other object of the same size in C. See 6.2.6 Representations of types, General:
Values stored in non-bit-field objects of any other object type [except forunsigned char] consist ofn×CHAR_BITbits, wherenis the size of an object of that type, in bytes. The value may be copied into an object of typeunsigned char[n](e.g., bymemcpy); the resulting set of bytes is called the object representation of the value.
Since any bit sequence is an integer (it may be a trap representation, but it's still not an undefined behavior) and you could copy any integer into a pointer (again, it could be a trap representation, but it's still not an undefined behavior) you could stuff literal zeros in there, too. Result is implementation-specific, of course.
Just give it up already
Posted Apr 17, 2015 23:20 UTC (Fri) by PaXTeam (guest, #24616) [Link]
Just give it up already
Posted Apr 17, 2015 17:49 UTC (Fri) by gb (subscriber, #58328) [Link]
It's complete b.....t to say that C is bad language to build kernel.
Just give it up already
Posted Apr 17, 2015 17:54 UTC (Fri) by HelloWorld (guest, #56129) [Link]
> It's complete b.....t to say that C is bad language to build kernel.
Learn english.
Just give it up already
Posted Apr 17, 2015 18:38 UTC (Fri) by zorro (subscriber, #45643) [Link]
Just give it up already
Posted Apr 17, 2015 18:50 UTC (Fri) by HelloWorld (guest, #56129) [Link]
Just give it up already
Posted Apr 18, 2015 6:09 UTC (Sat) by b7j0c (guest, #27559) [Link]
furthermore, Facebook has the funds to support the massive testing and tooling infrastructure required to make a shoddy tool like PHP safe and reliable enough. with enough QA and driven developers, even PHP can be made to produce something of acceptable quality
the Egyptians built the pyramids with manual labor. with the right incentives and a mountain of cash, its possible to produce parts of Facebook with junk like PHP
Just give it up already
Posted Apr 18, 2015 17:22 UTC (Sat) by tjc (guest, #137) [Link]
I'm sure I've mentioned this before, but since you seem compelled to post the same rant every year or so, I will state it again for completeness. :)
The example cited above works for declarations, but not for expressions; the postfix indirection operator will conflate with the infix multiplication operator in some not-to-obscure expressions, and it's not easy to get around this since there is no limit on indirection. (I'm assuming you don't want to design a language that requires more than one look-ahead symbol to parse.) You could change the lexeme from '*' to something else, but then you end up with awkward cast expressions, such as ((int *)p)@, since postfix operators have higher precedence than prefix operators. You could make the type cast operator postfix (or at least something other than pre-outfix), but try that and see how it looks....
Just give it up already
Posted Apr 19, 2015 9:52 UTC (Sun) by HelloWorld (guest, #56129) [Link]
Anyway, it doesn't even matter. Maybe my proposal has its own problems, but that doesn't make C any better.
> You could change the lexeme from '*' to something else, but then you end up with awkward cast expressions, such as ((int *)p)@, since postfix operators have higher precedence than prefix operators.
Well-written code doesn't have many casts anyway (though I suppose there's more of them in C due to the lacking type system).
Just give it up already
Posted Apr 24, 2015 22:10 UTC (Fri) by HelloWorld (guest, #56129) [Link]
Just give it up already
Posted Apr 20, 2015 20:35 UTC (Mon) by sorokin (subscriber, #88478) [Link]
I disagree with a rant about undefined behavior. What is the alternative to undefined behavior? Missed optimizations?
In every project I worked on lots of efforts were spent optimizing program. Because usually there is not such thing as "fast enough". If your program is fast enough just load more data in it. There is a constant pressure from users who want to handle larger and larger data sets. The performance does matter and as time goes on optimizations are harder and harder to make. For example in my current project it is difficult to get 5% speed up on critical algorithms. The are few 1-2% speed ups, that are relatively easy to archive. And I believe somebody will do them soon.
If you suggest me to drop undefined behavior and lose 1% of performance I would say no. Because this 1% cost you a time spent optimizing program. I can not stress enough that writing a program is simple, writing a fast program is incredibly difficult.
Undefined behavior is not a bad thing per se. What is a bad thing is that it is not possible to check for it. Fortunately now the situation has changed. Compilers have got {address,memory,thread,...} sanitizers. I don't think you can argue that defining a behavior that is already checked in sanitizers will save a debugging time. Ideally all undefined behaviors should be checked by sanitizers. I think gradually the number of undefined behaviors checked by sanitizers will be increasing. Most problematic undefined behaviors occuring on real programs are checked already.
Sometimes somebody says that undefined behavior is bad for correctness (or for security), because it can drop a bound checks/overflow checks and so on. These people are nuts. Programs must be tested. Bound checks/overflow checks/anything should be tested. -fno-string-aliasing or -fwrapv or -fno-delete-null-pointer-checks in the name of security is an insanity. Every code shipped to user must be extensively tested (automatically of course) and fuzzed.
What I want is more optimizations. If it was possible to add a new undefined behavior and speed up program by 1%, I'm for it. For example make unsigned arithmetic overflow undefined (it is a bad example, because it won't give 1%, but let's assume it does). Let's make unsigned overflow undefined, let's make a functions wrap_add/wrap_sub/... for cases when wrapping behavior is desired (very rare cases). Let's add checks for wrapping in undefined-behavior-sanitizer, so user can catch all cases where wrapping occurred and use wrap_xxx functions here. Profit. A program is faster. The example with unsigned overflow is artificial, but sometimes after reading a code generated by compiler I wish I can convey information about aliasing to the compiler, allowing compiler to reorder and to LICM code more aggressively.
Just give it up already
Posted Apr 21, 2015 10:13 UTC (Tue) by jezuch (subscriber, #52988) [Link]
There's ubsan which I guess does what you want (I haven't used it).
Just give it up already
Posted Apr 21, 2015 10:59 UTC (Tue) by mathstuf (subscriber, #69389) [Link]
Just give it up already
Posted Apr 21, 2015 12:31 UTC (Tue) by zmower (subscriber, #3005) [Link]
What if I told you my strongly typed preferred language doesn't have a pre-processor, has modules (specs and bodies so you can change a generic body and not have to recompile all the client code), does most of the bounds checking at compile time and has parallel processing and synchronisation primitives baked into the language. For thirty years (sigh).
Just give it up already
Posted Apr 21, 2015 15:21 UTC (Tue) by peter-b (subscriber, #66996) [Link]
Then I'd assume you were referring to Ada...
Just give it up already
Posted Apr 22, 2015 7:16 UTC (Wed) by zmower (subscriber, #3005) [Link]
For example I recently replaced buggy shared memory and semaphores C code with an Ada task. The former was a nightmare to maintain, the latter compact and readable.
Just give it up already
Posted Apr 21, 2015 22:55 UTC (Tue) by HelloWorld (guest, #56129) [Link]
> If you suggest me to drop undefined behavior and lose 1% of performance I would say no.
And the rest of the world would say “hell yes”. We've traded performance for robustness before. Nobody is using computers without memory protected any more even though it does have overhead.
> These people are nuts. Programs must be tested. Bound checks/overflow checks/anything should be tested.
Writing an automated test that exposes a bug requires exactly the same set of skills that are required to avoid creating the bug in the first place. And that's why tests offer virtually no protection against bugs; they protect against regressions. „Program testing can be a very effective way to show the presence of bugs, but it is hopelessly inadequate for showing their absence.“ – this classic Dijkstra quote is as true today as it always was. Fuzzing is in the same category, it doesn't really prove anything.
Just give it up already
Posted Apr 21, 2015 23:04 UTC (Tue) by pizza (subscriber, #46) [Link]
Excuse me? Protected memory doesn't apply in kernel space.
Can we just agree that different problem domains require different tradeoffs between performance and safety, and there's no such thing as one thing to do everything?
Just give it up already
Posted Apr 22, 2015 7:56 UTC (Wed) by HelloWorld (guest, #56129) [Link]
Just give it up already
Posted Apr 24, 2015 10:52 UTC (Fri) by sorokin (subscriber, #88478) [Link]
This works only for simple cases. Try to write something a bit more complex. For example a doubly linked list or a boost::intrusive::list.
> And the rest of the world would say “hell yes”.
For some reason memcpy and memset in libc are still written in assember and are optimized for different processor architectures. Why don't rewrite them on C? The answer is people don't care about implementation details, as long as your library works fast and correctly.
> Nobody is using computers without memory protected any more even though it does have overhead.
The cost of memory protection in processor is very small. It cost you something if you bounce your data between processes. But people who care about performance don't do this in first place. A virtual memory doesn't slow down the fast path.
> Writing an automated test that exposes a bug requires exactly the same set of skills that are required to avoid creating the bug in the first place.
Wrong. Please read about fuzzing and the results it gives. For example a report about fuzzing ffmpeg: http://googleonlinesecurity.blogspot.co.uk/2014/01/ffmpeg...
Writing a test is significantly simpler than writing a correct program.
Just give it up already
Posted Apr 24, 2015 13:15 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link]
Intrusive lists are not a problem at all, Rust supports smart pointers (autoderef).
> For some reason memcpy and memset in libc are still written in assember and are optimized for different processor architectures.
I'm using musl libc for some of our software now. It has a simple C-based memcpy - it works fast enough that our programs are not slower in any measurable way.
> The cost of memory protection in processor is very small. It cost you something if you bounce your data between processes. But people who care about performance don't do this in first place. A virtual memory doesn't slow down the fast path.
It certainly does. TLB misses are expensive and the amount of TLB slots is not going to get any greater.
> Writing a test is significantly simpler than writing a correct program.
Fuzzying doesn't detect all errors.
Twisty little (word) passages
Posted Apr 16, 2015 23:22 UTC (Thu) by dowdle (subscriber, #659) [Link]
Just give it up already
Posted Apr 16, 2015 23:25 UTC (Thu) by rahvin (subscriber, #16953) [Link]
I don't care for microkernels either but you have got to admit they've been successful when you have enough money and manpower to work around the problems with the design.
Just give it up already
Posted Apr 17, 2015 0:04 UTC (Fri) by HelloWorld (guest, #56129) [Link]
Just give it up already
Posted Apr 17, 2015 0:50 UTC (Fri) by brouhaha (subscriber, #1698) [Link]
Considering Linux is the only real monolithic kernel in existenceDoes that mean that xBSD and Illumos (Solaris) don't use monolithic kernels any more? Or does it mean that they aren't "real"? (I haven't followed either very closely in recent years, so I honestly don't know what they've been up to.)
Just give it up already
Posted Apr 17, 2015 9:08 UTC (Fri) by HelloWorld (guest, #56129) [Link]
Just give it up already
Posted Apr 18, 2015 8:56 UTC (Sat) by justincormack (subscriber, #70439) [Link]
Just give it up already
Posted Apr 17, 2015 0:53 UTC (Fri) by BenHutchings (subscriber, #37955) [Link]
Considering Linux is the only real monolithic kernel in existence I'd say the Microkernel's won the war. Microsoft sort of proved you could make one and have it functional enough to be a major success in both business and consumer markets.
Microkernels consist of a scheduler, VM, and IPC, and little else. NT and XNU never looked anything like that.
Just give it up already
Posted Apr 17, 2015 3:21 UTC (Fri) by roc (subscriber, #30627) [Link]
Between Windows and a Linux desktop system, *Linux* is the one that looks more like a microkernel.
Just give it up already
Posted Apr 17, 2015 9:01 UTC (Fri) by ledow (guest, #11753) [Link]
IIS does HTTP parsing INSIDE THE KERNEL.
MS software is nowhere *near* a microkernel architecture whatsoever.
If anything, the contrary has been proven - microkernels are niche and if you want to have a real, live, practical, usable system that developers can work on in the real-world, the world is almost dominated by monolithic kernels.
Just give it up already
Posted Apr 17, 2015 9:32 UTC (Fri) by khim (subscriber, #9252) [Link]
HTTP parsing could be justified. It's performance-critical piece of code once upon time Linux had something similar.
But when you stuff fonts parsing in your microkernel, then I don't know what to even say. It's not “everything-and-a-kitchen-sink-in-a-kernel”. That's “the-whole-bathhouse-in-a-kernel-and-maybe-the-neighbouring-shops-too”.
I guess salespeople could still sell Windows as a “microkernel”, but I really hoped that such people wouldn't post on the LWN.
Just give it up already
Posted Apr 18, 2015 21:07 UTC (Sat) by jengelh (subscriber, #33263) [Link]
Just give it up already
Posted Apr 19, 2015 0:56 UTC (Sun) by khim (subscriber, #9252) [Link]
Not really. Kernel Mode Linux gives the ability to execute [almost] any program in kernel mode, but it, by itself, does not come with any programs which must be executed in kernel or are typically executed in kernel. Yet most Windows systems include things like HTTP servers and fonts parsers in their itsy bitsy teenie weenie microkernel!
Just give it up already
Posted Apr 17, 2015 10:30 UTC (Fri) by Wol (guest, #4433) [Link]
The problem with elegant maths is that it is often inelegant engineering. That's true of microkernels, and just as true of relational DBMSs :-)
Put an engineer in charge and you get a solution that works in practice if not in theory. Put a mathematician (a Comp Sci?) in charge and you get a solution that works in theory but is cobbled together with baling wire and duck tape in practice.
Cheers,
Wol
Just give it up already
Posted Apr 17, 2015 12:20 UTC (Fri) by HelloWorld (guest, #56129) [Link]
> The problem with elegant maths is that it is often inelegant engineering. That's true of microkernels, and just as true of relational DBMSs :-)
>
> Put an engineer in charge and you get a solution that works in practice if not in theory. Put a mathematician (a Comp Sci?) in charge and you get a solution that works in theory but is cobbled together with baling wire and duck tape in practice.
That's just bullshit anti-intellectual ranting.
Just give it up already
Posted Apr 17, 2015 20:59 UTC (Fri) by Wol (guest, #4433) [Link]
Is it? There are far too many fancy mathematical solutions with no solution in a reasonable amount of time. After all, doesn't cryptography rely on the fact that some problems, even though the solution is simple to *describe*, take far too long to calculate? If I have someone's public RSA key, that gives me EVERYTHING I need to decipher every message they send or receive. It's just that to decode those messages without the private key would take so long it's usually not worth trying.
You know I rail at RDBMSs. But they are an engineering nightmare. The SLOWEST part of a computer is disk access. So why the f do relational databases insist on breaking up data into tiny little chunks to scatter all over the disk!?!?!? What's that quote again? From the Pick FAQ, "relational databases optimise the easy task of finding data in memory. Pick databases optimise the hard task of getting it into memory in the first place". And who on earth thought that having a database that could only store sets was a good idea ... nearly all data comes as bags or lists, which an RDBMS can't store. They have to be modelled, which is a massive unnecessary cost. Einstein said "make it as simple as possible, BUT NO SIMPLER". Far too many mathematical solutions have far too simple axioms, pushing loads of unnecessary complexity into the engineering solution.
imho, that's why Linux is such a great product - Linus is an engineer, and when presented with a load of maths "proving" that something is a great idea, he just comes back with a "does it work?". All too often, it doesn't! And that's not a complaint against the mathematicians - all too often their work is perfectly okay within stated constraints. But people who don't understand the maths promptly ignore the constraints and apply it where it was never supposed to work.
Cheers,
Wol
Just give it up already
Posted Apr 17, 2015 23:11 UTC (Fri) by HelloWorld (guest, #56129) [Link]
I also completely fail to see what microkernels are supposed to have to do with math. The only reason people are interested in microkernels is that the people who write the stuff aren't able to get it right most of the time and that thus in order to minimize the impact of their mistakes the components need to be isolated from each other. It doesn't really get any more engineery than that, after all there's no reason in principle one couldn't write a bug-free monolithic kernel.
Just give it up already
Posted Apr 18, 2015 15:47 UTC (Sat) by Wol (guest, #4433) [Link]
> No sane mathematician would claim that RSA is somehow broken because it's possible to decrypt it if you have infinite amounts of time. Everyone understands that what matters is the complexity of an algorithm that does it, and if the runtime complexity is exponential then you can't consider it broken. You put words into people's mouths, that does qualify as bullshit.
But you miss my point completely! I said that I don't blame the mathematician when OTHERS use the maths outside its limits. What's there to stop a PHB from designing a system that relies on the existence of a solution to RSA? Yes the mathematicians will laugh him out of court, but will there be any mathematicians with the authority to over-rule him?
And I can give you a real-life example of exactly this, that killed a space crew! The Challenger disaster. We all know the O-ring failed because it was too cold. But that *shouldn't* have mattered! If the rocket was properly circular, the o-ring would have fitted snugly, and it wouldn't have leaked. But the rocket recycling plant PHBs knew that a circle had constant diameter, so they decreed that if the diameter as measured in certain places was constant, then the rocket would pass muster. Read Dick Feynmann's report - a rocket could pass the test for being circular, and yet be seriously out of true. And it killed seven people :-( because people abused the maths.
Another, in this case perfectly excusable, example is Galileo vs Copernicus. Modern people can't understand why it wasn't blatantly obvious that Galileo was right. We forget that you need *ellipses* to model planetary orbits, and Kepler only discovered the relevant maths a century *after* Galileo. So Galileo used the *wrong* maths, that of circles, and people didn't believe him because his maths was a mess (but it wasn't his fault).
Cheers,
Wol
Just give it up already
Posted Apr 18, 2015 15:55 UTC (Sat) by Wol (guest, #4433) [Link]
But the PHB's definition of a circle was so bad it left gaping holes that the o-ring had no chance of sealing. Hence the fatal leak.
Cheers,
Wol
Just give it up already
Posted Apr 18, 2015 16:30 UTC (Sat) by viro (subscriber, #7872) [Link]
Just give it up already
Posted Apr 18, 2015 16:41 UTC (Sat) by viro (subscriber, #7872) [Link]
Just give it up already
Posted Apr 19, 2015 23:11 UTC (Sun) by nix (subscriber, #2304) [Link]
Just give it up already
Posted Apr 20, 2015 13:57 UTC (Mon) by Wol (guest, #4433) [Link]
Cheers,
Wol
Just give it up already
Posted Apr 18, 2015 18:38 UTC (Sat) by Wol (guest, #4433) [Link]
Cheers,
Wol
Just give it up already
Posted Apr 18, 2015 15:50 UTC (Sat) by Wol (guest, #4433) [Link]
Because it's a lot easier to prove a microkernel "correct". Because programming IS MATHS.
So if you design your monolithic kernel along modular, microkernel, lines, then you know the design is good.
Cheers,
Wol
Just give it up already
Posted Apr 19, 2015 12:06 UTC (Sun) by kleptog (subscriber, #1183) [Link]
You know, I've been hearing about people "proving programs correct" since I was in university and yet there appears to be no progress whatsoever. There has been progress in using computers to verify theorems but no one seriously considers trying to prove any significant program correct. I think this is mostly due to the fact that to prove a program correct you need to be able describe what it does, and there appears to be no way of describing what it does better than the program itself.
Instead, the focus has shifted to better testing, which verifies that the program produces the correct output for a given input. This requires some modularity to work well, but it's not clear to me at all that it requires a microkernel architecture.
> Because programming IS MATHS.
Every time I read that I understand less what it is trying to say. Programming is maths in the sense that a Turing machine can produce anything that is computable. But what you do while programming is so far removed from any other branch of mathematics that the cross-pollination appears to be non-existent, and thus it should possibly be considered something separate.
Just give it up already
Posted Apr 19, 2015 13:29 UTC (Sun) by HelloWorld (guest, #56129) [Link]
Just give it up already
Posted Apr 19, 2015 14:51 UTC (Sun) by PaXTeam (guest, #24616) [Link]
that doesn't mean anything unless you know what is in that specification. in particular, said specification misses out on at least one class of undefined behaviour and not surprisingly the code does actually contain such instances. it took me less than a minute of manual inspection to find them last year when the code was released with great fanfare. fortunately for them it's the kind of UB that compilers don't exploit (yet) so the actual binary works as intended but one would think that something 'formally verified' would do better than this.
Just give it up already
Posted Apr 20, 2015 0:48 UTC (Mon) by ineol (guest, #101525) [Link]
Just give it up already
Posted Apr 20, 2015 11:36 UTC (Mon) by PaXTeam (guest, #24616) [Link]
Just give it up already
Posted Apr 20, 2015 18:14 UTC (Mon) by cebewee (guest, #94775) [Link]
Just give it up already
Posted Apr 20, 2015 18:27 UTC (Mon) by PaXTeam (guest, #24616) [Link]
Just give it up already
Posted Apr 20, 2015 19:43 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link]
The real major problem is multithreading - it's often too complicated to prove anything useful.
Just give it up already
Posted Apr 20, 2015 11:34 UTC (Mon) by renox (subscriber, #23785) [Link]
Just give it up already
Posted Apr 20, 2015 14:35 UTC (Mon) by Wol (guest, #4433) [Link]
> Every time I read that I understand less what it is trying to say. Programming is maths in the sense that a Turing machine can produce anything that is computable. But what you do while programming is so far removed from any other branch of mathematics that the cross-pollination appears to be non-existent, and thus it should possibly be considered something separate.
Which is another thing I rail against a bit :-) But no, when your teacher set you a maths problem in class, your solution was a computer program (of sorts). Certain programming languages (APL, J) are simply mathematical languages that happen to have a computer compiler. Given that all computer languages are (I believe) Turing-equivalent, that means all programs written in such languages are simply mathematical expressions.
So actual programming, actually writing a program, is just doing maths. The program is the resultant mathematical proof of all your work. You need to separate that from all the ancillary stuff that is not maths (defining the problem, etc etc).
(Terminology doesn't help here. I am of the opinion that maths and science are complementary, while others take the viewpoint that science is a subset of maths. But if you take my viewpoint, Theoretical Physics promptly ceases to be a science, and becomes a mathematical discipline. Experimental Physics, on the other hand, becomes a scientific exercise in Technology. To me, science is doing the experiments and interpreting the results, maths is building the model of what you expect to happen, and analysing the results against the model.)
Then you need to realise that all maths is simply modelling an ideal world. As Einstein said, you have NO GUARANTEE WHATSOEVER that what goes on in that ideal world bears any resemblance to what actually happens in our reality. As ?Paxteam said (or rather, as I'll paraphrase him), "if there's a hole in your spec, your proof is worthless".
That's why I rail at RDBMSs :-) Because I use an alternate maths, I can see that the relational model does not match reality. Because Pick has a close approximation to reality at the disk interface level, I can prove that I can store and retrieve data very efficiently. For the relational die-hards, I can then convert that data to relational in RAM. I don't care about the conversion cost, it's far less than the overhead that is directly caused by breaking up real-world data to splat it all over the disk in an RDBMS. (I would also argue that relational tables muddle data and metadata, which Pick doesn't do, and this then causes horrendous unnecessary complexity at the application level!)
Cheers,
Wol
Just give it up already
Posted Apr 20, 2015 15:25 UTC (Mon) by pizza (subscriber, #46) [Link]
"All models are wrong, but some are useful" -- George E.P. Box
"Beware of bugs in the above code; I have only proved it correct, not tried it." -- Donald Knuth
Just give it up already
Posted Apr 21, 2015 7:32 UTC (Tue) by dgm (subscriber, #49227) [Link]
Just give it up already
Posted Apr 21, 2015 7:36 UTC (Tue) by dgm (subscriber, #49227) [Link]
Just give it up already
Posted Apr 21, 2015 18:36 UTC (Tue) by Wol (guest, #4433) [Link]
On the other hand, what I think of as Science (things like operating CERN, astronomical observation of the Universe) are imho clearly NOT maths.
At the end of the day, it comes down to Philosophy - the big questions ... "What is Maths?", "What is Science?", "What is Philosophy?".
Philosophy is the use of logic and reason, and imho while maths are science are both subsets of philosophy I would place them as disjoint subsets. Other people equate philosophy and maths which would place science as a proper subset of maths. To some extent it's how you define it, but the reason I see a clear distinction is that a Theory is the pinnacle of Science, while a Theorem is the pinnacle of Maths. A Scientific Proof destroys its subject, while a Mathematical Proof elevates its subject. (The similarity between the two, is that both scientists and mathematicians (supposedly) search for proofs).
That's why Theoretical Physics is not Science :-) because you can prove its conjectures correct. But if somebody then comes up with a Scientific Proof you have to throw the whole thing away and start again.
The confusion arises because, let's take the Higgs Boson, the experimenters found what they were looking for. This was taken as proof the Higgs Boson exists. Actually it's the other way round. The fact they found what they were looking for is *evidence* the Higgs Boson exists, not proof, but if they hadn't found it then it would have been proof the Higgs Boson does not exist.
This stuff is HARD. Not because it's difficult, but because it goes against everything we are programmed to believe - as Einstein said, "God does not play dice".
Cheers,
Wol
Just give it up already
Posted Apr 21, 2015 22:24 UTC (Tue) by jspaleta (subscriber, #50639) [Link]
In fact we have..considerable evidence..that Einstein's intuition with regard to quantum mechanics was flawed. He hated the theoretical QM concept of "spooky action at a distance".... otherwise known as quantum entanglement.. the very concept utilized in current applied research into quantum crypto and quantum computing. Einstein thought quantum entanglement impossible. It's been experimentally confirmed to happen and we are now even starting to build physical applications which rely on its existence to do useful things.
Einstein was wrong, God plays dice. I had a nice chat with him at the craps table at the Tropicana in Vegas while I was at CES in January. Nice guy.
And you have the experimental proof and evidence logic sort of backwards. Not seeing the Higgs particle is not proof that it doesn't exist. Creating a Higgs particle, like all nuclear reactions.. in inherently a game of dice. You smash some fast moving particles together, with enough energy, and very rarely, you get a Higgs particle pooped out of the collision. So its a game of dice... a handful of 1 billion sided dice....and you are looking to roll the 1 billion sided dice equivalent to snake eyes.
Under certain conditions, there a finite probability that a particular nuclear interaction will produce a Higgs boson. Under certain other conditions the probability of it happening is different. The probability of a Higgs boson...its nuclear cross-section... is theoretically quite small even at the high energy of our biggest collider. The LHC was designed to produce enough energy to make sure the target particle mass as predicted by the standard model was obtainable. But really, how many collisions would be needed to be done to observe a Higgs particle was effectively unbounded beyond being able to say it was a highly unlikely outcome of for any collision.
Highly improbable things are hard to produce experimentally. So to produce a Higgs boson, you smash things into each other again and again and again hoping to see the improbable happen. Einstein would have thought the entire concept of experimental particle physics insane. Here let me quote Einstein again... "The definition of insanity is doing the same thing over and over again, but expecting different results.” But that is exactly what particle physics experiments do. Do the same bloody thing over and over again...expecting different outcomes with every throw of the dice..err i mean accelerated molecular beams.
As it stands... with Higgs production cross-sections at picobarn levels (that's a super tiny nuclear cross section btw). .. it took trillions of collisions in the LXC to reach the previously agreed on levels of statistical significance..to be able to claim experimental validation of the existence of the Higgs particle being observed amongst all the other much more probable nuclear interactions resulting in similar particle relativistic mass. Though designing experiments like this with open ended number of trials until you reach a desired significance level is a nuanced problem with frequentist approach to probability.
<cut and past standard Bayesian inference rant here>
If they had just performed a measly billion collisions it would not have been validation that the Higgs particle did not exist, but it wouldn't have been proof of non-existance either. Highly improbable things are not disproved just because you don't see them happen in your finite set of observations.I do one smashing experiment and if I don't see a Higgs particle.. I don't get to claim it doesn't exist. All I can say is that its not guaranteed to be produced. If I run N such experiments, for any value of N and don't see it all I can claim is that if it exists, I just put a bound on its nuclear cross section.
When you don't produce enough Higgs boson observations to reach statistical significance within a certain regime inside a specific number of experimental runs..you've just reduced the probability bounds of being able to produce it. But you haven't proved it doesn't exist. You'd have to repeat the experiment an infinite number of times to _prove_ that the higgs boson does not exist. Infinity and beyond!
The lack of existence of something, is generally quite difficult to prove. Not experimentally seeing it is not proof. Just like the lack of affirmative results in experiments designed to look for CPT violations currently going on do not prove that CPT violations do not occur. These CPT null results just set tighter bounds on CPT violations if they are occurring. Should all those CPT violation experimenters just pack it up and give up hunting for it?
Just give it up already
Posted Apr 22, 2015 10:57 UTC (Wed) by Wol (guest, #4433) [Link]
That's why I quoted it!!! Einstein didn't want to believe it! As a true scientist, I hope he knuckled under and said "well if the evidence supports it then it must be true", but my point was that our belief systems often prevent us from seeing the facts in front of us ... it's a widely known and recognised problem.
> Highly improbable things are hard to produce experimentally. So to produce a Higgs boson, you smash things into each other again and again and again hoping to see the improbable happen. Einstein would have thought the entire concept of experimental particle physics insane. Here let me quote Einstein again... "The definition of insanity is doing the same thing over and over again, but expecting different results.” But that is exactly what particle physics experiments do. Do the same bloody thing over and over again...expecting different outcomes with every throw of the dice..err i mean accelerated molecular beams.
Yep. But at the end of the day, all those experiments will NEVER provide proof! I know if there's a one-in-ten-million chance that any individual experiment will produce a Higgs, then doing one million experiments is no real evidence either way. But if the chance was one-in-a-thousand, then a million experiments would make us pretty confident either way. But that is my point - we would be *confident*, we would not have *proof*.
Statistical certainty isn't proof. And that's the difference between science and maths, imho.
Cheers,
Wol
Sigh
Posted Apr 22, 2015 13:53 UTC (Wed) by renox (subscriber, #23785) [Link]
WHAT? The experiments showing the 'spooky action at distance' effect couldn't be done by hidden local variables (1) were done a long time after he was dead..
1: well most of the hidden local variables theory.
Just give it up already
Posted Apr 20, 2015 21:27 UTC (Mon) by kleptog (subscriber, #1183) [Link]
This doesn't make any sense to me. When I was (for example) asked to prove the continuity of the exponential function I don't see how what I wrote can be considered a computer program, but do enlighten me. Unless you mean primary school math, which is one of my pet peeves since that would be more accurately termed arithmetic, since it has little to do with what mathematicians do.
> Given that all computer languages are (I believe) Turing-equivalent, that means all programs written in such languages are simply mathematical expressions.
A language is Turing-equivalent if you can emulate a Turing machine in it. Just about any programming language can do that. But you have the Turing thesis myth: that a Turing machine can compute anything a computer can. This is false, because Turing machines don't do I/O.
And that is the crux of the issue for me: a Turing machine can compute anything computable by any algorithm, but the programs we write are a small number of algorithms with a huge amount of interactive behaviour, none of which can be modelled by a Turing machine and hence cannot be described by an algorithm.
The Origins of the Turing Thesis Myth: http://www.engr.uconn.edu/~dqg/papers/myth.pdf
Why Interaction is more Powerful than Algorithms: http://wit.tuwien.ac.at/events/wegner/cacm_may97_p80-wegn...
Just give it up already
Posted Apr 20, 2015 21:41 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link]
Fully formalized proofs are indeed nigh indistinguishable from a computer program. See for yourself: https://coq.inria.fr/tutorial/2-induction
After all, in formal logic one definition of a proof is: "A syntacticly valid expression which automatically leads to its semantic validity".
Just give it up already
Posted Apr 21, 2015 7:24 UTC (Tue) by kleptog (subscriber, #1183) [Link]
If you call coq a programming language, which I can accept since it's a similar idea to Prolog: if it completes it has acheived some useful task.
However, you have only shown the inclusion one way: any mathematical proof can be written as a program. What about the other way: can any program be converted to a mathematical proof?
To make it concrete, take the following Python program:
s=0
for i in range(10): s += int(raw_input("Number:"))
print s
What is the corresponding proof? I'm truly curious, perhaps I've missed some advance with respect to I/O.
Just give it up already
Posted Apr 21, 2015 7:45 UTC (Tue) by dgm (subscriber, #49227) [Link]
Just give it up already
Posted Apr 21, 2015 8:31 UTC (Tue) by sorokin (subscriber, #88478) [Link]
In a Curry–Howard isomorphism types correspond to formulae. Type inhabitation correspond to provability. A function returning value of type X and terminating in finite amount of time corresponds to a proof of a formula X. E.g. having a function you have a proof, having a proof you have a function.
When a type system is simple the fact that type is inhabitable is not very exciting. But when a type system is complex the problem of inhabitability of type is not trivial.
Just give it up already
Posted Apr 20, 2015 22:36 UTC (Mon) by viro (subscriber, #7872) [Link]
Just give it up already
Posted Apr 21, 2015 11:19 UTC (Tue) by Wol (guest, #4433) [Link]
> This doesn't make any sense to me. When I was (for example) asked to prove the continuity of the exponential function I don't see how what I wrote can be considered a computer program, but do enlighten me. Unless you mean primary school math, which is one of my pet peeves since that would be more accurately termed arithmetic, since it has little to do with what mathematicians do.
I sympathise. When I started doing Quantum Mechanics at Uni I just fell apart and failed the module, it was just too much for me. Relativity, on the other hand ... if you can turn off your common sense it was just so simple ... high school maths at most :-)
But approach it from a completely different angle. Newton wasn't a scientist - the concept didn't exist back then. He was a Natural Philosopher - someone who tried to explain the world in terms of logic. From that came maths and science, and if you follow down the line, it is clear that programming is a branch of maths.
> > Given that all computer languages are (I believe) Turing-equivalent, that means all programs written in such languages are simply mathematical expressions.
> A language is Turing-equivalent if you can emulate a Turing machine in it. Just about any programming language can do that. But you have the Turing thesis myth: that a Turing machine can compute anything a computer can. This is false, because Turing machines don't do I/O.
Actually, it's true. Because i/o belongs in the science world, not the maths world, therefore a computer can't compute i/o, either! (The science world is anything that has a physical effect upon reality - a vdu, a printer, etc.)
It is incredibly hard for people to separate reality from logic - we are biologically designed to rationalise things that are complex and chaotic. We jump to conclusions, we draw false connections, etc etc.
Look at the computers they used to design the atomic bomb in the early forties. Ask yourself, in what way does the LOGIC differ from a modern computer. It doesn't. Then look at the hardware, it was a bunch of people with adding machines! Once you start back in history and work your way forward, you see things were a lot simpler and more obvious then. In modern times, it's become much more complex but the fundamentals haven't changed. Just that people today can't see the fundamentals because of the complexity on top.
(Oh, and just be very aware, as I think is obvious from what I said, that your view of reality could well be, indeed probably is, at odds with the view of the person you're talking to. Especially here when we get into complex subjects like this. If you can understand the other person's view, then you'll understand why there's a disagreement. Like the Turing machine and i/o, just above :-)
Cheers,
Wol
Just give it up already
Posted Apr 21, 2015 21:38 UTC (Tue) by kleptog (subscriber, #1183) [Link]
If you don't mind my saying, that feels like a massive cop-out: "all programming is math, except for programs that interact with the real world". That's so trivial as to be useless.
The fact is that a Turing machine can't regulate the temperature of my house, but a computer can. For me this makes any discussion about programming based on what a Turing machine can and can't do rather suspect. A choice machine could do it, but nobody talks about them.
Just give it up already
Posted Apr 22, 2015 11:41 UTC (Wed) by Wol (guest, #4433) [Link]
> If you don't mind my saying, that feels like a massive cop-out: "all programming is math, except for programs that interact with the real world". That's so trivial as to be useless.
But that's the point. A program is A LIST OF INSTRUCTIONS. It's symbolic, abstract. The data you feed into the program is A BUNCH OF NUMBERS. It's symbolic, abstract. IT'S ALL MATHS.
Feed that program, that data, into a physical computer, that carries out those instructions, and you have your interaction with the real world. The *instruction* "print a 0" is maths. The machine that does it, is not.
Start at the very beginning. Let's draw a distinction between the symbol and the meaning. Let's cut out a cardboard "P". You've just cut out the letter "pee", haven't you? Actually, no, you've cut out the letter "ar"! Okay, cut out a "B". You've just cut out the letter "vee". Our Eastern European members will know exactly where I'm going :-) Mathematics is the manipulation of meaning. We don't care what the symbols are, so long as they're consistent. Does "2" "0" mean 20, or space? The computer doesn't give a monkeys, it's a symbol to be manipulated.
Let's go back to a very simple computer. It has three i/o interfaces, I'll assume they're punch tape. Input 1 reads a program tape. Input 2 reads a data tape. Output 1 outputs a data tape. Our machine reads "2" then "4" from the input data tape, and "multiply" from the program tape. It outputs "8" on the output tape.
Why paper tape and not a disk drive? Is it EBCDIC or ASCII? What processor is it, MIPS or x86 or SPARC? IT DOESN'T MATTER. And it doesn't matter BECAUSE IT'S MATHS. This *is* a Turing Machine. It's also the Arithmetic Logic Unit that you find inside a Central Processing Unit that you find inside what, in modern parlance, is known as a computer. Actually, if you have a person who can read the punch marks on the tape, there's no reason why it shouldn't be said person ... it just DOESN'T MATTER.
A modern CPU manipulates electric charges instead of holes in paper tape, but that's all irrelevant. That paper tape could then go to a machine that prints the number VIII, that is your i/o. Or as I said, someone could read the tape directly ...
When you're programming, you're manipulating symbols, be they variable names, instructions, whatever. Manipulating symbols according to logic is what maths is. A Turing Machine is an abstract machine that takes in a string of abstract symbols, carries out the instructions contained therein, and spits a string of abstract symbols out. A computer is a physical machine, that takes in a string of physical symbols, carries out the instructions contained therein, and spits a string of physical symbols out.
Your i/o occurs where your brain converts the logical symbols to physical symbols for input to the computer, ie the keyboard, or where your brain converts the physical symbols output by the computer into meaning in your brain.
It is hard to grasp, but you need to separate the meaning from the representation. It's why, from the mathematical point of view, a load of pits in a CD are THE SAME THING as a load of north and south poles in a .wav file on the hard disk are THE SAME THING as a bunch of + and - charges being fed to a soundcard.
(And it's why current patent and copyright law are such a clusterfsck because they're written by politicians who keep trying make distinctions between two different REPRESENTATIONS of THE SAME THING.)
And it's why, in the early days, it was perfectly normal for computer programmers to write programs when they didn't have access to a computer to run it. They would run it in their head, and many of these programs, when transcribed onto a computer, ran perfectly first time. TeX springs to mind as a likely example, and I know my first boss did exactly that, too. A program, is a string of logical instructions, is maths.
Cheers,
Wol
Just give it up already
Posted Apr 22, 2015 21:23 UTC (Wed) by kleptog (subscriber, #1183) [Link]
I've heard many definitions of math, but this one is for me a very poor one. It seems to me more correct to say that some of maths is represented by the manipulation of symbols, but maths itself is not the manipulation of symbols. My personal favourite definition is: math is the study of patterns.
But if I understand your view correctly you have a proof by definition: programming is the manipulation of abstract symbols, maths is the manipulation of abstract symbols, therefore programming is math. This is for me very unsatisfying, because it provides no insight.
> (And it's why current patent and copyright law are such a clusterfsck because they're written by politicians who keep trying make distinctions between two different REPRESENTATIONS of THE SAME THING.)
I'm not sure why that is necessarily crazy. If I have a random number generator that happens to spit out a copy of the latest Hollywood film then I have not committed copyright infringement, even if it is a bit for bit identical copy (it's frightfully unlikely of course). The metadata about the bits is different. There's an article online called "What colour are your bits" which talks about this much eloquently than I could, see here: http://ansuz.sooke.bc.ca/entry/23
Just give it up already
Posted Apr 23, 2015 11:19 UTC (Thu) by Wol (guest, #4433) [Link]
(the first bit below I wrote, then went back and re-read what you'd written. It doesn't quite address what you were getting at, but please bear with me ...)
Unfortunately, this sort of thing is what Mathematics is built upon. They're called Axioms.
To my mind, it's simple. Philosophy (and maths) is the use of logic. To use logic, you need things to manipulate with logic. Those things are called symbols. That to me is an axiom, because I really can't see any way of going any deeper. It's not turtles all the way down, at some point you really need some sort of foundation. And the argument in favour of that foundation is going to be "Because!!", because that's the only foundation there can be.
Yes, it's unsatisfying. But the best mathematical minds in the world have attacked this problem, and that's all they can come up with. Godel's incompleteness theorem proves, in fact, that that's all there is. Good luck with proving him wrong ...
(now to try and address what you really wrote)
Let's split "programming" into several parts. Firstly there's the gathering of requirements - some people would call that maths.
Then there's actually writing the program. That most definitely is "doing maths". Because you're writing a stream of instructions, that are to be carried out in the abstract, and for any given starting position, will always end up at the same end position. And I can see your frustration with my "because" argument.
Lastly, there's testing. Science. Making sure the model fits reality. You know that if your program says "x = 2 * 4", then x will be 8. But when you run the program, the x register (thanks to a stray cosmic ray) may end up containing 9. (And that could happen every single time you run the program. Implausible, yes. Impossible, no.)
Let's go back to my Physics example. A guy thinks he's doing physics when he designs a theoretical model, thinks through all the implications, and works out the consequences of reality being like that. Yes he is doing Physics, but he's not doing Science. In my world view, that activity belongs completely within the purview of maths. Then he goes down the lab, runs an experiment and tries to see if he gets the results he expects. That is still Physics, but it's no longer maths. It's science. If you think of programming as the wider field, it's not all maths. But when you're writing a program you're thinking in abstract symbols - that's logic ie that's doing maths. And the program is manipulating abstract symbols - it is itself maths. But as a programmer, not all of your activities are maths.
At the end of the day, if it looks like a duck, walks like a duck, quacks like a duck then the only conclusion you can come to is that it IS a duck. (the main reason we have arguments like this is that people argue over where to place the dividing line between their classifications. imnsho, maths is abstract, if it's not abstract it's not maths. Maths defines uncertainty as a probability spread, Science deals with uncertainty by asking "what happened?")
Cheers,
Wol
Just give it up already
Posted Apr 23, 2015 11:30 UTC (Thu) by Wol (guest, #4433) [Link]
> I'm not sure why that is necessarily crazy. If I have a random number generator that happens to spit out a copy of the latest Hollywood film then I have not committed copyright infringement, even if it is a bit for bit identical copy (it's frightfully unlikely of course). The metadata about the bits is different. There's an article online called "What colour are your bits" which talks about this much eloquently than I could, see here: http://ansuz.sooke.bc.ca/entry/23
The fuzziness of the real world :-) But I would say that, as soon as you look at what the physical object represents, the difference is clear.
Your big random number is just that. On a x86 PC it is identical to the representation of the Hollywood blockbuster. But if you move your number to a machine of a different architecture it stays the same number, although the representation may change. If you move the film to the same machine, the representations there may change, too. DIFFERENTLY. Change the representation and the two are no longer physically identical.
The legislation is trying to legislate about abstract meaning, using physical representation to do so. It doesn't work ... :-(
Cheers,
Wol
Just give it up already
Posted Apr 27, 2015 20:36 UTC (Mon) by kleptog (subscriber, #1183) [Link]
No, the law legislates actions by and interactions between people. Copyright rests on a work, not a number which happens to be a possible representation of it. The law doesn't care one bit about how it is represented, it cares about which actions people have taken.
So copying the random number is not copyright infringement, copying the hollywood blockbuster may be, but in any case the representation has nothing to do with it.
Just give it up already
Posted Apr 22, 2015 22:45 UTC (Wed) by nix (subscriber, #2304) [Link]
And it's why, in the early days, it was perfectly normal for computer programmers to write programs when they didn't have access to a computer to run it. They would run it in their head, and many of these programs, when transcribed onto a computer, ran perfectly first time. TeX springs to mind as a likely exampleTeX did not run perfectly first time. We know this because Knuth wrote a paper, 'The Errors of TeX', listing all his bugs. (This list is still being maintained, because TeXnicians are wondrous pedants.)
Just give it up already
Posted Apr 23, 2015 0:19 UTC (Thu) by viro (subscriber, #7872) [Link]
Just give it up already
Posted Apr 23, 2015 10:32 UTC (Thu) by Wol (guest, #4433) [Link]
And how many people here are examples of the opposite? "I'm a mathematician, I do X therefore Y can't be maths" or "I'm not a mathematician, I do X therefore X can't be maths"?
I did say earlier on this was a big philosophical question. And that my ideas weren't that mainstream.
Philosophy unfortunately has got a bad rap because of questions like "How many angels can dance on the head of a pin". But it really is a fascinating subject - the attempt to deduce things by the application of reason and logic. And just by starting from the most innocuous point you can get into some really weird arguments. The world is NOT rational. The world is NOT logical. Try and use logic to define what a number is. You get into a "reductio ad absurdam" mess (Godel's incompleteness theorem). One only has to look at the Planck constant and the Heisenberg Uncertainly Principle to see the logical uncertainty of Godel feed into the physical uncertainty of the real world. I used to think philosophy was rubbish - and then I tried doing it (using logic to try to answer big questions like Life, the Universe, and Everything, that is). It's amazing what it does to you when you try to back up your ideas with evidence and proof...
Okay, maybe you're correct in accusing me of taking a seagull view of things, but what's that quote about doctors? "Consultants know everything about nothing, while GPs know nothing about everything". Unfortunately the world of logic and scientific enquiry is now FAR too big for any person to know more than a smattering. Most people seem to focus in on their specialty and ignore anything else. I try and have an overview of most of it. And unfortunately, most people are unaware of how their specialty interacts with the world around it - how many real-world bugs/disasters/cock-ups could have been avoided if only someone had taken their blinkers off and stepped back a bit from the situation ...
(Oh, and I didn't say TeX worked first time :-) I said it was an obvious candidate - looks like I was wrong :-)
Cheers,
Wol
Just give it up already
Posted Apr 22, 2015 9:08 UTC (Wed) by renox (subscriber, #23785) [Link]
SR yes, GR no, the maths are tough..
Just give it up already
Posted Apr 21, 2015 22:05 UTC (Tue) by nix (subscriber, #2304) [Link]
Which is another thing I rail against a bit :-) But no, when your teacher set you a maths problem in class, your solution was a computer program (of sorts). Certain programming languages (APL, J) are simply mathematical languages that happen to have a computer compiler. Given that all computer languages are (I believe) Turing-equivalent, that means all programs written in such languages are simply mathematical expressions.That's just not true. Mathematics is two things: a set of formalisms, and a way of thinking about those formalisms ("mathematics is what mathematicians do"). Software development in all but the most formal of languages (mathematical systems like Mathematica, Octave and CAS systems; among more conventional, languages perhaps Haskell would just barely count, if you eschew monads) does not use those formalisms in any but the most trivial of fashions, and people writing in those languages do not think like mathematicians do (or, at any rate, need not: I know some people who do, and just as many who instead think of programming as a mix between literary and musical composition in a very formal mode: as a rule the two tribes find it fairly difficult to understand how the other tribe works).So actual programming, actually writing a program, is just doing maths. The program is the resultant mathematical proof of all your work. You need to separate that from all the ancillary stuff that is not maths (defining the problem, etc etc).
I'd say that computer programs are formally equivalent to mathematics, but not that they are mathematics. Because you don't need to think like a mathematician to manipulate them just as well as those who do think that way.
Just give it up already
Posted Apr 22, 2015 7:13 UTC (Wed) by HelloWorld (guest, #56129) [Link]
Just give it up already
Posted Apr 22, 2015 22:49 UTC (Wed) by nix (subscriber, #2304) [Link]
Just give it up already
Posted Apr 22, 2015 11:50 UTC (Wed) by Wol (guest, #4433) [Link]
You don't have to be a musician to play music. You don't have to be a mathematician to do maths.
To me programming is one calculation after another ("calculation" in a loose sense, "if I'm here and I do this, I will get to there"). It helps that much of my early programming was numerical manipulation. But if you don't understand logic, if you don't understand cause and effect (essential tools for the mathematician), you will make a lousy programmer.
Cheers,
Wol
Just give it up already
Posted Apr 22, 2015 11:54 UTC (Wed) by Wol (guest, #4433) [Link]
Incidentally, there is a strong connection between mathematical and musical ability. A strikingly large number of mathematicians are musical, and despite political ignorance and relegation of music from the curriculum, there is plenty of evidence that an hour spent practising music is (for students weak at maths) much more beneficial to their maths results than the same hour spent practising maths!!!
Cheers,
Wol
Just give it up already
Posted Apr 17, 2015 21:07 UTC (Fri) by Wol (guest, #4433) [Link]
> Complete nonsense. Linux has nothing at all to do with microkernel theory and there are efficient implementations of message passing.
I could be wrong, and I guess you're referring to non-Intel architectures, but from what I've heard, it is pretty much IMPOSSIBLE to get x86 to implement efficient message passing in a micro-kernel. Because the context switching just kills any decent response speed. The linux coders expend a LOT of energy trying to keep context switching to a minimum. A microkernel spends most of its time context switching - that's pretty much its entire job!
I worked on Pr1mes, and they had a segmented memory architecture. I gather they also did lightning-fast context switching. But Pr1mos was never ported to x86, precisely because it relied on the segmented memory and context switching. (Or rather, I believe attempts were made to port it, but they foundered commercially on those problems.)
Cheers,
Wol
Just give it up already
Posted Apr 17, 2015 22:47 UTC (Fri) by HelloWorld (guest, #56129) [Link]
Just give it up already
Posted Apr 17, 2015 23:14 UTC (Fri) by Wol (guest, #4433) [Link]
So. Am I correct in thinking that micro-kernels (thanks to inefficient message passing) will always be a bad idea on "real" computers? That's the "man on the Clapham Omnibus"'s definition of a "real" computer :-) I think so.
Cheers,
Wol
Fast Message passing on x86
Posted Apr 18, 2015 10:05 UTC (Sat) by gmatht (guest, #58961) [Link]
Optimizing message passing on x86 was a major goal of EROS-OS. See for example "The Measured Performance of a Fast Local IPC". As I understand, a plain bcopy within a process tended to be less than %50 faster than a copy between processes, and lead to a very modest drop in overall performance (see e.g. "Operating System Support for Active Networks"). Whether a tiny performance drop is worth it is obviously a judgment call.
Fast Message passing on x86
Posted Apr 18, 2015 22:39 UTC (Sat) by Wol (guest, #4433) [Link]
That's hard to parse ... do you mean that a message passed within a context took half the time, or a message passed between contexts took 50% more time?
Either way, for workloads that consist mostly of message passing (ie, in practice, an awful lot) that's a performance drop of about 30% for a microkernel - a lot more than minimal.
I get the impression that sort of hit would horrify the linux kernel engineers ... :-)
Cheers,
Wol
Microkernels are currently niches
Posted Apr 17, 2015 13:20 UTC (Fri) by david.a.wheeler (subscriber, #72896) [Link]
Microkernels are currently used only in niches. Monolithic kernels include the Linux kernels, all *BSDs, the Windows kernel, and Apple MacOS. Yes, historically MacOS was based on the Mach microkernel, but Apple abandoned the microkernel architecture and link it all into a monolithic kernel.
There are microkernels in use, esp. QNX, and L4 might seriously grow in use over time.
Microkernels are currently niches
Posted Apr 18, 2015 10:36 UTC (Sat) by CChittleborough (subscriber, #60775) [Link]
Microkernels are currently niches
Posted Apr 20, 2015 21:19 UTC (Mon) by Nelson (subscriber, #21712) [Link]
I think there is an important distinction here. On many phones, running OKL4, there is a host processor and effectively 2 operating systems running and they're not really sharing much (other than the processor) they even sell it as a microvisor or something like that. One OS is running the radio and cell communication and maybe some real-time type stuff just for communication, the other is the interface and runs the actual handset, or it's roughly like that and those individual OSes have processes of their own and resources that they manage. That's a very different design than what Hurd is attempting to do. It's not an apples to apples comparison.
Microkernels are currently niches
Posted Apr 23, 2015 11:37 UTC (Thu) by khim (subscriber, #9252) [Link]
It's also the very defition of a “niche”: OS which is not designed to run general-purpose application and which is supposed to be used in place where robustness is more important that versatility of speed.
Microkernels are not a bad idea in general, they are just not usable as a foundation for general-purpose OS.
And it's easy to see why: “a chain is only as strong as its weakest link” and kernel (Linux kernel, MacOS kernel, NS kernel) rarely are that link. In the last few years the systems I've worked with were hozed many times and very rarely kernel was the culprint. Why would you replace kernel with less performant one if it'll not make the whole system measurably better?
Just give it up already
Posted Apr 17, 2015 2:02 UTC (Fri) by rsidd (subscriber, #2582) [Link]
I guess your comment is only because this is GNU's pet project: otherwise since when have we told developers not to bother with their pet projects, no matter how obscure? Hurd will never make a dent in the market, but it continues to amuse a few people... so let them have fun with it. Though people who want to get their feet newly wet in microkernels should probably go for minix3 instead...
Viva la Hurd!
Posted Apr 17, 2015 13:30 UTC (Fri) by ksandstr (guest, #60862) [Link]
They seemingly have all of the damage of monolithic kernels with very few of the benefits of in-system decentralization. To wit, they substitute UNIX-equivalent system calls (synchronous, unqueued) with bufferbloat from naïve IP routing, inflicting not just fundamental IPC's disconnection from scheduling but also unbounded latency, cascading priority inversions, and premature wilting of all the beautiful flowers, too.
They also put many device drivers, besides those related directly to the microkernel's own function, within the microkernel; thereby failing to avoid a monolithic design's various troubles with special contexts of execution, implicit long-term APIs, low-level dependency on the microkernel for hardware-interaction tasks besides interrupt delivery, and so forth.
It's hard to even imagine what gains would possibly make up for even one of the deficiencies above. For example, even in the lightest of microkernels it used to be the case that using inter-process IPC for syscalls was an unjustifiably major performance pain-point until QNX happened. The nerds won't be convinced of anything short of lowering the flesh-and-blood user's UI latency to the "completed input-to-display roundtrip within a frame" demoscene level, and rightly so. [0]
In order to be realistically practicable, radical ideas such as those in Hurd (and even Mach itself, as seen in OS X) must have a rigorously measurable benefit. It's best for that benefit to be radical in itself. As much as I'd like for it to be, the Hurd isn't there. From the way it's been before, and the way it's going, it doesn't look like it'll be there anytime soon.
However the Hurd is incredibly valuable as a playground for a horrific juxtaposition of kernel-design ideas, and as a glorious cluster-intercourse of (seemingly) politics-over-technology for all to study.
[0] For examples of this phenomenon in action, see HTTP.SYS, the khttpd of yore, and today's proposals of kdbus. It turns out that not even a whiz-bang sysenter/exit mechanism quenches the thirst for a layering violation as good as the first one...
Just give it up already
Posted Apr 18, 2015 13:36 UTC (Sat) by taylanub (guest, #99527) [Link]
See, I can troll as well!
Maybe LWN needs some comment moderation.
Just give it up already
Posted Apr 19, 2015 20:47 UTC (Sun) by pboddie (guest, #50784) [Link]
GNU Hurd might become important
Posted Apr 17, 2015 15:36 UTC (Fri) by coriordan (guest, #7544) [Link]
In some years time, we might need a project which can be an incubator for fully free drivers, or we might need a kernel that's GPLv3'd so that we can put resources into it without also contributing to locked-down devices.
One problem is that Hurd currently needs the Linux drivers, so current Hurd can't jump to GPLv3, AFAICT.
The pace can be picked up when needs be, and the design can be rethought, but it's always good to have some working code that can be built upon.
GNU Hurd might become important
Posted Apr 20, 2015 17:34 UTC (Mon) by JoeBuck (guest, #2330) [Link]
If that's your belief, you'd still be better off working with the Linux kernel with a GNU runtime than with GNU Hurd, because the Hurd is making negative progress. By that I mean that the speed of hardware development is proceeding so much faster than Hurd development that Hurd is falling further and further behind.Sure, you can say that you don't want to contribute to code that will be used in ways you don't like. But if you isolate yourself, then you can't use all the free code produced by people who don't share your philosophy but find GPLv2 acceptable.
GNU Hurd might become important
Posted Apr 21, 2015 1:53 UTC (Tue) by coriordan (guest, #7544) [Link]
Will the Linux devs fight for our right to control our hardware? Will a fork of Linux be the right path? I don't know but having a parallel project which is focussed on freedom is smarter than leaving something so crucial to chance.
The Hurd could provide a project which attracts people who are interested in forcing free software onto hardware that was manufactured to lock free software out. Rockbox did it for the iPod nano 2. The native firmware was encrypted and for five years they thought it was impossible to get their free software onto it, but then they noticed a flaw and now Rockbox runs on the iPod nano 2.
Or another example is that, as a community, we might have to develop our own hardware to run free software operating systems. There might be some technical problems with running Linux (or Hurd) on this hardware and Hurd might be more willing to change to accommodate this new hardware.
(Since these are hypothetical situations, anyone can argue against them or say "Yeah but Linux will do that", but we don't know today what the problems will be, so we can't say what will or won't be the solution. A Hurd-based solution might be unlikely, but it's not impossible and we might be glad it's there.)
(Regarding the technical problems mentioned in other posts: Hurd doesn't exist to be a microkernel. It exists to be a free kernel. If it needs to advance quickly to solve a freedom problem, I'm sure the current microkernel design will get the necessary compromises or even be abandoned with the possibility of importing a pile of code from a *BSD kernel.)
GNU Hurd might become important
Posted Apr 21, 2015 5:18 UTC (Tue) by dlang (subscriber, #313) [Link]
GNU Hurd might become important
Posted Apr 21, 2015 13:58 UTC (Tue) by coriordan (guest, #7544) [Link]
GNU Hurd might become important
Posted Apr 21, 2015 6:27 UTC (Tue) by tao (subscriber, #17563) [Link]
Mind you, the Rockbox project is impressive. But it didn't do one iota to improve the hardware situation for future hardware, and the Hurd wouldn't have improved the situation (rather the opposite, because Rockbox would then have to spend their time on reinventing the wheel. Or has Hurd magically gained ARM support lately?
When it comes to Linux I think you should give credit where credit is due; the amount of hardware where the drivers have been provided by the companies making said hardware is miniscule compared to the amount of hardware that has had its drivers reverse engineered in one way or the other by, mainly, Linux developers, and secondly by various BSD developers.
How much hardware has the GNU HURD/Mach developers helped make usable by writing free drivers for? A laudable goal would be, say, a faster & more system friendly driver for NVidia cards. BSD licensed, to allow all Free Software projects to benefit from it -- after all the benefit of wide adaptation here would be bigger than the risk of someone taking the source and making a closed driver from it (after all there's already a -- performance wise -- superior driver from Nvidia).
GNU Hurd might become important
Posted Apr 21, 2015 13:19 UTC (Tue) by coriordan (guest, #7544) [Link]
We have free software for that task today because someone worked on it even when most thought it was impossible.
Hurd developers are among the developers we can rely on to work on the freest solution possible, and our need for kernel developers with that mindset might sharply increase in the coming years.
> A laudable goal would be, say, a faster & more
> system friendly driver for NVidia cards.
Lots of projects need help. That doesn't mean that other projects shouldn't be worked on.
> BSD licensed
Copyleft is working very well for kernel-space projects so far. There's no reason to be defeatist
GNU Hurd might become important
Posted Apr 21, 2015 14:32 UTC (Tue) by tao (subscriber, #17563) [Link]
Still, feel free to live in dream land. But I'm willing to bet quite a fair amount of money on Linux still being relevant 10 years from now, while HURD will either be abandoned or still a pet project of GNU that no one cares about.
You claim that HURD is a way to fight against non-free hardware. HOW, exactly?
I can imagine company X, just about to release their new hardware:
"Oh, wait, if we only supply a binary driver people won't be able to use our new hardware with GNU/HURD!"
"Damn, we have to cancel the whole project!!!"
That scenario might be feasible if the installed user base is large enough to matter. But no one of consequence uses GNU/HURD.
The Linux-kernel, OTOH, has a chance to influence HW makers; it has already influenced a lot of companies. It hasn't worked in every case -- some companies simply cannot comprehend why it'd be to everyone's advantage to free their software -- but those cases would be unlikely to be affected by the "don't cooperate, don't accept any shades of grey -- the world is black and white" stance of HURD either.
As you seem to glorify Rockbox, don't forget the multitude of hardware that is usable only because Linux or BSD developers have, through reverse engineering, written free drivers for that hardware (having then borrowed the necessary from each other). How many drivers in Linux or one of the BSD kernels are based on hardware specs reverse engineering work originating from HURD?
GNU Hurd might become important
Posted Apr 21, 2015 15:02 UTC (Tue) by pizza (subscriber, #46) [Link]
Rockbox is also an example of an almost-dead project, because the market for standalone music players has been subsumed by smartphones and constantly fighting intransigent hardware manufacturers gets quite old when there are no users to be had.
It's a shame, really.
GNU Hurd might become important
Posted Apr 21, 2015 15:32 UTC (Tue) by coriordan (guest, #7544) [Link]
> You claim that HURD is a way to fight against non-free hardware
My claim is different. It's that we might soon live in a world where the vast majority of hardware is designed to lock free software out. We'll need to work on various projects to find ways to continue to run free software without the cooperation of the big hardware manufacturers.
The Linux devs will play a role. Some will try hard to get the hardware working with free software. Others won't care and will copy more non-free binary blobs into the Linux kernel.
Linux forks and blob-replacing patch projects will play a role. As Linux-libre is doing now.
Projects to make our own computer hardware might play a role.
And Hurd might play a role, for reasons given in previous posts which aren't realistic today but are realistic in the future I mention.
GNU Hurd might become important
Posted Apr 21, 2015 16:22 UTC (Tue) by pizza (subscriber, #46) [Link]
> The Linux devs will play a role. Some will try hard to get the hardware working with free software. Others won't care and will copy more non-free binary blobs into the Linux kernel.
I call BS on this, because that last statement has nothing to do with your stated claim.
* Linux, as distributed by kernel.org, is Free Software. If you're sersiously claiming otherwise, I suggest you cite some references in Linus's git tree.
* Folks who distribute Linux-derived binaries without complete corresponding source code (in violation of Linux's license!) do not make Linux itself less Free. Call out the violators!
* The mere presences of an opaque blob used in the initialization of a device (and does not execute on the host processor) does not make Linux less free, because those blobs are entirely independent of, and not derived from, Linux. Call the hardware non-Free, but that is entirely independent from the OS itself.
I seriously don't understand why some folks claim that having the host processor transfer said blob to the hardware is bad, but the same blob stuffed into an eeprom on the hardware is not only okay, but preferable.
Meanwhile, none of this has anything to do with "locked-down" hardware, which uses cryptographic mechanisms to only allow pre-blessed binaries to be executed, and the legal system which treats circumvention of these mechanisms as a greater offense than molesting a child.
GNU Hurd might become important
Posted Apr 21, 2015 17:16 UTC (Tue) by coriordan (guest, #7544) [Link]
That was true until 1996. If you want a free software Linux kernel, try Linux-libre: http://www.fsfla.org/ikiwiki/selibre/linux-libre/
> If you're sersiously claiming otherwise, I suggest
> you cite some references in Linus's git tree.
https://git.kernel.org/cgit/linux/kernel/git/torvalds/lin...
It's a binary blob in Linus's git tree and the comment at the top says it downloads a further binary blob.
It's also a GPL violation because the binary blob in Linus' repository claims to be GPL but doesn't provide any source code. (The GPL requires source code, which it defines as "the preferred form of the work for making modifications to it".) The comment also says that the blob that gets downloaded is non-free.
If the functionality controlled by this blob is important enough for the manufacturer to want to be able to update it, then it's probably also important enough for the user to be allowed see what it's doing and modify it as desired - like the rest of the kernel code on kernel.org.
GNU Hurd might become important
Posted Apr 21, 2015 18:14 UTC (Tue) by pizza (subscriber, #46) [Link]
You apparently didn't read the *very first sentence* in that file, "The firmware this driver downloads into the Localtalk card is a separate program and is not GPL'd source code.."
So while you're correct, this blob is not Free Software -- but it's also not *Linux*, so claiming that this makes "Linux" non-free is a specious argument at best.
> If the functionality controlled by this blob is important enough for the manufacturer to want to be able to update it, then it's probably also important enough for the user to be allowed see what it's doing and modify it as desired
I actually agree with this statement. But again, I don't understand why you consider an identical binary blob to be okay (if not preferable) when it's embedded in field-updatable EEPROM rather than a file on disk. Your distinction reeks of arbitrary hair-splitting.
(I might add that the long-term trend is to pull blobs out of the kernel sources; if this matters that much to you I suggest you submit a patch to transfer this blob from linux.git into linux-firmware.git)
This isn't a theoretical distinction. I wrote the prism2_usb driver in the Linux kernel. Some devices have onboard eeprom, but others don't. They both utilize an identical binary firmware image that is propritary but can be redistributed freely. By your definition, a Linux system which uses the former is somehow freer than a Linux system that includes the firmware to use the latter... which is utter nonsense. The end result is the same; the overall system still has one copy of non-free firmware that is necessary to utilize the hardware.
Honestly, The FSF has far more productive battles to fight -- Like going after folks that blatantly violate the GPL. (Allwinner and Engenius in particular come to mind..)
GNU Hurd might become important
Posted Apr 21, 2015 18:50 UTC (Tue) by coriordan (guest, #7544) [Link]
I read the whole file. There are two blobs. The first, which is mentioned by the comment, is downloaded from some website.
You say downloaded blobs are ok. I disagree, but let's ignore this since it's not the blob I'm talking about.
I'm talking about the second blob, which begins just after the comment. The non-comment contents of the file is a binary executable formatted as an array of ints. A blob. No human readable source code is provided.
> Your distinction reeks of arbitrary hair-splitting.
No distinction will be perfect. The existence of grey areas or borders of inconsistency mean we should fine-tune the distinction, not abandon it.
Your case is a corner case, so I don't think it makes the general distinction ridiculous, but it's certainly a good example to think of when trying to improve the distinction. I've noted it here:
http://libreplanet.org/wiki/When_should_firmware_be_free#...
(I don't think anyone from FSF looks at that page, but at least it's noted and maybe myself or someone in the future will raise the issue.)
GNU Hurd might become important
Posted Apr 21, 2015 19:41 UTC (Tue) by mathstuf (subscriber, #69389) [Link]
Care to provide the URL it fetches the blob you're talking about from?
GNU Hurd might become important
Posted Apr 21, 2015 23:31 UTC (Tue) by coriordan (guest, #7544) [Link]
In that case it's clear that the Linux source code includes non-free software.
GNU Hurd might become important
Posted Apr 22, 2015 2:09 UTC (Wed) by mathstuf (subscriber, #69389) [Link]
GNU Hurd might become important
Posted Apr 21, 2015 21:46 UTC (Tue) by pizza (subscriber, #46) [Link]
The file URL you provided [1] only lists one blob, and has been essentially unchanged since the beginning of the git eramore than ten years ago [2], so you obviously did *not* read the whole file. And there's also no reference to "some website" in that comment either.
Meanwhile.
> Your case is a corner case, so I don't think it makes the general distinction ridiculous, but it's certainly a good example to think of when trying to improve the distinction. I've noted it here:
The text under that section does not represent the situation with the prism2 devices; It mentions ROM vs EEPROM but I was specifically referring to one device with [EEP]ROM vs one that has none, with the otherwise-identical firmware over transferred by the host system.
Incidentally, these days nobody uses mask ROM unless it's embedded within another IC. A separate ROM will actually be EEPROM/flash, and the technical capability to update it is nearly always present, even if not disclosed to system integrators or the end-user. Perversely, discovering a previously-unknown update mechanism will make formerly acceptable hardware into unacceptable under the FSF definition, which is ...silly.
IMO the line/distinction the FSF takes with regards to device firmware is nonsensical, as it uses an implementation detail to distinguish between "free" and "non-free", which leads to some downright stupid decisions [3]. Let's call a spade a spade -- If you don't have the source code (and the rights to utilize it, ie the FSF's Four Freedoms) then it isn't Free -- No matter if said blob is stored in ROM, local EEPROM, system EEPROM, some other mass storage device, or downloaded "from the cloud".
(BTW, the only non-FOSS software on my systems are device firmware blobs, including the system and graphics card BIOSes, so I do actually practice what I preach.)
All that said, this hair-splitting is a sideshow to the real threat -- laws that prevent folks from tinkering with hardware they supposedly own, eg the DMCA's section 1201. No amount of technical (or license) hand-wavery will let us work around the legal system's massive cudgel. I'm glad the EFF is actively fighting that front [3].
[1] https://git.kernel.org/cgit/linux/kernel/git/torvalds/lin...
[2] https://git.kernel.org/cgit/linux/kernel/git/torvalds/lin...
[3] http://lwn.net/Articles/460654/ (and many good comments too)
GNU Hurd might become important
Posted Apr 21, 2015 23:46 UTC (Tue) by coriordan (guest, #7544) [Link]
It says that the driver downloads some firmware, but in light of mathstuf's comment I'm guessing that the comment is actually wrong and the non-free licence actually refers to the code below the comment, so the official Linux source repo includes non-free code.
> The text under that section does not represent the situation
Thanks for the clarifications. What I had scribbled down was just some mental notes while I rushed out to watch a football match that had already started. I've now added a link to your posts.
GNU Hurd might become important
Posted Apr 22, 2015 0:20 UTC (Wed) by dlang (subscriber, #313) [Link]
It contains data that's loaded onto a device. That is not kernel code, it's driver data.
The driver may execute some of the data, it may use it to define a state table, it may use it as config info, or it may be the 'hardware' definition of a fpga.
In no case is it code that executes as part of the kernel.
Now, why this data is deemed "EVIL" by some people if it's in the git tree where it can be changed/replaced by the kernel developers, but is "Good" if it's in a flash chip on the device makes no sense to many of us. The first case makes it easier for the user to replace it, and the data is exactly the same in either case.
GNU Hurd might become important
Posted Apr 22, 2015 16:39 UTC (Wed) by coriordan (guest, #7544) [Link]
"Data". We know what sort of data it is, it's settings and instructions, just like a kernel or a kernel module. The only difference is that Linux will be run by my CPU and this other code will be run by another processor in my computer.
I see the point that it's difficult to draw a line between stuff we expect to control and stuff we accept we can't control, but difficulty doesn't mean we should just give up and reduce our expectations to the bare minimum.
GNU Hurd might become important
Posted Apr 22, 2015 19:52 UTC (Wed) by dlang (subscriber, #313) [Link]
that is a reasonable position to take
but saying that the same blob of binary data is EVIL if it's in the linux git tree, but perfectly acceptable if it's on a flash or ROM does not seem reasonable. It's far easier to change something that the device driver feeds to the hardware than something hidden in the hardware.
GNU Hurd might become important
Posted Apr 22, 2015 6:24 UTC (Wed) by mbunkus (subscriber, #87248) [Link]
GNU Hurd might become important
Posted Apr 20, 2015 18:49 UTC (Mon) by Wol (guest, #4433) [Link]
Cheers,
Wol
GNU Hurd might become important
Posted Apr 20, 2015 18:55 UTC (Mon) by pboddie (guest, #50784) [Link]
Unfortunately, the online historical trail isn't very reliable (or Google and the other search engines are being particularly unhelpful - a possibility these days) and it becomes difficult to obtain and evaluate the different variants of Mach that emerged at various points, which might be interesting just to see whether anyone's objections to GNU Mach might be overturned by looking at, say, OSF Mach or a derivative thereof.
GNU Hurd might become important
Posted Apr 20, 2015 19:03 UTC (Mon) by khim (subscriber, #9252) [Link]
Linus has been quite clear that the GPL does not jump the kernel/userspace barrier
Who said anything about kernel/userspace barrier? Please read what Linus actully wrote: NOTE! This copyright does *not* cover user programs that use kernel services by normal system calls - this is merely considered normal use of the kernel, and does *not* fall under the heading of "derived work".
It says nothing about kernel/userspace barrier and everything about normal system calls! Which is logical: system calls are designed to separate two independent pieces. Some of these are just POSIX, some are Linux-specific, but they are designed to separate kernel from userspace thus of course they act as kinda copyright barrier! If you use them in kernel - you are golden if the end result is some kind of kernel module, if you refactor kernel code to put bits of it in userspace you still have not created an independent program if you are using kernel-internal interfaces!
GNU Hurd might become important
Posted Apr 20, 2015 19:14 UTC (Mon) by bfields (subscriber, #19510) [Link]
Linus has been quite clear that the GPL does not jump the kernel/userspace barrier
The only statement I'm aware of beyond the GPL itself is from COPYING:
This copyright does *not* cover user programs that use kernel services by normal system calls...
(Well, Linus may have said any number of things elsewhere, but a) I don't remember anything as general as your statement, b) things he says elsewhere may count for something, but not as much as what's in COPYING.)
GNU Hurd might become important
Posted Apr 20, 2015 21:38 UTC (Mon) by Wol (guest, #4433) [Link]
Back to front, of course, as in the driver is in userspace whereas in the normal linux world it's in the kernel, but that's what I was thinking. The driver can be an independent, GPL2 user-space program, and as long as it only communicates with the microkernel via standard interfaces, it's golden...
Cheers,
Wol
GNU Hurd might become important
Posted Apr 21, 2015 2:45 UTC (Tue) by coriordan (guest, #7544) [Link]
(I say "Linus' Linux" because you also couldn't cut Linux in half, add message passing system calls to continue all the previous communication, and then say that the top half of your Linux is a user program.)
GNU Hurd might become important
Posted Apr 21, 2015 5:20 UTC (Tue) by dlang (subscriber, #313) [Link]
Copyright © 2015, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds