|
|
Subscribe / Log in / New account

Writing kernel modules in Haskell

There must be a crowd of people out there thinking that they would get into kernel development, but only if they could do it in Haskell. Here is a web site with instructions on how to do just that. "By making GHC and the Linux build system meet in the middle we can have modules that are type safe and garbage collected. Using the copy of GHC modified for the House operating system as a base, it turns out to be relatively simple to make the modifications necessary to generate object files for the Kernel environment." This leads to code which looks like:

    hello = newCString "hello" >>= printk >> return 0

Just don't try to merge it upstream.


to post comments

Writing kernel modules in Haskell

Posted Sep 13, 2009 14:38 UTC (Sun) by Baylink (guest, #755) [Link] (15 responses)

ZOMGBBQ. These people are serious, aren't they?

It's not April 1st?

Writing kernel modules in Haskell

Posted Sep 13, 2009 15:16 UTC (Sun) by Baylink (guest, #755) [Link] (14 responses)

No, I have a serious comment, too.

Garbage Collection? In a kernel module?

As someone who spends an inordinate amount of time staring at a spinning clock on a blackberry because *its* code is all written in garbage-collected Java, no thanks.

Writing kernel modules in Haskell

Posted Sep 13, 2009 15:56 UTC (Sun) by nix (subscriber, #2304) [Link] (5 responses)

There is actually such a thing as hard-realtime garbage collection (where
the GC is guaranteed to take no longer than some time T to complete). It's
hardly mainstream but it exists.

Writing kernel modules in Haskell

Posted Sep 14, 2009 5:11 UTC (Mon) by jordanb (guest, #45668) [Link] (4 responses)

You can do real-time garbage collection, such that any GC run will not take more than a given amount of time, but you then can't guarantee that all stale objects will be collected. If the GC strategy is to run to free space for a given newly-created object when a limit is reached, you can't guarantee that space for the object will be found in the given amount of time, even if it is available somewhere.

Think about it like this. Someone asks you to do a finite-time tree traversal, where the tree can be of an arbitrary size. You can't guarantee complete traversal and the time constraint at the same time. You can only meet the constraint by being willing to stop traversal early.

In the case of GC you may be willing to stop before traversal is complete, and you can work out your strategy such that you're likely to make gains early. But the time constraint means that there will always be a possibility of failing to find sufficient space even when it exists.

Writing kernel modules in Haskell

Posted Sep 14, 2009 6:03 UTC (Mon) by nybble41 (subscriber, #55106) [Link] (1 responses)

I'm not sure about the GC strategies you're referring to, but it is at least possible to implement a compacting GC algorithm in real-time. The trick is that you don't try to set an upper bound on the time required to traverse the entire active set; rather, you perform part of the GC at every allocation, and include some translation/bookkeeping when reading or writing pointer fields. The amount of work performed per allocation determines the amount of extra memory needed vs. your maximum active set: more time spent on GC per allocation results in more efficient use of memory.

Writing kernel modules in Haskell

Posted Sep 14, 2009 8:26 UTC (Mon) by nix (subscriber, #2304) [Link]

Yes indeed. This is part of how the Metronome GC works, for instance: GC
times with that can be in the submillisecond range.

Writing kernel modules in Haskell

Posted Sep 14, 2009 18:57 UTC (Mon) by Pc5Y9sbv (guest, #41328) [Link]

Right, while deterministic GC is unsolvable (reduces to halting problem), the realistic goal is just to converge towards the same non-deterministic memory usage as well-designed manual memory management for the same application.

Through compile-time analysis and incremental GC methods, you can play with these space and time overheads and with their granularity.

Writing kernel modules in Haskell

Posted Sep 17, 2009 2:51 UTC (Thu) by Pseudonym (guest, #60861) [Link]

Of course. The issue here is essentially how you want to divide up work between the programmer and the implementation.

There's always the risk of your GC not having enough time to reclaim all of memory, but that is a risk with a non-GC'd implementation, too.

First off, deallocating memory is always more expensive than allocating it in a concurrent environment. You can allocate memory from whichever region has the lowest contention, but you can only deallocate it to the place that it came from.

Secondly, if your time constraints were that tight, then you probably didn't have enough time to manually free all that memory right now, either.

Yes, there's always the possibility of your program misbehaving. But in the real-time space, that's always a risk. Using a GC'd implementation doesn't free you from hard thinking, it just frees you from bothering with a certain amount of boilerplate.

Not just GC

Posted Sep 13, 2009 22:44 UTC (Sun) by salimma (subscriber, #34460) [Link] (3 responses)

.. it's worse than that: while Haskell supports SMP, its garbage-collector is still single-threaded -- http://www.haskell.org/ghc/docs/latest/html/users_guide/u...

A friend of mine worked on it a couple of summers ago at MSR Cambridge; I guess it's not quite done yet.

Not just GC

Posted Sep 14, 2009 17:20 UTC (Mon) by TomMD (guest, #56998) [Link] (1 responses)

No - you can use a parallel GC with GHC. SimonM has some publications on the details they implemented [1]. Never-the-less, the RTS used by House (which I used as a base) is single threaded, so of coarse the GC will be. To get a hold of multiple processors in the Kernel for any stop-the-world GC would not be very cool.

[1] http://www.haskell.org/~simonmar/bib/bib.html

Not just GC

Posted Sep 15, 2009 20:07 UTC (Tue) by solidsnack (guest, #60840) [Link]

Last I checked, the multicore collector for GHC was a parallel, stop-the-world collector. From Parallel Generational-Copying Garbage Collection with a Block-Structured Heap (2008):

We focus on parallel, rather than concurrent, collection. In a concurrent collector the mutator and collector run at the same time, whereas we only consider garbage collecting in parallel while the mutator is paused.

Parallel GC doesn't directly address the problem of intermittent, hard to control GC latency. For soft real-time apps like the kernel, file systems, games and network services, this is less than ideal.

I haven't measured typical pause times for Haskell programs. Non-stop Haskell (2000) mentions average pause times of between 1 and 117 milliseconds on various benchmarks with GHC 4's standard collector. Their concurrent garbage collector, implemented for GHC 4, is able to get sub-millisecond average pause times (and, for a price, average pauses of much less than a millisecond).

Another interesting piece of work in this vein is John Meacham's region inference for JHC (2009). The idea is that some (all?) Haskell programs can have their memory statically allocated if that's desired; then you don't need a garbage collector.

I'd be very unhappy with my kernel module if it froze for 100ms; depending on what you're doing, even 1ms might be too much. On the other hand, strongly typed, purely functional programming is the way of the future and a lot of good work is being done to sort this stuff out.

Not just GC

Posted Sep 14, 2009 19:07 UTC (Mon) by simonmar (guest, #60806) [Link]

Thankyou, i really need to fix that section in the manual :)

Writing kernel modules in Haskell

Posted Sep 14, 2009 0:31 UTC (Mon) by flewellyn (subscriber, #5047) [Link] (3 responses)

GC isn't the reason for Java's slowness. Blame that on the VM.

Writing kernel modules in Haskell

Posted Sep 14, 2009 12:55 UTC (Mon) by nye (subscriber, #51576) [Link] (2 responses)

Indeed. Java is perfectly fast enough for long running, non-graphical tasks.

I suspect this is why a lot of Java developers think that the 'Java slowness myth' is complete FUD, whereas users who sit twiddling their thumbs waiting for the JVM to start or some godawful Swing application to draw think those developers must be on crack.

Totally different experiences.

Writing kernel modules in Haskell

Posted Sep 15, 2009 1:01 UTC (Tue) by mikov (guest, #33179) [Link] (1 responses)

Offtopic:

As someone who has the misfortune of running Java applications on 200MHz x86 CPUs, I feel that it is my duty to announce in public: The "Java slowness" is not a myth! It is real :-)

The overabundance of computing power has made everybody lose their objectivity. Nowadays we even have people claiming that JavaScript (!!!) is as fast as C. Similarly, a couple of years ago we had (the same?) people claiming the same for Java. In reality, I am thinking that Java is probably about 10x slower than C, and JavaScript 10x slower than Java :-)

People seem to believe that JITs can offer infinite improvements. In reality I suspect that we will never see a 2x improvement for JavaScript JITs compared to now without changing the language itself.

Java speed arguments

Posted Sep 17, 2009 22:29 UTC (Thu) by smoogen (subscriber, #97) [Link]

It all depends on how the application is written. Many C applications can be quite boggy slow when you make them do the same things that most java bogs down on: graphics, UTF-8, etc.

Look at how slow grep and some other Linux applications got when they started dealing with UTF-8 environments versus good old 'C' environment. Or how slow some applications could be when run in a 'newer' graphical terminal versus say a console or older xterm.

The biggest problem that people run into is that few development groups work on the hardware many users have (though I don't think I have any 200Mhz working systems here anymore myself).

Writing kernel modules in Haskell

Posted Sep 13, 2009 15:11 UTC (Sun) by MisterIO (guest, #36192) [Link]

I'd probably try Vala instead.

Writing kernel modules in Haskell

Posted Sep 13, 2009 16:13 UTC (Sun) by dilinger (subscriber, #2867) [Link]

More likely, there's a crowd of people who already do kernel work, but are also haskell fans. Or
maybe it's just me. :)

Writing kernel modules in Haskell

Posted Sep 13, 2009 16:23 UTC (Sun) by bjacob (guest, #58566) [Link] (52 responses)

While this is obviously just a fun experiment, there are real use cases for writing kernel code in a language more feature-full than C.

An example is if eventually a kernel module needs to do matrix computations. C++ or Fortran matrix libraries are much more powerful than C ones (Fortran because of native array arithmetic; C++ because templates allow to do that and much more) so one could imagine a kernel module that would be written in C++ with extern "C", for example. Actually, this will happen as soon as some kernel module needs to implement some clever math algorithm.

Writing kernel modules in Haskell

Posted Sep 13, 2009 16:49 UTC (Sun) by cesarb (subscriber, #6266) [Link] (40 responses)

It is more probable that the kernel developers will push the clever math algorithm to a userspace daemon.

Writing kernel modules in Haskell

Posted Sep 13, 2009 16:58 UTC (Sun) by bjacob (guest, #58566) [Link] (39 responses)

Sure, that makes a lot of sense. I don't have the experience though to tell if that is always possible or if sometimes stuff really must stay in-kernel. For example, what if a filesystem, or a scheduler, needs to do some statistical analysis that relies on a matrix computation, etc.

Writing kernel modules in Haskell

Posted Sep 13, 2009 21:10 UTC (Sun) by csamuel (✭ supporter ✭, #2624) [Link] (38 responses)

Then they'll probably do it in C for the general case and hand optimised
assembler for specific CPUs. ;-)

Writing kernel modules in Haskell

Posted Sep 13, 2009 21:13 UTC (Sun) by bjacob (guest, #58566) [Link] (37 responses)

That can sure be done -- with 10x the effort that it would take using a good C++ template library ;)

Writing kernel modules in Haskell

Posted Sep 13, 2009 21:30 UTC (Sun) by daney (guest, #24551) [Link] (34 responses)

I have worked with a Linux kernel module that was written in C++ *and* used
floating point. From a technical point of view, it is not difficult to do. The politics of merging it upstream on the other hand...

Writing kernel modules in Haskell

Posted Sep 13, 2009 22:57 UTC (Sun) by cesarb (subscriber, #6266) [Link] (32 responses)

Just for fun, an old (2004) thread about someone who did exactly that: http://kerneltrap.org/node/2067

Writing kernel modules in Haskell

Posted Sep 13, 2009 23:17 UTC (Sun) by bjacob (guest, #58566) [Link] (31 responses)

Interesting...

Linus's quote leave little hope that C++ code ever makes it into the kernel, indeed.

Just for the record, here's how I would counter his claims (in the link you posted):

Quote:
"The fact is, C++ compilers are not trustworthy. They were even worse in 1992, but some fundamental facts haven't changed: 1) the whole C++ exception handling thing is fundamentally broken. It's _especially_ broken for kernels. 2) any compiler or language that likes to hide things like memory allocations behind your back just isn't a good choice for a kernel. 3) you can write object-oriented code (useful for filesystems etc) in C, _without_ the crap that is C++."

1) Doesn't matter, nobody's forced to use exceptions, there are lots of C++ project who don't (KDE for example) for a variety of reasons. So exceptions being inappropriate isn't an argument against C++ in general.

2) C++ doesn't hide any memory allocation behind your back. When you replace ptr = (T*)malloc(sizeof(T)); by ptr = new T;, you haven't hidden the memory allocation!

3) meh, if Linus is happy about the object-oriented-C style that they use in the kernel, who am I to argue against that. I just have yet to hear the details of how C++ is "crap" here. It just does this repetitive work for you, in a standardized way.

***

It's funny that Linus doesn't mention what seems to me to be the most sensible reason to oppose C++ : it's far easier to organize a community of C programmers than a community of C++ programmers, to make everyone understand and use the same coding paradigms, to review patches etc... since C is such a simple language.

Writing kernel modules in Haskell

Posted Sep 13, 2009 23:36 UTC (Sun) by cesarb (subscriber, #6266) [Link] (3 responses)

The best email I have see so far from Linus on why he does not like C++ was in fact on the git mailing list: http://article.gmane.org/gmane.comp.version-control.git/5...

As a bonus, I just found two threads even older than the 2004 one (the link above is from 2007):

http://web.archive.org/web/20020821190433/kt.linuxcare.co... (from 2000)

http://web.archive.org/web/20020717162937/kt.linuxcare.co... (from 1999!)

Writing kernel modules in Haskell

Posted Sep 13, 2009 23:46 UTC (Sun) by bjacob (guest, #58566) [Link] (2 responses)

Ah, the 2007 link is better indeed (once you take out the Linus-isms!)

These arguments are against introducing c++ throughout the kernel, object models etc. Still, coming back to the original topic --- these arguments don't apply to the case where C++ would only be used in the back-end of a kernel module.

Writing kernel modules in Haskell

Posted Sep 14, 2009 6:26 UTC (Mon) by jgg (subscriber, #55211) [Link] (1 responses)

I've seen the 2007 mail used as a strong damning of C++ quite a few times, but honestly, I find it very strange when taken in context.

The general gist seems to be a damning of OOP principles, which I think is entirely appropriate for large swaths of git, but as usual, has very little to do with C++ in particular. The strange thing is that in the end git ended up using C for all the performance sensitive stuff and really sketchy languages like shell script for much of the hard-to-do-wrappering tasks that are painful in C..

To me the argument was never that a OOP/C++ style of programming should replace the excellent optimized inner stuff, but that a C++ style and language would have been a better, more efficient and maintainable choice than all the shell script/perl/etc.

But these days it seems a lot has been (painstakingly?) implemented in C, so no matter.

Anyhow, Linus is bang on, about most C++ programmers.. C programmers that understand and use OOP concepts tend to understand that it is just a tool and not everything needs to be done with OOP, while C++ programmers with narrow experience think everything is an OOP problem. Java programmers too. Usually makes a very long and painful mess.

To me the biggest programmer folly is to design, then execute a huge slogging painful pointless mess. You see this a lot with inexperienced programmers. Some languages (like C and C++) seem to make it a lot worse. There is ALOT of slogging to do in C if you want to handle errors sensibly.

Look at something like *swan or trousers for great examples of this in action. I had to write an IKEv2 daemon for commercial/embedded/special case/blah, and it turned out to be 8,000 LOC of C++. Does about 80% of what strongswan does and maybe another 5% that strongswan doesn't (the 5% was why this was done, no hope of fitting the 5% into strongswan without being a frickin expert at that mess). Strongswan is 140,000 lines of C. I kid you not, the difference is that great.

Did C++ cause this LOC reduction? Not really, just a smart design, a lot of care and an eye toward eliminating repetition.

Writing kernel modules in Haskell

Posted Sep 14, 2009 14:00 UTC (Mon) by marcH (subscriber, #57642) [Link]

> Anyhow, Linus is bang on, about most C++ programmers.. C programmers that understand and use OOP concepts tend to understand that it is just a tool and not everything needs to be done with OOP, while C++ programmers with narrow experience think everything is an OOP problem. Java programmers too. Usually makes a very long and painful mess.

Aaahhh, OOP... "Dude, not everything is an object."

http://steve-yegge.blogspot.com/2006/03/execution-in-king...

Writing kernel modules in Haskell

Posted Sep 14, 2009 0:14 UTC (Mon) by cesarb (subscriber, #6266) [Link] (12 responses)

Just a note in your second point: yes, C++ can hide memory allocation behind your back. Even a seemingly simple statement like "a = b + c;" can allocate memory in C++, while in C the only way that statement will allocate memory is if the compiler has to spill registers, and even then the allocation will be small and on the stack. In C++, the same "a = b + c;" statement could read from the disk, send a network packet, and draw a smiley face on your screen.

Of course, that can only happen if you create a class elsewhere which does memory allocation (and the other evil things) on "operator+" (and if you have not decided to forbid operators in your classes). But I think Linus's point is that, once you allow the language, people will want to use the features of the language, and before long you lose the ability to tell at a glance the costs and side effects of even simple operations like "a = b" (which in C can only mean the equivalent of a simple "MOV").

In fact, if you look at it this way, the fact that C is less expressive than C++ can be a feature. Having to write things like "obj->ops->fn(obj, foo)" instead of "obj->fn(foo)" makes explicit the extra costs (in this case, an additional memory read for the vtable and the extra "hidden" parameter).

Writing kernel modules in Haskell

Posted Sep 14, 2009 0:32 UTC (Mon) by bjacob (guest, #58566) [Link] (2 responses)

All these arguments are indeed defendable --- but again, they are arguments against using C++ throughout the kernel, and hardly constitute an argument against using C++ internally in the back-end of a kernel module.

That said, if we discuss why C++ isn't used throughout the kernel --- I understand your point. One remark though: about your first paragraph, one might turn the argument upside down and say that C is evil because a function named print_hello() may actually do a lot of malloc's and send a network packet too ;) The difference between C++ and C here is that functions in C++ may appear is more various forms than in C (operators, constructors, etc). It's really a matter of knowing what to expect: an experienced C++ programmer will understand spontaneously when an operator+ is likely to have to allocate memory for the result. Which takes us back to what is really Linus' best argument against C++ throught the kernel, and to your last paragraph... in other words, we agree.

Writing kernel modules in Haskell

Posted Sep 14, 2009 6:13 UTC (Mon) by drag (guest, #31333) [Link]

Well a module is still just a hunk of the kernel and for best results, long-
term, it should be every modules goal to get integrated into the kernel
proper. Therefore its counter productive to use any language other than the C
version that is acceptable by the kernel folks.

Now the haskell thing is a interesting experiment and as such it does not
make sense to be constrained by conventions.. But it may make some change to
the status quo if something is learned from it. So it's neat.

Writing kernel modules in Haskell

Posted Sep 14, 2009 15:17 UTC (Mon) by bfields (subscriber, #19510) [Link]

they are arguments against using C++ throughout the kernel, and hardly constitute an argument against using C++ internally in the back-end of a kernel module.

I don't see how you're using that distinction.

The bulk of the kernel is drivers. If drivers were written in a variety of languages, that would make the drivers as a whole more difficult to maintain.

Writing kernel modules in Haskell

Posted Sep 14, 2009 7:36 UTC (Mon) by alankila (guest, #47141) [Link] (5 responses)

Quibble: a = b can be a struct copy. It will turn into a memcpy, if I recall correctly.

Writing kernel modules in Haskell

Posted Sep 14, 2009 13:20 UTC (Mon) by cesarb (subscriber, #6266) [Link] (4 responses)

Oops, you are right, I forgot about that trick (which I have already used several times myself). Let me rephrase then: "a = b" in C can only mean the equivalent of a simple MOV for each byte of the variable (that is, it will read from each byte of "b" once, and write to each byte of "a" once; of course this will be optimized to read/write several bytes each time). The performance characteristics are still quite transparent.

Writing kernel modules in Haskell

Posted Sep 14, 2009 21:27 UTC (Mon) by alankila (guest, #47141) [Link] (2 responses)

Yes. I fully agree with your basic argument.

Since Java is sort of C++'s successor, perhaps I can be justified to drag it into this discussion. There is something fairly nice about its way to have essentially two languages in it: a simple unextendable arithmetic language for doing underlying operations with -- no confusion possible about what any of the expressions mean and it's analogous to C -- and an object language which is used for extending the core language. It has plenty of the good parts of C++, I guess, but I would agree that it's far from perfect.

However, the discussion we are having now can not occur in Java. In fact, since defining classes is the only possible way you can extend Java in any sense of word, you are even encouraged to not know or care what happens behind a method. It's an unusual dichotomy, but I must agree that it works. It's sort of brilliant, even.

(I'm watching myself turn into a Java enthusiast. Oh dear.)

Writing kernel modules in Haskell

Posted Sep 16, 2009 2:37 UTC (Wed) by ringerc (subscriber, #3071) [Link] (1 responses)

(I'm watching myself turn into a Java enthusiast. Oh dear.)

I'm suffering the same pain. Built-in GUI toolkit (even if it's pretty dire), sane and complete libraries, native Unicode string class (do not mention std::wstring in my presence), no build system pain, solid built-in threading and concurrency ... I'm getting to like it.

I really miss RAII, though. It's not really possible in Java as there's no way to ensure a dtor fires reliably when an object exists scope, as with stack-based instances in C++. It's a real shame and a pain point for me, as working with program flow when exceptions are involved is excruciating without RAII. If I'm missing something and there's a reasonable way to do or approximate RAII in Java, please let me know...

Writing kernel modules in Haskell

Posted Sep 17, 2009 6:10 UTC (Thu) by pynm0001 (guest, #18379) [Link]

Just like I wouldn't write C without one of glib or apr, I wouldn't write
C++ without Qt. Just as an aside, Qt has a built-in GUI toolkit (not dire,
btw), sane and complete libraries (although not as comprehensive as Java
I'd imagine), a native Unicode string class, minimal build system pain
(usually as easy as qmake -project && qmake && make), and solid built-in
threading and concurrency.

And you get RAII.

Writing kernel modules in Haskell

Posted Sep 15, 2009 9:42 UTC (Tue) by etienne_lorrain@yahoo.fr (guest, #38022) [Link]

> of course this will be optimized to read/write several bytes each time

Do not be so sure of this "of course", for instance GCC:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22141
and last time I tried Clang (LLVM) was worse (mov $4,ecx; rep mov stosb).

I have to say that older compiler version were copying fields of structure more efficiently, because they were not optimised to compile
C++ code which may trigger constructor for each fields of the structure.

Writing kernel modules in Haskell

Posted Sep 18, 2009 1:25 UTC (Fri) by kbob (guest, #1770) [Link] (2 responses)

Even a seemingly simple statement like "a = b + c;" can allocate memory in C++, while in C the only way that statement will allocate memory is if the compiler has to spill registers
#define a (buf[read(open("/dev/hda", 0), buf, sizeof buf) - 1])
#define b (send(some_sock, buf, sizeof buf, 0))
#define c (system("xloadimage /usr/share/cups/doc-root/images/smiley.jpg &"))
You were saying?

Writing kernel modules in Haskell

Posted Sep 18, 2009 4:04 UTC (Fri) by njs (subscriber, #40338) [Link] (1 responses)

Don't be ridiculous, the kernel wouldn't hide magic in macro calls.

Writing kernel modules in Haskell

Posted Sep 29, 2009 16:14 UTC (Tue) by Auders (guest, #53318) [Link]

Of course, no one would do anything like that
http://lwn.net/Articles/308520/
:)

Writing kernel modules in Haskell

Posted Sep 14, 2009 0:17 UTC (Mon) by Ed_L. (guest, #24287) [Link] (13 responses)

When you replace ptr = (T*)malloc(sizeof(T)); by ptr = new T;, you haven't hidden the memory allocation!
evaluates "true" only when ptr = new T; is in fact replaceable by ptr = (T*)malloc(sizeof(T)); Are you certain you want to debate this one? ;)

C++, the steampunk language

Posted Sep 14, 2009 7:24 UTC (Mon) by felixfix (subscriber, #242) [Link] (11 responses)

My experience with C++ is summed up by having replaced the simple malloc/free paradigm with THREE different ways to allocate : new/delete, new[]/delete[], and the original. Heaven help you if you mix them up. (I may have details wrong here; it's been a good long time since I had to deal with that muck.)

That is no way to design anything. I remember being disgusted with quite a few other things which I have gratefully mostly forgotten, Virtual should have been the default, I think, some pretense at giving you another knob to fine tune efficiency, You can define methods with a ridiculous number of keywords, more pretentious knobs. There are a zillion things you can do to hide the complexity of your class, to the point that there is no way to tell which operations are expensive, and debugging is a nightmare of chasing from one source file to another to see the secret details of classes to find out what little pitfalls were thrown in simply because they could be. People like to call Perl line noise, but C++ makes Perl look like a nice kindergarten language in the ways you can write write-only code, and every fool CS major thinks he is cool for doing so.

It reminds me of steampunk after it got past the initial fun stage: full of pointless knobs stuck on in an attempt to look like there's a reason behind them, when all they do is ... nothing except add clutter to make a fashion statement.

C++, the steampunk language

Posted Sep 14, 2009 8:11 UTC (Mon) by alankila (guest, #47141) [Link]

C++ is, sadly, an expert's language. For good or bad, all that crap is there to squeeze out every last performance bit available -- if you know how to use them and judge them worth the effort. C++ had to beat C on its own turf, and thus every feature of C++ was compromised (or made completely optional) to ensure that C++ could compile at least as well as C.

From design point of view, C++ seems to be quite unpleasant to program with, so I agree with you on that one. I think I feel some kind of a personal laundry list banging the backside of my skull wanting to come out, but there's no point to it -- C++ is a lost cause, and as long as backwards compatibility is required there is not much improvement on the horizon.

C++, the steampunk language

Posted Sep 14, 2009 16:17 UTC (Mon) by tjc (guest, #137) [Link] (4 responses)

My experience with C++ is summed up by having replaced the simple malloc/free paradigm with THREE different ways to allocate : new/delete, new[]/delete[], and the original.
Since you bring it up, do you happen to know why C++ requires a separate form (new[] v. new) for allocating and deallocating arrays?

I'm aware that arrays are converted to pointers in parameter declarations and (most) expressions, but I don't think that's an issue here. There should be enough information in the symbol table to identify an object as an array and [de]allocate memory for it without resorting to a special form.

C++, the steampunk language

Posted Sep 14, 2009 16:21 UTC (Mon) by avik (guest, #704) [Link] (1 responses)

delete[] needs to call the destructors of every element of the array, so it needs to know it is destroying an array.

C++, the steampunk language

Posted Sep 14, 2009 16:30 UTC (Mon) by nye (subscriber, #51576) [Link]

Note that there is a bit more to it than that, otherwise it would be okay to use delete on an array of POD types, which it isn't (see my other post). That's undoubtedly the root reason though.

C++, the steampunk language

Posted Sep 14, 2009 16:26 UTC (Mon) by nye (subscriber, #51576) [Link] (1 responses)

The first response to http://stackoverflow.com/questions/659270/why-is-there-a-... gives a reasonable answer.

C++, the steampunk language

Posted Sep 14, 2009 17:43 UTC (Mon) by tjc (guest, #137) [Link]

Thanks for the link. There is some other useful information on that page; see the post by Andrew Grant.

IIUC the size of an object could be stored in the symbol table, but that would only work with objects within lexical scope-- a dynamically allocated object returned from a library function would still be a problem, since there is no portable way to attach size information to an object in C.

C++, the steampunk language

Posted Sep 14, 2009 17:37 UTC (Mon) by jgg (subscriber, #55211) [Link] (4 responses)

> My experience with C++ is summed up by having replaced the simple malloc/free paradigm with THREE different ways to allocate : new/delete, new[]/delete[], and the original. Heaven help you if you mix them up. (I may have details wrong here; it's been a good long time since I had to deal with that muck.)

And C has an infinite variety of alloc/free functions..

Foo *foo = calloc(1,sizeof(Foo));
Foo *foo = LIBRARY_foo_Alloc();
Foo foo; LIBRARY_foo_Init(&foo);
Foo **foo = malloc(n*sizeof(Foo *)); for (int i = 0; i != n; i++) foo[i] = LIBRARY_foo_Alloc();

It is ever so much fun in mixing and matching them! Heaven help you if you use the incorrect free for the alloc that was chosen!!

new/new [] is actually a substantial increase in API uniformity compared to C and the multitude of libraries.

C++, the steampunk language

Posted Sep 14, 2009 18:28 UTC (Mon) by felixfix (subscriber, #242) [Link] (1 responses)

Why are you dragging in library routines? They have nothing to do with core C.

C++, the steampunk language

Posted Sep 14, 2009 20:06 UTC (Mon) by jgg (subscriber, #55211) [Link]

> Why are you dragging in library routines? They have nothing to do with core C.

Wow, that is just so pedantic. You do realize that the ISO C standard includes a library and that library includes functions like I described (fopen/fclose for instance). SUS adds more. It isn't like there is another way to do this ..

C++, the steampunk language

Posted Sep 14, 2009 19:46 UTC (Mon) by bronson (subscriber, #4806) [Link] (1 responses)

Er, by definition C++ has all of C's "infinite variety" of alloc/free functions. Plus it adds more of its own (it's great fun watching people incorrectly override delete[]... from a distance).

If you say, "well, don't use X in C++" then I'll just say "don't use X in C."

C++, the steampunk language

Posted Sep 15, 2009 14:33 UTC (Tue) by nix (subscriber, #2304) [Link]

Yes, but in C++ code you don't use that 'infinite variety': you wrap whateveritis up in an object and call the appropriate C allocation/freeing functions *once*, from the constructor and destructor of that object. Then you can consistently use the new/delete family of functions *everywhere* else. So the chaos of allocation and freeing functions that can easily be mixed up transforms into a single allocation/freeing function for all objects and a single one for all arrays of objects (which are sort of deprecated now 'cos the STL types are just as good). And it becomes impossible to use the wrong allocation and freeing functions anywhere, and memory leaks become harder to accidentally introduce as well (as the majority of object allocations are on the stack, so construction and destruction are automatic).

(The same applies to malloc()/free(): they shouldn't be used in well-written C++ programs, at all. They're not typesafe.)

Writing kernel modules in Haskell

Posted Sep 17, 2009 6:17 UTC (Thu) by pynm0001 (guest, #18379) [Link]

> > When you replace ptr = (T*)malloc(sizeof(T)); by ptr = new T;, you
> > haven't hidden the memory allocation!

> evaluates "true" only when ptr = new T; is in fact replaceable by ptr =
> (T*)malloc(sizeof(T)); Are you certain you want to debate this one? ;)

You mean as opposed to the more generic C:

ptr = (T*)malloc(sizeof(T));
foo_type_initialize_default(ptr);

that a C++ "ptr = new T;" is capable of expressing? You weren't going to
use that ptr uninitialized were you? This is 2009, there's no reason to
be using uninitialized structs!

Writing kernel modules in Haskell

Posted Sep 15, 2009 0:24 UTC (Tue) by jengelh (guest, #33263) [Link]

>I have worked with a Linux kernel module that was written in C++ *and* used
floating point.

VMware WS has, or used to have, some C++ (IIRC, only for some little templates, there was not much too much visible C++ classes, operator-overloads or otherwise)

The latter is not particularly hard ever since the FPU state is properly saved and restored.

Writing kernel modules in Haskell

Posted Sep 13, 2009 21:31 UTC (Sun) by Randakar (guest, #27808) [Link]

10x the effort might still cheap enough for some people.

A factor 10 effort multiplication on some minor addition to an already working system might easily be dwarfed by a) how much effort is already expended daily on regular maintenance and b) is saved further down the chain because a million distributers and kernel devs don't need to understand how to get Haskell going for their kernel build when that module is written in C ;-)

Writing kernel modules in Haskell

Posted Sep 14, 2009 9:01 UTC (Mon) by dgm (subscriber, #49227) [Link]

First, the 10x figure is bogus at best.
Second, 1/10th the development time, at the cost of losing control would be OK for a one off number crunching app, but it's not an option for an OS Kernel.
Finally, if you care so much about rapid development, then maybe C++ is not your best option. Go with Python (I have read there are wonderful math libraries) or something like that.
That coming from somebody that has been writing C++ code for the last 14 years.
My biggest gripe with C++ is that it's like a ubercomplex laser scalpel: great in the right hands, deadly in the the wrong ones. It's a great language for top-down communication (for example, writing libraries), but sucks for interchange of code among peers.

Writing kernel modules in Haskell

Posted Sep 14, 2009 9:22 UTC (Mon) by sagrailo (guest, #48054) [Link] (10 responses)

An example is if eventually a kernel module needs to do matrix computations. C++ or Fortran matrix libraries are much more powerful than C ones (Fortran because of native array arithmetic; C++ because templates allow to do that and much more)...

If you need vector/matrix computation, you need to use BLAS/LAPACK/etc., period; no one should try to re-implement that kind of code on his own. As for general numerical calculations, I'd agree with your statement that so far Fortran compilers were able to generate much better code than C compilers, but I certainly disagree that C++ template numerical libraries are better than C code - I worked with number of them, and these are all crap, regarding performance, readability etc.; this stuff is good only if you're into writing some papers on how smart and wonderful your usage of templates is.

Writing kernel modules in Haskell

Posted Sep 14, 2009 21:23 UTC (Mon) by gmaxwell (guest, #30048) [Link] (1 responses)

"If you need vector/matrix computation, you need to use BLAS/LAPACK/etc., period;"

Any kind? Er, so the kernel should be including megabytes of unswappable finite field matrix algebra library code just to perform raid-6? I think notÂ…

General tools are nice for general things. The kernel is not, almost by definition, a general thing.

Writing kernel modules in Haskell

Posted Sep 15, 2009 10:24 UTC (Tue) by sagrailo (guest, #48054) [Link]

Have you ever actually looked into the layout of a BLAS implementation? Netlib BLAS consists of hundreds of small files, mostly one file per numerical method, and it's easy to pickup needed files and put them in your code. On the other side, Netlib BLAS is admittedly just a reference implementation, not fast at all (and also actually requiring Fortran 77 compiler to build), and things get more complicated with optimized BLAS implementation. For example, with ATLAS, these files are generated, so either you'd have to generate them up-front for all needed processors, or to include the whole infrastructure for auto-generating these into your code (lots of code there, indeed, but if you want to do it right, you'd have actually to write all of this it yourself). As for vendor libraries (like MKL etc.), you may be in better luck with negotiating with vendors to provide you with some kind of license to use needed routines. But in any case you're better with using an existing solution, than to reimplementing it yourself (especially as numerical codes are notoriously hard to get right, which is why it makes even more sense to stick to re-using existing code).

Writing kernel modules in Haskell

Posted Sep 14, 2009 22:50 UTC (Mon) by Ed_L. (guest, #24287) [Link] (7 responses)

I'm sure you have good foundation for your opinions. But my experience, specifically with the Template Numeric Toolkit, is decidedly different. In one application where I implemented Lapack and TNT equivalents side by side, the TNT code was roughly twice as fast as Lapack. This was on CentOS, GCC 4.1.2. I don't recall whether I was linking the system-supplied Lapack or one I custom compiled, and I don't know if the difference was attributable to optimization flag differences or not. FWIW, library code is typically optimized -O2, my TNT segment was -O3 and there were several adjacent function call candidates for successive inline optimization. Perhaps a different example would skew a different direction.

Call me a crappy coder, but I have been singularly unsuccessful using multidimensional C++ arrays without resort to template wrappers, for which there is discernible run-time penalty for only the smallest of arrays. (Numerical Recipes style solutions also work.)

Writing kernel modules in Haskell

Posted Sep 15, 2009 10:35 UTC (Tue) by sagrailo (guest, #48054) [Link] (6 responses)

Yeah, right... I just quickly wrote this code (to multiply two matrices):

#include <algorithm>
#include <cassert>
#include <cstdlib>
#include <ctime>
#include <iostream>

extern "C" {
#include <cblas.h>
}

#include <tnt.h>

#define N 1024

int
main(void)
{
    float* A = new float[N * N];
    float* B = new float[N * N];
    float* C = new float[N * N];
    std::fill(C, C + N * N, 0);

    TNT::Array2D<float< ATNT(N, N);
    TNT::Array2D<float< BTNT(N, N);
    TNT::Array2D<float< CTNT(N, N);

    for (int i = 0; i < N; ++i)
        for (int j = 0; j < N; ++j) {
            A[i * N + j] = ATNT[i][j] = 1;
            B[i * N + j] = BTNT[i][j] = 1;
        }

    clock_t start, end;

    start = clock();
    cblas_sgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, N, N, N, 1, A, N, B,
                N, 0, C, N);
    end = clock();
    std::cerr
        << "ATLAS: "
        << (float) (end - start) / CLOCKS_PER_SEC
        << "s"
        << std::endl;

    start = clock();
    CTNT = TNT::matmult(ATNT, BTNT);
    end = clock();
    std::cerr
        << "TNT:   "
        << (float) (end - start) / CLOCKS_PER_SEC
        << "s"
        << std::endl;

    for (int i = 0; i < N; ++i)
        for (int j = 0; j < N; ++j) {
            assert(C[i * N + j] == N);
	    assert(CTNT[i][j] == N);
        }

    delete [] A;
    delete [] B;
    delete [] C;
}

Complied with (I've put TNT into /tmp, and I have ATLAS installed in /usr/local):

g++ -O3 -o foo -I/tmp/tnt foo.cpp -lcblas -latlas

The output, on my old ThinkPad T61p:

ATLAS: 0.22s
TNT:   17.64s
So - C++ almost two orders of magnitude slower (and that's typical, in my experience).

Writing kernel modules in Haskell

Posted Sep 15, 2009 12:01 UTC (Tue) by jbh (guest, #494) [Link] (4 responses)

Interesting! I've wanted to look into Eigen as well, which is another
template linear algebra library (http://eigen.tuxfamily.org). It seems to
hold up pretty well against Atlas (on Thinkpad T60, compiled with -msse2
flag):

ATLAS: 0.43s
TNT: 32.25s
Eigen: 0.76s

The major additions to your test program are (in different places):

#include <Eigen/Core>
--
MatrixXf Aeig(N,N), Beig(N,N), Ceig(N,N);
Aeig.setOnes(); // or Aeig = MatrixXf::Ones(N,N);
Beig.setOnes();
Ceig.setZero();
--
Ceig = Aeig * Beig;
--
assert(Ceig(i,j) == N);

Writing kernel modules in Haskell

Posted Sep 15, 2009 12:50 UTC (Tue) by bjacob (guest, #58566) [Link] (3 responses)

Hm, small world, I'm actually one of the main developers of Eigen. I didn't want to make a plug, but now that you've mentioned it...

I'm surprised by your result, because in our experience, we are consistently faster than ATLAS for matrix products. Already with Eigen 2.0, and even more so in the development branch. In fact, on single-CPU systems, we have almost the performance of vendor-tuned BLAS.

development branch:
http://eigen.tuxfamily.org/index.php?title=Benchmark

2.0:
http://eigen.tuxfamily.org/index.php?title=Benchmark-Augu...

It's good that you passed -msse2, indeed.

Some things to consider:

* did you allow ATLAS to use more than one thread? That would probably not be applicable in the setting of a kernel module! (As far as I understand) so perhaps it's more fair to require ATLAS to use only one thread.

* did you tune ATLAS for your hardware, especially CPU cache size? You can do the same for Eigen, then. For example with the development version it's
EIGEN_TUNE_FOR_CPU_CACHE_SIZE, see in the file Macros.h.

Writing kernel modules in Haskell

Posted Sep 15, 2009 13:09 UTC (Tue) by jbh (guest, #494) [Link] (1 responses)

In short, no tuning at all, neither ATLAS or Eigen. Standard ubuntu (jaunty)
packages for everything. This was on my laptop, and I only bother to tune
for production runs; those are not on my laptop :)

Eigen already looks very impressive, for a relatively young project in a
problem space that's dominated by dinosaurs --- I've meant to look into it
for a while, but not enough time and all that, and last time I checked it
didn't have sparse matrix support (I see it does now, although
experimental).

Writing kernel modules in Haskell

Posted Sep 15, 2009 13:29 UTC (Tue) by bjacob (guest, #58566) [Link]

OK, my best guess then is that ATLAS by default uses more than one thread...
anyway thanks for the kind words. Yes, there are a lot of dinosaurs in this area and we're sort of the small mammals in comparison.

(If it wasn't obvious, the largest advantage of a c++ template library like Eigen over a large C or Fortran binary library like a BLAS, is that it spontaneously generates only the code that's needed at build-time, and after that you can forget about it, so your kernel module is only a few dozen kilobytes self-contained; by comparison vendor-tuned BLAS take dozens of megabytes).

Writing kernel modules in Haskell

Posted Sep 29, 2009 23:14 UTC (Tue) by Auders (guest, #53318) [Link]

I did the same test with standard jaunty packages and the same compilation parameters and for me Eigen outperformed ATLAS, as you expected:
ATLAS: 0.69s
TNT: 21.93s
Eigen: 0.56s

Matrix Libraries

Posted Sep 16, 2009 20:50 UTC (Wed) by cry_regarder (subscriber, #50545) [Link]

Of course ublas has bindings for atlas so you can use this approach:

...
#include <boost/numeric/ublas/matrix.hpp>
#include <boost/numeric/bindings/atlas/cblas.hpp>
#include <boost/numeric/bindings/traits/ublas_matrix.hpp>
...
namespace ublas = boost::numeric::ublas;
namespace atlas = boost::numeric::bindings::atlas;
...
atlas::gemm(CblasNoTrans, CblasNoTrans, 1.0,
Aublas, Bublas, 0.0, Cublas);
...

and this compile:
g++ -O3 -o mat mat.c++ -I/tmp/boost-numeric-bindings -L /usr/lib64/atlas -latlas -lcblas -DNDEBUG

to get run times like:

ATLAS: 0.16s
uBLAS: 0.16s

...

Cry

Writing kernel modules in Haskell

Posted Sep 14, 2009 8:31 UTC (Mon) by simlo (guest, #10866) [Link] (7 responses)

Yes!

At my work I write "drivers" in Java (in user space, through simple native calls to get access to memory mapped hardware registers in the hardware). There is nothing wrong about writing hardware near stuff in higher level languages, where you have no direct access to low-level bit manipulation. In fact, to make C portable you must avoid doing the very low-level bit-manipulations directly but put them away in libraries. Then you can just as well access those libraries from another language.

In fact, if I were to write an OS kernel from scratch, I would start out by making a small runtime system for a language like OCAML and then write drivers and filesystem etc. in that language. Some of the scheduler and the garbage collector would have to be written in C. The major technological hurdle would be to make a scaleable, realtime garbage collector, but if you can construct it along with the compiler and the scheduler, it might turn out to be easier than doing it in userspace. And you can also use that in "almost functional" languages, you make far fewer memory modifications than in Java. I.e. you can optimize the system for "many readers", "few writers". Old memory can not have pointers to new memory.

This also tells us one of the benifits: Better scaleability since the garbage collector superseeds the RCU. A functional language makes it natural to write good RCU client code, since you never change objects. You (almost) only need locks to protect the hardware from simultanious access, since (almost) all ordinary memory structures are read-only. And then, strongly typed functional language makes must safer code. Writing the kernel in such a system, would be a more efficient way to get a stable system than using microkernels.

Writing kernel modules in Haskell

Posted Sep 14, 2009 14:56 UTC (Mon) by PaulMcKenney (✭ supporter ✭, #9624) [Link]

Still working on getting the equivalent of rcu_dereference() and rcu_assign_pointer() into the Java standard. Thus far, the powers that be remain unamused by this suggestion. That said, there seemed to be a bit less resistance this time around than last time.

In the meantime, make sure you use the "volatile" keyword appropriately...

Writing kernel modules in Haskell

Posted Sep 15, 2009 18:10 UTC (Tue) by dambacher (subscriber, #1710) [Link]

don't reinvent!
Look at smalltalk80 .-)=====

Writing kernel modules in Haskell

Posted Sep 17, 2009 10:56 UTC (Thu) by bboissin (subscriber, #29506) [Link] (3 responses)

Some students in my school did a small ocaml kernel som years ago, it was
booting in qemu and on real hardware.

http://gna.org/projects/funk/

Writing kernel modules in Haskell

Posted Sep 18, 2009 14:19 UTC (Fri) by simlo (guest, #10866) [Link] (2 responses)

Last time I looked at OCAML, the implementation was bad: I used a "grand interpreter lock" (just as Python). You have to get rid of that for it to make sense.

I also disagree with the main page statement

"It could then be used as an educational tool for teaching OS design, for instance, or may be just replace Linux before Hurd goes out. "

Sorry, it is the other way around: It would be faster and more stable than any microkernel, it is just a question of optimizing the compiler. Even in very optimized assambly a microkernel will be slower - and each "service" will be more buggy since it typically will be written in C.

Writing kernel modules in Haskell

Posted Sep 18, 2009 14:30 UTC (Fri) by bboissin (subscriber, #29506) [Link] (1 responses)

Here Ocaml isn't run in interpreted mode, and maybe the GC has a global
lock but I don't think there's any other lock when Ocaml is run natively.

Secondly, I think you took that comment too seriously, the reference to
hurd shows it was meant as a joke :)
(It was just students having fun building an OS from scratch)

Writing kernel modules in Haskell

Posted Sep 19, 2009 22:41 UTC (Sat) by simlo (guest, #10866) [Link]

Well, I haven't checked again, but years ago I looked into it and that version of OCAML had something similar to the "grand interpreter lock" even though it was copiled to native code :-(

But with help from the compiler and the scheduler, it ought to be possible to make a system with no central lock - making it scale - although it will be very hard. But using an (almost) pure functional language will help in that: You for instance know that an old piece of memory can never have references to later allocated memory since it wasn't change. And in the few places where you do update a reference the compiler can put in checks to help the GC as well as the right read/write barriers for the readers on other CPUs.

Well, I can always keep on dreaming..

assignment-free garbage-collected data processing considered no substitute for cache locality

Posted Sep 27, 2009 13:46 UTC (Sun) by xoddam (guest, #2322) [Link]

Unfortunately one consequence of avoiding assignment in your language is that sane-looking code requires new locations for new data. The only way to say "I want this data updated in place" is to declare a monad for that data, and a multitude of monads is probably as problematic to debug as C code.

Otherwise you will end up doing *everything* as garbage-collected RCU. Great, in theory, for scalability on the grand scale but dreadful for performance on typical machines with only a few CPUs(!) and a few gigabytes (!) of RAM.

Perhaps best to implement the kernel of your kernel in well-audited C but begin to adopt the secure languages for peripheral components such as device drivers.

Writing kernel modules in Haskell, you might say.

Writing kernel modules in Haskell

Posted Sep 21, 2009 14:12 UTC (Mon) by shlomif (guest, #11299) [Link] (4 responses)

<p>
<a href="http://www.shlomifish.org/humour/fortunes/shlomif.html#sh...">this IRC conversation on my page should be instructive</a> (I am "rindolf" there). Essentially, even an elegant dinky "log analyser" that people showed me how to write properly using ghc segfaulted. And I don't mean thrown an exception - it segfaulted. As much as I like Haskell, I still think the language has many inherent limitations due to its restrictive design decisions, and ghc still has many bugs, and is a huge memory hog (possibly due to a good reason). Haskell and ghc are still fine to play with and highly enlightening, but I'd never use them in anything "industrial strength", that I want to depend on.
</p>

<p>
Someone I talked with on IRC, said he'd like to create an alternative operating system written in Haskell, that would be incompatible with Linux. Then when I said <a href="http://www.joelonsoftware.com/articles/fog0000000054.html">no one will use it because it won't run any UNIX (or Windows for that matter) applications</a>, he said that "fine, we can implement a nice POSIX emulation subsystem in a jail.". The reason GNU/Linux was successful was because the GNU run-time and other open-source UNIX/POSIX programs were ported to it easily, and in time it also proved attractive to vendors of commercial and/or non-FOSS software. If the Linux kernel was incompatible with anything else, it would have faded into obscurity.
</p>

<p>
So I don't see why I should use the new OS with a POSIX-compatible system that will run everything I'm using now more slowly, less natively and less reliably. Granted, the C/UNIX/TCP/IP system is not without its inherent philosophical problems, but everything today relies on it too much. In <a href="http://www.shlomifish.org/humour/Star-Trek/We-the-Living-...">a Star Trek screenplay I started writing (a fan-fiction of sorts)</a> I whimsically contemplated that their tricorder still runs a UNIX-based OS, and can be operated in X mode from the command line. I'm not sure UNIX (whether Linux, BSD or a different kernel) will still be prevalent 300 years from now, but we cannot rule it out.
</p>

<p>
My experience told me that such grand re-inventing everything from scratch schemes tend to amount to very little of promise. (And yes - I've tried some of them myself). Inertia is important and like it or not, reality needs to be obeyed in order to be conquered.
</p>

<p>
In short - nice hack, but not terribly useful. The kernel developers are likely to reject Haskell modules, and I would too in their shoes. Now, it's time for Perl, Python, Ruby, Lua, Tcl, PHP, etc. to reciprocate... ;-)
</p>

Writing kernel modules in Haskell

Posted Sep 21, 2009 14:13 UTC (Mon) by shlomif (guest, #11299) [Link] (3 responses)

(Sorry for the bad formatting, I pressed the wrong button by accident.)

this IRC conversation on my page should be instructive (I am "rindolf" there). Essentially, even an elegant dinky "log analyser" that people showed me how to write properly using ghc segfaulted. And I don't mean thrown an exception - it segfaulted. As much as I like Haskell, I still think the language has many inherent limitations due to its restrictive design decisions, and ghc still has many bugs, and is a huge memory hog (possibly due to a good reason). Haskell and ghc are still fine to play with and highly enlightening, but I'd never use them in anything "industrial strength", that I want to depend on.

Someone I talked with on IRC, said he'd like to create an alternative operating system written in Haskell, that would be incompatible with Linux. Then when I said no one will use it because it won't run any UNIX (or Windows for that matter) applications, he said that "fine, we can implement a nice POSIX emulation subsystem in a jail.". The reason GNU/Linux was successful was because the GNU run-time and other open-source UNIX/POSIX programs were ported to it easily, and in time it also proved attractive to vendors of commercial and/or non-FOSS software. If the Linux kernel was incompatible with anything else, it would have faded into obscurity.

So I don't see why I should use the new OS with a POSIX-compatible system that will run everything I'm using now more slowly, less natively and less reliably. Granted, the C/UNIX/TCP/IP system is not without its inherent philosophical problems, but everything today relies on it too much. In a Star Trek screenplay I started writing (a fan-fiction of sorts) I whimsically contemplated that their tricorder still runs a UNIX-based OS, and can be operated in X mode from the command line. I'm not sure UNIX (whether Linux, BSD or a different kernel) will still be prevalent 300 years from now, but we cannot rule it out.

My experience told me that such grand re-inventing everything from scratch schemes tend to amount to very little of promise. (And yes - I've tried some of them myself). Inertia is important and like it or not, reality needs to be obeyed in order to be conquered.

In short - nice hack, but not terribly useful. The kernel developers are likely to reject Haskell modules, and I would too in their shoes. Now, it's time for Perl, Python, Ruby, Lua, Tcl, PHP, etc. to reciprocate... ;-)

Writing kernel modules in Haskell

Posted Sep 23, 2009 5:45 UTC (Wed) by TomMD (guest, #56998) [Link] (2 responses)

> this IRC conversation on my page should be instructive (I am "rindolf" there).

PERL folks bashing on Haskell in #perl is instructive? I think your a tad under the bar.

> Essentially, even an elegant dinky "log analyser" that people showed me how to write properly using ghc segfaulted.

On a practical level: this isn't helpful without a version of GHC, platform information, and a copy of the progrma that was "properly written". On a theoretical level: this says nothing about the language.

> As much as I like Haskell, I still think the language has many inherent limitations due to its restrictive design decisions

There are many design decisions I'd happily question - but I wouldn't call those ones limiting. Which ones are you referring to?

> and ghc still has many bugs

True! But its been a while since I've been caught by a segfault.

> and is a huge memory hog

Are you saying your programs are huge memory hogs? The compilation process takes too much memory? Perhaps you are claiming "all" or "too many" progrmas use unnecessary amounts of memory? If your claim is either of the second two I must disagree. I've built many memory friendly programs, but it does take a little more skill to predict the memory use than most people think it should.

> Someone I talked with on IRC, said he'd like to create an alternative operating system written in Haskell

Hundreds of them out there (alternative OSes) and Haskell is fun to play with. I'd like to do this too. His lack of care about POSIX doesn't say anything about the language or the tool - so I'm not sure what you're getting at there or if you're just going on a tangent.

> The kernel developers are likely to reject Haskell modules

Upstreaming was never a goal - oh, that would be a headache without any reward!

Writing kernel modules in Haskell

Posted Sep 27, 2009 7:25 UTC (Sun) by shlomif (guest, #11299) [Link] (1 responses)

Essentially, even an elegant dinky "log analyser" that people showed me how to write properly using ghc segfaulted.

On a practical level: this isn't helpful without a version of GHC, platform information, and a copy of the progrma that was "properly written". On a theoretical level: this says nothing about the language.

Here is the Haskell program in all its glory - I lost the original one, but people on #haskell have recreated a mostly equivalent one:

import System.Environment
import Data.List

main = do [infile,outfile] <- System.Environment.getArgs
          text <- readFile infile
          writeFile outfile . unlines . map (\xs -> (show (length xs)) ++ "\t" ++ head xs) . group . sort . words $ text

This program no longer segfaults (even though I recall it segfaulting), but it consumes 38.3% of my 2.5 GB of RAM (I still remember my first 20 MB hard-disk), and takes a long time to run. On the other hand, the equivalent Perl program consumes less than 2% of my RAM and finishes faster:

#!/usr/bin/perl 

use strict;
use warnings;

my %hash;
open my $in, "<", "FreeNode-#perlcafe.log"
    or die "Could not open - $!";
while (my $l = <$in>)
{
    chomp($l);
    foreach my $word (split (/ +/, $l))
    {
        $hash{$word}++;
    }
}
close($in);
open my $out, ">", "analysis-perl.txt"
    or die "Could not open for writing - $!";
while (my ($word, $occurs) = each(%hash))
{
    print {$out} "$occurs\t$word\n";
}
close($out);

The Lisp program using SBCL also runs adequately.

I'm running it on this file:

$ wc FreeNode-#perlcafe.log
  383760  3773752 22564515 FreeNode-#perlcafe.log
As much as I like Haskell, I still think the language has many inherent limitations due to its restrictive design decisions

There are many design decisions I'd happily question - but I wouldn't call those ones limiting. Which ones are you referring to?

The very strong typing, the lazy-only evaluation, the lack of assignment. All of these are essential cornerstones of Haskell, yet result in under-performing code when put together.

Writing kernel modules in Haskell

Posted Oct 7, 2009 17:53 UTC (Wed) by TomMD (guest, #56998) [Link]

1) You are comparing different algorithms.
Your algorithms for these two program are entirely different. Your Perl program will take one word, modify a map of counters, and then process the next word - a constant space algorithm. This is entirely doable in Haskell but you have decided not to. Instead, you read in the entire file, sort the words, group them, and count the resulting list lengths, requiring O(n) memory (actually more, given that you are sorting).

2) You are using linked lists of linked lists for processing
Anyone serious about time and space would be using packed strings, from the bytestring package.

The first issue seems especially obvious, making me think this is done on purpose. Please don't troll on LWN.


Copyright © 2009, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds