LWN.net Logo

Writing kernel modules in Haskell

Writing kernel modules in Haskell

Posted Sep 13, 2009 16:23 UTC (Sun) by bjacob (subscriber, #58566)
Parent article: Writing kernel modules in Haskell

While this is obviously just a fun experiment, there are real use cases for writing kernel code in a language more feature-full than C.

An example is if eventually a kernel module needs to do matrix computations. C++ or Fortran matrix libraries are much more powerful than C ones (Fortran because of native array arithmetic; C++ because templates allow to do that and much more) so one could imagine a kernel module that would be written in C++ with extern "C", for example. Actually, this will happen as soon as some kernel module needs to implement some clever math algorithm.


(Log in to post comments)

Writing kernel modules in Haskell

Posted Sep 13, 2009 16:49 UTC (Sun) by cesarb (subscriber, #6266) [Link]

It is more probable that the kernel developers will push the clever math algorithm to a userspace daemon.

Writing kernel modules in Haskell

Posted Sep 13, 2009 16:58 UTC (Sun) by bjacob (subscriber, #58566) [Link]

Sure, that makes a lot of sense. I don't have the experience though to tell if that is always possible or if sometimes stuff really must stay in-kernel. For example, what if a filesystem, or a scheduler, needs to do some statistical analysis that relies on a matrix computation, etc.

Writing kernel modules in Haskell

Posted Sep 13, 2009 21:10 UTC (Sun) by csamuel (✭ supporter ✭, #2624) [Link]

Then they'll probably do it in C for the general case and hand optimised
assembler for specific CPUs. ;-)

Writing kernel modules in Haskell

Posted Sep 13, 2009 21:13 UTC (Sun) by bjacob (subscriber, #58566) [Link]

That can sure be done -- with 10x the effort that it would take using a good C++ template library ;)

Writing kernel modules in Haskell

Posted Sep 13, 2009 21:30 UTC (Sun) by daney (subscriber, #24551) [Link]

I have worked with a Linux kernel module that was written in C++ *and* used
floating point. From a technical point of view, it is not difficult to do. The politics of merging it upstream on the other hand...

Writing kernel modules in Haskell

Posted Sep 13, 2009 22:57 UTC (Sun) by cesarb (subscriber, #6266) [Link]

Just for fun, an old (2004) thread about someone who did exactly that: http://kerneltrap.org/node/2067

Writing kernel modules in Haskell

Posted Sep 13, 2009 23:17 UTC (Sun) by bjacob (subscriber, #58566) [Link]

Interesting...

Linus's quote leave little hope that C++ code ever makes it into the kernel, indeed.

Just for the record, here's how I would counter his claims (in the link you posted):

Quote:
"The fact is, C++ compilers are not trustworthy. They were even worse in 1992, but some fundamental facts haven't changed: 1) the whole C++ exception handling thing is fundamentally broken. It's _especially_ broken for kernels. 2) any compiler or language that likes to hide things like memory allocations behind your back just isn't a good choice for a kernel. 3) you can write object-oriented code (useful for filesystems etc) in C, _without_ the crap that is C++."

1) Doesn't matter, nobody's forced to use exceptions, there are lots of C++ project who don't (KDE for example) for a variety of reasons. So exceptions being inappropriate isn't an argument against C++ in general.

2) C++ doesn't hide any memory allocation behind your back. When you replace ptr = (T*)malloc(sizeof(T)); by ptr = new T;, you haven't hidden the memory allocation!

3) meh, if Linus is happy about the object-oriented-C style that they use in the kernel, who am I to argue against that. I just have yet to hear the details of how C++ is "crap" here. It just does this repetitive work for you, in a standardized way.

***

It's funny that Linus doesn't mention what seems to me to be the most sensible reason to oppose C++ : it's far easier to organize a community of C programmers than a community of C++ programmers, to make everyone understand and use the same coding paradigms, to review patches etc... since C is such a simple language.

Writing kernel modules in Haskell

Posted Sep 13, 2009 23:36 UTC (Sun) by cesarb (subscriber, #6266) [Link]

The best email I have see so far from Linus on why he does not like C++ was in fact on the git mailing list: http://article.gmane.org/gmane.comp.version-control.git/5...

As a bonus, I just found two threads even older than the 2004 one (the link above is from 2007):

http://web.archive.org/web/20020821190433/kt.linuxcare.co... (from 2000)

http://web.archive.org/web/20020717162937/kt.linuxcare.co... (from 1999!)

Writing kernel modules in Haskell

Posted Sep 13, 2009 23:46 UTC (Sun) by bjacob (subscriber, #58566) [Link]

Ah, the 2007 link is better indeed (once you take out the Linus-isms!)

These arguments are against introducing c++ throughout the kernel, object models etc. Still, coming back to the original topic --- these arguments don't apply to the case where C++ would only be used in the back-end of a kernel module.

Writing kernel modules in Haskell

Posted Sep 14, 2009 6:26 UTC (Mon) by jgg (guest, #55211) [Link]

I've seen the 2007 mail used as a strong damning of C++ quite a few times, but honestly, I find it very strange when taken in context.

The general gist seems to be a damning of OOP principles, which I think is entirely appropriate for large swaths of git, but as usual, has very little to do with C++ in particular. The strange thing is that in the end git ended up using C for all the performance sensitive stuff and really sketchy languages like shell script for much of the hard-to-do-wrappering tasks that are painful in C..

To me the argument was never that a OOP/C++ style of programming should replace the excellent optimized inner stuff, but that a C++ style and language would have been a better, more efficient and maintainable choice than all the shell script/perl/etc.

But these days it seems a lot has been (painstakingly?) implemented in C, so no matter.

Anyhow, Linus is bang on, about most C++ programmers.. C programmers that understand and use OOP concepts tend to understand that it is just a tool and not everything needs to be done with OOP, while C++ programmers with narrow experience think everything is an OOP problem. Java programmers too. Usually makes a very long and painful mess.

To me the biggest programmer folly is to design, then execute a huge slogging painful pointless mess. You see this a lot with inexperienced programmers. Some languages (like C and C++) seem to make it a lot worse. There is ALOT of slogging to do in C if you want to handle errors sensibly.

Look at something like *swan or trousers for great examples of this in action. I had to write an IKEv2 daemon for commercial/embedded/special case/blah, and it turned out to be 8,000 LOC of C++. Does about 80% of what strongswan does and maybe another 5% that strongswan doesn't (the 5% was why this was done, no hope of fitting the 5% into strongswan without being a frickin expert at that mess). Strongswan is 140,000 lines of C. I kid you not, the difference is that great.

Did C++ cause this LOC reduction? Not really, just a smart design, a lot of care and an eye toward eliminating repetition.

Writing kernel modules in Haskell

Posted Sep 14, 2009 14:00 UTC (Mon) by marcH (subscriber, #57642) [Link]

> Anyhow, Linus is bang on, about most C++ programmers.. C programmers that understand and use OOP concepts tend to understand that it is just a tool and not everything needs to be done with OOP, while C++ programmers with narrow experience think everything is an OOP problem. Java programmers too. Usually makes a very long and painful mess.

Aaahhh, OOP... "Dude, not everything is an object."

http://steve-yegge.blogspot.com/2006/03/execution-in-king...

Writing kernel modules in Haskell

Posted Sep 14, 2009 0:14 UTC (Mon) by cesarb (subscriber, #6266) [Link]

Just a note in your second point: yes, C++ can hide memory allocation behind your back. Even a seemingly simple statement like "a = b + c;" can allocate memory in C++, while in C the only way that statement will allocate memory is if the compiler has to spill registers, and even then the allocation will be small and on the stack. In C++, the same "a = b + c;" statement could read from the disk, send a network packet, and draw a smiley face on your screen.

Of course, that can only happen if you create a class elsewhere which does memory allocation (and the other evil things) on "operator+" (and if you have not decided to forbid operators in your classes). But I think Linus's point is that, once you allow the language, people will want to use the features of the language, and before long you lose the ability to tell at a glance the costs and side effects of even simple operations like "a = b" (which in C can only mean the equivalent of a simple "MOV").

In fact, if you look at it this way, the fact that C is less expressive than C++ can be a feature. Having to write things like "obj->ops->fn(obj, foo)" instead of "obj->fn(foo)" makes explicit the extra costs (in this case, an additional memory read for the vtable and the extra "hidden" parameter).

Writing kernel modules in Haskell

Posted Sep 14, 2009 0:32 UTC (Mon) by bjacob (subscriber, #58566) [Link]

All these arguments are indeed defendable --- but again, they are arguments against using C++ throughout the kernel, and hardly constitute an argument against using C++ internally in the back-end of a kernel module.

That said, if we discuss why C++ isn't used throughout the kernel --- I understand your point. One remark though: about your first paragraph, one might turn the argument upside down and say that C is evil because a function named print_hello() may actually do a lot of malloc's and send a network packet too ;) The difference between C++ and C here is that functions in C++ may appear is more various forms than in C (operators, constructors, etc). It's really a matter of knowing what to expect: an experienced C++ programmer will understand spontaneously when an operator+ is likely to have to allocate memory for the result. Which takes us back to what is really Linus' best argument against C++ throught the kernel, and to your last paragraph... in other words, we agree.

Writing kernel modules in Haskell

Posted Sep 14, 2009 6:13 UTC (Mon) by drag (subscriber, #31333) [Link]

Well a module is still just a hunk of the kernel and for best results, long-
term, it should be every modules goal to get integrated into the kernel
proper. Therefore its counter productive to use any language other than the C
version that is acceptable by the kernel folks.

Now the haskell thing is a interesting experiment and as such it does not
make sense to be constrained by conventions.. But it may make some change to
the status quo if something is learned from it. So it's neat.

Writing kernel modules in Haskell

Posted Sep 14, 2009 15:17 UTC (Mon) by bfields (subscriber, #19510) [Link]

they are arguments against using C++ throughout the kernel, and hardly constitute an argument against using C++ internally in the back-end of a kernel module.

I don't see how you're using that distinction.

The bulk of the kernel is drivers. If drivers were written in a variety of languages, that would make the drivers as a whole more difficult to maintain.

Writing kernel modules in Haskell

Posted Sep 14, 2009 7:36 UTC (Mon) by alankila (subscriber, #47141) [Link]

Quibble: a = b can be a struct copy. It will turn into a memcpy, if I recall correctly.

Writing kernel modules in Haskell

Posted Sep 14, 2009 13:20 UTC (Mon) by cesarb (subscriber, #6266) [Link]

Oops, you are right, I forgot about that trick (which I have already used several times myself). Let me rephrase then: "a = b" in C can only mean the equivalent of a simple MOV for each byte of the variable (that is, it will read from each byte of "b" once, and write to each byte of "a" once; of course this will be optimized to read/write several bytes each time). The performance characteristics are still quite transparent.

Writing kernel modules in Haskell

Posted Sep 14, 2009 21:27 UTC (Mon) by alankila (subscriber, #47141) [Link]

Yes. I fully agree with your basic argument.

Since Java is sort of C++'s successor, perhaps I can be justified to drag it into this discussion. There is something fairly nice about its way to have essentially two languages in it: a simple unextendable arithmetic language for doing underlying operations with -- no confusion possible about what any of the expressions mean and it's analogous to C -- and an object language which is used for extending the core language. It has plenty of the good parts of C++, I guess, but I would agree that it's far from perfect.

However, the discussion we are having now can not occur in Java. In fact, since defining classes is the only possible way you can extend Java in any sense of word, you are even encouraged to not know or care what happens behind a method. It's an unusual dichotomy, but I must agree that it works. It's sort of brilliant, even.

(I'm watching myself turn into a Java enthusiast. Oh dear.)

Writing kernel modules in Haskell

Posted Sep 16, 2009 2:37 UTC (Wed) by ringerc (subscriber, #3071) [Link]

(I'm watching myself turn into a Java enthusiast. Oh dear.)

I'm suffering the same pain. Built-in GUI toolkit (even if it's pretty dire), sane and complete libraries, native Unicode string class (do not mention std::wstring in my presence), no build system pain, solid built-in threading and concurrency ... I'm getting to like it.

I really miss RAII, though. It's not really possible in Java as there's no way to ensure a dtor fires reliably when an object exists scope, as with stack-based instances in C++. It's a real shame and a pain point for me, as working with program flow when exceptions are involved is excruciating without RAII. If I'm missing something and there's a reasonable way to do or approximate RAII in Java, please let me know...

Writing kernel modules in Haskell

Posted Sep 17, 2009 6:10 UTC (Thu) by pynm0001 (guest, #18379) [Link]

Just like I wouldn't write C without one of glib or apr, I wouldn't write
C++ without Qt. Just as an aside, Qt has a built-in GUI toolkit (not dire,
btw), sane and complete libraries (although not as comprehensive as Java
I'd imagine), a native Unicode string class, minimal build system pain
(usually as easy as qmake -project && qmake && make), and solid built-in
threading and concurrency.

And you get RAII.

Writing kernel modules in Haskell

Posted Sep 15, 2009 9:42 UTC (Tue) by etienne_lorrain@yahoo.fr (guest, #38022) [Link]

> of course this will be optimized to read/write several bytes each time

Do not be so sure of this "of course", for instance GCC:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22141
and last time I tried Clang (LLVM) was worse (mov $4,ecx; rep mov stosb).

I have to say that older compiler version were copying fields of structure more efficiently, because they were not optimised to compile
C++ code which may trigger constructor for each fields of the structure.

Writing kernel modules in Haskell

Posted Sep 18, 2009 1:25 UTC (Fri) by kbob (guest, #1770) [Link]

Even a seemingly simple statement like "a = b + c;" can allocate memory in C++, while in C the only way that statement will allocate memory is if the compiler has to spill registers
#define a (buf[read(open("/dev/hda", 0), buf, sizeof buf) - 1])
#define b (send(some_sock, buf, sizeof buf, 0))
#define c (system("xloadimage /usr/share/cups/doc-root/images/smiley.jpg &"))
You were saying?

Writing kernel modules in Haskell

Posted Sep 18, 2009 4:04 UTC (Fri) by njs (subscriber, #40338) [Link]

Don't be ridiculous, the kernel wouldn't hide magic in macro calls.

Writing kernel modules in Haskell

Posted Sep 29, 2009 16:14 UTC (Tue) by Auders (subscriber, #53318) [Link]

Of course, no one would do anything like that
http://lwn.net/Articles/308520/
:)

Writing kernel modules in Haskell

Posted Sep 14, 2009 0:17 UTC (Mon) by Ed_L. (guest, #24287) [Link]

When you replace ptr = (T*)malloc(sizeof(T)); by ptr = new T;, you haven't hidden the memory allocation!
evaluates "true" only when ptr = new T; is in fact replaceable by ptr = (T*)malloc(sizeof(T)); Are you certain you want to debate this one? ;)

C++, the steampunk language

Posted Sep 14, 2009 7:24 UTC (Mon) by felixfix (subscriber, #242) [Link]

My experience with C++ is summed up by having replaced the simple malloc/free paradigm with THREE different ways to allocate : new/delete, new[]/delete[], and the original. Heaven help you if you mix them up. (I may have details wrong here; it's been a good long time since I had to deal with that muck.)

That is no way to design anything. I remember being disgusted with quite a few other things which I have gratefully mostly forgotten, Virtual should have been the default, I think, some pretense at giving you another knob to fine tune efficiency, You can define methods with a ridiculous number of keywords, more pretentious knobs. There are a zillion things you can do to hide the complexity of your class, to the point that there is no way to tell which operations are expensive, and debugging is a nightmare of chasing from one source file to another to see the secret details of classes to find out what little pitfalls were thrown in simply because they could be. People like to call Perl line noise, but C++ makes Perl look like a nice kindergarten language in the ways you can write write-only code, and every fool CS major thinks he is cool for doing so.

It reminds me of steampunk after it got past the initial fun stage: full of pointless knobs stuck on in an attempt to look like there's a reason behind them, when all they do is ... nothing except add clutter to make a fashion statement.

C++, the steampunk language

Posted Sep 14, 2009 8:11 UTC (Mon) by alankila (subscriber, #47141) [Link]

C++ is, sadly, an expert's language. For good or bad, all that crap is there to squeeze out every last performance bit available -- if you know how to use them and judge them worth the effort. C++ had to beat C on its own turf, and thus every feature of C++ was compromised (or made completely optional) to ensure that C++ could compile at least as well as C.

From design point of view, C++ seems to be quite unpleasant to program with, so I agree with you on that one. I think I feel some kind of a personal laundry list banging the backside of my skull wanting to come out, but there's no point to it -- C++ is a lost cause, and as long as backwards compatibility is required there is not much improvement on the horizon.

C++, the steampunk language

Posted Sep 14, 2009 16:17 UTC (Mon) by tjc (guest, #137) [Link]

My experience with C++ is summed up by having replaced the simple malloc/free paradigm with THREE different ways to allocate : new/delete, new[]/delete[], and the original.
Since you bring it up, do you happen to know why C++ requires a separate form (new[] v. new) for allocating and deallocating arrays?

I'm aware that arrays are converted to pointers in parameter declarations and (most) expressions, but I don't think that's an issue here. There should be enough information in the symbol table to identify an object as an array and [de]allocate memory for it without resorting to a special form.

C++, the steampunk language

Posted Sep 14, 2009 16:21 UTC (Mon) by avik (subscriber, #704) [Link]

delete[] needs to call the destructors of every element of the array, so it needs to know it is destroying an array.

C++, the steampunk language

Posted Sep 14, 2009 16:30 UTC (Mon) by nye (guest, #51576) [Link]

Note that there is a bit more to it than that, otherwise it would be okay to use delete on an array of POD types, which it isn't (see my other post). That's undoubtedly the root reason though.

C++, the steampunk language

Posted Sep 14, 2009 16:26 UTC (Mon) by nye (guest, #51576) [Link]

The first response to http://stackoverflow.com/questions/659270/why-is-there-a-... gives a reasonable answer.

C++, the steampunk language

Posted Sep 14, 2009 17:43 UTC (Mon) by tjc (guest, #137) [Link]

Thanks for the link. There is some other useful information on that page; see the post by Andrew Grant.

IIUC the size of an object could be stored in the symbol table, but that would only work with objects within lexical scope-- a dynamically allocated object returned from a library function would still be a problem, since there is no portable way to attach size information to an object in C.

C++, the steampunk language

Posted Sep 14, 2009 17:37 UTC (Mon) by jgg (guest, #55211) [Link]

> My experience with C++ is summed up by having replaced the simple malloc/free paradigm with THREE different ways to allocate : new/delete, new[]/delete[], and the original. Heaven help you if you mix them up. (I may have details wrong here; it's been a good long time since I had to deal with that muck.)

And C has an infinite variety of alloc/free functions..

Foo *foo = calloc(1,sizeof(Foo));
Foo *foo = LIBRARY_foo_Alloc();
Foo foo; LIBRARY_foo_Init(&foo);
Foo **foo = malloc(n*sizeof(Foo *)); for (int i = 0; i != n; i++) foo[i] = LIBRARY_foo_Alloc();

It is ever so much fun in mixing and matching them! Heaven help you if you use the incorrect free for the alloc that was chosen!!

new/new [] is actually a substantial increase in API uniformity compared to C and the multitude of libraries.

C++, the steampunk language

Posted Sep 14, 2009 18:28 UTC (Mon) by felixfix (subscriber, #242) [Link]

Why are you dragging in library routines? They have nothing to do with core C.

C++, the steampunk language

Posted Sep 14, 2009 20:06 UTC (Mon) by jgg (guest, #55211) [Link]

> Why are you dragging in library routines? They have nothing to do with core C.

Wow, that is just so pedantic. You do realize that the ISO C standard includes a library and that library includes functions like I described (fopen/fclose for instance). SUS adds more. It isn't like there is another way to do this ..

C++, the steampunk language

Posted Sep 14, 2009 19:46 UTC (Mon) by bronson (subscriber, #4806) [Link]

Er, by definition C++ has all of C's "infinite variety" of alloc/free functions. Plus it adds more of its own (it's great fun watching people incorrectly override delete[]... from a distance).

If you say, "well, don't use X in C++" then I'll just say "don't use X in C."

C++, the steampunk language

Posted Sep 15, 2009 14:33 UTC (Tue) by nix (subscriber, #2304) [Link]

Yes, but in C++ code you don't use that 'infinite variety': you wrap whateveritis up in an object and call the appropriate C allocation/freeing functions *once*, from the constructor and destructor of that object. Then you can consistently use the new/delete family of functions *everywhere* else. So the chaos of allocation and freeing functions that can easily be mixed up transforms into a single allocation/freeing function for all objects and a single one for all arrays of objects (which are sort of deprecated now 'cos the STL types are just as good). And it becomes impossible to use the wrong allocation and freeing functions anywhere, and memory leaks become harder to accidentally introduce as well (as the majority of object allocations are on the stack, so construction and destruction are automatic).

(The same applies to malloc()/free(): they shouldn't be used in well-written C++ programs, at all. They're not typesafe.)

Writing kernel modules in Haskell

Posted Sep 17, 2009 6:17 UTC (Thu) by pynm0001 (guest, #18379) [Link]

> > When you replace ptr = (T*)malloc(sizeof(T)); by ptr = new T;, you
> > haven't hidden the memory allocation!

> evaluates "true" only when ptr = new T; is in fact replaceable by ptr =
> (T*)malloc(sizeof(T)); Are you certain you want to debate this one? ;)

You mean as opposed to the more generic C:

ptr = (T*)malloc(sizeof(T));
foo_type_initialize_default(ptr);

that a C++ "ptr = new T;" is capable of expressing? You weren't going to
use that ptr uninitialized were you? This is 2009, there's no reason to
be using uninitialized structs!

Writing kernel modules in Haskell

Posted Sep 15, 2009 0:24 UTC (Tue) by jengelh (subscriber, #33263) [Link]

>I have worked with a Linux kernel module that was written in C++ *and* used
floating point.

VMware WS has, or used to have, some C++ (IIRC, only for some little templates, there was not much too much visible C++ classes, operator-overloads or otherwise)

The latter is not particularly hard ever since the FPU state is properly saved and restored.

Writing kernel modules in Haskell

Posted Sep 13, 2009 21:31 UTC (Sun) by Randakar (guest, #27808) [Link]

10x the effort might still cheap enough for some people.

A factor 10 effort multiplication on some minor addition to an already working system might easily be dwarfed by a) how much effort is already expended daily on regular maintenance and b) is saved further down the chain because a million distributers and kernel devs don't need to understand how to get Haskell going for their kernel build when that module is written in C ;-)

Writing kernel modules in Haskell

Posted Sep 14, 2009 9:01 UTC (Mon) by dgm (subscriber, #49227) [Link]

First, the 10x figure is bogus at best.
Second, 1/10th the development time, at the cost of losing control would be OK for a one off number crunching app, but it's not an option for an OS Kernel.
Finally, if you care so much about rapid development, then maybe C++ is not your best option. Go with Python (I have read there are wonderful math libraries) or something like that.
That coming from somebody that has been writing C++ code for the last 14 years.
My biggest gripe with C++ is that it's like a ubercomplex laser scalpel: great in the right hands, deadly in the the wrong ones. It's a great language for top-down communication (for example, writing libraries), but sucks for interchange of code among peers.

Writing kernel modules in Haskell

Posted Sep 14, 2009 9:22 UTC (Mon) by sagrailo (guest, #48054) [Link]

An example is if eventually a kernel module needs to do matrix computations. C++ or Fortran matrix libraries are much more powerful than C ones (Fortran because of native array arithmetic; C++ because templates allow to do that and much more)...

If you need vector/matrix computation, you need to use BLAS/LAPACK/etc., period; no one should try to re-implement that kind of code on his own. As for general numerical calculations, I'd agree with your statement that so far Fortran compilers were able to generate much better code than C compilers, but I certainly disagree that C++ template numerical libraries are better than C code - I worked with number of them, and these are all crap, regarding performance, readability etc.; this stuff is good only if you're into writing some papers on how smart and wonderful your usage of templates is.

Writing kernel modules in Haskell

Posted Sep 14, 2009 21:23 UTC (Mon) by gmaxwell (subscriber, #30048) [Link]

"If you need vector/matrix computation, you need to use BLAS/LAPACK/etc., period;"

Any kind? Er, so the kernel should be including megabytes of unswappable finite field matrix algebra library code just to perform raid-6? I think notÂ…

General tools are nice for general things. The kernel is not, almost by definition, a general thing.

Writing kernel modules in Haskell

Posted Sep 15, 2009 10:24 UTC (Tue) by sagrailo (guest, #48054) [Link]

Have you ever actually looked into the layout of a BLAS implementation? Netlib BLAS consists of hundreds of small files, mostly one file per numerical method, and it's easy to pickup needed files and put them in your code. On the other side, Netlib BLAS is admittedly just a reference implementation, not fast at all (and also actually requiring Fortran 77 compiler to build), and things get more complicated with optimized BLAS implementation. For example, with ATLAS, these files are generated, so either you'd have to generate them up-front for all needed processors, or to include the whole infrastructure for auto-generating these into your code (lots of code there, indeed, but if you want to do it right, you'd have actually to write all of this it yourself). As for vendor libraries (like MKL etc.), you may be in better luck with negotiating with vendors to provide you with some kind of license to use needed routines. But in any case you're better with using an existing solution, than to reimplementing it yourself (especially as numerical codes are notoriously hard to get right, which is why it makes even more sense to stick to re-using existing code).

Writing kernel modules in Haskell

Posted Sep 14, 2009 22:50 UTC (Mon) by Ed_L. (guest, #24287) [Link]

I'm sure you have good foundation for your opinions. But my experience, specifically with the Template Numeric Toolkit, is decidedly different. In one application where I implemented Lapack and TNT equivalents side by side, the TNT code was roughly twice as fast as Lapack. This was on CentOS, GCC 4.1.2. I don't recall whether I was linking the system-supplied Lapack or one I custom compiled, and I don't know if the difference was attributable to optimization flag differences or not. FWIW, library code is typically optimized -O2, my TNT segment was -O3 and there were several adjacent function call candidates for successive inline optimization. Perhaps a different example would skew a different direction.

Call me a crappy coder, but I have been singularly unsuccessful using multidimensional C++ arrays without resort to template wrappers, for which there is discernible run-time penalty for only the smallest of arrays. (Numerical Recipes style solutions also work.)

Writing kernel modules in Haskell

Posted Sep 15, 2009 10:35 UTC (Tue) by sagrailo (guest, #48054) [Link]

Yeah, right... I just quickly wrote this code (to multiply two matrices):

#include <algorithm>
#include <cassert>
#include <cstdlib>
#include <ctime>
#include <iostream>

extern "C" {
#include <cblas.h>
}

#include <tnt.h>

#define N 1024

int
main(void)
{
    float* A = new float[N * N];
    float* B = new float[N * N];
    float* C = new float[N * N];
    std::fill(C, C + N * N, 0);

    TNT::Array2D<float< ATNT(N, N);
    TNT::Array2D<float< BTNT(N, N);
    TNT::Array2D<float< CTNT(N, N);

    for (int i = 0; i < N; ++i)
        for (int j = 0; j < N; ++j) {
            A[i * N + j] = ATNT[i][j] = 1;
            B[i * N + j] = BTNT[i][j] = 1;
        }

    clock_t start, end;

    start = clock();
    cblas_sgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, N, N, N, 1, A, N, B,
                N, 0, C, N);
    end = clock();
    std::cerr
        << "ATLAS: "
        << (float) (end - start) / CLOCKS_PER_SEC
        << "s"
        << std::endl;

    start = clock();
    CTNT = TNT::matmult(ATNT, BTNT);
    end = clock();
    std::cerr
        << "TNT:   "
        << (float) (end - start) / CLOCKS_PER_SEC
        << "s"
        << std::endl;

    for (int i = 0; i < N; ++i)
        for (int j = 0; j < N; ++j) {
            assert(C[i * N + j] == N);
	    assert(CTNT[i][j] == N);
        }

    delete [] A;
    delete [] B;
    delete [] C;
}

Complied with (I've put TNT into /tmp, and I have ATLAS installed in /usr/local):

g++ -O3 -o foo -I/tmp/tnt foo.cpp -lcblas -latlas

The output, on my old ThinkPad T61p:

ATLAS: 0.22s
TNT:   17.64s
So - C++ almost two orders of magnitude slower (and that's typical, in my experience).

Writing kernel modules in Haskell

Posted Sep 15, 2009 12:01 UTC (Tue) by jbh (subscriber, #494) [Link]

Interesting! I've wanted to look into Eigen as well, which is another
template linear algebra library (http://eigen.tuxfamily.org). It seems to
hold up pretty well against Atlas (on Thinkpad T60, compiled with -msse2
flag):

ATLAS: 0.43s
TNT: 32.25s
Eigen: 0.76s

The major additions to your test program are (in different places):

#include <Eigen/Core>
--
MatrixXf Aeig(N,N), Beig(N,N), Ceig(N,N);
Aeig.setOnes(); // or Aeig = MatrixXf::Ones(N,N);
Beig.setOnes();
Ceig.setZero();
--
Ceig = Aeig * Beig;
--
assert(Ceig(i,j) == N);

Writing kernel modules in Haskell

Posted Sep 15, 2009 12:50 UTC (Tue) by bjacob (subscriber, #58566) [Link]

Hm, small world, I'm actually one of the main developers of Eigen. I didn't want to make a plug, but now that you've mentioned it...

I'm surprised by your result, because in our experience, we are consistently faster than ATLAS for matrix products. Already with Eigen 2.0, and even more so in the development branch. In fact, on single-CPU systems, we have almost the performance of vendor-tuned BLAS.

development branch:
http://eigen.tuxfamily.org/index.php?title=Benchmark

2.0:
http://eigen.tuxfamily.org/index.php?title=Benchmark-Augu...

It's good that you passed -msse2, indeed.

Some things to consider:

* did you allow ATLAS to use more than one thread? That would probably not be applicable in the setting of a kernel module! (As far as I understand) so perhaps it's more fair to require ATLAS to use only one thread.

* did you tune ATLAS for your hardware, especially CPU cache size? You can do the same for Eigen, then. For example with the development version it's
EIGEN_TUNE_FOR_CPU_CACHE_SIZE, see in the file Macros.h.

Writing kernel modules in Haskell

Posted Sep 15, 2009 13:09 UTC (Tue) by jbh (subscriber, #494) [Link]

In short, no tuning at all, neither ATLAS or Eigen. Standard ubuntu (jaunty)
packages for everything. This was on my laptop, and I only bother to tune
for production runs; those are not on my laptop :)

Eigen already looks very impressive, for a relatively young project in a
problem space that's dominated by dinosaurs --- I've meant to look into it
for a while, but not enough time and all that, and last time I checked it
didn't have sparse matrix support (I see it does now, although
experimental).

Writing kernel modules in Haskell

Posted Sep 15, 2009 13:29 UTC (Tue) by bjacob (subscriber, #58566) [Link]

OK, my best guess then is that ATLAS by default uses more than one thread...
anyway thanks for the kind words. Yes, there are a lot of dinosaurs in this area and we're sort of the small mammals in comparison.

(If it wasn't obvious, the largest advantage of a c++ template library like Eigen over a large C or Fortran binary library like a BLAS, is that it spontaneously generates only the code that's needed at build-time, and after that you can forget about it, so your kernel module is only a few dozen kilobytes self-contained; by comparison vendor-tuned BLAS take dozens of megabytes).

Writing kernel modules in Haskell

Posted Sep 29, 2009 23:14 UTC (Tue) by Auders (subscriber, #53318) [Link]

I did the same test with standard jaunty packages and the same compilation parameters and for me Eigen outperformed ATLAS, as you expected:
ATLAS: 0.69s
TNT: 21.93s
Eigen: 0.56s

Matrix Libraries

Posted Sep 16, 2009 20:50 UTC (Wed) by cry_regarder (subscriber, #50545) [Link]

Of course ublas has bindings for atlas so you can use this approach:

...
#include <boost/numeric/ublas/matrix.hpp>
#include <boost/numeric/bindings/atlas/cblas.hpp>
#include <boost/numeric/bindings/traits/ublas_matrix.hpp>
...
namespace ublas = boost::numeric::ublas;
namespace atlas = boost::numeric::bindings::atlas;
...
atlas::gemm(CblasNoTrans, CblasNoTrans, 1.0,
Aublas, Bublas, 0.0, Cublas);
...

and this compile:
g++ -O3 -o mat mat.c++ -I/tmp/boost-numeric-bindings -L /usr/lib64/atlas -latlas -lcblas -DNDEBUG

to get run times like:

ATLAS: 0.16s
uBLAS: 0.16s

...

Cry

Copyright © 2012, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds