Interview: Eigen Developers on 2.0 Release (KDEDot)
Recently Eigen 2.0 was released. You might already have heard about Eigen, it is a small but very high performance maths library which has its roots in KDE. Below, the two core developers are interviewed about it."
Posted Feb 18, 2009 3:20 UTC (Wed)
by ncm (guest, #165)
[Link] (12 responses)
One difference I note is that VSIPL++ is dual-licensed, GPL2 and "proprietary". Another is that VSIPL++ automatically parallelizes matrix operations to as many cores as are available. A third is that it may be compiled to use (e.g.) Intel's MKL underneath.
(Disclosure: I worked on an early version of VSIPL++.)
Posted Feb 18, 2009 6:04 UTC (Wed)
by csamuel (✭ supporter ✭, #2624)
[Link] (9 responses)
Take our current dual quad-core systems, if two users end up running 4 CPU
Posted Feb 18, 2009 8:12 UTC (Wed)
by ncm (guest, #165)
[Link] (1 responses)
Posted Feb 18, 2009 10:35 UTC (Wed)
by csamuel (✭ supporter ✭, #2624)
[Link]
Thanks for the clarification!
Posted Feb 18, 2009 9:55 UTC (Wed)
by epa (subscriber, #39769)
[Link] (1 responses)
Posted Feb 18, 2009 10:36 UTC (Wed)
by csamuel (✭ supporter ✭, #2624)
[Link]
Posted Feb 18, 2009 11:41 UTC (Wed)
by endecotp (guest, #36428)
[Link] (4 responses)
I think this is the right thing to do for single-user systems. When you run something with a CPU set [you're talking about the cgroups feature, right?], does sysconf(_SC_NPROCESSORS_ONLN) give a different answer? I suspect not. Maybe it should. Or is there some other API that a process can use to determine how many processors are available to it, taking into account the cgroups stuff?
We don't want to end up with each library or application using its own environment variable that the user is expected to set.
Posted Feb 18, 2009 13:04 UTC (Wed)
by csamuel (✭ supporter ✭, #2624)
[Link] (3 responses)
If you're writing multi-threaded HPC codes then you should look at using MPI (which
We prefer people to use MPI as the explicit parallelisation you need to do seems to
We're using the cpusets virtual filesystem (which are part of cgroups now, but you
Posted Feb 18, 2009 13:59 UTC (Wed)
by endecotp (guest, #36428)
[Link]
Posted Feb 19, 2009 1:30 UTC (Thu)
by ncm (guest, #165)
[Link] (1 responses)
Posted Feb 21, 2009 20:30 UTC (Sat)
by jedbrown (subscriber, #49919)
[Link]
Posted Feb 18, 2009 9:10 UTC (Wed)
by halla (subscriber, #14185)
[Link] (1 responses)
Just from judging the website, I'd say that VSIPL++ focuses on signal processing,
And, as you say, VSIPL++ is a commercial, dual licensed product, while Eigen2 is available under
For me, as a free software developer there is an even bigger difference: the VSIPL++ people have
The Eigen manual is rather more extensive and gives useful examples: the VSIPL++ tutorial has
But as for performance and functionality... That's something you are much better placed to
Posted Feb 19, 2009 1:56 UTC (Thu)
by ncm (guest, #165)
[Link]
Without looking, I would guess that Eigen's API is probably nicer than VSIPL++'s. Besides history, the latter suffers from committee design. A fork to replace the API while retaining the underlying implementation (allowing plug-in parallization libraries and assembly-language cores) could make it nicer to use.
It ought to be pretty easy to port Krita over to use VSIPL++, and benchmarks would be instructive. It's a heady thing to see a program pick up and use extra CPUs without changing source or even recompiling.
Posted Feb 18, 2009 17:11 UTC (Wed)
by cry_regarder (subscriber, #50545)
[Link] (3 responses)
http://www.boost.org/doc/libs/1_38_0/libs/numeric/ublas/d...
"uBLAS is a C++ template class library that provides BLAS level 1, 2, 3 functionality for dense, packed and sparse matrices. The design and implementation unify mathematical notation via operator overloading and efficient code generation via expression templates."
Thanks,
Cry
Posted Feb 18, 2009 17:32 UTC (Wed)
by bo (guest, #56215)
[Link] (2 responses)
Posted Feb 18, 2009 18:11 UTC (Wed)
by cry_regarder (subscriber, #50545)
[Link] (1 responses)
Interesting chart. There are MKL <-> uBLAS bindings so should be able to get the MKL performance also.
Cry
Posted Feb 18, 2009 19:52 UTC (Wed)
by halla (subscriber, #14185)
[Link]
"Generality: we need many different kinds of matrices: fixed-size,
I would like to see comparisons to VSIPL++, a library with similar structure, range, and goals.
vs. VSIPL++?
vs. VSIPL++?
multiple cores on our systems.
jobs on the same node you don't want them to both try and use all 8 cores.
Fortunately our queuing system (Torque) puts the jobs into a CPU set so all
they'll affect are themselves, but in general it's probably not what you
want..
vs. VSIPL++?
vs. VSIPL++?
variable for OpenMP applications.
vs. VSIPL++?
Take our current dual quad-core systems, if two users end up running 4 CPU
jobs on the same node you don't want them to both try and use all 8 cores.
The best policy might be for each job to use all 8 cores, but run the two jobs in sequence. Then the first user gets his results after an hour and the second user after another hour, rather than making them both wait two hours as would happen if running in parallel with 4 cores each. (Assuming perfect scalability from 4 to 8 cores.)
vs. VSIPL++?
for 4 cores they are likely not be expecting any more than that.
vs. VSIPL++?
vs. VSIPL++?
you're doing, but if it is ever going to be given to anyone else or if you yourself are
ever going to want to run it on a shared system I'd suggest rethinking it.
lets you span multiple nodes and not just be tied to the cores on a single SMP
system) or use an existing SMP framework like OpenMP. Both MPI and OpenMP are
open standards with multiple implementations (and it is OpenMP that implements
the OMP_NUM_THREADS limits for you).
make people produce more efficient code - we've certainly seen commercial codes
that come as both MPI and SMP variants where the MPI version scales better on an
SMP system than the SMP version!
can't mount both views at the same time) and I don't believe your code will see a
different number of cores available, it'll just try and run all its threads on whatever
cores have been allocated. It would be nice if it could figure it out automatically!
vs. VSIPL++?
MPI strikes me as markedly better in every way. The experience of VSIPL++ demonstrates that practically nothing of the "explicit parallelization" that csamuel mentions need leak out to the library API, at least for a matrix operations library. I would guess that the performance advantage mentioned is a consequence of better cache behavior.
vs. VSIPL++?
vs. VSIPL++?
vs. VSIPL++?
comparison.
while Eigen2 is a versatile library for linear algebra: vectors, matrices, and related algorithms.
both GPL and LGPLv3 (to avoid all header-include licensing rannygazoo).
never given me patches to make my application (krita) use VSIPL++, while the eigen people have
done just that.
nice examples for high-level things like convolution. The Eigen api seems a lot nicer to me, but
that may because I am used to it or because even I, who hasn't had maths at school to speak of,
can understand what the expression mean. somtimes.
compare. And I'm sure the Eigen people would like to see it, too :-)
vs. VSIPL++?
Eigen vs. uBLAS
Eigen vs. uBLAS
includes (among others) uBlas.
It may be found here:
http://eigen.tuxfamily.org/index.php?title=Benchmark
Eigen vs. uBLAS
Eigen vs. uBLAS
article, when asked to explain what is different about Eigen:
dynamic-size dense, sparse. For example, BLAS and LAPACK handle only
dynamic-size dense matrices. Even MKL and vecLib have only limited support
for fixed-size matrices."