Ruby Performance (Linux Journal)

[Posted February 26, 2007 by ris]

Pat Eyler looks at Ruby performance. "Antonio Cangiano posted a Ruby Implementation Shootout on his blog last week. While it's an interesting piece (and will likely be more interesting over time), it's still very premature."

A reminder to benchmarkers

Posted Feb 27, 2007 0:52 UTC (Tue) by BrucePerens (guest, #2510) [Link] (14 responses)

These days, memory bandwidth is a crucial factor for real applications, and may not be captured in benchmarks. If you are writing about performance, you should present some material on locality-of-reference and memory usage.

My Rails applications seem to fragment the heap over time, without obvious leaks (I've looked pretty thoroughly). And thus I think there's room for experimentation with the memory allocator running under Ruby.

Bruce

garbage-collecting VMs

Posted Feb 27, 2007 3:43 UTC (Tue) by ncm (guest, #165) [Link] (13 responses)

We can expect implementations based on garbage-collecting VMs to do badly where locality matters. If previous experience is any guide, we will see plenty of artificial benchmarks demonstrating rosy (or, in Sr. Cangiano's case, verdant) performance numbers on these targets, but badly disappointed users. I hope this expectation proves wrong, but wouldn't bet that way.

garbage-collecting VMs

Posted Feb 27, 2007 4:19 UTC (Tue) by BrucePerens (guest, #2510) [Link] (2 responses)

There is lots of room for tuning allocation and object storage, and thus I'm not quite so willing to give up. And the interpreter's pretty good now. I've a few Rails applications in production, and I wish I had as much load as they can service with only one dispatcher.

Bruce

garbage-collecting VMs

Posted Feb 27, 2007 4:53 UTC (Tue) by ncm (guest, #165) [Link] (1 responses)

I'm optimistic for a different reason: there's no need to run one's Ruby programs on a GC VM. If they turn out slow under real loads it will be easy to tell and easy to switch. It will only be unfortunate on shared servers where it may be hard to convince some users that they're unfairly hammering the rest because their CPU usage looks minimal as they thrash the bus.

garbage-collecting VMs

Posted Feb 27, 2007 5:42 UTC (Tue) by BrucePerens (guest, #2510) [Link]

I wasn't aware of any non-GC environments for Ruby. There are node-traversal interpreters rather than VMs, but they still GC. And I'm curious about what you think would be better. I sometimes prefer reference counting as with smart pointers in C++, and double-indirect schemes that work with compacting collectors, but the cache performance of such things is worse than GC on average. GC is only worse when the collector is running.

Thanks

Bruce

garbage-collecting VMs

Posted Feb 27, 2007 9:04 UTC (Tue) by flewellyn (subscriber, #5047) [Link] (9 responses)

We can expect implementations based on garbage-collecting VMs to do badly where locality matters.

That's not necessarily true. It's going to depend on the type of GC. A good compacting or copying collector, an ephemeral collector, and other strategies can help enormously with locality issues, especially with long-lived objects (and with a copying or ephemeral collector, short-lived objects don't matter anyway). Collectors can be tuned to improve locality over hand-allocated memory, in fact. So let's not be spreading old misconceptions about GC now, okay? It's the 21st century, we've had the technique for fifty years, and it's been well proven by now. :-)

garbage-collecting VMs

Posted Feb 27, 2007 19:29 UTC (Tue) by ncm (guest, #165) [Link] (8 responses)

People have been saying that for decades. Somehow, fifty years down the road, programs that depend on GC and address real-world problems still exhibit abysmal memory-usage patterns, and often abysmal performance, despite stellar benchmark numbers. "Strategies that can help" evidently do not help enough, or are hard enough to apply that they aren't. GC programs have always made bad neighbors, and there is no evidence of progress there. Meanwhile, the machines are not getting faster any more, and are ever more dependent on good cache behavior.

How many more decades will it be before it's OK to say that the GC experiment has failed?

garbage-collecting VMs

Posted Feb 27, 2007 20:29 UTC (Tue) by flewellyn (subscriber, #5047) [Link] (7 responses)

Actually, people have been saying "Eww, GC = slow" for decades, despite huge amounts of research and development into making GCs faster and more robust.

I will freely admit that there's probably workloads in which GC can cause problems, but let's not be relegating the whole concept to the bit bucket just because it doesn't work in all cases.

Perhaps what we really need is some kind of "happy medium" system, in which a programmer COULD manage memory manually if necessary, but can otherwise leave it up to a GC. I'm not sure what such a system would look like, but it would be interesting to see.

garbage-collecting VMs

Posted Feb 28, 2007 1:38 UTC (Wed) by drag (guest, #31333) [Link] (5 responses)

Maybe something like Pyrex?

As you probably know python is a slow language. However it is used in lots of places that demand very high performance from programs.

How you deal with this, as I am told, is that you write the program you want to do in python. Use it and profile it. If it doesn't give acceptable performance then find the portiosn of the code that are slow and them figure out optimizations for it.

If you optimize as much as you can in python and it's still slow then you take your now-highly optimized python code and translate it to a C which you then compile and import into your python program as a module.

The key to making it all work is to write the program using generic code and _then_ identify bottlenecks (instead of speculating ahead of time), then refactor the code until it's no longer possible to get better performance _then_ rewrite it in a lower level language. That way you end up with a working program first (abiet slow) and then spend your time as wisely as possible to make it fast after some more realworld-ish experiance and testing.

Thats the theory anyways. How well it works out is my guess.

Well one of the major downsides to this approach is that although writing Python programs is easy, writing python extensions in C is not.

It requires a significant portion of boiler plate code and can be tricky to make things 'python-ish'.

So that is why Pyrex is invented. You can take straight python code (with some caveats), compile it with pyrex and get compiled python. Not realy that much faster, but the key is that you can then mix and match C code with python code... Allow them to intermingle in that python module your making freely and without all the boilerplate cruft.

Often times you can end up with superior results then trying to write in pure C or import a C/C++ library using bindings and such.

The some it up as: "Pyrex is Python with C data types."
http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/...

garbage-collecting VMs

Posted Feb 28, 2007 2:29 UTC (Wed) by flewellyn (subscriber, #5047) [Link] (1 responses)

Well, that's kind of a separate issue from memory management, though.

You can use a GC with a compiled language: Lisps have had it for years, and have been compiled
since the 70s. Heck, you can use the Boehm-Weiser GC with C or C++. So memory
management is orthogonal to compilation vs interpretation.

Thanks for the tip, though. I do write a fair amount of Python code.

garbage-collecting VMs

Posted Feb 28, 2007 4:27 UTC (Wed) by drag (guest, #31333) [Link]

I wasn't realy thinking about compiled vs interpreted.

But I figured it you wrote in something like pyrex you could use C-like code to do memory management manually and then use python-like when you don't feel like it.

garbage-collecting VMs

Posted Feb 28, 2007 5:31 UTC (Wed) by jdell (guest, #25923) [Link] (2 responses)

You can do the same thing in Ruby with RubyInline. Very elegant:
http://www.zenspider.com/ZSS/Products/RubyInline/

garbage-collecting VMs

Posted Feb 28, 2007 6:11 UTC (Wed) by drag (guest, #31333) [Link] (1 responses)

ya there is a pyinline also, if your curious.
http://pyinline.sourceforge.net/

On that pyrex page I linked to they mentioned it. They say that it's nice, but it only converts basic types and you can't use it to make new python types. Maybe RubyInline is better.

Both of them are based off of the Perl inline concept, of course.

garbage-collecting VMs

Posted Feb 28, 2007 18:24 UTC (Wed) by jdell (guest, #25923) [Link]

Yes, you are absolutely correct. It seems to me that WRT to Python and Ruby, if it is interesting and worth doing, it has already been done in Perl :-)

RubyInline only auto-converts basic types - From their webpage: Automatic conversion between ruby and C basic types: char, unsigned, unsigned int, char *, int, long, unsigned long

garbage-collecting VMs

Posted Feb 28, 2007 7:53 UTC (Wed) by ldo (guest, #40946) [Link]

Actually, people have been saying "Eww, GC = slow" for decades, despite huge amounts of research and development into making GCs faster and more robust.

The problem is, even while the software continues to improve the locality of reference of garbage collectors, the hardware continues to increase the cost of violating locality of reference. CPU speeds continue to increase much faster than RAM speeds. Which is why you have L1 and L2 caches (and sometimes even L3 ones) to bridge the gap. As the gap widens, so does the cost of crossing it.

It's been about half a century since the concepts of machine-independent languages and garbage collection were invented; the first one is pretty much universally taken for granted these days, the second one is not. This even though processor speeds have improved by something like five orders of magnitude over that time; everybody now accepts that you should write your code in machine-independent languages (even for something as machine-dependent as the Linux kernel!), but garbage collection is still too expensive for many uses, and will probably remain that way forever.

Ruby Performance (Linux Journal)

Posted Feb 27, 2007 16:05 UTC (Tue) by josh_stern (guest, #4868) [Link] (1 responses)

Are there so many tradeoffs in the implementation of byte-code interpreters that it makes sense for so many new and different ones to be created?

Ruby Performance (Linux Journal)

Posted Feb 28, 2007 3:08 UTC (Wed) by josh_stern (guest, #4868) [Link]

Coincidentally, my question was mostly answered by this thread:

http://lambda-the-ultimate.org/node/1617