Not logged in
Log in now
Create an account
Subscribe to LWN
LWN.net Weekly Edition for May 23, 2013
An "enum" for Python 3
An unexpected perf feature
LWN.net Weekly Edition for May 16, 2013
A look at the PyPy 2.0 release
The Managed Runtime Initiative
Posted Jun 17, 2010 4:27 UTC (Thu) by elanthis (guest, #6227)
When you start dealing with very, very large applications -- like an entire browser, office suite, Photoshop-esque image editor, long-running server process, game engine, or many scientific tools -- then the number of allocated objects is very high, the total amount of allocated memory is often high, and the memory manager deals with an awful lot of allocation churn. These things are all really bad for most garbage collection algorithms.
They're really not all that great for classic C/C++-style manual memory management, either. A particularly good GC will actually be faster overall than even a well coded manual memory management using application. The problem is that the manual memory management approach spreads all of its time out amongst every allocation and deallocation. That means that there is never a big pause on one allocation only, and it also means that all the memory manager work happens at places in the code the programmer can easily identify. With automatic memory management, a big chunk of the garbage collection process (possibly all of it) can happen at any allocation without any deterministic way to identify it. Even with an incremental collector, this can lead to long pauses happening right in the middle of a relatively speed-sensitive bit of code; it's not a big deal if the GC happens during general idle loop processing in a GUI app but it can be noticeable if it kicks in during animation processing, for example.
So anyway, yeah, all garbage collected languages have the problem, just most garbage collected languages don't have so much as a single app on the scale of what Java or C# are frequently used for.
Posted Jun 17, 2010 4:57 UTC (Thu) by skitching (subscriber, #36856)
Note that Azul systems aren't just talking about "large apps like an office suite". They sell 54-core CPUs and java virtual machines designed to "scale to 670 GB of memory".
So we are not talking about optimising applets here, but systems that can run stock markets, large brokerage houses, etc. But such applications are also sometimes sensitive to latency. The old "stop the world" garbage collectors would be a real problem here, but even "incremental" collectors may cause latencies to some threads.
And (I haven't read the released code) perhaps they are more interested in moving some memory-management tasks from kernel to user-space. Things like large database systems often try to bypass OS behaviour and manage some things themselves (eg file caches) because they know better than the OS what the usage patterns are. Sounds plausable that a JVM may also have these needs.
Posted Jun 17, 2010 7:17 UTC (Thu) by helge.bahmann (subscriber, #56804)
... but only when you are willing to throw about 4 times the memory an equivalent explicitly-managed program would use at the problem: http://www.cs.umass.edu/~emery/pubs/gcvsmalloc.pdf
This unfortunately means that performance-wise garbage collection fares particularly badly with "large" applications where the working set size approaches the available RAM.
GC myth propagation
Posted Jun 17, 2010 7:17 UTC (Thu) by ncm (subscriber, #165)
"Eager", by the way, would be a more accurate moniker for what is normally done in C++, as there memory management, with the aid of destructors, is no wise more manual than in Lisp. Adopting the usage would allow GC promoters to hint at more honesty and scruples than their colleagues are inclined to exhibit.
Posted Jun 17, 2010 15:42 UTC (Thu) by martinfick (subscriber, #4455)
Hmmm, that sounds like better memory management to me. Good programmers have been writing small modular programs which do not live forever and work well together to create larger systems for decades. There are many reasons for this and easier/better memory management is on the list.
Java programmers/system designers seem to be the only programmers in the world who do not take advantage of this model. Maybe because with java, no one ever wants to leave the JVM, doing a fork from java is considered a horrible mistake (and will take many servers down)? This NIH java disease is the reason that many java apps have horrible memory problems (a program already exists to do that, well, it's not in java... rewrite) Without using pipes and forking, it is hard to create large robust modular programs in any language. Only java programmers would even dream of it.
So, it is not java so much that is the problem, but the mindset when using it: I cannot fork, that would not be portable...(and uncool?) Not to mention the slow JVM startup times.
Posted Jun 18, 2010 10:05 UTC (Fri) by marcH (subscriber, #57642)
On the other hand, please tell us how to implement Eclipse or Emacs using a bunch of independent and loosely connected processes.
Even better: please tell us how to implement something as powerful and fast as Linux using a micro-kernel and a bunch of loosely connected daemons :-P
Posted Jun 18, 2010 15:48 UTC (Fri) by martinfick (subscriber, #4455)
Linux, well, we have to reverse things here a bit, linux is somewhat large, but still modularization was defined early by the original unix devs. So the image looks more like this.
multics -> linux + 1000 other small unix utilities.
Remeber how often the "do it in userspace" slogan is shouted from linux kernel devs. Perhaps if the java devs used a bit more of this approach (i.e. do not use the same JVM for everything) they would not be suggesting to expand the kernel just so that JVMs can get bigger! Note how different from normal scalable program requests that is. This is not a request to" "be able to create 1 billion threads/processes and to switch efficiently and fairly among them", or "to communicate between them with low overhead"..., no it is a request to fundamentally support using larger and larger monolithic programs. One might ask why the JVM even needs an OS, why not just run it straight on the iron and optimize the hell out of it, for this single use case, isn't that what the java devs really want? ;)
Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds