User: Password:
Subscribe / Log in / New account

Garbage collection and MM

Garbage collection and MM

Posted Mar 8, 2007 10:55 UTC (Thu) by aanno (guest, #6082)
In reply to: Garbage collection and MM by ncm
Parent article: Short topics in memory management

I completly agree with you, mcn. Cache locality will probably never be fixed with GC based languages. But as far as I understand, in many GC algorithms, there is also a 'walker' that marks memory that is not reachable any more. It is a bad idea to walk onto a swapped-out page, though.

And I could imagine that this problem could be diminished. The paper I linked to takes this direction.

Even if GC based language will not predominate, they have to be taken into account more than let's say 20 years ago. The fact that many Linux users don't like Java or .NET will not make this langauges to disappear. And even if they would, what's about Python, Ruby, Perl and the like?

GC is even mentioned in this discussion to be used within firefox (see above). And I heart that the GCC suite also uses GC (beginning with version 3.0).

GC will always be problematic. But it is a thing that could be improved by collaboration of kernel and user land knowledge. My point of view is that the research of Hertz points to the right direction. I was interesting to read that this sort of collaboration could also solve other MM related problems.

(Log in to post comments)

Garbage collection and MM

Posted Mar 8, 2007 12:22 UTC (Thu) by nix (subscriber, #2304) [Link]

Yes, GCC 3.0 did start using GC (because the lifetime rules of objects in the compiler were unfathomably complex). It fixed an unknown but large number of bugs... and slowed down the compiler a *lot* due to cache locality issues.

Years ago now, someone (Mike Stump?) ran some benchmarks that showed GCC incurring cache stalls every *twenty instructions* or thereabouts. Small wonder that it slowed down!

Careful moves are now underway (and have been for a while) to migrate objects with simple lifetime rules back into obstacks. The obstacks are still garbage-collected, but the obstack is a *single* GCed object with good cache locality, where the myriads of objects it replaces were not.

I doubt GCC will ever leave GCC, either: for objects with complex lifetime rules, there's really no maintainable alternative. But for the simple ones, using suballocators with better cache locality (like obstacks) is a good idea.

Garbage collection and MM

Posted Mar 8, 2007 15:25 UTC (Thu) by pflugstad (subscriber, #224) [Link]

But as far as I understand, in many GC algorithms, there is also a 'walker' that marks memory that is not reachable any more. It is a bad idea to walk onto a swapped-out page, though.

This has not been the case for several years now. Most GCs (certainly in Java) systems use a generational/compacting collector. As such, dead objects aren't touched at all. And this was 3+ years ago - it's gotten even better since then. When you do Java, you really do need to re-think how you program.

As someone else said, GC is everywhere these days, even embedded. The ease and clarity of developing in a GC language (Python, Perl, Java, etc), far outweigh the performance penalty you may see with GC. This is especially true for the vast majority of programs where performance is not seriously a concern, such as those with human interactions. I've done a lot C. I've done a lot of C++. I've done Python. I've done Java. I'll take Python/Java 6 days a week (but not twice on Sundays - sometimes you do need performance :-).

Garbage collection and MM

Posted Mar 8, 2007 16:32 UTC (Thu) by aanno (guest, #6082) [Link]

Right. I oversimplified this. Generational/compacting collectors have no walker but (a) have a second indirection (that slows down things) and (b) are not memory efficient (as they roughly use twice the space that is theoretically needed). Certainly Hertz compares his GC against this type. Refer to the paper to get all the details.

Garbage collection and MM

Posted Mar 8, 2007 20:35 UTC (Thu) by ncm (subscriber, #165) [Link]

Python is not, at present, a GC language. Neither is Perl.

However, people are working on GC implementations. Expect complaints about cache abuse by scripts, too, soon.

Garbage collection and MM

Posted Mar 9, 2007 0:04 UTC (Fri) by njs (guest, #40338) [Link]

Python is certainly a GC language (since version 2.0). It happens to optimize that GC by using reference counting to catch the easy cases, and only doing mark/sweep type stuff occasionally to catch reference cycles, but it definitely has a full GC.

I'm told that some of the hottest Java GC techniques actually involve reference counting these days, because you can massively optimize your actual walking -- the only time a cycle can be created is when a reference count is decremented, and thus achieves a number greater than 0. This is pretty rare, and it also tells you that any cycle that was just created must involve that object in particular, so you don't have to tromp through all memory either.

None of this affects your original point, though, because reference counting already trashes caches by itself -- especially in the multiprocessor case, where supposedly read-only access to variables is suddenly triggering cache flushes...

Depending on your cache hierarchy and the characteristics of your GC, you can minimize its impact, though. E.g., in gcc, I thought I remember some trick where you only run the collector between passes, since you know already that that's when everything becomes garbage, and also where it doesn't matter if you trash the cache? Similarly, Graydon was saying something about in initial implementations of firefox's GC, they would just run it after page load, because no-one notices if the browser pauses for 400 milliseconds then, they're just looking at the page.

Long run: build garbage collection into the RAM hardware! That'll work around those pesky cache issues ;-)

Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds