> That's the problem with GC. It works quite well in tests, but not so well in practice. What happens when your application works for half-hour and memory is badly fragmented?
Actually, fragmentation is usually less of an issue on garbage-collected systems, because the GC can defragment memory, which isn't feasible in languages like C where pointers aren't opaque.
> What happens when there are some other applications in background which also need to frequently run GC?
Why should that be a problem?
Posted Aug 22, 2011 12:43 UTC (Mon) by khim (subscriber, #9252)
[Link]
Actually, fragmentation is usually less of an issue on garbage-collected systems, because the GC can defragment memory, which isn't feasible in languages like C where pointers aren't opaque.
Right. And this is where you experience the most extreme dropouts and slowdowns. How will you compact a heap with multimegabyte arrays without significant delays?
And THAT is the problem
Posted Aug 22, 2011 15:04 UTC (Mon) by HelloWorld (guest, #56129)
[Link]
> Right. And this is where you experience the most extreme dropouts and slowdowns.
Really? Do you have any data to back this up? Can you cite any measuremeants to that effect?
> How will you compact a heap with multimegabyte arrays without significant delays?
I don't know, but that doesn't mean it's not possible. I mean, people have been writing books about GCs (e. g. http://www.amazon.com/dp/0471941484/), do you really expect me to answer this kind of questions in an lwn comment?
And THAT is the problem
Posted Aug 22, 2011 17:00 UTC (Mon) by khim (subscriber, #9252)
[Link]
> Right. And this is where you experience the most extreme dropouts and slowdowns.
Really? Do you have any data to back this up? Can you cite any measuremeants to that effect?
Ah, now we back to the whole BFS debate. No, I have no benchmarks present. I know how to make Java interface not "extremely sucky" but "kind-of-acceptable" - but yes, it's kind of black magic and I'm not 100% sure all techniques are actually required and proper. The main guide are dropout benchmarks for real programs: you just log timing of operations and tweak the architecture till they show acceptable timings. And I know I never need such black magic for simple, non-GC-driven programs: there I can measure timings without a lot of complicated experiments and be reasonably sure this will translate well to the end product. Not so with GC.
> How will you compact a heap with multimegabyte arrays without significant delays?
I don't know, but that doesn't mean it's not possible. I mean, people have been writing books about GCs (e. g. http://www.amazon.com/dp/0471941484/), do you really expect me to answer this kind of questions in an lwn comment?
Sure. It's typical problems for real programs. And if the best answer you can offer "there are a lot of papers on subject, surely CS wizards solved the problem long ago" then I'm not convinced.
Because these are the same "wizards from Ivory Tower" that proclaimed 20 years ago "Among the people who actually design operating systems, the debate is essentially over. Microkernels have won."
This was nice theory but practice is different. In practice two out of three surviving major OSes are microkernel-based only in name and one is not microkernel at all. I suspect the same will happen with GC: wizards promised that with GC you can just ignore memory issues and concentrate on what you need to do, but in practice it only works for packet computation (things like compilers or background indexers) while in UI you spend so much time fighting GC the whole savings become utterly pointless.
And THAT is the problem
Posted Aug 22, 2011 17:42 UTC (Mon) by cmccabe (guest, #60281)
[Link]
> Sure. It's typical problems for real programs. And if the best answer you
> can offer "there are a lot of papers on subject, surely CS wizards solved
> the problem long ago" then I'm not convinced
Why don't you check out the incremental garbage collector that was implemented in Android 2.3? It exists and is deployed in the real world, not an ivory tower.
Posted Aug 25, 2011 13:30 UTC (Thu) by renox (subscriber, #23785)
[Link]
incremental GC != real time GC.
To have no pause, you need a real time GC not merely an incremental GC!
And real time GCs, especially free one are rare indeed.
And THAT is the problem
Posted Sep 4, 2011 20:41 UTC (Sun) by cmccabe (guest, #60281)
[Link]
Non sequitor. User interfaces have never been a hard real-time problem. Human reaction latencies are measured in hundreds of milliseconds.
And THAT is the problem
Posted Aug 22, 2011 18:06 UTC (Mon) by pboddie (subscriber, #50784)
[Link]
I suspect the same will happen with GC: wizards promised that with GC you can just ignore memory issues and concentrate on what you need to do, but in practice it only works for packet computation (things like compilers or background indexers) while in UI you spend so much time fighting GC the whole savings become utterly pointless.
Someone gives you data and you just come back with more conjecture!
The good versus bad of GC is debated fairly actively in various communities. For example, CPython uses a reference-counting GC whose performance has been criticised from time to time by various parties for different reasons. As a consequence, implementations like PyPy have chosen different GC architectures. The developer of the HotPy implementation, who now appears to be interested in overhauling CPython, also advocates a generational GC, I believe, which means that there is some kind of emerging consensus.
There has even been quite a bit of work to measure the effect of garbage collection strategies on the general performance of virtual machines, and that has definitely fed into actual implementation decisions. This isn't a bunch of academics hypothesising, but actual real-world stuff.
And THAT is the problem
Posted Aug 22, 2011 19:38 UTC (Mon) by zlynx (subscriber, #2285)
[Link]
Personal experience in programming user interactive software in Java is not conjecture. It is fact.
I can back that up with my own personal experience. Java software and C#/.NET too will show unexpected and very annoying pauses whenever the GC is required to run. C or C++ software, even Perl and Python software never demonstrated this erratic behavior.
If you would like to see it for yourself, please run JEdit on a system with 256MB RAM while editing several files of several megabytes each. That is one application I know I experienced problems with while Emacs and vi never acted funny.
And THAT is the problem
Posted Aug 22, 2011 22:40 UTC (Mon) by cmccabe (guest, #60281)
[Link]
> I can back that up with my own personal experience. Java software and
> C#/.NET too will show unexpected and very annoying pauses whenever the GC
> is required to run.
The JVMs in use on servers do not use incremental garbage collection. I mentioned this is in the post that started this thread.
> C or C++ software, even Perl and Python software never
> demonstrated this erratic behavior.
CPython's garbage collector is based on reference counting, which is inherently incremental. So it's no surprise that you don't see long pauses there.
> If you would like to see it for yourself, please run JEdit on a system
> with 256MB RAM while editing several files of several megabytes each.
I would like to, but unfortunately systems with 256MB of RAM are no longer manufactured or sold.
> That
> is one application I know I experienced problems with while Emacs and vi
> never acted funny.
Ironically, emacs actually implements its own garbage collector, which is based on mark-and-sweep. So apparently even on your ancient 256 MB machine, old-fashioned stop-the-world GC is fast enough that you don't notice it. As a side note, it's a little strange to see emacs held up as a shining example of efficient programming. The sarcastic joke in the past was that emacs stood for "eight megs and constantly swapping." Somehow that doesn't seem like such a good punchine any more, though. :)
The bottom line is that well-implemented garbage collection has its place on modern systems.
And THAT is the problem
Posted Aug 23, 2011 2:26 UTC (Tue) by viro (subscriber, #7872)
[Link]
boot with mem=256M in kernel command line, assuming that it wasn't a snide "who the fsck cares about insufficiently 31137 boxen???"...
And THAT is the problem
Posted Aug 24, 2011 18:07 UTC (Wed) by cmccabe (guest, #60281)
[Link]
It was just a snark. But thanks for the suggestion; it could be interesting for testing low-memory conditions.
There are nothing ironic here...
Posted Aug 24, 2011 9:07 UTC (Wed) by khim (subscriber, #9252)
[Link]
Ironically, emacs actually implements its own garbage collector, which is based on mark-and-sweep. So apparently even on your ancient 256 MB machine, old-fashioned stop-the-world GC is fast enough that you don't notice it.
This machine may be ancient, but emacs is beyond ancient. You said it yourself:
As a side note, it's a little strange to see emacs held up as a shining example of efficient programming. The sarcastic joke in the past was that emacs stood for "eight megs and constantly swapping."
It's still good punchline - but now it can be used as showcase for GC. If you system is so beefy, is so overpowered, is so high-end that you can actually throw 10 times as much on the problem as it actually needs... then sure as hell GC is acceptable. But this is not what GC proponents will tell you, right?
And THAT is the problem
Posted Aug 23, 2011 5:55 UTC (Tue) by alankila (subscriber, #47141)
[Link]
Note that even in java 6, there are multiple GC strategies to pick from today. Things change -- what was true in 2000 or whenever you conducted your tests (based on 256 MB of memory) isn't true today. And we don't know if this was because jEdit was configured with so much heap that it swapped out of your RAM or whatever...
I do know however that even the mark-and-sweep collectors in practice limit pauses to less than 100 ms because I have written audio applications in java with shorter buffers than 100 ms and they run without glitching. This sort of application should have its heap size tuned appropriately because too large heap will have lot of objects to collect when the cycle triggers, and this in turn can cause glitches, whereas a small heap will have frequent but fast enough collections.
G1GC strategy looks very promising, because it concentrates GC effort to memory regions that are likely to be free with very little work, and supports soft realtime constraints for limiting the length of a single GC cycle. It looks to be something to the tune of order of magnitude faster than the other strategies, but I haven't personally tried it yet.
This is my experience as well...
Posted Aug 24, 2011 9:00 UTC (Wed) by khim (subscriber, #9252)
[Link]
This sort of application should have its heap size tuned appropriately because too large heap will have lot of objects to collect when the cycle triggers, and this in turn can cause glitches, whereas a small heap will have frequent but fast enough collections.
Yeah. It works. But the main stated goal of the GC, it's raison d'être is the ability to forget about memory allocation. Remember: no memory leaks, no more confusion about ownership, etc? GC pseudoscience fails to deliver, sorry. Just like relational database theory fails to deliver. It does not mean both are useless - if your goal is "something working" they often are "good enough". But the problem with Java is the fact that GC is imposed. You can not avoid it because standard library requires it. So in the end you fight to the death with the one thing which should "free you" from some imaginary tyranny.
This is my experience as well...
Posted Aug 25, 2011 10:37 UTC (Thu) by alankila (subscriber, #47141)
[Link]
Your arguments sound really bizarre to me.
Think about what I just said: I have positive experience working with a relatively low-latency application in the real world. To get it, all I have to do is adjust one tunable -- the heap size. And I hinted that with G1GC even that adjustment is now unnecessary, but I'll wait until G1GC is actually the default. JDK7 maybe, when it rolls out for OS X I'll probably check it out.
This is my experience as well...
Posted Aug 31, 2011 14:37 UTC (Wed) by nix (subscriber, #2304)
[Link]
So rather than agonizing over object ownership, we tweak one tunable, the heap size, yet the existence of this single tunable apparently makes GCs 'pseudoscience'?
Hint: GCs are actually studied, quite intensively, by actual computer scientists. Science is what scientists do: thus, GC research is science. Enough of the badmouthing. (It is quite evident to me at least that no amount of evidence will change your opinion on this score: further discussion is clearly pointless.)
And THAT is the problem
Posted Aug 23, 2011 11:00 UTC (Tue) by pboddie (subscriber, #50784)
[Link]
I can back that up with my own personal experience. Java software and C#/.NET too will show unexpected and very annoying pauses whenever the GC is required to run. C or C++ software, even Perl and Python software never demonstrated this erratic behavior.
Yet Perl and Python employ garbage collection. My point was that blanket statements about GC extrapolated from specific implementations of specific platforms don't inform the discussion, but I'm pretty sure that this is a replay of a previous discussion, anyway, fuelled by the same prejudices, of course.
Puhlease...
Posted Aug 24, 2011 8:52 UTC (Wed) by khim (subscriber, #9252)
[Link]
1. Perl and python are unbearably slow and laggy when used by itself. Thankfully noone in the right mind will ever try to use then without wide array of support C libraries. These "fast core" libraries don't employ GC.
2. Most perl and python scripts are batch-mode scripts. GC is perfectly Ok for such use (when you only care about throughput and not about latency).
3. Current implementations of perl and python use refcounting GC which is, of course, it as reliable WRT latency as manual memory allocation. We'll see what happens when PyPy and other "advanced" implementations will become mainstream.
Puhlease...
Posted Aug 25, 2011 10:55 UTC (Thu) by alankila (subscriber, #47141)
[Link]
1. The language speed is due to their interpreting loop rather than the fact they use GC. As a rule of thumb, an interpreter evaluating an opcode stream appears to be 10 times slower than compiler that translates it into native form.
2. Probably true. Missing from this discussion is the observation that even malloc/free can be unpredictable with respect to latency because they must maintain the free list and occasionally optimize it to maintain performance of allocations. No doubt advances to malloc technology have happened and will happen, and there are multiple implementations to pick from that expose different tradeoffs.
3. Python, I've been told, also contains a true GC in addition to the refcounter. I think the largest single advantage of a refcounter is that it gives a very predictable lifecycle for an object, often removing it as soon as the code exits a block. This makes user code simpler to write because filehandles don't need to be closed and database statement handles go away automatically. Still, this is synchronous object destruction, and scheduling object destruction during user think-time would give better user experience.
Puhlease...
Posted Aug 25, 2011 11:41 UTC (Thu) by pboddie (subscriber, #50784)
[Link]
1. The language speed is due to their interpreting loop rather than the fact they use GC. As a rule of thumb, an interpreter evaluating an opcode stream appears to be 10 times slower than compiler that translates it into native form.
Indeed. One can switch off GC and just let programs allocate memory until they exit to see the performance impact of GC, if one is really interested in knowing what that is. This is something people seem to try only infrequently and in pathological cases - it's not a quick speed-up trick.
2. Probably true. Missing from this discussion is the observation that even malloc/free can be unpredictable with respect to latency because they must maintain the free list and occasionally optimize it to maintain performance of allocations. No doubt advances to malloc technology have happened and will happen, and there are multiple implementations to pick from that expose different tradeoffs.
Quite right. This is not so different from any discussions of latency around garbage collectors.
3. Python, I've been told, also contains a true GC in addition to the refcounter.
CPython uses reference counting and a cycle detector. PyPy uses a generational GC by default, if I remember correctly. The PyPy people did spend time evaluating different GCs and found that performance was significantly improved for some over others.
I think the largest single advantage of a refcounter is that it gives a very predictable lifecycle for an object, often removing it as soon as the code exits a block.
This advantage is almost sacred to the CPython developers, but I don't think it is entirely without its own complications. Since we're apparently obsessed with latency now, I would also note that a reference counting GC is also unlikely to be unproblematic with regard to latency purely because you can have a cascade of unwanted objects and the GC would then need to be interruptable so that deallocation work could be scheduled at convenient times. This could be done (and most likely is done) in many different kinds of GC.
And THAT is the problem
Posted Aug 22, 2011 19:39 UTC (Mon) by HelloWorld (guest, #56129)
[Link]
> Ah, now we back to the whole BFS debate. No, I have no benchmarks present.
Well, then you don't really have a point, do you?
> Sure. It's typical problems for real programs.
Again, do you have any data to back this up? Because I really doubt that this is a problem for 99% of all applications.
> Because these are the same "wizards from Ivory Tower" that proclaimed 20 years ago "Among the people who actually design operating systems, the debate is essentially over. Microkernels have won."
> This was nice theory but practice is different. In practice two out of three surviving major OSes are microkernel-based only in name and one is not microkernel at all.
That doesn't mean anything as long as you don't show that the failure of microkernel based OSes on the desktop can be attributed directly to the microkernel design. There may well have been other factors: inertia, lack of hardware and software vendors, bad luck etc.. Also, microkernels were actually a success in the embedded world. QNX is just one example of this.
And THAT is the problem
Posted Aug 23, 2011 5:00 UTC (Tue) by raven667 (subscriber, #5198)
[Link]
Microkernels are also a success in the modern server world, the name has changed but the design is there, it is just called a hypervisor now
And THAT is the problem
Posted Aug 23, 2011 17:19 UTC (Tue) by bronson (subscriber, #4806)
[Link]
I don't think this is true. Microkernels are a collection of OS processes interacting to provide a rich interface to userland (file systems, networking stack, IPC, etc), while hypervisors don't intend to provide OS services at all. They're just a thin hardware abstraction layer so that multiple operating systems (microkernel or not) can be fooled into coexisting and maybe some very coarse resource management.
Like atoms vs. the solar system, they might look similar if you look from afar. It's certainly possible to cherry-pick theoretical similarities. For real-world work, though, they tend to be quite different.
And THAT is the problem
Posted Aug 25, 2011 20:37 UTC (Thu) by raven667 (subscriber, #5198)
[Link]
I was thinking that the combination of the hypervisor and the guest kernel in total formed the microkernel. Each guest has its own personality and file systems, network, ipc, etc. Information is passed back and forth through the hypervisor using interfaces that happen to look a lot like network cards and block devices, or use PV drivers which can do pure message passing. The big thing is that every bit runs in its own protected memory space which I thought was the big difference and value of a microkernel vs. a traditional kernel.
Like you said, I'm certainly standing from afar and squinting more than a little. 8-)
And THAT is the problem
Posted Aug 23, 2011 23:36 UTC (Tue) by rodgerd (guest, #58896)
[Link]
You mean like the microkernels used in MacOS, the iPhone that you're touting as the model of how to do things, and Windows?
Boy microkernels didn't get anywhere, did they.
And THAT is the problem
Posted Aug 24, 2011 0:12 UTC (Wed) by rahulsundaram (subscriber, #21946)
[Link]
MacOS and Windows do not claim to be micro kernels anymore and they are not. The only widespread use of micro kernel that I am aware of is QNX.
And THAT is the problem
Posted Aug 24, 2011 7:44 UTC (Wed) by anselm (subscriber, #2796)
[Link]
MacOS may be based on the Mach microkernel, but given that it has a big monolithic BSD emulation layer and nothing else on top it can by no stretch of the imagination be called a »microkernel OS«. (Andrew Tanenbaum probably wouldn't like it any more than he likes Linux.) Very similar considerations apply to Windows; having the graphics driver in the kernel is not what one would expect in a microkernel OS.
It is fair to say that the microkernel concept, while academically interesting, has so far mostly failed to stand up to the exigencies of practical application. There are exceptions (QNX comes to mind), but despite previous claims to the contrary no mainstream operating system would, in fact, pass as a »microkernel OS«. At least Linux is honest about it :^)